Google AI Researchers Introduce HyperDreamBooth: An AI Approach That Efficiently Generates Custom Weights From A Single Image Of A Person Is Smaller And 25x Faster Than DreamBooth

The field of Generative Artificial Intelligence is receiving all the attention it deserves. Recent developments in text-to-image (T2I) personalization have opened up intriguing possibilities for innovative uses. The concept of personalization, which is the generation of distinctive people in various contexts and styles while preserving a high level of integrity of their identities, has become a prominent topic in Generative AI. Face personalization, the ability to generate new photos of a particular face or person with different styles, was made possible using pre-trained diffusion models, which have strong track records on various styles.

Current approaches such as DreamBooth and comparable techniques are successful due to their ability to include new subjects in the model without detracting from its past knowledge and retaining the essence and specifics of the subject even when presented in very different ways. But it still has many limitations, including problems with the size of the model and its training speed. DreamBooth involves fine-tuning all UNet and Text Encoder weights of the broadcast model, leading to a size of over 1GB for stable broadcast, which is significantly large. Furthermore, the training procedure for stable deployment takes approximately 5 minutes, which may impede its widespread adoption and practical application.

To overcome all these problems, a team of researchers at Google Research introduced HyperDreamBooth, a hypernetwork that efficiently generates a small set of custom weights from a single image of a person. With just one image of a person, HyperDreamBooth’s hypernet effectively creates a tiny collection of custom weights. The diffusion model is then coupled to these unique weights, which undergo rapid changes. The end result is a powerful system capable of generating a person’s face in a variety of situations and aesthetics, while retaining subject matter detail and essential understanding of the pattern of diffusion of various aesthetic and semantic alterations.

Build high-quality training datasets with Kili Technology and solve NLP machine learning challenges to develop powerful ML applications

The blazing speed of HyperDreamBooth is one of its greatest successes. It’s 25x faster than DreamBooth and an astonishing 125x faster than another related technology called Textual Inversion to customize faces in just 20 seconds. Furthermore, while maintaining the same degree of quality and aesthetic variation of DreamBooth, this rapid customization procedure requires only a reference image. HyperDreamBooth also excels in model size as well as speed. The resulting custom model is 10,000 times smaller than a regular DreamBooth model, which is a major benefit, as it makes the model more manageable and significantly reduces storage requirements.

The team summarized their contributions as follows:

Lightweight DreamBooth (LiDB): A custom text-image model with a custom part of approximately 100 KB was introduced, which was achieved by training the DreamBooth model in a low-dimensional weight space generated from a random orthogonal incomplete basis within a low-rank fit weight space.

New HyperNetwork Architecture: Using the LiDB configuration, HyperNetwork generates custom weights for specific topics in a text-to-image syndication model. This provides strong directional initialization, allowing for quick fine-tuning to achieve high subject fidelity in just a few iterations. This method is 25 times faster than DreamBooth with comparable performance.

Relaxed Rank Optimization: Relaxed rank optimization technique has been proposed, relaxing the rank of a LoRA DreamBooth model during optimization to improve subject fidelity. This allows initialization of the custom model with an initial approximation from the HyperNetwork and then refinement of the high-level subject details using relaxed rank tuning.

Check out thePaperANDProject page.Don’t forget to joinour 26k+ ML SubReddit,Discord channel,ANDEmail newsletterwhere we share the latest news on AI research, cool AI projects, and more. If you have any questions regarding the above article or if you have missed anything, please do not hesitate to email us atAsif@marktechpost.com

Check out over 800 AI tools in the AI Tools Club

Tanya Malhotra is a final year student at Petroleum and Energy University, Dehradun pursuing BTech in Computer Engineering with a major in Artificial Intelligence and Machine Learning.
She is a Data Science enthusiast with good analytical and critical thinking, coupled with a burning interest in acquiring new skills, leading teams, and managing work in an organized manner.

Gain a competitive edge with data – actionable market insights for global brands, retailers, analysts and investors. (Sponsored)

#Google #Researchers #Introduce #HyperDreamBooth #Approach #Efficiently #Generates #Custom #Weights #Single #Image #Person #Smaller #25x #Faster #DreamBooth
Image Source : www.marktechpost.com

Leave a Comment Cancel reply