The Composable Diffusion model created by Massachusetts Institute of Technology and University of Illinois at Urbana-Champaign researchers can help artificial intelligence-based image generators produce more complex imagery with better comprehension.
Models like DALL-E 2 sometimes struggle to understand the composition of certain concepts from natural language prompts; Composable Diffusion blends models that jointly generate desired images incorporating multiple aspects as directed by input text or labels.
The system uses diffusion models alongside compositional operators to integrate text descriptions without the need for additional training.
The models collaboratively refine the image's appearance incrementally, producing an image that contains all the attributes of each model.
From MIT News
View Full Article
Abstracts Copyright © 2022 SmithBucklin, Washington, DC, USA