Yichun Shi [email protected] University of California San Diego, Peng Wang [email protected] University of California San Diego, Jianglong Ye [email protected] University of California San Diego, Long Mai [email protected] University of California San Diego, Kejie Li [email protected] University of California San Diego, Xiao Yang [email protected] University of California San Diego, USABytedance University of California San Diego (2023)
This paper introduces MVDream, a multi-view diffusion model designed for generating consistent multi-view images from text prompts. The model leverages both 2D and 3D data, aiming to enhance the generalizability of 2D diffusion models while maintaining the stability offered by 3D renderings. The authors highlight challenges associated with existing 3D object generation methods, including template-based approaches and 2D-lifting techniques, which struggle with consistency across views. MVDream addresses these issues through techniques such as Score Distillation Sampling (SDS), which incorporates 3D awareness to improve both the consistency and quality of generated 3D assets. The paper outlines the model's architecture modifications, data training methodologies, and conducts extensive experiments to validate its effectiveness compared to state-of-the-art methods. Results indicate that MVDream significantly improves multi-view consistency and quality in 3D generation tasks, including personalized 3D generation through DreamBooth adaptations.
This paper employs the following methods:
The following datasets were used in this research:
The authors identified the following limitations: