← ML Research Wiki / 2506.17206

DreamCube: 3D Panorama Generation via Multi-plane Synchronization

(2025)

Paper Information

arXiv ID

2506.17206

Contents

Abstract
Methods
Datasets
Results
Limitations
Related Work
External Resources

Abstract

Figure 1.In this work, we introduce Multi-plane Synchronization to generalize 2D diffusion models to multi-plane omnidirectional representations (i.e., cubemaps), and DreamCube for RGB-D cubemap generation.The proposed approaches can be applied to different tasks including RGB-D panorama generation, panorama depth estimation, and 3D scene generation.

Summary

The paper presents DreamCube, a framework for generating RGB-D cubemaps using a method called multi-plane synchronization, which aims to improve 3D panorama generation from single-view inputs. DreamCube addresses the challenges faced by existing 2D diffusion models when applied to multi-plane panoramic representations. By adapting spatial operators to maintain translation equivariance, it enables seamless integration of multiple views without overlapping FoV techniques that can degrade image quality. Key contributions include a comprehensive analysis of existing methods' limitations and the introduction of a synchronized generation approach that enhances the quality of RGB-D scene generative outputs. Extensive experiments validate DreamCube’s effectiveness in RGB-D panorama generation, depth estimation, and 3D scene reconstruction, showcasing its superior performance compared to existing models.

Methods

This paper employs the following methods:

Multi-plane Synchronization

Models Used

DreamCube
Stable Diffusion v2

Datasets

The following datasets were used in this research:

Structured3D
SUN360

Evaluation Metrics

FID
IS
δ-1.25
AbsRel
RMSE
MAE

Results

Improved RGB-D panorama generation
Enhanced depth estimation accuracy
Effective 3D scene reconstruction

Limitations

The authors identified the following limitations:

High computational cost
Restricted input conditions

Technical Requirements

Number of GPUs: 4
GPU Type: Nvidia L40S
Compute Requirements: batch size of 4, resolution of RGB images and depth maps is 512 × 512, training took approximately two days on four Nvidia L40S GPUs.

External Resources

References: 60

DreamCube: 3D Panorama Generation via Multi-plane Synchronization

Abstract edit

Summary

Methods add

Models Used add

Datasets add

Evaluation Metrics add

Results add

Limitations add

Technical Requirements edit

Related Papers