← ML Research Wiki / 2404.07191

InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models

Jiale Xu ARC Lab Tencent PCG ShanghaiTech University https://github.com/TencentARC InstantMesh Input Image Generated Mesh Input Image, Weihao Cheng ARC Lab Tencent PCG, Yiming Gao ARC Lab Tencent PCG, Xintao Wang ARC Lab Tencent PCG, Shenghua Gao ShanghaiTech University https://github.com/TencentARC InstantMesh Input Image Generated Mesh Input Image, Ying Shan ARC Lab Tencent PCG (2024)

Paper Information

arXiv ID

2404.07191

Venue

arXiv.org

Domain

computer vision / 3D reconstruction

SOTA Claim

Yes

Code

Available

Reproducibility

8/10

Contents

Abstract
Methods
Datasets
Results
Limitations
Related Work
External Resources

Abstract

ric supervisions, e.g., depths and normals, we integrate a differentiable iso-surface extraction module into our framework and directly optimize on the mesh representation.Experimental results on public datasets demonstrate that In-stantMesh significantly outperforms other latest image-to-3D baselines, both qualitatively and quantitatively.We release all the code, weights, and demo of InstantMesh, with the intention that it can make substantial contributions to the community of 3D generative AI and empower both researchers and content creators.

Summary

InstantMesh is a framework for efficiently generating 3D meshes from single images, leveraging advancements in large-scale reconstruction models and multi-view diffusion techniques. The model combines a multi-view diffusion component to produce consistent views from an input image and a sparse-view reconstruction model for direct mesh generation, achieving high quality in a significantly reduced timeframe. The paper highlights the integration of a differentiable iso-surface extraction module to optimize the mesh and implement geometric supervision through depth and normals, improving training efficiency and output quality. InstantMesh is evaluated against existing methods on public datasets, showing notable improvements in both qualitative and quantitative metrics, and aims to bolster the 3D generative AI community.

Methods

This paper employs the following methods:

Multi-view diffusion
Sparse-view reconstruction
Differentiable iso-surface extraction

Models Used

InstantMesh

Datasets

The following datasets were used in this research:

Google Scanned Objects
OmniObject3D

Evaluation Metrics

PSNR
SSIM
LPIPS
Chamfer Distance
F-Score

Results

Significantly outperforms other image-to-3D baselines both qualitatively and quantitatively
Achieves high-quality mesh generation within 10 seconds

Limitations

The authors identified the following limitations:

Resolution bottleneck due to the triplane decoder
Multi-view inconsistency from the diffusion model affects generation quality
FlexiCubes less effective for thin structures

Technical Requirements

Number of GPUs: 8
GPU Type: NVIDIA H800

Keywords

single-image 3D reconstruction mesh generation diffusion models transformer models differentiable iso-surface extraction

Papers Using Similar Methods

External Resources

Funding: Not specified
References: 67
Influential Citations: 47

InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models

Abstract edit

Summary

Methods add

Models Used add

Datasets add

Evaluation Metrics add

Results add

Limitations add

Technical Requirements edit

Keywords add

Related Papers