Skip to content

ByteDance-Seed/SimArt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

[SIGGRAPH 2026] SIMART: Decomposing Monolithic Meshes into Sim-ready Articulated Assets via MLLM

SIMART logo
SIMART

Project Page ArXiv HuggingFace Weights Video Demo

Teaser

🌟 Overview

SIMART is a unified MLLM framework that performs part-level decomposition and kinematic prediction jointly to transform monolithic meshes into sim-ready articulated assets.

  • Unified MLLM Framework: Offers a single-stage path to joint static asset understanding and sim-ready asset generation.
  • Sparse 3D VQ-VAE: Reduces token counts by 70% compared to dense voxel tokens, enabling high-fidelity multi-part assemblies.
  • Sim-Ready Assets: Generates structured URDF metadata and decomposed segments, enabling deployment into physics-based simulators and interactive robotic environments.

🔧 Installation

Our implementation is tested on Python 3.10.

conda create -n simart python=3.10
conda activate simart
pip install -r requirements.txt

📥 Model Weights

Download the pre-trained checkpoints for the MLLM and the VQ-VAE from Hugging Face:

Place the downloaded weights in the ./checkpoints, or specify your custom paths using the inference arguments below.

🚀 Inference

1. Data Preprocessing (Coordination Alignment)

Our model is trained on 3D assets following the Right-Handed Coordinate System:

  • Up Direction: +Z
  • Forward Direction: -Y (or +Y, but consistency is key for part orientation)

Pre-aligned Models: If your models are generated by Seed3D or Hunyuan3D, they are typically pre-aligned to the +Z up convention. You can run the normalization script directly without additional rotation arguments:

python scripts/process_raw_objects.py --input <object_path> --output ./assets --render

Manual Alignment: For models from other sources that might use +Y up, you must use the rotation flags to align them.

  • Important: Beyond just the "Up" direction, ensure the "Front" of the object faces the intended direction to help the MLLM correctly identify parts like "front legs" or "handles".
  • Reference: Please refer to the processed models in the assets/ directory for the standard orientation.

Arguments:

  • --input: Path to the input raw object (.glb).
  • --output: Output directory for the normalized model.
  • --rot_x, --rot_y, --rot_z: Rotation angles in degrees to align the mesh.
  • --render: Highly recommended. It renders a preview image to let you verify if the object is standing upright and facing forward.

2. Run Inference

To predict the articulated structure and generate the URDF of a processed 3D model, run the main inference pipeline:

python inference/infer.py --object_path ./assets/box_00.glb --debug

Arguments:

  • --object_path: Path to the object file or a folder containing multiple GLBs (Required).
  • --output_path: Directory to save outputs (Default: ./output/raw).
  • --name: Base name for outputs (JSON, URDF, PLY, folders). If not provided, it is derived from the object_path.
  • --model_path: Path to the trained MLLM checkpoint directory (Default: ./checkpoints/simart_mllm).
  • --vqvae_ckpt_dir: Path to the VQ-VAE checkpoint directory (Default: ./checkpoints/simart_vqvae).
  • --blender_path: Custom path to the Blender executable. If not provided, it auto-downloads to /tmp.
  • --debug: Enable debug mode to output intermediate visualizations (colored PLY files, joint axes, etc.).

📁 Repository Structure

SIMART/
├── assets/               # Sample 3D GLB assets
├── blender_script/        # Scripts for headless Blender rendering
├── inference/             # Main MLLM inference pipeline
├── scripts/               # Data preprocessing scripts
├── utils/                 # Modular utility functions (mesh, URDF, parsing, etc.)
└── vqvae/                 # Sparse VQ-VAE model definitions

License

This project is licensed under the Apache 2.0.

Citation

If you find our work helpful, please cite as

@article{zhang2026simart,
  title={SIMART: Decomposing Monolithic Meshes into Sim-ready Articulated Assets via MLLM},
  author={Zhang, Chuanrui and Qin, Minghan and Wang, Yuang and Xie, Baifeng and Li, Hang and Wang, Ziwei},
  journal={arXiv preprint arXiv:2603.23386},
  year={2026}
}

About

[SIGGRAPH 2026] SIMART: Decomposing Monolithic Meshes into Sim-ready Articulated Assets via MLLM

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages