

Doodle Your 3D: From Abstract Freehand Sketches to Precise 3D Shapes [CVPR 2024]

Banner GIF

In this paper, we democratise 3D content creation, enabling precise generation of 3D shapes from abstract sketches while overcoming limitations tied to drawing skills. We introduce a novel part-level modelling and alignment framework that facilitates abstraction modelling and cross-modal correspondence. Leveraging the same part-level decoder, our approach seamlessly extends to sketch modelling by establishing correspondence between CLIPasso edgemaps and projected 3D part regions, eliminating the need for a dataset pairing human sketches and 3D shapes. Additionally, our method introduces a seamless in-position editing process as a byproduct of cross-modal part-aligned modelling. Operating in a low-dimensional implicit space, our approach significantly reduces computational demands and processing time.


We synthesize our own dataset by :

Sample data is available in data/sample/


The training is a two stage process: first the sketch segmentation network is trained and next, the diffusion model for conditional shape generation is trained. To train these networks, the dataset has to be created: in the form of sketches of 3D shape projections, and their segment maps.

To train the segmentation network, run:

sh scripts/segment_train.sh

To precompute sketch encodings (for fast diffusion model training), run:

sh scripts/precompute.sh

To train the diffusion model, run:

sh scripts/diffusion_train.sh

To scale training, change the size parameter of the diffusion model. As per my pre-liminary tests, increasing size to 1024 results in a 73M param diffusion model, but the shape quality goes up.