Hi, I’m Thomas, a Mechanical/Robotics Engineer turned Software & ML.
This is an experiment to create 3D models using a Small Language Model.
Armed with GPU credits from CloudRift & Prime Intellect + the generous free usage of Huggingface, I set out to build a language model to generate 3d files - CADMonkey.
At Starmind, we need our models to be small enough to run on Raspberry Pi 4 & 5. The base Language Model of choice is Gemma3-1B for the following reasons:
We also briefly considered a diffusion model, but the development complexity is too large. Perhaps we will revisit this idea another day.
The model will generate OPENSCAD code that will be used to render 3D models.
Why OpenSCAD? As a Mechanical Engineer, I find that traditional Voxel & Mesh-based 3D models to bring little to no value. Engineering requires constant revisions and iterations, and shape-based and code-based models are ideal for that.
Your model is only as good as its dataset. But the definiton of “good” is relative to the task at hand.
Here are the attempts we made at creating a dataset:
#1: There are 7,000 rows worth of open-source OpenSCAD dataset, scrapped from Thingiverse on Huggingface (redcathode/thingiverse-openscad). However, we have a few issues with this:
#2: Synthetic data generation is the method we chose to go with.
The result is the following dataset: https://kdataset.web.app
This is the first synthetically generated & reviewed OpenSCAD dataset of large quantity on the internet: ThomasTheMaker/Synthetic-Object-v3 (35k rows of verified data).
This would not be possible without the grants provided by CloudRift. Thanks a bunch!
After fine-tuning the model on the dataset, we found that:
As the matter of fact, only 1/400 models matched the object. See below for the only good object created - the duck:
#3: Scaling dataset horizontally
We tried scaling up with a dataset with more objects, but the same issue of non-matching objects persisted.
#4: Scaling up dataset vertically
Model performance only truly improved when we scaled up the dataset vertically:
You can see the improvement below:
There are many things we tried that did not work, and we hope it helps you avoid wasting time & efforts:
First, we tried generating data with Claude Sonnet & Haiku models on AWS Bedrock. The cost was estimated to be 40-60$ based on the token count. Due to reasoning tokens, this came out to $170 while the output barely surpasses open models like Kimi-K2 (non-thinking) and Deepseek-V2.
Second, we tried generating the list of object names through libraries and dictionaries. This was a bad idea as the list was quite random, containing objects that the base model did not even have any knowledge about.
With the dataset ready, we fine-tuned Gemma3 1B model on the datasets with the following prompt:
‘hey cadmonkey, make me a {object name}’
This was done using Unsloth 4-bit finetuning.
The output model is converted into a GGUF model with q8 quantization.
Everything is available here: https://hf.co/collections/ThomasTheMaker/cadmonkey
I’m using Modal to host the model. Since the model is small, it can run very fine even on CPU, Raspberry Pis, etc.
For speed optimization, I’m using T4 GPU on Modal, giving some awesome output speed. Although only 8% of GPU is utilized.
On average, each prompt costs 2 cents to run.
Try out the app at https://cadmonkey.web.app
I know it’s cliche, but you can just make things!
5 years ago, it would cost me 5 figures and a team of 20 scientists to achieve this.
Now, I ran the whole experiments over 3 weekends, using 500$ in credits from various sources.
Up until now, my knowledge about Language Model was 1-year of obsessive self-taught.
You really can just do things. You just have to be crazy enough to start.
The dataset & model files:
The datasets used in this experiments are named “Synthetic-Openscad-v**”
Dataset generation & training code:
Awesome compute sponsors (:
If you have any questions, ask me here: https://discord.gg/XT23RZ5U6R