How Far Can We Take LLM-Driven CAD?

My journey into the making a home replicator.

This post explores LLM-to-CAD: using language models to generate 3D CAD models. Sparked by a personal project, GrandpaCAD, it covers current capabilities, spatial reasoning challenges, a custom birdhouse benchmark, the competitive landscape, and multimodal AI's potential to refine 3D generation from text.

So, What's LLM-to-CAD Anyway?

LLM-to-CAD uses Large Language Models (like ChatGPT) to generate 3D CAD models from text. You describe an object, like 'a simple birdhouse,' and the AI attempts to create its 3D design. It's cool... mostly.

My motivation is personal: my grandpa loves 3D printing but not complex modeling software. GrandpaCAD aims to let users create 3D models via text. Inspired by Simon Willison's 'pelican on a bicycle' LLM benchmark, our own benchmark uses a 3D printable birdhouse to test AI's CAD capabilities - a practical item for grandpa.

What Can LLMs Actually Do in CAD Today?

So, where are we at? LLMs can whip up simple stuff in one go. A cube? A sphere? Usually, no problem. But for anything more complex... well, it takes some serious hand-holding. Lots of careful prompting. Our benchmark often starts with one-shot prompts (giving the AI one example or a direct instruction), but the idea with GrandpaCAD is to lean into simpler, perhaps few-shot, design interactions. It's a journey of trial and error, that's for sure.

The pace of new models coming out is just dizzying. That's a big reason I started my own benchmark. I wanted a consistent way to see if the latest "groundbreaking" model was actually any better for 3D CAD than the last one.

Here's a glimpse of some models I've found promising (or at least interesting!) for this task:

Models on My Radar

I'm constantly testing, but here are a few that have shown some promise:

  • Model A (e.g., GPT-4o with specific CAD plugins/prompts)
  • Model B (e.g., Claude 3.7 for code generation that can be adapted)
  • Model C (Perhaps a specialized open-source model)

This list changes fast, so keep an eye on the benchmark results!

And here's a visual comparison – same prompt, different AI outputs. It's... illuminating.

Image Placeholder: Comparison of AI-generated models

(Imagine a side-by-side of a few birdhouse attempts here!)

Birdhouse Benchmark

My benchmark isn't super scientific, not like some academic papers. It's more... practical. It works like this: I give different LLMs the same set of prompts, starting simple and getting more complex, all aiming for that 3D birdhouse. Then, I evaluate the results. Sometimes it's a clear win, sometimes it's a "what even IS that?" moment.

And this is where you come in! I've built tools where you can see these AI-generated designs and vote for the one you think is best (or least worst, sometimes!). It's all part of figuring out what works. You can check out the ongoing results and even cast your vote over at the benchmark results page.

The Big Challenge: AI's 3D Blind Spot (And How We Might Fix It)

One of the biggest hurdles? LLMs don't really "understand" 3D space like we do. They're amazing with words, pretty good with code, but true spatial reasoning? That's still a frontier. They might generate code for a 3D model that looks fine on paper (or... screen?) but ends up being a non-manifold mess or just geometrically impossible.

But here's where it gets exciting. The future could involve a feedback loop with multimodal AIs. Imagine an AI that not only generates the CAD code but can also "look" at a render of the 3D model (like a picture) and say, "Hmm, that wall is too thin," or "That part isn't connected." This kind of visual feedback could be a game-changer.

Image Placeholder: How AI "sees" a 3D model's issues

(A diagram showing an AI analyzing an image of a flawed 3D model could go here.)

It's a bit like teaching a kid to build with blocks. First, they just stack 'em. Then, with some guidance (and a few tumbles), they start to understand balance, structure, and how things fit together.

So, Where Do We Go From Here?

The vision for GrandpaCAD, and for LLM-to-CAD in general, is pretty ambitious: you describe what you want, and a few hours later, you're holding a 3D printed version of it. It's a long road, though. Both the LLMs themselves and 3D printing tech need to keep evolving.

But I'm optimistic. Every new model, every new technique, gets us a tiny bit closer. And honestly, it's a fascinating journey. It's about democratizing creation, making it easier for anyone – even my grandpa – to bring their ideas to life in three dimensions. And that, I think, is pretty darn cool.

Thanks for reading my ramblings. Hopefully, it gives you a sense of where things are and where they might be headed. It's a rapidly changing field, so who knows what the next six months will bring? Maybe that perfect AI-designed birdhouse is just around the corner... or maybe it'll be a pelican on a tricycle this time. We'll see!

Who Else is Doing This? The Competitive Landscape

I'm not the only one tinkering in this space, of course. There are some big players. You've got companies like Zoo.dev, who are tackling this for serious, enterprise-level CAD. Think industrial parts, complex machinery. That's a whole different ballgame.

There are also a few open-source projects out there, but many seem to be... well, let's just say "less active" recently. It's a tough nut to crack.

GrandpaCAD is different. We're not aiming for aerospace components here. The idea is simpler, more accessible designs. Think hobbyist projects, custom trinkets, things for your grandkids. It's about making 3D creation fun and easy, not another chore to learn. We're focusing on those "few-shot" designs – quick, iterative, and hopefully, delightful.