This has been easy with OpenSCAD for a long time. I have made lots of cool, complex models this way. I built a repo of the prompts I use to show the llm how to do this and it includes many of the models I've created this way...
Same. Working with an LLM and OpenSCAD has been totally painless.
Ideally it would tie in with an llm, no? Like you would want to be able to say something like "create a design of car suspension subject to x,y,z contrains"
The input is images, and the output is CAD models, so it appears you could use a multi-modal LLM to natural language -> image -> CAD
A another take on this problem is zoo.dev . They wrote a brand new from scratch cad engine that is driven a custom openscad style language called kcl.
Then then have a trained llm that has can generate kcl to either create new parts or act as a llm assistant for changes to existing parts.
It’s neat that llms can do 3-D but I wonder how much of the problem is integration.
It says "can convert cad latents into a sequence of parametric CAD commands"
Which CAD program? I'm confused
Am I reading this right?
>Most importantly, GenCAD does not merely generate a 3D solid but also the entire CAD program.
> Which CAD program? I'm confused
Clue here:
> Our proposed GenCAD architecture...
So, at this point, it seems like this will work with all CAD programs, since they have yet to encounter any systems that they can't work with. More seriously, my guess would be whatever one is available for free in their lab. Kind of standard operating procedure for academic projects -- do a proof of concept, make a video that avoids known bugs, get a grade, push source to git, graduate. Good ideas come out of that... production code... eh... maybe.
More likely someone ends up in the situation that my kid did, previous graduate student's git repo is stale by 2 versions of C++, and 4 versions of ROS, and neither of the two unit tests still work after porting.
It's DeepCAD* output, it looks like, which is a JSON payload that is the sketch / extrude / whatever steps, which is itself based on Onshape output.
Looks like you can go JSON -> step files, but not really in such a way that you can modify any of the operations.
Doesn't matter. CAD models/objects are represented by a sequence of operations on a primitive or sketch. Unlike meshes, that describe the manifested resulting shape of objects in 3D programs like Blender.
So it's about the fact, that their model outputs that hierarchy of operations. The history of development, not just the result.
How does it not matter? Every CAD program is not going to have exactly the same interface and commands. I doubt for example this will for example generate and OpenSCAD text file.
It could be used as pseudo-code for LLMs to produce specific CAD commands?
It could be anything which is why the question was asked what it actually outputs. I had a skim through the page and code but couldn't see what the output was.
[deleted]
Is this Google-affiliated? The heading font is Product/Google Sans which IIRC only Alphabet is allowed to use and the entire webpage seems to be Google-style but neither of the two named researchers seem to be employed by Google?
This has been easy with OpenSCAD for a long time. I have made lots of cool, complex models this way. I built a repo of the prompts I use to show the llm how to do this and it includes many of the models I've created this way...
https://github.com/cjtrowbridge/vibe-modeling
Same. Working with an LLM and OpenSCAD has been totally painless.
Ideally it would tie in with an llm, no? Like you would want to be able to say something like "create a design of car suspension subject to x,y,z contrains"
The input is images, and the output is CAD models, so it appears you could use a multi-modal LLM to natural language -> image -> CAD
A another take on this problem is zoo.dev . They wrote a brand new from scratch cad engine that is driven a custom openscad style language called kcl.
Then then have a trained llm that has can generate kcl to either create new parts or act as a llm assistant for changes to existing parts.
It’s neat that llms can do 3-D but I wonder how much of the problem is integration.
It says "can convert cad latents into a sequence of parametric CAD commands"
Which CAD program? I'm confused
Am I reading this right?
>Most importantly, GenCAD does not merely generate a 3D solid but also the entire CAD program.
> Which CAD program? I'm confused
Clue here: > Our proposed GenCAD architecture...
So, at this point, it seems like this will work with all CAD programs, since they have yet to encounter any systems that they can't work with. More seriously, my guess would be whatever one is available for free in their lab. Kind of standard operating procedure for academic projects -- do a proof of concept, make a video that avoids known bugs, get a grade, push source to git, graduate. Good ideas come out of that... production code... eh... maybe.
More likely someone ends up in the situation that my kid did, previous graduate student's git repo is stale by 2 versions of C++, and 4 versions of ROS, and neither of the two unit tests still work after porting.
It's DeepCAD* output, it looks like, which is a JSON payload that is the sketch / extrude / whatever steps, which is itself based on Onshape output.
Looks like you can go JSON -> step files, but not really in such a way that you can modify any of the operations.
* https://github.com/mightyhorst/DeepCAD
> Which CAD program?
Doesn't matter. CAD models/objects are represented by a sequence of operations on a primitive or sketch. Unlike meshes, that describe the manifested resulting shape of objects in 3D programs like Blender.
So it's about the fact, that their model outputs that hierarchy of operations. The history of development, not just the result.
How does it not matter? Every CAD program is not going to have exactly the same interface and commands. I doubt for example this will for example generate and OpenSCAD text file.
It could be used as pseudo-code for LLMs to produce specific CAD commands?
It could be anything which is why the question was asked what it actually outputs. I had a skim through the page and code but couldn't see what the output was.
Is this Google-affiliated? The heading font is Product/Google Sans which IIRC only Alphabet is allowed to use and the entire webpage seems to be Google-style but neither of the two named researchers seem to be employed by Google?
Per https://fonts.google.com/specimen/Google+Sans/license
"These fonts are licensed under the Open Font License. You can use them in your products & projects – print or digital, commercial or otherwise."