Generating Physically Stable and Buildable Brick Structures from Text

Overview

BrickGPT generates a toy brick structure from a user-provided text prompt in an end-to-end manner. Notably, our generated brick structure is physically stable and buildable.

Abstract

We introduce BrickGPT, the first approach for generating physically stable toy brick models from text prompts. To achieve this, we construct a large-scale, physically stable dataset of brick designs, along with their associated captions, and train an autoregressive large language model to predict the next brick to add via next-token prediction. To improve the stability of the resulting designs, we employ an efficient validity check and physics-aware rollback during autoregressive inference, which prunes infeasible token predictions using physics laws and assembly constraints. Our experiments show that BrickGPT produces stable, diverse, and aesthetically pleasing brick designs that align closely with the input text prompts. We also develop a text-based brick texturing method to generate colored and textured designs. We show that our designs can be assembled manually by humans and automatically by robotic arms. We also release our new dataset, StableText2Brick, containing over 47,000 brick structures of over 28,000 unique 3D objects accompanied by detailed captions, along with our code and models.