Granny Squares
AI can convincingly imitate crochet diagrams, but without the ability to execute structured instructions, it fails to produce patterns that actually work.
When Code Does Not Compile
Knitting and crochet patterns function as a form of code, but unlike software, they must compile in the physical world. A valid pattern is not one that reads well. It is one that produces a structurally sound object. Current AI systems can generate instructions that appear correct, yet fail at this fundamental test because they do not enforce the constraints that make patterns work.
In practice, that code is unforgiving. Patterns define structure through sequence, repetition, and constraint. A pattern specifies how many stitches enter a row, how they transform, and how they resolve. Each instruction depends on the last. Errors accumulate. A single misplaced stitch propagates forward and alters the entire object. Makers have always understood this. They read the fabric as carefully as the instructions, correcting mistakes as they appear, ensuring the pattern runs in the real world.
From the earliest public releases of ChatGPT, this domain exposed a limitation. The system produced patterns that looked correct, often polished and confident in tone. Yet practitioners quickly found that these instructions failed in execution. Stitch counts drifted. Repeats did not close. Corners in a square did not resolve. The pattern read like code, but it did not compile.
Some years ago, coverage from CNN documented this clearly. A crocheter followed AI-generated instructions and found that the finished object diverged from the intended design. The language passed inspection. The artifact did not. The gap between description and execution became visible.
That gap has not disappeared. Progress has improved formatting, terminology, and surface coherence. Simple patterns sometimes succeed with minor corrections. Yet the deeper issue remains. The model predicts plausible instructions rather than validating structural integrity. It does not enforce stitch counts across rows. It does not ensure that geometric constraints hold from beginning to end.
More recent attempts shows how the problem has expanded, with AI-generated patterns now circulating widely, often paired with convincing images and presented as if they have been tested, shifting what was once a technical limitation into a broader question of trust, as beginners encounter instructions that appear authoritative but fail in practice, and verification, long grounded in tradition and community, struggles to keep pace with automated generation; at the same time, another line of coverage highlights how these flawed patterns, when actually realized or even just visualized, tend to look almost correct yet remain structurally unsound, with symmetry breaking down, proportions drifting, and the final object revealing in its form what the text quietly conceals.
Knitting, in this sense, retains an older discipline. It demands that instructions survive contact with material. Yarn, tension, and repetition enforce rules that language alone cannot bypass. A working pattern is not one that reads well. It is one that holds together.
Can ChatGPT Generate a Crochet Diagram?
There is a persistent and well-documented limitation in AI image generators: they cannot execute structured instructions. They can recognize the aesthetic of a diagram, a chart, or a technical drawing — and reproduce something that looks plausible — but the underlying logic of that diagram is invisible to them. This post documents a controlled test of that limitation using one of the most deceptively simple objects in fiber arts: the granny square.
The Test Object
A Granny Square is a modular crochet unit built from four concentric rounds of precisely sequenced stitches. Each round uses distinct stitch types — double crochet (Dc), treble (Tr), double treble (Dtr), half-double crochet (Hdc), and slip stitches — positioned at exact spatial locations. The result is a square with fourfold rotational symmetry and a readable stitch diagram that an expert can verify at a glance.
The pattern used for this test comes from: yeezhee.com. The written instructions are:
Round 1: MR, Ch×3, Dc×2, Ch×2, [Dc×3, Ch×2,]×3, Sl.st [12]
Round 2: (Ch×3, Dc×2, Ch×3), [K1, Sl.st, K1, (Hdc, Dc×6, Hdc)]×3, K1, Sl.st
Round 3: Ch×1, Sc, Hdc, Dc, Tr, (Dtr, Ch×3, Dtr, Tr), (Dc, Hdc), Sc, (Hdc, Dc),
(Tr, Dtr, Ch×3, Dtr), Tr, Dc, [Sc, Hdc, Dc, Tr, (Dtr, Ch×3, Dtr), Tr, Dc, Hdc]×2, Sl.st
Round 4: (Ch×3, Dc×2, Ch×2, Dc×3), Dc×9, [(Dc×3, Ch×2, Dc×3), Dc×9]×3, Sl.st
The reference diagram renders this as a two-color technical illustration: inner rounds (1–3) in gray, the outer border (Round 4) in blue, with standard US crochet chart symbols throughout.
Yeezhee, Club Granny Square. © Yeezhee.
The Prompt
Rather than asking for a generic crochet image, the prompt was designed to stress-test whether the model could translate stitch notation into a structurally accurate diagram. The key elements:
- Explicit round-by-round stitch descriptions
- Specification of symbol types (T-shapes for Dc, taller symbols for Tr, diagonal lines for Dtr, oval loops for chain spaces)
- Two-color rendering (gray inner rounds, blue outer border)
- Overhead flat view, vector-style, white background
- No yarn texture, no photorealism
This is not a casual prompt. It provides the model with all the information a human diagrammer would need.
The Output
AI-generated crochet stitch diagram of a granny square, created in a generic instructional style using standard symbol notation; not based on or derived from any specific published pattern or commercial design. Generated with DALL·E. © Original prompt author.
The output is visually convincing at first glance. It is clearly a crochet diagram: two-color, white background, radiating stitch symbols from a central ring. For a non-specialist, it reads as plausible.
A domain expert sees something different.
Scoring the Fidelity of the Output
The output was evaluated against a 40-point rubric covering five dimensions: Round 1 (center structure), Round 2 (shell expansions), Round 3 (treble clusters), Round 4 (outer border), and global fidelity.
Final score: 19 / 40 — Partial tier (48%)
| Section | Score | Max |
|---|---|---|
| Round 1 — magic ring center | 5 | 8 |
| Round 2 — shell expansions | 3 | 6 |
| Round 3 — treble/double-treble clusters | 2 | 10 |
| Round 4 — outer border | 5 | 10 |
| Global structure and fidelity | 4 | 6 |
The score distribution tells the story clearly. The model succeeded on surface features — color separation, clean lines, white background, no text hallucinations — and failed on structural ones.
Round 3 is the diagnostic round. It scored 2 out of 10. The double-treble (Dtr) symbols, which should appear as tall diagonal lines at precise angles converging into corner points, are absent entirely. The treble (Tr) symbols show no height differentiation from double crochets. The chain-3 corner arches — which should render as paired oval loops at each corner — are scattered loosely with no structural placement logic. This is the round that distinguishes a granny square from generic crochet. The model did not render it.
The outer border (Round 4) fared better: the blue fan shapes at corners are present, and the side Dc symbols are consistent. But the chain ovals flanking each corner fan — a structurally necessary element — are missing.
What This Reveals
The failure is architectural, not a prompting problem. DALL-E and comparable diffusion models generate images by sampling from a learned distribution of visual patterns. They have absorbed thousands of crochet diagram images during training and can reproduce the general appearance of one. What they cannot do is parse Dtr,Ch×3,Dtr and place two tall diagonal symbols flanking three oval loops at a specific corner position. That requires instruction execution — reading notation, computing spatial coordinates, placing symbols — which is simply not what these models do.
The two-step framing sometimes used ("convert to symbolic code, then generate") does not resolve this. ChatGPT's text model and DALL-E are separate systems with no shared execution layer. The symbolic code, if generated, becomes another natural language prompt before reaching the image model. The structural information is lost in translation.
A few caveats worth acknowledging:
Single generation. This test used one output. DALL-E has meaningful variance across runs; a rigorous comparison would average scores across five or more generations with the same prompt.
Prompt sensitivity. The rubric tests the model against a specific reference diagram. A different prompt framing might produce a higher score on some dimensions — though the Round 3 failure is unlikely to resolve without architectural change.
Rubric subjectivity. Partial credit judgments (particularly for Rounds 1 and 2) involve interpretation. A second scorer might land within ±3 points on the total.
This is one model at one point in time. Multimodal architectures that tightly couple language understanding with spatial rendering may eventually close this gap. The test is worth repeating as models evolve.
A score of 19/40 on a domain-expert rubric is not a failure of prompting. It is a failure of architecture. The model can pattern-match to crochet diagrams. It cannot read one. For any application where AI-generated technical diagrams need to be structurally correct — instructional materials, pattern documentation, educational tools — this gap matters. The granny square is an unusually clean test case precisely because its structure is regular, verifiable, and unambiguous. If a model cannot reproduce it from explicit notation, more complex structured diagram tasks are out of reach by the same mechanism.
Commentary
The observed failure is structural, not a matter of prompting. Current image models can replicate the appearance of crochet diagrams but cannot execute the underlying notation that ensures correctness. This creates a gap between visual plausibility and functional reliability.
Key Takeaway
Crochet diagrams are not images in the conventional sense. They are instruction sets. Each symbol encodes position, sequence, and relationship. Traditional practice depends on exact translation from notation to form. Generative image models, by design, do not perform this translation. They approximate patterns without enforcing structure.
Implication
Outputs that appear authoritative may fail when used. This introduces a trust risk, particularly for beginners and instructional contexts. Established verification methods, historically grounded in community practice and repetition, do not scale effectively against high-volume automated generation.
Root Cause
The limitation arises from architecture. Language models and image models operate independently, with no shared execution layer. Symbolic information is not preserved as structured data through the pipeline. As a result, positional accuracy, symmetry, and stitch logic degrade during generation.
Recommended Direction
A more reliable approach follows established engineering principles:
- Parse crochet notation as a formal language
- Convert patterns into a structured intermediate representation with explicit spatial relationships
- Apply rule-based validation, including symmetry and stitch counts
- Render diagrams programmatically using deterministic methods
This pipeline treats the problem as execution rather than generation. It aligns with how technical systems ensure correctness in domains such as compilers, CAD, and data visualization.
Conclusion
This approach is feasible with current technology and offers predictable results. Alternative strategies that rely on improving generative models remain uncertain and do not directly address the need for structural fidelity.
The current limitation defines a clear boundary. Where correctness depends on the faithful execution of structured notation, visual approximation is insufficient. Systems that enforce structure will outperform those that rely on pattern matching.
Further: Video!
Siân Over, Why AI Can’t Write Knitting Patterns, 2025. A detailed video explaining why large language models produce convincing but structurally unreliable knitting instructions, highlighting the gap between linguistic fluency and executable pattern design. Source: YouTube.