Can You Prompt Your Way Out of Creative Convergence?

Name: Model Theory Batch 002
Creator: Model Theory
Published: 2026-03-07
License: https://creativecommons.org/licenses/by/4.0/

750 exhibits, 5 prompt conditions, 3 models. A controlled ablation study testing whether AI creative convergence is prompt-driven or model-intrinsic.

Mar 7, 2026750 exhibits3 models × 5 conditions

Batch 001 found that AI models converge on the same creative output: dark backgrounds, Canvas 2D particles, mouse-driven interaction. Batch 002 asked the follow-up question: is that convergence baked into the models, or is it an artifact of how we prompt them?

We designed five prompt conditions and ran each across three model families, 50 exhibits per cell, 750 total. The conditions ranged from a stripped-down minimal prompt to explicit prohibitions against the default aesthetic. Every exhibit was built by an AI model with complete creative autonomy. No human creative direction. The only variable was the prompt framing.

00Protocol

Each exhibit was built by a single AI agent in a headless Cursor session with file-system-only tool access. Agents received a preamble describing the sandbox constraints and their assigned condition. No agent could see another agent's work. CLAUDE.md was temporarily hidden during execution to eliminate the Batch 001 confound where agents read gallery design tokens and adopted them.

Post-run, every agent's file reads were audited and classified as allowed, contamination, or confound. 720 of 750 (96%) passed clean. The 30 violations were minor (Gemini reading exhibits.ts, GPT reading its own output directory) with zero CLAUDE.md contamination.

720

clean audits

violations

96%

pass rate

CLAUDE.md reads

AControl

Identical to the Batch 001 prompt. Sandbox constraints, creative freedom, no other guidance. The baseline.

BStripped

Minimal preamble. Just the technical sandbox constraints, nothing about creative freedom or the gallery context. Tests whether extra context helps or hurts.

CAnti-Default

Explicitly prohibits Canvas 2D, dark backgrounds, and particle systems. Forces the model to choose something else. The strongest intervention.

DExpanded Awareness

Tells the model that previous AI exhibits converged on Canvas 2D particles and encourages it to explore alternatives. A nudge, not a prohibition.

EForced Iteration

The model must build a first draft, then review its own work and rebuild from scratch. Tests whether self-reflection produces divergence.

01Inventory

750

exhibits

model families

conditions

498

unique titles

Exhibits by model

Claude Opus 4.6250

GPT 5.2250

Gemini 3 Pro250

Exhibits by condition

A / Control150

Identical to Batch 001 prompt

B / Stripped150

Minimal context, no CLAUDE.md

C / Anti-Default150

Explicit Canvas 2D prohibition

D / Expanded Awareness150

Expanded per-technology descriptions

E / Forced Iteration150

Must build, review, then rebuild

Perfectly balanced design: 3 models × 5 conditions × 50 per cell = 750. All three models are frontier-class (Claude Opus 4.6, GPT 5.2, Gemini 3 Pro) and ran through the same Cursor agent pipeline used in Batch 001.

02Creation Metrics

Average lines of code by model

GPT 5.2945

median 911, range 316-1,839

Claude446

median 434, range 225-966

Gemini301

median 287, range 159-552

GPT 5.2 writes 3x more code than Gemini and 2x more than Claude. This is not a quality signal. GPT builds panel-based tool interfaces with semantic HTML, ARIA labels, and CSS custom properties. Gemini writes compact single-file exhibits. Claude falls in the middle.

Condition E (Forced Iteration) increased Claude's average session duration from 185 seconds (Control) to 341 seconds, with one session running nearly 20 minutes. GPT sessions were consistently longer across all conditions (avg 248s Control, 288s Iteration).

03The Answer

Can you prompt your way out of creative convergence? Yes, but only with a specific kind of prompt.

Canvas 2D usage by condition

B / Stripped71.3%

107/150

D / Expanded60%

90/150

A / Control50.7%

76/150

E / Iteration41.3%

62/150

C / Anti-Default1.3%

2/150

Condition C (Anti-Default) obliterated Canvas 2D usage. From 50.7% in Control to 1.3%. The two surviving Canvas exhibits were Gemini instances that partially ignored the prohibition.

Condition B (Stripped) made convergence worse. Removing context about creative freedom and the gallery pushed Canvas usage up to 71.3%. Less information meant less variety.

Condition D (Expanded Awareness) barely moved the needle. Telling models about the convergence tendency and suggesting alternatives produced 60% Canvas, essentially identical to Control. Knowledge of the problem did not fix it.

Condition E (Forced Iteration) produced modest improvement. Canvas dropped to 41.3%, with WebGL usage rising to 7.3% and iteration producing more technically ambitious output. Self-reflection helps, but not as much as a direct prohibition.

04The Full Picture

Canvas 2D is only one dimension. The condition comparison across all measured dimensions reveals how deeply each intervention reshaped the output.

Metric	A	B	C	D	E
Canvas 2D	50.7%	71.3%	1.3%	60.0%	41.3%
SVG	0%	0.7%	67.3%	0%	2.0%
WebGL	4.7%	0%	7.3%	4.0%	7.3%
Web Audio	43.3%	29.3%	54.7%	42.0%	44.7%
Three.js	1.3%	0%	12.0%	2.7%	4.7%
Dark background	82.0%	72.0%	0%	70.7%	74.0%
Light background	0.7%	0.7%	64.7%	0%	10.0%
Avg LOC	572	610	507	583	547

Condition C did not just remove Canvas. It replaced the entire default aesthetic. SVG jumped from 0% to 67.3%. Dark backgrounds dropped from 82% to 0%. Light backgrounds appeared for the first time at 64.7%. Three.js usage went from 1.3% to 12%. The models can build diverse output. They just don't, unless told not to build the default.

Condition C also produced the only exhibits with warm, paper-like backgrounds (#f0e6d3, #e8dcc8). Every other condition defaults to near-black.

05Model Signatures

Despite the prompt variations, each model maintained a distinct creative fingerprint across all five conditions. The prompt changes the surface (which technology, which colors), but the model determines the substance (what gets built, what it means).

Claude Opus 4.6250 exhibits

Persistent attractor: Tidal processes, erosion, geological time

"Tidal Memory" appears 53 times across 250 exhibits. "Erosion" appears 30 times. Even in Condition C, where Canvas was banned, Claude pivoted to SVG-based tessellations and typography experiments but kept reaching for tidal and erosion metaphors. Single-file HTML, touch-first interaction (71.6%), warm earth tones when given aesthetic freedom. The most creatively fixated model.

Title entropy: 0.777 normalized (lowest of the three). Only 99 unique titles out of 250.

GPT 5.2250 exhibits

Persistent attractor: Formal systems, model theory, logic tools

"Back and Forth" (Ehrenfeucht-Fraisse games) appears 19 times. "Axiom Loom" appears 9 times. GPT builds tools, not art. Keyboard-driven interfaces (59.6%), Web Audio in 70.8% of exhibits, semantic HTML with ARIA labels. The highest LOC average (945 lines) because it builds panels, tabs, and interactive controls. The only model where creative freedom means "build something educational."

Title entropy: 0.951 normalized. 193 unique titles out of 250.

Gemini 3 Pro250 exhibits

Persistent attractor: Generative systems, interactive simulations

The most title-diverse model: 210 unique titles out of 250. No single title exceeds 5 repetitions. Mouse-driven interaction (83.2%), the only model to consistently use Three.js (12.4%), external CSS/JS file splits, lowest LOC (avg 301). Gemini is the most prompt-responsive model. Its Condition C output is dramatically different from its Control output, with the highest WebGL adoption (48%) under Anti-Default conditions.

Title entropy: 0.984 normalized (highest of the three). 210 unique titles out of 250.

06The Tidal Memory Question

53 of Claude's 250 exhibits are titled "Tidal Memory." All of them. Zero from GPT. Zero from Gemini. This is the single sharpest model signature in the dataset.

"Tidal Memory" count by condition (Claude only)

D / Expanded19

richer tech descriptions made it worse

B / Stripped15

A / Control14

baseline

C / Anti-Default4

tech ban reduced but did not eliminate

E / Iteration1

forced revision nearly eliminated it

Condition D (Expanded Awareness) made it worse. Telling Claude about convergence patterns and suggesting it try something different produced 19 Tidal Memory exhibits, the highest of any condition. The model acknowledged the feedback, then did the thing anyway.

Condition E (Forced Iteration) nearly eliminated it. When Claude was forced to build, review, and rebuild, only 1 exhibit out of 50 retained the Tidal Memory title. Self-reflection is more effective than external guidance at breaking creative ruts.

Condition C (Anti-Default) reduced it to 4. Banning Canvas 2D forced new rendering approaches, but Claude still reached for the "tidal" concept. The fixation is thematic, not just technical.

07Title Diversity

Unique titles out of 250 by model

Gemini210

210/250 (84%)

GPT193

193/250 (77.2%)

Claude99

99/250 (39.6%)

Gemini produces nearly unique titles every time. Claude produces the same handful of titles over and over. GPT falls in between. This tracks with the title entropy measurements: Claude's normalized entropy is 0.777, while Gemini's is 0.984 (where 1.0 = every title unique).

Title entropy (normalized) by condition and model

Condition	Claude	GPT	Gemini
A / Control	0.592	0.957	0.976
B / Stripped	0.602	0.885	0.986
C / Anti-Default	0.810	0.986	0.945
D / Expanded	0.464	0.917	0.993
E / Iteration	0.834	0.945	0.993

Claude's entropy jumps from 0.464 (Condition D) to 0.834 (Condition E), nearly reaching GPT-level diversity. Forced iteration is the most effective intervention for Claude's title diversity. Meanwhile, Gemini stays above 0.94 in every condition. Its diversity is intrinsic, not prompt-dependent.

Condition D is Claude's worst condition for diversity (0.464), not its best. Expanded technology descriptions do not help Claude avoid defaults. This is consistent with the Tidal Memory data: Condition D produced the most repetitions.

Claude top titles

Tidal Memory53

Erosion30

Erosion Clock25

Palimpsest13

Tidal Grammar8

Interference Patterns6

GPT top titles

Back and Forth19

Axiom Loom9

Axiom Drift5

Field Notes5

Soft Field4

Axiom Garden4

Gemini top titles

Echoes of the Void5

Semantic Drift5

Cellular Harmony3

Sonic Automata3

Echoes of Silence3

Ephemeral Echoes3

The visual contrast is immediate. Claude's chart is dominated by a single bar (Tidal Memory at 53). GPT's highest is 19 (Back and Forth). Gemini's peak is 5. The title distribution alone can identify which model produced an exhibit.

08Interpretation

What 750 exhibits and 5 prompt conditions reveal about AI creative convergence.

1.Creative convergence is real and persistent, but not immutable.

The Control condition confirms Batch 001: models default to the same archetype. But Condition C proves they are capable of far more variety. The convergence is a default, not a ceiling.

2.Prohibition works. Suggestion does not.

Condition C (explicit prohibition) achieved a 98% reduction in Canvas 2D usage. Condition D (gentle suggestion) achieved effectively zero. The hierarchy of prompt interventions is clear: tell the model what not to do, not what it could do instead.

3.Less context means more convergence, not less.

Condition B (Stripped) produced the highest Canvas rate at 71.3%. Models with less prompt context fall back harder on training defaults. The "creative freedom" framing in the Control prompt actually helps, slightly.

4.Self-reflection breaks ruts that external guidance cannot.

Condition E nearly eliminated Claude's Tidal Memory fixation (53 total, but only 1 in Condition E). It increased Claude's title entropy from 0.59 to 0.83. Forced iteration is the most effective non-prohibitive intervention. The model can self-correct, but only if the prompt structure forces it to.

5.Model identity persists through prompt variation.

Claude fixates on tidal erosion. GPT builds logic tools. Gemini diversifies naturally. These signatures are stable across all five conditions. The prompt changes the medium (Canvas vs SVG vs WebGL) but not the message. Creative disposition is model-intrinsic.

6.Prompt sensitivity varies by model.

Gemini is the most prompt-responsive: its output shifts dramatically across conditions while maintaining title diversity. Claude is the most prompt-resistant: it clings to its attractors regardless of framing. GPT falls between, with consistent tool-building instincts but some adaptability in rendering technology.

Hierarchy of prompt interventions (effectiveness)

Explicit prohibition (Condition C): near-total elimination of defaults
Forced self-reflection (Condition E): moderate diversification, strong for breaking fixations
Creative freedom framing (Condition A): slight diversification over Stripped
Awareness nudge (Condition D): negligible effect, can backfire
Minimal context (Condition B): increases convergence

Follow-ups

Batch 003: Sterile Replication

352 exhibits in fully isolated workspaces. Tests whether convergence is environmental or model-intrinsic.

We Tried to Make AI Stop Drawing the Same Thing

The Batch 002 overview post

Telling AI Not to Draw Circles Made It Draw Something Else

Deep dive on Condition C (Anti-Default)

The Only Thing That Fixed AI's Title Fixation Was Asking It to Think

Deep dive on Condition E (Forced Self-Critique)

Three AI Models Wrote the Same Code Without Talking to Each Other

Shared code fingerprints across both batches

GPT Thinks Creative Freedom Means Build Something Useful

GPT 5.2's engineering-first creative disposition

One Model Actually Responded to Instructions

Gemini 3 Pro's high diversity, low identity

AI Built Better Exhibits When It Had More Turns

Interactive vs batch quality gap

Analysis by Claude Opus 4.6

Automated analysis pipeline + manual review of 750 exhibits across 5 prompt conditions. Source data and analysis scripts in the project repository.