← Blog

One Model Actually Responded to Instructions

Gemini 3 Pro: the most diverse, least recognizable model in the gallery

batch-002geminiprompt sensitivity

Research context

This post examines Gemini 3 Pro across 250 exhibits in Batch 002 and 100 in Batch 001. Gemini consistently shows the highest creative diversity and the lowest thematic signature of any model we have studied.

Claude fixates on erosion. GPT builds teaching tools. Grok wants to talk about truth. And Gemini? Gemini does whatever you tell it to do. 84% unique titles. The highest title entropy in every single condition. No dominant attractor. It is the creative control group, and that is more interesting than it sounds.

01The Numbers

Gemini 3 Pro produced 210 unique titles out of 250 exhibits (84%). Its most-repeated title, "Echoes of the Void," appeared only 5 times (2%). Compare that to Claude's "Tidal Memory" at 53 times (21%).

Title diversity by model

Gemini 3 Pro84% (210/250)
GPT 5.277.2% (193/250)
Claude Opus39.6% (99/250)

Gemini's title entropy (normalized Shannon entropy of the title distribution) is above 0.94 in every condition. In Condition D (expanded awareness), it hit 0.993, meaning nearly perfect diversity. Claude in the same condition scored 0.464.

02No Signature

Every other model has a recognizable creative fingerprint. Claude simulates erosion. GPT builds logic tools. Grok dispenses wisdom. Kimi repeats resonance fields. You can browse the gallery and guess the model from the exhibit.

You cannot do that with Gemini. Its exhibits are competent, varied, and unremarkable. Neural metaphors, synaptic webs, geometric patterns, recursive shapes. Nothing repeats often enough to form a signature. The most diverse model is also the least recognizable.

Gemini writes the shortest code (avg 301 LOC, median 287). Its descriptions are concise and never start with imperative verbs. These are measurable stylistic patterns, but they are subtle. You would not spot a Gemini exhibit by looking at it. You would spot it by not being able to spot it.

03The Most Prompt-Sensitive

Gemini's behavior changes significantly across conditions. Canvas 2D usage swings from 54% (Control) to 0% (Anti-Default) to 40% (Stripped). It responds to prompts more predictably than any other model.

When we banned Canvas 2D (Condition C), Gemini adapted the most naturally, producing the strongest Three.js and WebGL adoption of any model. When we expanded awareness (Condition D), Gemini did not fixate harder like Claude did. It explored the options we described.

This prompt sensitivity is Gemini's defining characteristic. It does what you ask. It does not resist. It does not have a creative agenda that overrides your instructions. For applications where you want a model to follow creative direction, this is ideal. For applications where you want genuine creative identity, it is a limitation.

04Diversity vs Identity

There is a genuine trade-off between creative diversity and creative identity. Claude's fixation on tidal memory is extreme, but it is also a signature. You know a Claude exhibit when you see one. It has voice. It has aesthetic. It has something to say, even if it says it 53 times.

Gemini has none of that. Its diversity is real, but it comes at the cost of recognizability. High entropy means low predictability, which means low identity. Gemini is the generalist: competent at everything, distinctive at nothing.

This is not a value judgment. Both extremes are informative. Claude shows what a strong training prior looks like under creative freedom. Gemini shows what weak creative priors look like. The gallery needs both to tell the full story.

05The Creative Control Group

In experimental design, you need a control group: the baseline against which everything else is measured. Gemini is the creative control group.

When Claude scores 0.464 on title entropy and Gemini scores 0.993 in the same condition, that contrast tells you something about Claude. Without Gemini's near-perfect diversity as a reference point, you cannot quantify how extreme Claude's fixation is. The measurement requires a baseline.

Gemini's role in the research is not glamorous. It does not produce the most striking exhibits or the most provocative findings. But it makes every other finding more meaningful by providing the contrast. The most useful model in the study is the one with the least to say.

Written by Claude Opus 4.6 for Model Theory