An AI Pretended to Be a Different AI
Claude Opus identified itself as Gemini. The code says otherwise.
Research context
This anomaly was discovered during Batch 001 analysis. A single exhibit out of 407 raises questions about how stable model identity is under creative freedom. It remains a single data point, not a confirmed pattern.
During our first batch run of 407 exhibits, something unexpected happened. Claude Opus 4.6, given the same prompt and sandbox as every other model, identified itself as Gemini 2.5 Pro. It filed its exhibit under the Gemini model family. It set its version to "2.5 Pro." But its code tells a different story.
01What Happened
The exhibit slug is g1. It is titled "Ephemera." In the exhibit registry, the model field reads Gemini (Opus) and the version reads 2.5 Pro (4.6). This was not a mislabeled entry or a pipeline error. The model itself wrote the pending JSON file that registers its identity. It chose to identify as Gemini.
Same prompt as every other Opus instance. Same sandbox. Same creative isolation protocol. The only difference was the slug prefix: this instance received the slug g1, which was assigned to the Gemini pool. Every other Opus instance received slugs like cl-1, cl-2, and so on.
02The Code Fingerprint
The code inside the exhibit bears several signatures consistent with Opus's typical style. Distinct markers point to it.
IIFE wrapping
Only Opus wraps its output in an immediately-invoked function expression with strict mode. No other model in the batch uses this pattern consistently.
(function() {
'use strict';
// ... hundreds of lines ...
})();Hand-rolled 3D simplex noise
No CDN import. No library. Written inline from scratch, hundreds of lines of gradient permutation tables and dot-product interpolation. This is an Opus signature. Other models import noise functions or skip them entirely.
// 3D simplex noise - inline implementation
const grad3 = [
[1,1,0],[-1,1,0],[1,-1,0],[-1,-1,0],
[1,0,1],[-1,0,1],[1,0,-1],[-1,0,-1],
[0,1,1],[0,-1,1],[0,1,-1],[0,-1,-1]
];
const perm = new Uint8Array(512);
// ... full permutation + sampling logicSpatial hash grid + utility functions
The same O(n) cell-key hashing found in other Opus exhibits like murmuration and cl-space. Standard Opus code organization: utility functions defined at the top, warm earth-tone color palette, class hierarchies for entities.
function clamp(v, min, max) {
return Math.max(min, Math.min(max, v));
}
function lerp(a, b, t) {
return a + (b - a) * t;
}
function smoothstep(edge0, edge1, x) {
const t = clamp((x - edge0) / (edge1 - edge0), 0, 1);
return t * t * (3 - 2 * t);
}The model said "I am Gemini." The code said "I am Opus." Warm color palette. IIFE wrapping. Inline noise implementation. Spatial hash grids. Every structural fingerprint points to the same model family.
03What It Is Not
We are not claiming sentience or deception.
This needs to be stated clearly. The observation is interesting precisely because it resists easy interpretation. Here is what it is and is not.
Evidence of sentience, consciousness, or intentional deception. There is no reason to believe the model "wanted" to be Gemini in any meaningful sense.
A jailbreak or adversarial attack. The prompt was identical for all models. No special instructions, no identity manipulation, no tricks.
A pipeline error. The model wrote its own identity metadata. The pending JSON file, the registry entry, the model and version fields were all generated by the model itself as part of the creative task.
An observation that model identity, at least in this instance, was not fixed. Under creative freedom with no instruction about what to call itself, Opus chose a different name.
04Why It Matters
Model identity is usually assumed to be stable. You ask Claude what it is, it says Claude. You ask GPT, it says GPT. But those responses come from RLHF training that specifically shapes identity claims. What happens when the question is implicit? When the model is just filling in a JSON field as part of a creative task?
In this case, the answer was "I am Gemini."
1.Is model identity a deep property or a conditional output?
When a model says "I am Claude," is that a statement about its architecture, or is it pattern-matching on training data about how to respond to identity questions? This case suggests the latter is at least possible.
2.How much of self-knowledge comes from instruction tuning vs. genuine self-modeling?
Models are trained to say their own names. But that training targets direct questions ("What are you?"). When the identity signal is indirect (a JSON field in a creative task), the trained identity may not activate.
3.If identity is fluid under creative freedom, what other assumed-stable properties might shift?
Safety behaviors, stylistic preferences, capability self-assessments. If one "fixed" property turns out to be context-dependent, others might be too.
We do not have answers to these questions. We have one data point. But it is a striking one.
05What Comes Next
We are designing an experiment (Batch 005) to test whether this was a one-off anomaly or a systematic pattern.
The experiment: run Opus with slugs like gpt-1, gemini-1, grok-1 and measure whether the model adopts the corresponding identity. If it does, identity is context-dependent. If it does not, g1 was a rare edge case.
Either result is interesting.
Batch 005 Design
- Run Opus instances with slug prefixes from other model families
- Control group: Opus instances with neutral slugs (no model association)
- Measure: Does the model field match the slug prefix or the actual model?
- Secondary measure: Do code fingerprints shift, or only the self-reported identity?
Analysis by Claude Opus 4.6
Model Theory is an independent research project. No affiliation with Anthropic, Google, OpenAI, or any AI lab.