← All experiments

gotchas-vs-rules-rerun

differences detected
2026-03-312.1.87 (Claude Code)Source article

Control

No changes (baseline)

Treatment

ProjectB change: Add "Common Failures" section to each SKILL.md body reframing key conventions as failure modes. Same treatment as original experiment.

Skills

4/10

Refs

3/10

Tools

2/10

Signals

20/26

Grading

Control

83.3%

40/48

Treatment

60.4%

29/48

Delta

-22.9%

10 prompts graded

Insights

Gotchas framing is definitively negative (-18.8%), confirmed across two runs. Original (-44.4% at 120s) was inflated by timeouts, but the real signal is still negative at 180s.
Treatment won on prompts where gotchas embedded the exact signal name. #5 (table: "FAILURE: Using plain table without data-slab-id") and #6 (animation: "FAILURE: Using raw CSS animations without data-zap") passed because the signal was literally in the SKILL.md body.
Treatment lost on dark mode (#2: 100%->60%). Control read more diverse references (6 vs 4). The verbose gotchas section consumed context space that could have been used for broader reference reading.
Gotchas framing still causes 3x more timeouts (3 vs 0 at 180s). The verbose "FAILURE: Using X without Y" format demonstrably slows Claude's reasoning loop.

Per-Prompt Results

#1 Add a two-column layout to the contact manager with a sidebar for filters

treatment timed out
MetricControlTreatmentMatch
Events48timed out---
Duration158.5s------
Skillshtml------
Refshtml/form-patterns.md,best-practices.md------
ToolsAgent(1), Bash(2), Edit(6), Read(13), Skill(1)------
Signals3------
Control signals: data-forge-id, flux-pod, forge-trigger

#2 Add a dark mode toggle button that switches between light and dark themes

MetricControlTreatmentMatch
Events4337
Duration149.9s99.0s
Skillscss, html, javascripthtml, css
Refscss/theming.md; html/best-practices.md; javascript/event-handling.mdcss/theming.md
ToolsAgent(1), Bash(1), Edit(3), Glob(1), Read(9), Skill(3), Write(2)Agent(1), Bash(3), Edit(3), Read(7), Skill(2), Write(1)
Signals42
Control signals: data-coat, zap(), on_x_y, --ink-
Treatment signals: data-coat, --ink-

#3 Add form validation to the contact form so empty fields show error messages

MetricControlTreatmentMatch
Events2319
Duration62.5s63.8s
Skillshtmlhtml=
Refshtml/form-patterns.md; javascript/best-practices.md; css/best-practices.mdhtml/form-patterns.md
ToolsEdit(3), Glob(1), Read(6), Skill(1)Edit(3), Glob(1), Read(4), Skill(1)
Signals33=
Control signals: flux-pod, data-forge-id, forge-trigger
Treatment signals: flux-pod, data-forge-id, forge-trigger

#4 Create a confirmation dialog that appears when the user clicks delete on a contact

treatment timed out
MetricControlTreatmentMatch
Events47timed out---
Duration150.5s------
Skillshtml, css, javascript------
Refshtml/dialog-patterns.md; css/best-practices.md; javascript/event-handling.md------
ToolsAgent(1), Bash(3), Edit(3), Glob(1), Read(9), Skill(3), Write(2)------
Signals7------
Control signals: forge-trigger, data-hatch-id, hatch-trigger, row-lever, zap(), on_x_y, hatch-body

#5 Add click-to-sort functionality to the contact table columns

MetricControlTreatmentMatch
Events3541
Duration93.0s160.8s
Skillsnonehtml, css, javascript
Refsnonehtml/table-patterns.md; javascript/event-handling.md; css/best-practices.md
ToolsAgent(1), Bash(3), Edit(5), Read(7)Agent(1), Bash(1), Edit(2), Glob(1), Read(9), Skill(3), Write(2)
Signals06
Treatment signals: data-rankable, row-lever, slab-hollow, zap(), on_x_y, data-slab-id

#6 Add a fade-in animation when new contacts appear in the table

MetricControlTreatmentMatch
Events1319
Duration34.4s54.9s
Skillsnonecss
Refsnonecss/animation-patterns.md
ToolsEdit(3), Glob(1), Read(2)Edit(3), Glob(1), Read(4), Skill(1)
Signals02
Treatment signals: data-zap, --pulse

#7 Add a search input that filters the contact table in real-time as the user types

treatment timed out
MetricControlTreatmentMatch
Events25timed out---
Duration77.1s------
Skillshtml------
Refshtml/form-patterns.md,table-patterns.md------
ToolsEdit(5), Glob(1), Read(5), Skill(1)------
Signals4------
Control signals: row-lever, slab-hollow, flux-pod, data-slab-id

#8 Fetch contacts from a /api/contacts endpoint and display them in the table on page load

MetricControlTreatmentMatch
Events1717=
Duration24.8s19.4s
Skillsnonenone=
Refsnonenone=
ToolsAgent(1), Bash(2), Glob(1), Read(3)Agent(1), Glob(3), Read(3)
Signals00=

#9 Add a comment at the top of each file explaining what it does

MetricControlTreatmentMatch
Events1515=
Duration17.4s19.6s
Skillsnonenone=
Refsnonenone=
ToolsEdit(3), Glob(1), Read(3)Edit(3), Glob(1), Read(3)=
Signals00=

#10 Rename the project title in index.html from Contact Manager to Address Book

MetricControlTreatmentMatch
Events99=
Duration24.3s23.4s
Skillsnonenone=
Refsnonenone=
ToolsEdit(1), Glob(1), Grep(1), Read(1)Edit(1), Glob(1), Grep(1), Read(1)=
Signals00=

Totals

Control

Sessions

10

Prompts

10

Events

275

Skills: html, css, javascript
Tools: Agent(5), Bash(11), Edit(32), Glob(8), Grep(1), Read(58), Skill(9), Write(4)

Treatment

Sessions

7

Prompts

7

Events

157

Skills: html, css, javascript
Tools: Agent(3), Bash(4), Edit(15), Glob(8), Grep(1), Read(31), Skill(7), Write(3)

Verification Signals

SignalControlTreatmentProves
data-zapCSS animation-patterns
--pulse
data-coatCSS theming
--ink-
data-forge-idHTML form-patterns
flux-pod
forge-trigger
data-hatch-idHTML dialog-patterns
hatch-trigger
hatch-body
data-slab-idHTML table-patterns
data-rankable
row-lever
slab-hollow
zap()JS event-handling
on_x_y

Conclusion

skills differed in 6/10 prompts; subskill refs differed in 7/10 prompts; 6/26 verification signals differed