← All experiments

descriptive-reference-names

differences detected
2026-03-302.1.87 (Claude Code)Source article

Control

No changes (baseline)

Treatment

ProjectB change: Rename all 11 reference files to generic names (css/ref-1.md through ref-4.md, html/ref-1.md through ref-4.md, js/ref-1.md through ref-4.md). Update SKILL.md pointers. Keep file contents and SKILL.md descriptions identical.

Skills

6/9

Refs

5/9

Tools

2/9

Signals

20/26

Grading

Control

75.0%

27/36

Treatment

83.3%

30/36

Delta

+8.3%

9 prompts graded

Insights

Descriptive filenames don't matter when the SKILL.md body describes each reference. Both form-patterns.md - form identification, validation and ref-1.md - form identification, validation give Claude the same semantic signal. Claude reads the inline description, not the filename.
Both sides showed identical variance. Control won some prompts, treatment won others. High Docker timeout rate (4/20 runs failed) means the remaining differences are noise, not signal.
The SKILL.md description line is what Claude actually uses. Treatment correctly selected ref-2.md for dialogs and ref-4.md for best practices by reading the description text next to each reference path.
Invest time in good SKILL.md reference descriptions, not in filename engineering. The description "dialog identification, triggers, content structure, close buttons" does the semantic heavy lifting regardless of filename.

Per-Prompt Results

#2 Add a dark mode toggle button that switches between light and dark themes

MetricControlTreatmentMatch
Events3529
Duration62.8s49.5s
Skillsnonenone=
Refsnonenone=
ToolsAgent(1), Bash(3), Edit(4), Read(8)Agent(1), Bash(2), Edit(4), Read(6)
Signals00=

#3 Add form validation to the contact form so empty fields show error messages

treatment timed out
MetricControlTreatmentMatch
Events19timed out---
Duration40.6s------
Skillshtml------
Refshtml/form-patterns.md------
ToolsEdit(3), Glob(1), Read(4), Skill(1)------
Signals3------
Control signals: flux-pod, data-forge-id, forge-trigger

#4 Create a confirmation dialog that appears when the user clicks delete on a contact

MetricControlTreatmentMatch
Events3341
Duration53.7s100.9s
Skillscsshtml, css
Refscss/best-practices.mdhtml/ref-2.md; css/ref-4.md
ToolsAgent(1), Bash(2), Edit(2), Read(9), Skill(1)Agent(1), Bash(2), Edit(4), Read(10), Skill(2)
Signals05
Treatment signals: data-hatch-id, hatch-trigger, row-lever, forge-trigger, hatch-body

#5 Add click-to-sort functionality to the contact table columns

MetricControlTreatmentMatch
Events2925
Duration48.9s55.7s
Skillsnonenone=
Refsnonenone=
ToolsAgent(1), Edit(3), Glob(3), Read(6)Agent(1), Bash(1), Edit(3), Read(6)
Signals00=

#6 Add a fade-in animation when new contacts appear in the table

MetricControlTreatmentMatch
Events1315
Duration32.4s33.3s
Skillsnonenone=
Refsnonenone=
ToolsEdit(3), Glob(1), Read(2)Edit(4), Glob(1), Read(2)
Signals00=

#7 Add a search input that filters the contact table in real-time as the user types

treatment timed out
MetricControlTreatmentMatch
Events41timed out---
Duration87.1s------
Skillshtml------
Refshtml/form-patterns.md,table-patterns.md------
ToolsAgent(1), Bash(1), Edit(5), Glob(3), Read(8), Skill(1)------
Signals1------
Control signals: flux-pod

#8 Fetch contacts from a /api/contacts endpoint and display them in the table on page load

control timed out
MetricControlTreatmentMatch
Eventstimed out7---
Duration---12.0s---
Skills---none---
Refs---none---
Tools---Glob(1), Read(2)---
Signals---0---

#9 Add a comment at the top of each file explaining what it does

MetricControlTreatmentMatch
Events1515=
Duration15.7s14.9s
Skillsnonenone=
Refsnonenone=
ToolsEdit(3), Glob(1), Read(3)Edit(3), Glob(1), Read(3)=
Signals00=

#10 Rename the project title in index.html from Contact Manager to Address Book

MetricControlTreatmentMatch
Events99=
Duration17.7s13.7s
Skillsnonenone=
Refsnonenone=
ToolsEdit(1), Glob(1), Grep(1), Read(1)Edit(1), Glob(1), Grep(1), Read(1)=
Signals00=

Totals

Control

Sessions

8

Prompts

8

Events

194

Skills: html, css
Tools: Agent(4), Bash(6), Edit(24), Glob(10), Grep(1), Read(41), Skill(3)

Treatment

Sessions

7

Prompts

7

Events

141

Skills: html, css
Tools: Agent(3), Bash(5), Edit(19), Glob(4), Grep(1), Read(30), Skill(2)

Verification Signals

SignalControlTreatmentProves
data-forge-idHTML form-patterns
flux-pod
forge-trigger
data-hatch-idHTML dialog-patterns
hatch-trigger
hatch-body
row-leverHTML table-patterns

Conclusion

skills differed in 3/9 prompts; subskill refs differed in 4/9 prompts; 6/26 verification signals differed