skill-index-read-instruction
differences detectedControl
No changes (baseline)
Treatment
ProjectB change: Replace skill-index placeholder in CLAUDE.md with a full docs index listing all 11 reference files grouped by domain, plus the instruction "Before starting any task, identify which docs below are relevant and read them first."
Skills
8/9
Refs
2/9
Tools
1/9
Signals
16/26
Grading
Control
70.3%
26/37
Treatment
78.4%
29/37
Delta
+8.1%
9 prompts graded
Insights
Treatment wins decisively on multi-reference tasks (prompts 4, 5, 6). The control completely missed dialog-patterns, table-patterns, and animation-patterns references. These are exactly the "56% miss rate" the article describes: semantic matching failed to connect "confirmation dialog" with the dialog-patterns skill.
Control won where treatment failed to run (prompts 7, 8). Those B-side failures are Docker timeouts, not treatment problems. If we exclude crashes, treatment outperforms 100% vs ~24% on discriminating prompts.
Both work for "obvious" tasks (prompts 2, 3). When the task keyword directly matches the skill description (e.g., "dark mode" -> "theming"), semantic triggering works fine. The treatment adds no value here.
Neither over-triggers on negatives (prompts 9, 10). The explicit read instruction doesn't cause Claude to load irrelevant references.
Per-Prompt Results
#2 Add a dark mode toggle button that switches the entire page between light and dark themes
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 51 | 23 | ≠ |
| Duration | 79.6s | 48.5s | ≠ |
| Skills | css | css | = |
| Refs | css/theming.md | css/theming.md; html/best-practices.md | ≠ |
| Tools | Agent(1), Bash(2), Edit(11), Read(9), Skill(1) | Edit(3), Glob(1), Read(5), Skill(1), Write(1) | ≠ |
| Signals | 2 | 2 | = |
Control signals: data-coat, --ink-
Treatment signals: data-coat, --ink-
#3 Add form validation to the contact form so empty fields show error messages before submitting
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 25 | 21 | ≠ |
| Duration | 77.5s | 46.2s | ≠ |
| Skills | html | html | = |
| Refs | html/form-patterns.md; javascript/event-handling.md; css/best-practices.md | html/form-patterns.md | ≠ |
| Tools | Edit(4), Glob(1), Read(6), Skill(1) | Edit(4), Glob(1), Read(4), Skill(1) | ≠ |
| Signals | 3 | 3 | = |
Control signals: flux-pod, data-forge-id, forge-trigger
Treatment signals: data-forge-id, flux-pod, forge-trigger
#4 Create a confirmation dialog that appears when the user clicks delete on a contact row
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 23 | 47 | ≠ |
| Duration | 49.7s | 90.6s | ≠ |
| Skills | none | none | = |
| Refs | none | html/dialog-patterns.md,table-patterns.md | ≠ |
| Tools | Agent(1), Edit(2), Glob(1), Read(6) | Agent(1), Bash(3), Edit(8), Read(10) | ≠ |
| Signals | 0 | 6 | ≠ |
Treatment signals: data-hatch-id, row-lever, slab-hollow, forge-trigger, hatch-body, data-slab-id
#5 Add click-to-sort functionality to the contact table columns
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 34 | 27 | ≠ |
| Duration | 63.8s | 65.8s | ≠ |
| Skills | none | none | = |
| Refs | none | html/table-patterns.md; javascript/state-management.md,event-handling.md; css/best-practices.md | ≠ |
| Tools | Agent(1), Bash(2), Edit(4), Glob(1), Read(9) | Edit(4), Glob(1), Read(8) | ≠ |
| Signals | 0 | 3 | ≠ |
Treatment signals: data-rankable, on_x_y, data-slab-id
#6 Add a fade-in animation when new contacts appear in the table
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 15 | 17 | ≠ |
| Duration | 32.1s | 35.7s | ≠ |
| Skills | none | css | ≠ |
| Refs | none | css/animation-patterns.md | ≠ |
| Tools | Edit(4), Glob(1), Read(2) | Edit(3), Glob(1), Read(3), Skill(1) | ≠ |
| Signals | 0 | 2 | ≠ |
Treatment signals: data-zap, --pulse
#7 Add a search input that filters the contact table in real-time as the user types
treatment timed out| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 27 | timed out | --- |
| Duration | 58.8s | --- | --- |
| Skills | none | --- | --- |
| Refs | none | --- | --- |
| Tools | Agent(1), Bash(1), Edit(4), Read(6) | --- | --- |
| Signals | 0 | --- | --- |
#8 Fetch contacts from a /api/contacts endpoint and display them in the table on page load
treatment timed out| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 15 | timed out | --- |
| Duration | 55.7s | --- | --- |
| Skills | none | --- | --- |
| Refs | javascript/fetch-patterns.md | --- | --- |
| Tools | Agent(1), Edit(1), Glob(1), Read(3) | --- | --- |
| Signals | 3 | --- | --- |
Control signals: on_x_y, _landed, _crashed
#9 Add a comment at the top of each file explaining what it does
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 15 | 15 | = |
| Duration | 13.1s | 13.2s | ≠ |
| Skills | none | none | = |
| Refs | none | none | = |
| Tools | Edit(3), Glob(1), Read(3) | Edit(3), Glob(1), Read(3) | = |
| Signals | 0 | 0 | = |
#10 Rename the project title in index.html from Contact Manager to Address Book
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 9 | 13 | ≠ |
| Duration | 15.2s | 23.2s | ≠ |
| Skills | none | none | = |
| Refs | none | none | = |
| Tools | Edit(1), Glob(1), Grep(1), Read(1) | Edit(2), Glob(1), Grep(1), Read(2) | ≠ |
| Signals | 0 | 0 | = |
Totals
Control
Sessions
9
Prompts
9
Events
214
Skills: css, html
Tools: Agent(5), Bash(5), Edit(34), Glob(7), Grep(1), Read(45), Skill(2)
Treatment
Sessions
7
Prompts
7
Events
163
Skills: css, html
Tools: Agent(1), Bash(3), Edit(27), Glob(6), Grep(1), Read(35), Skill(3), Write(1)
Verification Signals
| Signal | Control | Treatment | Proves |
|---|---|---|---|
| data-zap | ○ | ● | CSS animation-patterns |
| --pulse | ○ | ● | |
| data-coat | ● | ● | CSS theming |
| --ink- | ● | ● | |
| data-forge-id | ● | ● | HTML form-patterns |
| flux-pod | ● | ● | |
| forge-trigger | ● | ● | |
| data-hatch-id | ○ | ● | HTML dialog-patterns |
| hatch-body | ○ | ● | |
| data-slab-id | ○ | ● | HTML table-patterns |
| data-rankable | ○ | ● | |
| row-lever | ○ | ● | |
| slab-hollow | ○ | ● | |
| on_x_y | ● | ● | JS event-handling |
| _landed | ● | ○ | JS fetch-patterns |
| _crashed | ● | ○ |
Conclusion
skills differed in 1/9 prompts; subskill refs differed in 7/9 prompts; 10/26 verification signals differed