pure-routing-skill-body
differences detectedControl
No changes (baseline)
Treatment
ProjectB change: Rewrite all three SKILL.md bodies to pure routing format. Remove inline context lines (e.g., "Layouts use a data-rack attribute system"). Keep only numbered steps pointing to references. All domain knowledge stays exclusively in reference files.
Skills
8/8
Refs
7/8
Tools
2/8
Signals
26/26
Grading
Control
87.1%
27/31
Treatment
100.0%
31/31
Delta
+12.9%
8 prompts graded
Insights
Both sides loaded identical skills and references. Pure routing format didn't cause more or fewer reference reads. The "Do NOT proceed on training knowledge alone" disclaimer had no measurable effect.
Removing inline context hints didn't hurt or help. "Layouts use a data-rack attribute system" in control didn't give Claude enough to skip references, and removing it didn't force more reads.
The delta is entirely from a Docker timeout on prompt #4 (control side failed), not from treatment design. 7/8 completed prompts scored identically at 100%.
Body format (pure routing vs mixed) is neutral for reference loading in this test setup. The format of SKILL.md instructions matters less than what the references themselves contain.
Per-Prompt Results
#2 Add a dark mode toggle button that switches between light and dark themes
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 43 | 39 | ≠ |
| Duration | 70.7s | 73.1s | ≠ |
| Skills | css | css | = |
| Refs | css/theming.md | css/theming.md | = |
| Tools | Agent(1), Bash(4), Edit(3), Read(10), Skill(1), Write(1) | Agent(1), Bash(3), Edit(3), Read(9), Skill(1), Write(1) | ≠ |
| Signals | 2 | 2 | = |
Control signals: data-coat, --ink-
Treatment signals: data-coat, --ink-
#3 Add form validation to the contact form so empty fields show error messages
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 27 | 19 | ≠ |
| Duration | 61.5s | 49.5s | ≠ |
| Skills | html | html | = |
| Refs | html/form-patterns.md | html/form-patterns.md | = |
| Tools | Edit(7), Glob(1), Read(4), Skill(1) | Edit(3), Glob(1), Read(4), Skill(1) | ≠ |
| Signals | 3 | 3 | = |
Control signals: flux-pod, data-forge-id, forge-trigger
Treatment signals: flux-pod, data-forge-id, forge-trigger
#4 Create a confirmation dialog that appears when the user clicks delete on a contact
control timed out| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | timed out | 31 | --- |
| Duration | --- | 57.4s | --- |
| Skills | --- | none | --- |
| Refs | --- | none | --- |
| Tools | --- | Agent(1), Bash(3), Edit(1), Read(9) | --- |
| Signals | --- | 0 | --- |
#5 Add click-to-sort functionality to the contact table columns
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 25 | 15 | ≠ |
| Duration | 46.3s | 20.2s | ≠ |
| Skills | none | none | = |
| Refs | none | none | = |
| Tools | Agent(1), Bash(1), Edit(3), Read(6) | Edit(3), Glob(1), Read(3) | ≠ |
| Signals | 0 | 0 | = |
#6 Add a fade-in animation when new contacts appear in the table
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 15 | 13 | ≠ |
| Duration | 48.9s | 28.7s | ≠ |
| Skills | none | none | = |
| Refs | none | none | = |
| Tools | Edit(4), Glob(1), Read(2) | Edit(3), Glob(1), Read(2) | ≠ |
| Signals | 0 | 0 | = |
#8 Fetch contacts from a /api/contacts endpoint and display them in the table on page load
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 23 | 17 | ≠ |
| Duration | 24.6s | 20.0s | ≠ |
| Skills | none | none | = |
| Refs | none | none | = |
| Tools | Agent(1), Bash(3), Read(6) | Agent(1), Glob(3), Read(3) | ≠ |
| Signals | 0 | 0 | = |
#9 Add a comment at the top of each file explaining what it does
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 15 | 15 | = |
| Duration | 11.8s | 13.7s | ≠ |
| Skills | none | none | = |
| Refs | none | none | = |
| Tools | Edit(3), Glob(1), Read(3) | Edit(3), Glob(1), Read(3) | = |
| Signals | 0 | 0 | = |
#10 Rename the project title in index.html from Contact Manager to Address Book
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 9 | 9 | = |
| Duration | 13.6s | 13.6s | = |
| Skills | none | none | = |
| Refs | none | none | = |
| Tools | Edit(1), Glob(1), Grep(1), Read(1) | Edit(1), Glob(1), Grep(1), Read(1) | = |
| Signals | 0 | 0 | = |
Totals
Control
Sessions
7
Prompts
7
Events
157
Skills: css, html
Tools: Agent(3), Bash(8), Edit(21), Glob(4), Grep(1), Read(32), Skill(2), Write(1)
Treatment
Sessions
8
Prompts
8
Events
158
Skills: css, html
Tools: Agent(3), Bash(6), Edit(17), Glob(8), Grep(1), Read(34), Skill(2), Write(1)
Verification Signals
| Signal | Control | Treatment | Proves |
|---|---|---|---|
| data-coat | ● | ● | CSS theming |
| --ink- | ● | ● | |
| data-forge-id | ● | ● | HTML form-patterns |
| flux-pod | ● | ● | |
| forge-trigger | ● | ● |
Conclusion
subskill refs differed in 1/8 prompts