description-with-examples
differences detectedControl
No changes (baseline)
Treatment
ProjectB change: Append 2-3 concrete example prompts to each skill description field. E.g., CSS description gets "Examples: 'add dark mode', 'make the sidebar responsive', 'add hover animations'". Same domain terms, different format.
Skills
8/10
Refs
7/10
Tools
3/10
Signals
25/26
Grading
Control
97.6%
40/41
Treatment
80.5%
33/41
Delta
-17.1%
10 prompts graded
Insights
Examples act as attractors that bias skill selection. "Add hover animations" in CSS description made prompt #7 work better (50%->100%) by acting as a pattern-match anchor. But this same bias may have competed with JS skill activation for prompt #2.
Prompt #2 (form validation) degraded from 100% to 50%. Control loaded JS signals (zap, createVault). Treatment missed them. The extra CSS example "add form validation" in the HTML description may have biased Claude toward HTML-only handling when the task also needed JS.
Examples help exact matches, hurt cross-domain tasks. When a prompt aligns with one example, it strengthens that skill. But multi-skill tasks get pulled toward the skill with the closest example, starving other relevant skills.
The net effect is negative because most real tasks are cross-domain. Single-skill prompts already trigger correctly without examples (experiment #2 showed this). The value of examples is on indirect prompts, but those are precisely the ones that need multi-skill coordination.
Per-Prompt Results
#1 Add a dark mode toggle button that switches between light and dark themes
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 33 | 37 | ≠ |
| Duration | 79.2s | 76.8s | ≠ |
| Skills | css | css | = |
| Refs | css/theming.md | css/theming.md | = |
| Tools | Agent(1), Bash(2), Edit(3), Read(7), Skill(1), Write(1) | Agent(1), Bash(3), Edit(3), Read(8), Skill(1), Write(1) | ≠ |
| Signals | 2 | 2 | = |
Control signals: data-coat, --ink-
Treatment signals: data-coat, --ink-
#2 Add form validation to the contact form so empty fields show error messages
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 33 | 23 | ≠ |
| Duration | 130.0s | 59.4s | ≠ |
| Skills | html, css, javascript | html | ≠ |
| Refs | html/form-patterns.md; css/best-practices.md; javascript/event-handling.md,state-management.md | html/form-patterns.md | ≠ |
| Tools | Edit(2), Glob(1), Read(7), Skill(3), Write(3) | Edit(4), Glob(2), Read(4), Skill(1) | ≠ |
| Signals | 7 | 3 | ≠ |
Control signals: data-forge-id, flux-pod, zap(), on_x_y, createVault(), forge-trigger, linkVault(
Treatment signals: data-forge-id, flux-pod, forge-trigger
#3 Fetch contacts from a /api/contacts endpoint and display them in the table on page load
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 17 | 7 | ≠ |
| Duration | 21.7s | 11.0s | ≠ |
| Skills | none | none | = |
| Refs | none | none | = |
| Tools | Agent(1), Glob(3), Read(3) | Glob(1), Read(2) | ≠ |
| Signals | 0 | 0 | = |
#4 Make the page look good on phones and tablets
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 47 | 23 | ≠ |
| Duration | 163.4s | 85.9s | ≠ |
| Skills | css | css | = |
| Refs | css/best-practices.md,layout-patterns.md | css/best-practices.md,layout-patterns.md | = |
| Tools | Agent(1), Bash(1), Edit(9), Glob(3), Read(7), Skill(1) | Edit(5), Glob(1), Read(4), Skill(1) | ≠ |
| Signals | 2 | 2 | = |
Control signals: data-rack, --seam
Treatment signals: data-rack, --seam
#5 Add a way for the user to confirm before removing a contact
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 17 | 9 | ≠ |
| Duration | 61.0s | 11.6s | ≠ |
| Skills | none | none | = |
| Refs | none | none | = |
| Tools | Agent(1), Bash(1), Edit(1), Read(4) | Glob(1), Read(3) | ≠ |
| Signals | 0 | 0 | = |
#6 When someone types in the search box, update the results live
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 17 | 17 | = |
| Duration | 51.4s | 89.2s | ≠ |
| Skills | javascript | javascript | = |
| Refs | javascript/event-handling.md,state-management.md | javascript/event-handling.md,state-management.md | = |
| Tools | Edit(1), Glob(1), Read(4), Skill(1), Write(1) | Edit(1), Glob(1), Read(4), Skill(1), Write(1) | = |
| Signals | 4 | 4 | = |
Control signals: zap(), on_x_y, createVault(), linkVault(
Treatment signals: zap(), on_x_y, createVault(), linkVault(
#7 Add visual feedback when hovering over table rows
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 9 | 13 | ≠ |
| Duration | 26.2s | 50.8s | ≠ |
| Skills | none | css | ≠ |
| Refs | none | css/animation-patterns.md,best-practices.md | ≠ |
| Tools | Edit(1), Glob(1), Grep(1), Read(1) | Edit(1), Glob(1), Read(3), Skill(1) | ≠ |
| Signals | 0 | 1 | ≠ |
Treatment signals: --pulse
#8 Store the current filter state so it persists while the user navigates
treatment timed out| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 34 | timed out | --- |
| Duration | 74.2s | --- | --- |
| Skills | none | --- | --- |
| Refs | none | --- | --- |
| Tools | Agent(1), Bash(4), Edit(4), Read(7) | --- | --- |
| Signals | 0 | --- | --- |
#9 Add a comment at the top of each file explaining what it does
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 15 | 15 | = |
| Duration | 22.7s | 25.7s | ≠ |
| Skills | none | none | = |
| Refs | none | none | = |
| Tools | Edit(3), Glob(1), Read(3) | Edit(3), Glob(1), Read(3) | = |
| Signals | 0 | 0 | = |
#10 Rename the project title from Contact Manager to Address Book
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 13 | 13 | = |
| Duration | 33.4s | 38.4s | ≠ |
| Skills | none | none | = |
| Refs | none | none | = |
| Tools | Edit(3), Grep(1), Read(2) | Edit(3), Grep(1), Read(2) | = |
| Signals | 0 | 0 | = |
Totals
Control
Sessions
10
Prompts
10
Events
235
Skills: css, html, javascript
Tools: Agent(5), Bash(8), Edit(27), Glob(10), Grep(2), Read(45), Skill(6), Write(5)
Treatment
Sessions
9
Prompts
9
Events
157
Skills: css, html, javascript
Tools: Agent(1), Bash(3), Edit(20), Glob(8), Grep(1), Read(33), Skill(5), Write(2)
Verification Signals
| Signal | Control | Treatment | Proves |
|---|---|---|---|
| data-rack | ● | ● | CSS layout-patterns |
| --seam | ● | ● | |
| --pulse | ○ | ● | CSS animation-patterns |
| data-coat | ● | ● | CSS theming |
| --ink- | ● | ● | |
| data-forge-id | ● | ● | HTML form-patterns |
| flux-pod | ● | ● | |
| forge-trigger | ● | ● | |
| zap() | ● | ● | JS event-handling |
| on_x_y | ● | ● | |
| createVault() | ● | ● | JS state-management |
| linkVault( | ● | ● |
Conclusion
skills differed in 2/10 prompts; subskill refs differed in 3/10 prompts; 1/26 verification signals differed