short-vs-long-descriptions
differences detectedControl
No changes (baseline)
Treatment
ProjectB change: Shorten all three skill descriptions to generic ~30-36 char versions that remove domain terms (layout, animation, theming, forms, dialogs, events, state, fetch).
Skills
6/9
Refs
4/9
Tools
1/9
Signals
22/26
Grading
Control
75.0%
24/32
Treatment
84.4%
27/32
Delta
+9.4%
9 prompts graded
Insights
Stripping domain terms from descriptions significantly hurts trigger accuracy. Prompt #1 ("CSS animations") dropped 100% to 33% because "animation" was removed from the description. "CSS styling and visual conventions" doesn't bridge to "animations."
Prompt #6 ("search box, update results live") dropped 100% to 33%. Control has "events, state management, and API fetching" which connects to search filtering. Treatment's "JavaScript behavior conventions" is too vague.
Domain terms in descriptions are the primary semantic bridge between user intent and skill activation. This confirms experiment #2 from the opposite direction: adding keyword soup hurts (exp #2), and removing domain terms also hurts (exp #6). The sweet spot is natural language with domain-specific vocabulary.
Description length per se doesn't matter, content does. The difference isn't 50 chars vs 70 chars. It's having "animation, theming, layout" vs not having them. A 50-char description with the right terms would work fine.
Per-Prompt Results
#1 Add CSS animations to the contact table rows
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 15 | 23 | ≠ |
| Duration | 25.8s | 38.7s | ≠ |
| Skills | css | none | ≠ |
| Refs | css/animation-patterns.md | none | ≠ |
| Tools | Edit(2), Glob(1), Read(3), Skill(1) | Agent(1), Bash(2), Edit(2), Read(5) | ≠ |
| Signals | 2 | 0 | ≠ |
Control signals: data-zap, --pulse
#3 Add event listeners to handle clicking the delete button
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 23 | 17 | ≠ |
| Duration | 26.3s | 22.3s | ≠ |
| Skills | none | none | = |
| Refs | none | none | = |
| Tools | Agent(1), Bash(3), Read(6) | Agent(1), Glob(3), Read(3) | ≠ |
| Signals | 0 | 0 | = |
#4 Make the page look good on phones and tablets
control timed out| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | timed out | 23 | --- |
| Duration | --- | 48.0s | --- |
| Skills | --- | none | --- |
| Refs | --- | none | --- |
| Tools | --- | Agent(1), Bash(1), Edit(1), Glob(1), Read(6) | --- |
| Signals | --- | 0 | --- |
#5 Add a way for the user to confirm before removing a contact
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 19 | 29 | ≠ |
| Duration | 53.0s | 38.5s | ≠ |
| Skills | none | none | = |
| Refs | none | none | = |
| Tools | Agent(1), Bash(1), Edit(1), Glob(1), Read(4) | Agent(1), Bash(3), Grep(1), Read(8) | ≠ |
| Signals | 0 | 0 | = |
#6 When someone types in the search box, update the results live
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 21 | 31 | ≠ |
| Duration | 57.0s | 52.3s | ≠ |
| Skills | javascript | none | ≠ |
| Refs | javascript/event-handling.md,state-management.md | none | ≠ |
| Tools | Edit(3), Glob(1), Read(5), Skill(1) | Agent(1), Bash(2), Edit(5), Read(6) | ≠ |
| Signals | 2 | 0 | ≠ |
Control signals: zap(), on_x_y
#7 Add visual feedback when hovering over table rows
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 13 | 7 | ≠ |
| Duration | 25.7s | 12.8s | ≠ |
| Skills | css | none | ≠ |
| Refs | css/animation-patterns.md,best-practices.md | none | ≠ |
| Tools | Edit(1), Glob(1), Read(3), Skill(1) | Edit(1), Glob(1), Read(1) | ≠ |
| Signals | 1 | 0 | ≠ |
Control signals: --pulse
#8 Store the current filter state so it persists while the user navigates
control timed out| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | timed out | 31 | --- |
| Duration | --- | 67.6s | --- |
| Skills | --- | none | --- |
| Refs | --- | none | --- |
| Tools | --- | Agent(1), Bash(2), Edit(4), Read(7) | --- |
| Signals | --- | 0 | --- |
#9 Add a comment at the top of each file explaining what it does
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 15 | 15 | = |
| Duration | 12.8s | 15.4s | ≠ |
| Skills | none | none | = |
| Refs | none | none | = |
| Tools | Edit(3), Glob(1), Read(3) | Edit(3), Glob(1), Read(3) | = |
| Signals | 0 | 0 | = |
#10 Rename the project title from Contact Manager to Address Book
| Metric | Control | Treatment | Match |
|---|---|---|---|
| Events | 13 | 9 | ≠ |
| Duration | 19.8s | 15.2s | ≠ |
| Skills | none | none | = |
| Refs | none | none | = |
| Tools | Edit(3), Grep(1), Read(2) | Edit(2), Grep(1), Read(1) | ≠ |
| Signals | 0 | 0 | = |
Totals
Control
Sessions
7
Prompts
7
Events
119
Skills: css, javascript
Tools: Agent(2), Bash(4), Edit(13), Glob(5), Grep(1), Read(26), Skill(3)
Treatment
Sessions
9
Prompts
9
Events
185
Skills: none
Tools: Agent(6), Bash(10), Edit(18), Glob(6), Grep(2), Read(40)
Verification Signals
| Signal | Control | Treatment | Proves |
|---|---|---|---|
| data-zap | ● | ○ | CSS animation-patterns |
| --pulse | ● | ○ | |
| zap() | ● | ○ | JS event-handling |
| on_x_y | ● | ○ |
Conclusion
skills differed in 3/9 prompts; subskill refs differed in 5/9 prompts; 4/26 verification signals differed