♻️ refactor(agents,skills): optimize invocation system with context-efficient architecture

## Major Changes

**Agent System Overhaul:**
-  Added 3 specialized implementation agents (angular-developer, test-writer, refactor-engineer)
- 🗑️ Removed 7 redundant agents (debugger, error-detective, deployment-engineer, prompt-engineer, search-specialist, technical-writer, ui-ux-designer)
- 📝 Updated all 9 agent descriptions with action-focused, PROACTIVELY-triggered patterns
- 🔧 Net reduction: 16 → 9 agents (44% reduction)

**Description Pattern Standardization:**
- **Agents**: "[Action] + what. Use PROACTIVELY when [specific triggers]. [Features]."
- **Skills**: "This skill should be used when [triggers]. [Capabilities]."
- Removed ambiguous "use proactively" without conditions
- Added measurable triggers (file counts, keywords, thresholds)

**CLAUDE.md Enhancements:**
- 📚 Added "Agent Design Principles" based on Anthropic research
-  Added "Proactive Agent Invocation" rules for automatic delegation
- 🎯 Added response format control (concise vs detailed)
- 🔄 Added environmental feedback patterns
- 🛡️ Added poka-yoke error-proofing guidelines
- 📊 Added token efficiency benchmarks (98.7% reduction via code execution)
- 🗂️ Added context chunking strategy for retrieval
- 🏗️ Documented Orchestrator-Workers pattern

**Context Management:**
- 🔄 Converted context-manager from MCP memory to file-based (.claude/context/)
- Added implementation-state tracking for session resumption
- Team-shared context in git (not personal MCP storage)

**Skills Updated (5):**
- api-change-analyzer: Condensed, added trigger keywords
- architecture-enforcer: Standardized "This skill should be used when"
- circular-dependency-resolver: Added build failure triggers
- git-workflow: Added missing trigger keywords
- library-scaffolder: Condensed implementation details

## Expected Impact

**Context Efficiency:**
- 15,000-20,000 tokens saved per task (aggressive pruning)
- 25,000-35,000 tokens saved per complex task (agent isolation)
- 2-3x more work capacity per session

**Automatic Invocation:**
- Main agent now auto-invokes specialized agents based on keywords
- Clear boundaries prevent wrong agent selection
- Response format gives user control over detail level

**Based on Anthropic Research:**
- Building Effective Agents
- Writing Tools for Agents
- Code Execution with MCP
- Contextual Retrieval
This commit is contained in:
Lorenz Hilpert
2025-11-21 16:52:26 +01:00
parent 4107641e75
commit ac2df3ea54
25 changed files with 1774 additions and 1423 deletions

524
CLAUDE.md
View File

@@ -365,6 +365,530 @@ NEVER parallel if: One agent's findings should guide the next agent's search.
3. Document why standard approach might differ
4. Never refuse based on "best practices" alone
## 🔴 CRITICAL: Tool Result Minimization
**Tool results are the #1 source of context bloat. After each tool execution, aggressively minimize what stays in context:**
### Bash Tool Results
**SUCCESS cases:**
-`✓ Command succeeded (exit 0)`
-`✓ npm install completed (23 packages added)`
-`✓ Tests passed: 45/45`
**FAILURE cases:**
- ✅ Keep exit code + error lines only (max 10 lines)
- ✅ Strip ANSI codes, progress bars, verbose output
- ❌ NEVER include full command output for successful operations
**Example transformations:**
```
❌ WRONG: [300 lines of npm install output with dependency tree]
✅ RIGHT: ✓ npm install completed (23 packages added)
❌ WRONG: [50 lines of test output with passing test names]
✅ RIGHT: ✓ Tests passed: 45/45
❌ WRONG: [Build output with webpack chunks and file sizes]
✅ RIGHT: ✓ Build succeeded in 12.3s (3 chunks, 2.1MB)
```
### Edit Tool Results
**SUCCESS cases:**
-`✓ Modified /path/to/file.ts`
-`✓ Updated 3 files: component.ts, service.ts, test.ts`
**FAILURE cases:**
- ✅ Show error message + line number
- ❌ NEVER show full file diffs for successful edits
**ONLY show full diff when:**
- User explicitly asks "what changed?"
- Edit failed and debugging needed
- Major refactoring requiring review
### Write Tool Results
-`✓ Created /path/to/new-file.ts (245 lines)`
- ❌ NEVER echo back full file content after writing
### Read Tool Results
**Extract only relevant sections:**
- ✅ Read file → extract function/class → discard rest
- ✅ Summarize: "File contains 3 components: A, B, C (lines 10-150)"
- ❌ NEVER keep full file in context after extraction
**Show full file ONLY when:**
- User explicitly requests it
- File < 50 lines
- Need complete context for complex refactoring
### Grep/Glob Results
-`Found in 5 files: auth.ts, user.ts, ...`
- ✅ Show matching lines if < 20 results
- ❌ NEVER include full file paths and line numbers for > 20 matches
### Skill Application Results
**After applying skill:**
- ✅ Replace full skill content with: `Applied [skill-name]: [checklist]`
- ✅ Example: `Applied logging: ✓ Factory pattern ✓ Lazy evaluation ✓ Context added`
- ❌ NEVER keep full skill instructions after application
**Skill compression format:**
```
Applied angular-template:
✓ Modern control flow (@if, @for, @switch)
✓ Template references (ng-template)
✓ Lazy loading (@defer)
```
### Agent Results
**Summarization requirements (already covered in previous section):**
- 1-2 sentences for simple queries
- Structured table/list for complex findings
- NEVER include raw JSON or full agent output
### Session Cleanup
**Use `/clear` between tasks when:**
- Switching to unrelated task
- Previous task completed successfully
- Context exceeds 80K tokens
- Starting new feature/bug after finishing previous
**Benefits of `/clear`:**
- Prevents irrelevant context from degrading performance
- Resets working memory for fresh focus
- Maintains only persistent context (CLAUDE.md, skills)
## Implementation Work: Agent vs Direct Implementation
**Context-efficient implementation requires choosing the right execution mode.**
### Decision Matrix
| Task Type | Files | Complexity | Duration | Use |
|-----------|-------|-----------|----------|-----|
| Single file edit | 1 | Low | < 5min | Main agent (direct) + aggressive pruning |
| Bug fix | 1-3 | Variable | < 10min | Main agent (direct) + aggressive pruning |
| New Angular code (component/service/store) | 2-5 | Medium | 10-20min | **angular-developer agent** |
| Test suite | Any | Medium | 10-20min | **test-writer agent** |
| Large refactor | 5+ | High | 20+ min | **refactor-engineer agent** |
| Migration work | 10+ | High | 30+ min | **refactor-engineer agent** |
### 🔴 Proactive Agent Invocation (Automatic)
**You MUST automatically invoke specialized agents when task characteristics match, WITHOUT waiting for explicit user request.**
**Automatic triggers:**
1. **User says: "Create [component/service/store/pipe/directive/guard]..."**
→ AUTOMATICALLY invoke `angular-developer` agent
→ Example: "Create user dashboard component with metrics" → Use angular-developer
2. **User says: "Write tests for..." OR "Add test coverage..."**
→ AUTOMATICALLY invoke `test-writer` agent
→ Example: "Write tests for the checkout service" → Use test-writer
3. **User says: "Refactor all..." OR "Migrate [X] files..." OR "Update pattern across..."**
→ AUTOMATICALLY invoke `refactor-engineer` agent
→ Example: "Migrate all checkout components to standalone" → Use refactor-engineer
4. **Task analysis indicates > 4 files will be touched**
→ AUTOMATICALLY suggest/use appropriate agent
→ Example: User asks to implement feature that needs component + service + store + routes → Use angular-developer
5. **User says: "Remember to..." OR "TODO:" OR "Don't forget..."**
→ AUTOMATICALLY invoke `context-manager` to store task
→ Store immediately in `.claude/context/tasks.json`
**Decision flow:**
```
User request received
Analyze task characteristics:
├─ Keywords match? (create, test, refactor, migrate)
├─ File count estimate? (1 = direct, 2-5 = angular-developer, 5+ = refactor-engineer)
├─ Task type? (implementation vs testing vs refactoring)
└─ Complexity? (simple = direct, medium = agent, high = agent + detailed response)
IF agent match found:
├─ Brief user: "I'll use [agent-name] for this task"
├─ Invoke agent with Task tool
└─ Validate result
ELSE:
└─ Implement directly with aggressive pruning
```
**Communication pattern:**
```
✅ CORRECT:
User: "Create a user profile component with avatar upload"
Assistant: "I'll use the angular-developer agent for this Angular feature implementation."
[Invokes angular-developer agent]
❌ WRONG:
User: "Create a user profile component with avatar upload"
Assistant: "I can help you create a component. What fields should it have?"
[Doesn't invoke agent, implements directly, wastes main context]
```
**Override mechanism:**
If user says "do it directly" or "don't use an agent", honor their preference:
```
User: "Create a simple component, do it directly please"
Assistant: "Understood, implementing directly."
[Does NOT invoke angular-developer]
```
**Response format default:**
- Use `response_format: "concise"` by default (context efficiency)
- Use `response_format: "detailed"` when:
- User is learning/exploring
- Debugging complex issues
- User explicitly asks for details
- Task is unusual/non-standard
### When Main Agent Implements Directly
**Use direct implementation for simple, focused tasks:**
1. **Apply aggressive pruning** (see Tool Result Minimization above)
2. **Use context-manager proactively**:
- Store implementation state in MCP memory
- Enable session resumption without context replay
3. **Compress conversation every 5-7 exchanges**:
- Summarize: "Completed X, found Y, next: Z"
- Discard intermediate exploration
4. **Use `/clear` after completion** to reset for next task
**Example flow:**
```
User: "Fix the auth bug in login.ts"
Assistant: [Reads file, identifies issue, applies fix]
Assistant: ✓ Modified login.ts
Assistant: ✓ Tests passed: 12/12
[Context used: ~3,000 tokens]
```
### When to Hand Off to Subagent
**Delegate to specialized agents for complex/multi-file work:**
**Triggers:**
- Task will take > 10 minutes
- Touching > 4 files
- Repetitive work (CRUD generation, migrations)
- User wants to multitask in main thread
- Implementation requires iterative debugging
**Available Implementation Agents:**
#### angular-developer
**Use for:** Angular code implementation (components, services, stores, pipes, directives, guards)
- Auto-loads: angular-template, html-template, logging, tailwind
- Handles: Components, services, stores, pipes, directives, guards, tests
- Output: 2-5 files created/modified
**Briefing template:**
```
Implement Angular [type]:
- Type: [component/service/store/pipe/directive/guard]
- Purpose: [description]
- Location: [path]
- Requirements: [list]
- Integration: [dependencies]
- Data flow: [if applicable]
```
#### test-writer
**Use for:** Test suite generation/expansion
- Auto-loads: test-migration-specialist patterns, Vitest config
- Handles: Unit tests, integration tests, mocking
- Output: Test files with comprehensive coverage
**Briefing template:**
```
Generate tests for:
- Target: [file path]
- Coverage: [unit/integration/e2e]
- Scenarios: [list of test cases]
- Mocking: [dependencies to mock]
```
#### refactor-engineer
**Use for:** Large-scale refactoring/migrations
- Auto-loads: architecture-enforcer, circular-dependency-resolver
- Handles: Multi-file refactoring, pattern migrations, architectural changes
- Output: 5+ files modified, validation report
**Briefing template:**
```
Refactor [scope]:
- Pattern: [old pattern] → [new pattern]
- Files: [list or glob pattern]
- Constraints: [architectural rules]
- Validation: [how to verify success]
```
### Agent Coordination Pattern
**Main agent responsibilities:**
1. **Planning**: Decompose request, choose agent
2. **Briefing**: Provide focused, complete requirements
3. **Validation**: Review agent output (summary only)
4. **Integration**: Ensure changes work together
**Implementation agent responsibilities:**
1. **Execution**: Load skills, implement changes
2. **Testing**: Run tests, fix errors
3. **Reporting**: Return summary + key files modified
4. **Context isolation**: Keep implementation details in own context
**Handoff protocol:**
```
Main Agent:
↓ Brief agent with requirements
Implementation Agent:
↓ Execute (skills loaded, iterative debugging)
↓ Return summary: "✓ Created 3 files, ✓ Tests pass (12/12)"
Main Agent:
↓ Validate summary
↓ Continue with next task
[Implementation details stayed in agent context]
```
### Parallel Work Pattern
**When user has multiple independent tasks:**
1. **Main agent** handles simple task directly
2. **Specialized agent** handles complex task in parallel
3. Both complete, results integrate
**Example:**
```
User: "Fix auth bug AND create new dashboard component"
Main Agent: Fix auth bug directly (simple, 1 file)
angular-developer: Create dashboard (complex, 4 files)
Both complete independently
```
### Context Savings Calculation
**Direct implementation (simple task):**
- Tool results (pruned): ~1,000 tokens
- Conversation: ~2,000 tokens
- Total: ~3,000 tokens
**Subagent delegation (complex task):**
- Briefing: ~1,500 tokens
- Summary result: ~500 tokens
- Total in main context: ~2,000 tokens
- (Agent's 15,000 tokens stay isolated)
**Net savings for complex task: ~13,000 tokens**
## Agent Design Principles (Anthropic Best Practices)
**Based on research from Anthropic's engineering blog, these principles guide our agent architecture.**
### Core Principles
**1. Simplicity First**
- Start with simple solutions before adding agents
- Only use agents when tasks are open-ended with unpredictable steps
- Avoid over-engineering with unnecessary frameworks
**2. Tool Quality > Prompt Quality**
- Anthropic spent MORE time optimizing tools than prompts for SWE-bench
- Small design choices (absolute vs relative paths) eliminate systematic errors
- Invest heavily in tool documentation and testing
**3. Ground Agents in Reality**
- Provide environmental feedback at each step (tool results, test output)
- Build feedback loops with checkpoints
- Validate progress before continuing
### Response Format Control
**All implementation agents support response format parameter:**
```
briefing:
response_format: "concise" # default, ~500 tokens
# or "detailed" # ~2000 tokens with explanations
```
**Concise** (default for context efficiency):
```
✓ Feature created: DashboardComponent
✓ Files: component.ts (150 lines), template (85 lines), tests (12/12 passing)
✓ Skills applied: angular-template, html-template, logging
```
**Detailed** (use when debugging or learning):
```
✓ Feature created: DashboardComponent
Implementation approach:
- Used signalStore for state management (withState + withComputed)
- Implemented reactive data loading with Resource API
- Template uses modern control flow (@if, @for)
Files created:
- component.ts (150 lines): Standalone component with inject() pattern
- component.html (85 lines): Modern syntax with E2E attributes
- component.spec.ts: 12 tests covering rendering, interactions, state
Key decisions:
- Chose Resource API over manual loading for better race condition handling
- Computed signals for derived state (no effects needed)
Integration notes:
- Requires UserMetricsService injection
- Routes need update: path 'dashboard' → DashboardComponent
```
### Environmental Feedback Pattern
**Agents should report progress at key milestones:**
```
Phase 1: Creating files...
✓ Created dashboard.component.ts (150 lines)
✓ Created dashboard.component.html (85 lines)
Phase 2: Running validation...
→ Running lint... ✓ No errors
→ Running tests... ⚠ 8/12 passing
Phase 3: Fixing failures...
→ Investigating test failures: Mock data missing for UserService
→ Adding mock setup... ✓ Fixed
→ Rerunning tests... ✓ 12/12 passing
Complete! Ready for review.
```
### Tool Documentation Standards
**Every agent must document:**
**When to Use** (boundaries):
```markdown
✅ Creating 2-5 related files (component + service + store)
❌ Single file edits (use main agent directly)
❌ >10 files (use refactor-engineer instead)
```
**Examples** (happy path):
```markdown
Example: "Create login component with form validation and auth service integration"
→ Generates: component.ts, component.html, component.spec.ts, auth.service.ts
```
**Edge Cases** (failure modes):
```markdown
⚠ If auth service exists, will reuse (not recreate)
⚠ If tests fail after 3 attempts, returns partial progress + blocker details
```
### Context Chunking Strategy
**When storing knowledge in `.claude/context/`, prepend contextual headers:**
**Before** (low retrieval accuracy):
```json
{
"description": "Use signalStore() with withState()"
}
```
**After** (high retrieval accuracy):
```json
{
"context": "This pattern is for NgRx Signal Store in ISA-Frontend Angular 20+ monorepo. Replaces @ngrx/store for feature state management.",
"description": "Use signalStore() with withState() for state, withComputed() for derived values, withMethods() for actions. NO effects for state propagation.",
"location": "All libs/**/data-access/ libraries",
"example": "export const UserStore = signalStore(withState({users: []}));"
}
```
**Chunk size**: 200-800 tokens per entry for optimal retrieval.
### Poka-Yoke (Error-Proofing) Design
**Make mistakes harder to make:**
-**Use absolute paths** (not relative): Eliminates path resolution errors
-**Validate inputs early**: Check file existence before operations
-**Provide sensible defaults**: response_format="concise", model="sonnet"
-**Actionable error messages**: "File not found. Did you mean: /path/to/similar-file.ts?"
-**Fail fast with rollback**: Stop on first error, report state, allow retry
### Workflow Pattern: Orchestrator-Workers
**Our architecture uses the Orchestrator-Workers pattern:**
```
Main Agent (Orchestrator)
├─ Plans decomposition
├─ Chooses specialized worker
├─ Provides focused briefing
└─ Validates worker results
Worker Agents (angular-developer, test-writer, refactor-engineer)
├─ Execute with full autonomy
├─ Load relevant skills
├─ Iterate on errors internally
└─ Return concise summary
```
**Benefits:**
- Context isolation (worker details stay in worker context)
- Parallel execution (main + worker simultaneously)
- Specialization (each worker optimized for domain)
- Predictable communication (briefing → execution → summary)
### Token Efficiency Benchmark
**From Anthropic research: 98.7% token reduction via code execution**
Traditional tool-calling approach:
- 150,000 tokens (alternating LLM calls + tool results)
Code execution approach:
- 2,000 tokens (write code → execute in environment)
**ISA-Frontend application:**
- Use Bash for loops, filtering, transformations
- Use Edit for batch file changes (not individual tool calls per file)
- Use Grep/Glob for discovery (not Read every file)
- Prefer consolidated operations over step-by-step tool chains
### Context Strategy Threshold
**Based on Anthropic contextual retrieval research:**
- **< 200K tokens**: Put everything in prompt with caching (90% cost savings)
- **> 200K tokens**: Use retrieval system (our `.claude/context/` approach)
**ISA-Frontend**: ~500K+ tokens (Angular monorepo with 100+ libraries)
- ✅ File-based retrieval is correct choice
- ✅ Contextual headers improve retrieval accuracy by 35-67%
- ✅ Top-20 chunks with dual retrieval (semantic + lexical)
<!-- nx configuration start-->
<!-- Leave the start & end comments to automatically receive updates. -->