Target Workflow: test-coverage-reporter
Source report: #5627
AIC per run: ~161.6 (proxy cost; token telemetry unavailable for Copilot engine)
Action minutes per run: ~15 (successful run)
Failure rate: 50% (1 of 2 recent runs failed — run 28313083143)
LLM turns: N/A (not reported by Copilot engine)
Current Configuration
| Setting |
Value |
| Tools loaded |
None (github: false, bash: false) |
| Safe-output tools |
create_discussion, missing_tool, missing_data, noop |
| Network groups |
github only |
| Engine / model |
Copilot / summarization (haiku / gpt-5-mini / gemini-flash-lite) |
| Pre-agent steps |
9 steps (npm install, build, test:coverage, 6× data extraction) |
| Prompt body size |
~2,200 chars |
| Pre-built discussion template step |
✅ exists — but output is never referenced in the prompt |
Root Cause: Pre-built Body Is Computed but Discarded
The Pre-build discussion template step (lines 202–254 of .github/workflows/test-coverage-reporter.md) builds a complete formatted discussion body (steps.discussion-template.outputs.DISCUSSION_BODY) with 6 of the 8 required sections:
- Overall Coverage table (from
steps.coverage-table)
- Security-Critical Path Status (from
steps.critical-gaps)
- Coverage Table (from
steps.coverage-table)
- Function Audit (from
steps.func-audit)
- Recent Source Changes (from
steps.recent-changes)
However, steps.discussion-template.outputs.DISCUSSION_BODY is never referenced in the agent prompt. The agent only receives the 5-line COVERAGE_GAPS_BRIEF and is asked to "write a complete coverage discussion" with all 8 sections from scratch — data it literally cannot access since no bash/GitHub tools are loaded.
This forces the model to hallucinate sections (coverage percentages, file names, function lists, etc.) from a 5-line summary, likely causing multiple inference turns and inflating runtime.
Recommendations
1. Pass the Pre-Built Discussion Body to the Agent Prompt
Estimated savings: ~75 AIC/run (~47%) — reduces agent from multi-turn to 1-turn
Replace the current agent prompt body in .github/workflows/test-coverage-reporter.md (lines 259–301) with:
## Pre-Computed Discussion Body
All coverage data was computed in pre-steps. The discussion body below is ready — do not change any numbers or section headings.
${{ steps.discussion-template.outputs.DISCUSSION_BODY }}
## Your Task
Append these two sections to the body above, then call `create_discussion` with the complete combined body (pre-built sections + the two new sections below).
### 🔎 Notable Findings
Write 2–4 bullet points based strictly on the coverage gaps below. Reference specific file names and percentages from the body above.
${{ steps.coverage-gaps-brief.outputs.COVERAGE_GAPS_BRIEF }}
### 🎯 Recommendations
Write 3 prioritized items (High / Medium / Low) tied to the findings above. Each item must name a specific file and suggest a concrete action (e.g., "add tests for the X branch in `src/host-iptables.ts`").
After appending both sections, call `create_discussion` once with the full body.
This gives the agent all section data already formatted, reducing its task to appending ~200 words and calling a tool (1 LLM turn).
2. Pre-Compute Notable Findings and Recommendations in a Shell Step
Estimated savings: additional ~50 AIC/run, cumulative ~78% total with Rec 1
Add a pre-step after coverage-gaps-brief that generates findings and recommendations deterministically:
- name: Generate findings and recommendations
id: findings
run: |
{
echo "FINDINGS_AND_RECS<<EOF"
node -e "
const fs = require('fs');
const d = JSON.parse(fs.readFileSync('coverage/coverage-summary.json','utf8'));
const SEC = ['docker-manager','host-iptables','squid-config','domain-patterns','cli'];
const findings = [];
const recs = [];
const gaps = Object.entries(d)
.filter(([k]) => k !== 'total')
.map(([k, v]) => ({
file: k.replace(process.cwd() + '/', ''),
stmts: v.statements.pct,
isSec: SEC.some(s => k.includes(s))
}))
.filter(r => r.stmts < 80)
.sort((a, b) => (b.isSec - a.isSec) || (a.stmts - b.stmts));
gaps.slice(0, 4).forEach(g => {
const tag = g.isSec ? '🔴 **[SECURITY]**' : g.stmts < 50 ? '🟠 **[CRITICAL]**' : '🟡 **[LOW]**';
findings.push('- ' + tag + ' \`' + g.file + '\`: ' + g.stmts + '% statement coverage');
});
if (!findings.length) findings.push('- ✅ All files meet the 80% coverage threshold');
const priorities = ['High', 'Medium', 'Low'];
gaps.slice(0, 3).forEach((g, i) => {
const action = g.isSec
? 'Add unit tests for \`' + g.file + '\` — security-critical path at ' + g.stmts + '% coverage'
: 'Increase test coverage for \`' + g.file + '\` from ' + g.stmts + '% to 80%';
recs.push('- **' + priorities[i] + '**: ' + action);
});
if (!recs.length) recs.push('- **Low**: Maintain current coverage levels across all files');
console.log('### 🔎 Notable Findings\n');
findings.forEach(f => console.log(f));
console.log('\n### 🎯 Recommendations\n');
recs.forEach(r => console.log(r));
" 2>/dev/null || echo "### 🔎 Notable Findings\n\n- Coverage data unavailable"
echo "EOF"
} >> "$GITHUB_OUTPUT"
Then simplify the agent prompt to a single-call instruction:
Call `create_discussion` once with this exact body — do not modify it:
${{ steps.discussion-template.outputs.DISCUSSION_BODY }}
${{ steps.findings.outputs.FINDINGS_AND_RECS }}
---
*Generated by test-coverage-reporter workflow. Trigger: `${{ github.event_name }}`*
This reduces the agent to one deterministic LLM turn: parse prompt → call tool → done.
3. Add Fast-Fail Gate After Coverage Step
Estimated savings: Eliminates ~161.6 AIC waste on failed runs (50% of recent runs)
The 50% failure rate (run 28313083143 failed at 4 min, conclusion: failure) wastes action minutes without producing any value. Add a gate step immediately after Run coverage:
- name: Verify coverage data
id: coverage-gate
run: |
if [ ! -f coverage/coverage-summary.json ]; then
echo "::error::Coverage data missing — aborting"
exit 1
fi
echo "Coverage data verified: $(wc -c < coverage/coverage-summary.json) bytes"
This fails the job immediately before the 6 downstream extraction steps and the AWF agent start, saving ~10+ minutes of wasted runner time per failed run.
4. Evaluate Removing the Push Trigger
Estimated savings: ~161.6 AIC per avoided duplicate run
The workflow triggers on both schedule: daily and push: branches: [main], paths: [src/**/*.ts]. For a reporting workflow (no code changes), the push trigger can cause same-day duplicate runs when source files land on main.
# Consider removing this trigger block:
on:
push:
branches: [main]
paths:
- 'src/**/*.ts'
Keep workflow_dispatch: for on-demand runs. The daily schedule is sufficient for trend reporting.
5. Consolidate Node.js Pre-Step Calls
Estimated savings: ~1–2 action minutes per run
Steps coverage-table, critical-gaps, func-audit, and coverage-gaps-brief all open coverage/coverage-summary.json in separate node -e "..." processes. Merge these into one step that writes all four GITHUB_OUTPUT entries in a single Node.js invocation.
Expected Impact
| Metric |
Current |
Projected (Rec 1+2) |
Savings |
| AIC / run |
~161.6 |
~35 |
~78% |
| Action minutes / run |
~15 |
~5 |
~67% |
| Agent LLM turns |
multi-turn |
1 |
~80%+ |
| Failed-run AIC waste |
~161.6 |
~0 (with Rec 3) |
100% |
| Runs/week (estimate) |
~10 |
~7 (with Rec 4) |
~30% |
Implementation Checklist
Generated by Daily Copilot Token Optimization Advisor · 62.2 AIC · ⊞ 6.7K · ◷
Target Workflow:
test-coverage-reporterSource report: #5627
AIC per run: ~161.6 (proxy cost; token telemetry unavailable for Copilot engine)
Action minutes per run: ~15 (successful run)
Failure rate: 50% (1 of 2 recent runs failed — run 28313083143)
LLM turns: N/A (not reported by Copilot engine)
Current Configuration
github: false,bash: false)create_discussion,missing_tool,missing_data,noopgithubonlysummarization(haiku / gpt-5-mini / gemini-flash-lite)Root Cause: Pre-built Body Is Computed but Discarded
The
Pre-build discussion templatestep (lines 202–254 of.github/workflows/test-coverage-reporter.md) builds a complete formatted discussion body (steps.discussion-template.outputs.DISCUSSION_BODY) with 6 of the 8 required sections:steps.coverage-table)steps.critical-gaps)steps.coverage-table)steps.func-audit)steps.recent-changes)However,
steps.discussion-template.outputs.DISCUSSION_BODYis never referenced in the agent prompt. The agent only receives the 5-lineCOVERAGE_GAPS_BRIEFand is asked to "write a complete coverage discussion" with all 8 sections from scratch — data it literally cannot access since no bash/GitHub tools are loaded.This forces the model to hallucinate sections (coverage percentages, file names, function lists, etc.) from a 5-line summary, likely causing multiple inference turns and inflating runtime.
Recommendations
1. Pass the Pre-Built Discussion Body to the Agent Prompt
Estimated savings: ~75 AIC/run (~47%) — reduces agent from multi-turn to 1-turn
Replace the current agent prompt body in
.github/workflows/test-coverage-reporter.md(lines 259–301) with:This gives the agent all section data already formatted, reducing its task to appending ~200 words and calling a tool (1 LLM turn).
2. Pre-Compute Notable Findings and Recommendations in a Shell Step
Estimated savings: additional ~50 AIC/run, cumulative ~78% total with Rec 1
Add a pre-step after
coverage-gaps-briefthat generates findings and recommendations deterministically:Then simplify the agent prompt to a single-call instruction:
This reduces the agent to one deterministic LLM turn: parse prompt → call tool → done.
3. Add Fast-Fail Gate After Coverage Step
Estimated savings: Eliminates ~161.6 AIC waste on failed runs (50% of recent runs)
The 50% failure rate (run 28313083143 failed at 4 min, conclusion:
failure) wastes action minutes without producing any value. Add a gate step immediately afterRun coverage:This fails the job immediately before the 6 downstream extraction steps and the AWF agent start, saving ~10+ minutes of wasted runner time per failed run.
4. Evaluate Removing the Push Trigger
Estimated savings: ~161.6 AIC per avoided duplicate run
The workflow triggers on both
schedule: dailyandpush: branches: [main], paths: [src/**/*.ts]. For a reporting workflow (no code changes), the push trigger can cause same-day duplicate runs when source files land onmain.Keep
workflow_dispatch:for on-demand runs. The daily schedule is sufficient for trend reporting.5. Consolidate Node.js Pre-Step Calls
Estimated savings: ~1–2 action minutes per run
Steps
coverage-table,critical-gaps,func-audit, andcoverage-gaps-briefall opencoverage/coverage-summary.jsonin separatenode -e "..."processes. Merge these into one step that writes all fourGITHUB_OUTPUTentries in a single Node.js invocation.Expected Impact
Implementation Checklist
${{ steps.discussion-template.outputs.DISCUSSION_BODY }}to the agent prompt body (Rec 1)findingspre-step with rule-based generation of both missing sections (Rec 2)coverage-gatestep afterRun coverageto fail fast on missing data (Rec 3)push:trigger (Rec 4)coverage-table,critical-gaps,func-audit,coverage-gaps-briefinto one node.js step (Rec 5, optional)gh aw compile .github/workflows/test-coverage-reporter.mdnpx tsx scripts/ci/postprocess-smoke-workflows.ts(if applicable)