Skip to content

⚡ Copilot Token Optimization2026-06-28 — Test Coverage Reporter #5628

Description

@github-actions

Target Workflow: test-coverage-reporter

Source report: #5627
AIC per run: ~161.6 (proxy cost; token telemetry unavailable for Copilot engine)
Action minutes per run: ~15 (successful run)
Failure rate: 50% (1 of 2 recent runs failed — run 28313083143)
LLM turns: N/A (not reported by Copilot engine)

Current Configuration

Setting Value
Tools loaded None (github: false, bash: false)
Safe-output tools create_discussion, missing_tool, missing_data, noop
Network groups github only
Engine / model Copilot / summarization (haiku / gpt-5-mini / gemini-flash-lite)
Pre-agent steps 9 steps (npm install, build, test:coverage, 6× data extraction)
Prompt body size ~2,200 chars
Pre-built discussion template step ✅ exists — but output is never referenced in the prompt

Root Cause: Pre-built Body Is Computed but Discarded

The Pre-build discussion template step (lines 202–254 of .github/workflows/test-coverage-reporter.md) builds a complete formatted discussion body (steps.discussion-template.outputs.DISCUSSION_BODY) with 6 of the 8 required sections:

  • Overall Coverage table (from steps.coverage-table)
  • Security-Critical Path Status (from steps.critical-gaps)
  • Coverage Table (from steps.coverage-table)
  • Function Audit (from steps.func-audit)
  • Recent Source Changes (from steps.recent-changes)

However, steps.discussion-template.outputs.DISCUSSION_BODY is never referenced in the agent prompt. The agent only receives the 5-line COVERAGE_GAPS_BRIEF and is asked to "write a complete coverage discussion" with all 8 sections from scratch — data it literally cannot access since no bash/GitHub tools are loaded.

This forces the model to hallucinate sections (coverage percentages, file names, function lists, etc.) from a 5-line summary, likely causing multiple inference turns and inflating runtime.

Recommendations

1. Pass the Pre-Built Discussion Body to the Agent Prompt

Estimated savings: ~75 AIC/run (~47%) — reduces agent from multi-turn to 1-turn

Replace the current agent prompt body in .github/workflows/test-coverage-reporter.md (lines 259–301) with:

## Pre-Computed Discussion Body

All coverage data was computed in pre-steps. The discussion body below is ready — do not change any numbers or section headings.

${{ steps.discussion-template.outputs.DISCUSSION_BODY }}

## Your Task

Append these two sections to the body above, then call `create_discussion` with the complete combined body (pre-built sections + the two new sections below).

### 🔎 Notable Findings

Write 2–4 bullet points based strictly on the coverage gaps below. Reference specific file names and percentages from the body above.

${{ steps.coverage-gaps-brief.outputs.COVERAGE_GAPS_BRIEF }}

### 🎯 Recommendations

Write 3 prioritized items (High / Medium / Low) tied to the findings above. Each item must name a specific file and suggest a concrete action (e.g., "add tests for the X branch in `src/host-iptables.ts`").

After appending both sections, call `create_discussion` once with the full body.

This gives the agent all section data already formatted, reducing its task to appending ~200 words and calling a tool (1 LLM turn).

2. Pre-Compute Notable Findings and Recommendations in a Shell Step

Estimated savings: additional ~50 AIC/run, cumulative ~78% total with Rec 1

Add a pre-step after coverage-gaps-brief that generates findings and recommendations deterministically:

- name: Generate findings and recommendations
  id: findings
  run: |
    {
      echo "FINDINGS_AND_RECS<<EOF"
      node -e "
        const fs = require('fs');
        const d = JSON.parse(fs.readFileSync('coverage/coverage-summary.json','utf8'));
        const SEC = ['docker-manager','host-iptables','squid-config','domain-patterns','cli'];
        const findings = [];
        const recs = [];

        const gaps = Object.entries(d)
          .filter(([k]) => k !== 'total')
          .map(([k, v]) => ({
            file: k.replace(process.cwd() + '/', ''),
            stmts: v.statements.pct,
            isSec: SEC.some(s => k.includes(s))
          }))
          .filter(r => r.stmts < 80)
          .sort((a, b) => (b.isSec - a.isSec) || (a.stmts - b.stmts));

        gaps.slice(0, 4).forEach(g => {
          const tag = g.isSec ? '🔴 **[SECURITY]**' : g.stmts < 50 ? '🟠 **[CRITICAL]**' : '🟡 **[LOW]**';
          findings.push('- ' + tag + ' \`' + g.file + '\`: ' + g.stmts + '% statement coverage');
        });
        if (!findings.length) findings.push('- ✅ All files meet the 80% coverage threshold');

        const priorities = ['High', 'Medium', 'Low'];
        gaps.slice(0, 3).forEach((g, i) => {
          const action = g.isSec
            ? 'Add unit tests for \`' + g.file + '\` — security-critical path at ' + g.stmts + '% coverage'
            : 'Increase test coverage for \`' + g.file + '\` from ' + g.stmts + '% to 80%';
          recs.push('- **' + priorities[i] + '**: ' + action);
        });
        if (!recs.length) recs.push('- **Low**: Maintain current coverage levels across all files');

        console.log('### 🔎 Notable Findings\n');
        findings.forEach(f => console.log(f));
        console.log('\n### 🎯 Recommendations\n');
        recs.forEach(r => console.log(r));
      " 2>/dev/null || echo "### 🔎 Notable Findings\n\n- Coverage data unavailable"
      echo "EOF"
    } >> "$GITHUB_OUTPUT"

Then simplify the agent prompt to a single-call instruction:

Call `create_discussion` once with this exact body — do not modify it:

${{ steps.discussion-template.outputs.DISCUSSION_BODY }}
${{ steps.findings.outputs.FINDINGS_AND_RECS }}

---
*Generated by test-coverage-reporter workflow. Trigger: `${{ github.event_name }}`*

This reduces the agent to one deterministic LLM turn: parse prompt → call tool → done.

3. Add Fast-Fail Gate After Coverage Step

Estimated savings: Eliminates ~161.6 AIC waste on failed runs (50% of recent runs)

The 50% failure rate (run 28313083143 failed at 4 min, conclusion: failure) wastes action minutes without producing any value. Add a gate step immediately after Run coverage:

- name: Verify coverage data
  id: coverage-gate
  run: |
    if [ ! -f coverage/coverage-summary.json ]; then
      echo "::error::Coverage data missing — aborting"
      exit 1
    fi
    echo "Coverage data verified: $(wc -c < coverage/coverage-summary.json) bytes"

This fails the job immediately before the 6 downstream extraction steps and the AWF agent start, saving ~10+ minutes of wasted runner time per failed run.

4. Evaluate Removing the Push Trigger

Estimated savings: ~161.6 AIC per avoided duplicate run

The workflow triggers on both schedule: daily and push: branches: [main], paths: [src/**/*.ts]. For a reporting workflow (no code changes), the push trigger can cause same-day duplicate runs when source files land on main.

# Consider removing this trigger block:
on:
  push:
    branches: [main]
    paths:
      - 'src/**/*.ts'

Keep workflow_dispatch: for on-demand runs. The daily schedule is sufficient for trend reporting.

5. Consolidate Node.js Pre-Step Calls

Estimated savings: ~1–2 action minutes per run

Steps coverage-table, critical-gaps, func-audit, and coverage-gaps-brief all open coverage/coverage-summary.json in separate node -e "..." processes. Merge these into one step that writes all four GITHUB_OUTPUT entries in a single Node.js invocation.

Expected Impact

Metric Current Projected (Rec 1+2) Savings
AIC / run ~161.6 ~35 ~78%
Action minutes / run ~15 ~5 ~67%
Agent LLM turns multi-turn 1 ~80%+
Failed-run AIC waste ~161.6 ~0 (with Rec 3) 100%
Runs/week (estimate) ~10 ~7 (with Rec 4) ~30%

Implementation Checklist

  • Add ${{ steps.discussion-template.outputs.DISCUSSION_BODY }} to the agent prompt body (Rec 1)
  • Update the agent task to "append Notable Findings + Recommendations to pre-built body, then call create_discussion" (Rec 1)
  • Add the findings pre-step with rule-based generation of both missing sections (Rec 2)
  • Simplify agent prompt to a single-call instruction once Rec 2 is in place (Rec 2)
  • Add coverage-gate step after Run coverage to fail fast on missing data (Rec 3)
  • Evaluate removing the push: trigger (Rec 4)
  • Consolidate coverage-table, critical-gaps, func-audit, coverage-gaps-brief into one node.js step (Rec 5, optional)
  • Recompile: gh aw compile .github/workflows/test-coverage-reporter.md
  • Post-process: npx tsx scripts/ci/postprocess-smoke-workflows.ts (if applicable)
  • Verify CI passes on PR
  • Compare AIC on new run vs 161.6 baseline

Generated by Daily Copilot Token Optimization Advisor · 62.2 AIC · ⊞ 6.7K ·

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions