Skip to content

GitHub Slashes Agent Workflow Token Spend up to 62% with Daily Audits and MCP Pruning

9.8 relevance
Score Breakdown
technical depth
9
novelty
9
actionability
9
community
8
strategic
7
personal
10

Scored daily by a customisable AI persona to surface the most relevant engineering leadership news.

GitHub's token cost optimization techniques for agentic workflows, directly actionable and novel.

2026-05-29 AI/ML infoq.com
GitHub Slashes Agent Workflow Token Spend up to 62% with Daily Audits and MCP Pruning
Summary

GitHub cut token usage in its agentic CI workflows by up to 62% by pruning unused Model Context Protocol (MCP) tools, replacing MCP calls with gh CLI commands, and deploying daily audit and optimization agents. The team uses an Effective Tokens (ET) metric that weights output tokens 4× and cache reads 0.1×, with model multipliers (Haiku 0.25×, Sonnet 1.0×, Opus 5.0×), to normalize cost across models. The Daily Token Usage Auditor and Daily Token Optimiser agents, shipped in the gh-aw CLI, surfaced that removing unused MCP tools cut per-call context by 8–12 KB in smoke-test workflows, though pruning was ineffective when tool manifests were a small fraction of overall context (e.g., Community Attribution).

Key Takeaways
  • Implement a daily audit-and-optimize agent loop that tracks token usage via a normalized metric (ET), prunes unused MCP tools, and replaces expensive MCP calls with pre-downloaded CLI commands to cut agent workflow costs by up to 62%.
Why it matters

For engineers building LLM agent pipelines in CI, this provides a proven pattern—proxy-level token observability plus automated optimization agents—to systematically reduce runaway token costs without sacrificing functionality.

Author

Mark Silvester

More from Mark Silvester →