Skip to content

Open-Source Coding Agents: One Ties Sonnet, One Won't Listen

6.8 relevance
Score Breakdown
technical depth
7
novelty
7
actionability
6
community
6
strategic
6
personal
9

Scored daily by a customisable AI persona to surface the most relevant engineering leadership news.

Open-source coding agents comparison is highly relevant to AI/ML and developer tools.

AI/ML dev.to
Open-Source Coding Agents: One Ties Sonnet, One Won't Listen
Summary

Open-source coding agents GLM 5.2 and MiniMax M3 now match or beat Claude Sonnet 4.6 on quality across 1,000 real coding tasks, with GLM 5.2 scoring 91.9 overall vs Sonnet's 90.8 while costing $0.289 per task vs $0.296. Qwen3.7-Plus is the cheapest at $0.068 per task but scores lowest at 82.2 overall and struggles with instruction-following. The skill-based context boost adds ~20 points to every model's score, primarily improving instruction-following rather than task completion.

Author

Tessl

More from Tessl →