Open-Source Coding Agents: One Ties Sonnet, One Won't Listen

6.8 relevance

Open-source coding agents comparison is highly relevant to AI/ML and developer tools.

AI/ML dev.to

Open-Source Coding Agents: One Ties Sonnet, One Won't Listen

Summary

Open-source coding agents GLM 5.2 and MiniMax M3 now match or beat Claude Sonnet 4.6 on quality across 1,000 real coding tasks, with GLM 5.2 scoring 91.9 overall vs Sonnet's 90.8 while costing $0.289 per task vs $0.296. Qwen3.7-Plus is the cheapest at $0.068 per task but scores lowest at 82.2 overall and struggles with instruction-following. The skill-based context boost adds ~20 points to every model's score, primarily improving instruction-following rather than task completion.

Author

Tessl