Skip to content

Anthropic Unveils Claude Sonnet 4.5: A Game-Changer in Software Engineering

Claude Sonnet 4.5 brings 30+ hours of focus and new features for longer tasks. It's leading the pack in coding and reasoning tasks, and it's now available everywhere.

There is a poster in which there is a robot, there are animated persons who are operating the...
There is a poster in which there is a robot, there are animated persons who are operating the robot, there are artificial birds flying in the air, there are planets, there is ground, there are stars in the sky, there is watermark, there are numbers and texts.

Anthropic Unveils Claude Sonnet 4.5: A Game-Changer in Software Engineering

Anthropic has unveiled Claude AI Sonnet 4.5, a significant leap in end-to-end software engineering and practical computer use. The model has demonstrated remarkable improvements in coding tasks, with teams observing over 30 hours of uninterrupted focus.

Claude AI Sonnet 4.5 introduces innovative features like 'Checkpoints' for saving and reverting progress, an updated terminal interface, and a native VS Code plugin. These enhancements allow agents to handle longer contexts and manage more complex tasks autonomously, up to 30 hours continuously.

The model showcases substantial gains across common reasoning and math evaluations. On the OSWorld-Verified dataset, it leads at 61.4%, reflecting stronger tool control and UI manipulation for browser/desktop tasks. With a 1M-context setting, it reached 78.2% accuracy on the SWE-bench Verified dataset. Under a higher-compute setting with parallel sampling, accuracy rose to 82.0%.

Claude AI Sonnet 4.5 also exhibits significant improvements in code quality, security vulnerability detection, and domain-specific knowledge (finance, law, medicine, STEM). Its safety posture is ASL-3 with strengthened defenses against prompt-injection.

Claude AI Sonnet 4.5 is now available on Anthropic API & apps, AWS Bedrock, Google Cloud Vertex AI, and GitHub Copilot. With its enhanced capabilities and real-world performance, it is poised to become the global standard in code models, setting a new benchmark for software engineering and practical computer use.

Read also:

Latest