News

AWS Ups Its Agentic AI Game with Anthropic Claude 3.5 Sonnet Update

Following yesterday's agentic AI moves from fellow cloud giants Microsoft and Google, Amazon Web Services (AWS) upped its own autonomous AI game with new capabilities for its Amazon Bedrock service, which now offers an updated Claude 3.5 Sonnet model from partner Anthropic.

"You now have access to an upgraded Claude 3.5 Sonnet model that builds upon its predecessor's strengths, offering even more intelligence at the same cost," AWS said in an Oct. 22 blog post. "Claude 3.5 Sonnet continues to improve its capability to solve real-world software engineering tasks and follow complex, agentic workflows."

The latter is one of the hottest areas of AI right now, as evidenced by those cloud giant moves reported in "Microsoft and Google Turn to Autonomous AI Agents."

Autonomous AI agents can perform tasks or solve problems independently with minimal, non-continuous human intervention. They operate based on predefined goals, using AI techniques such as machine learning, natural language processing, or reinforcement learning to interact with their environment, gather information, and make decisions.

Agentic abilities were highlighted in Anthropic's own post today announcing the Claude 3.5 Sonnet update and Claude 3.5 Haiku: "The updated Claude 3.5 Sonnet shows wide-ranging improvements on industry benchmarks, with particularly strong gains in agentic coding and tool use tasks. On coding, it improves performance on SWE-bench Verified from 33.4% to 49.0%, scoring higher than all publicly available models -- including reasoning models like OpenAI o1-preview and specialized systems designed for agentic coding. It also improves performance on TAU-bench, an agentic tool use task, from 62.6% to 69.2% in the retail domain, and from 36.0% to 46.0% in the more challenging airline domain."

Benchmarks
[Click on image for larger view.] Benchmarks (source: Anthropic).

That Claude 3.5 Haiku model, meanwhile, is "coming soon" to Amazon Bedrock, the fully managed AI cloud service.

AWS touted the ability of the new Claude to use computers by itself, calling it "a new frontier in AI interaction."

That is explained further: "Instead of restricting the model to use APIs, Claude has been trained on general computer skills, allowing it to use a wide range of standard tools and software programs. In this way, applications can use Claude to perceive and interact with computer interfaces. Software developers can integrate this API to enable Claude to translate prompts (for example, 'find me a hotel in Rome') into specific computer commands (open a browser, navigate this website, and so on)."

Software developers, meanwhile, can now use three new integrated tools providing "a virtual set of hands to operate a computer":

  • Computer tool -- This tool can receive as input a screenshot and a goal and returns a description of the mouse and keyboard actions that should be performed to achieve that goal. For example, this tool can ask to move the cursor to a specific position, click, type, and take screenshots.
  • Text editor tool -- Using this tool, the model can ask to perform operations like viewing file contents, creating new files, replacing text, and undoing edits.
  • Bash tool -- This tool returns commands that can be run on a computer system to interact at a lower level as a user typing in a terminal.

Anthropic, though it teamed with Amazon much like OpenAI teamed with Microsoft, also works with Google and said its new models will also be available on Google Cloud: "The upgraded Claude 3.5 Sonnet is now available for all users. Starting today, developers can build with the computer use beta on the Anthropic API, Amazon Bedrock, and Google Cloud's Vertex AI. The new Claude 3.5 Haiku will be released later this month."

AWS provides more information in its "Anthropic's Claude in Amazon Bedrock" site.

About the Author

David Ramel is an editor and writer at Converge 360.

Featured

Subscribe on YouTube