Catch Up with What's New for AI in AWS in 2025 -- AWSInsider

Catch Up with What's New for AI in AWS in 2025

By David Ramel
01/09/2025

Things slow down in the tech industry over the holiday season, so here's a catch-up of AI news announced by Amazon Web Services (AWS) since Nov. 20, 2024, a Monday when many people started their seasonal vacations, ranging from updates to the Amazon Q dev tool to support for latency optimized models.

Amazon Q Business Is Now SOC Compliant
Amazon Q Business, AWS's generative AI-powered assistant, has achieved SOC (System and Organization Controls) compliance as of Dec. 20, 2024. This certification covers SOC 1, 2, and 3, enabling customers to use Amazon Q Business for applications that require SOC compliance.

Key points of this announcement include:

Amazon Q Business can now be used for SOC-compliant tasks within enterprise systems
The certification provides insight into AWS's security processes and controls for protecting customer data
AWS maintains SOC compliance through rigorous third-party audits
The compliance applies to all AWS Regions where Amazon Q Business is available

This certification enhances Amazon Q Business's capability to handle sensitive enterprise data while maintaining high security and compliance standards. More info here.

Amazon Bedrock Agents, Flows, and Knowledge Bases Now Support Latency Optimized Models

These components of Amazon's generative AI platform that enable developers to build sophisticated AI applications now support latency-optimized models through the SDK, as announced on Dec. 23, 2024. This update enhances AI applications built with Amazon Bedrock Tooling by providing faster response times and improved responsiveness.

Key features of this update include:

Support for latency-optimized versions of Anthropic's Claude 3.5 Haiku model and Meta's Llama 3.1 405B and 70B models
Reduced latency without compromising accuracy compared to standard models
Utilization of purpose-built AI chips like AWS Trainium2 and advanced software optimizations
Immediate integration into existing applications without additional setup or model fine-tuning

This enhancement is particularly beneficial for latency-sensitive applications such as real-time customer service chatbots and interactive coding assistants. The latency-optimized inference support is available in the US East (Ohio) Region via cross-region inference and can be accessed through the Amazon Bedrock SDK using a runtime configuration. More info here.

AWS Neuron Introduces Support for Trainium2 and NxD Inference
AWS released Neuron 2.21, introducing several significant updates to its AI infrastructure. The AWS Neuron SDK now supports model training and deployment across Trn1, Trn2, and Inf2 instances, available in various AWS Regions and instance types.

Highlights include:

Support for AWS Trainium2 chips and Amazon EC2 Trn2 instances, including the trn2.48xlarge instance type and Trn2 UltraServer
Introduction of NxD Inference, a PyTorch-based library integrated with vLLM for simplified deployment of large language and multi-modality models
Launch of Neuron Profiler 2.0 (beta) with enhanced capabilities and support for distributed workloads
Support for PyTorch 2.5
Llama 3.1 405B model inference support on a single trn2.48xlarge instance using NxD Inference
Updates to Deep Learning Containers and AMIs, with support for new model architectures like Llama 3.2, Llama 3.3, and Mixture-of-Experts (MoE) models
New inference features including FP8 weight quantization and flash decoding for speculative decoding in Transformers NeuronX
Additional training examples and features, such as support for HuggingFace Llama 3/3.1 70B on Trn2 instances and DPO support for post-training model alignment

More info here.

Llama 3.3 70B Now Available on AWS via Amazon SageMaker JumpStart
AWS made Meta's Llama 3.3 70B model available through Amazon SageMaker JumpStart as of Dec. 26, 2024. This large language model offers a balance of high performance and computational efficiency, making it suitable for cost-effective AI deployments.

Key features of Llama 3.3 70B include:

Enhanced attention mechanism for reduced inference costs
Training on approximately 15 trillion tokens
Extensive supervised fine-tuning and Reinforcement Learning from Human Feedback (RLHF)
Comparable output quality to larger Llama versions with fewer resources
Nearly five times more cost-effective inference operations, according to Meta

Customers can deploy Llama 3.3 70B using either the SageMaker JumpStart user interface or programmatically via the SageMaker Python SDK. SageMaker AI's advanced inference capabilities optimize both performance and cost efficiency for deployments.

The model is available in all AWS Regions where Amazon SageMaker AI is supported. More information is here and in a separate blog post.

Amazon Q Developer Is Now Available in Amazon SageMaker Code Editor IDE
The general availability of Amazon Q Developer in Amazon SageMaker Studio Code Editor was the first AWS AI announcement of 2025, being posted yesterday."SageMaker Studio customers now get generative AI assistance powered by Q Developer right within their Code Editor (Visual Studio Code - Open Source) IDE," AWS said. "With Q Developer, data scientists and ML engineers can access expert guidance on SageMaker features, code generation, and troubleshooting. This allows for more productivity by eliminating the need for tedious online searches and documentation review, and ensuring more time delivering differentiated business value."

Key features and benefits of Amazon Q Developer in SageMaker Studio Code Editor include:

Expert guidance on SageMaker features
Code generation tailored to user needs
In-line code suggestions and conversational assistance
Step-by-step troubleshooting guidance
Chat capability for discovering and learning SageMaker features

This integration aims to enhance productivity for data scientists and ML engineers by:

Eliminating the need for extensive documentation review
Accelerating the model development lifecycle
Streamlining code editing, explanation, and documentation processes
Providing efficient error resolution

Amazon Q Developer is now available in all commercial AWS regions where SageMaker Studio is supported. This feature is accessible to both Amazon Q Developer Free Tier and Pro Tier users, with pricing potentially varying depending on individual service models.

The addition of Amazon Q Developer to SageMaker Studio Code Editor represents a significant step in AWS's efforts to integrate generative AI capabilities into its machine learning development environment, potentially transforming the workflow for data scientists and ML engineers, the company said.

About the Author

David Ramel is an editor and writer at Converge 360.

Featured

Subscribe on YouTube

AWS Cloud Report

Email Address*Country*

Please type the letters/numbers you see above.

Upcoming Training Events

0 AM

VSLive! 4-Day Hands-On Training Seminar: Hands-on with Blazor
May 5-8, 2025

Cybersecurity & Ransomware Live! VirtCon 2025
May 13-15, 2025

VSLive! 4-Hour In-Depth Workshop: Deep Dive into ASP.NET Core Razor Pages
May 29, 2025

VSLive! 3-Day Hands-On Training Seminar: Master Modern JavaScript: Unlock the Full Potential of Your Code
June 2-4, 2025

VSLive! 2-Day Hands-On Training Seminar: Asynchronous and Parallel Programming in C#
June 24-25, 2025

4-Hour Hands-on Workshop: MCP Demystified
June 30, 2025

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training: 4-Day Hands-On Experience
July 15-18, 2025

VSLive! 4-Hour In-Depth Workshop: Immersive .NET Full Stack Training: C# Interfaces: Effective Usage while Avoiding Pitfalls
July 29, 2025

Visual Studio Live! @ Microsoft HQ
August 4-8, 2025

TechMentor @ Microsoft HQ
August 11-15, 2025

4-Hour VSLive! Workshop: Testability in .NET
August 27, 2025

Microsoft 365 Security Masterclass
August 25-26, 2025

Visual Studio Live! San Diego
September 8-12, 2025

Live! 360 2-Day Hands-On Seminar: Swimming in the Lakes of Microsoft Fabric and AI – A Hands-on Experience
September 18-19, 2025

VSLive! 2-Day Hands-On Training Seminar: Hands-On with .NET Web Development in 2025
October 7-8, 2025

Live! 360 Orlando
November 16-21, 2025

Artificial Intelligence Live! Orlando
November 16-21, 2025

Cloud & Containers Live! Orlando
November 16-21, 2025

Cybersecurity & Ransomware Live! Orlando
November 16-21, 2025

Data Platform Live! Orlando
November 16-21, 2025

Visual Studio Live! Orlando
November 16-21, 2025

TechMentor Orlando
November 16-21, 2025

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training: 4-Day Hands-On Experience
December 16-19, 2025

Visual Studio Live! Las Vegas
March 16-20, 2026

Free Whitepapers

> More TechLibrary

Free Webcasts

> More Webcasts