Amazon.com Inc. has launched Amazon Nova, a next-generation suite of foundation models (FMs) designed to deliver state-of-the-art intelligence across a wide range of tasks with industry-leading price performance. Available through Amazon Bedrock, Amazon Nova includes multiple models aimed at transforming AI capabilities for developers and businesses. The launch introduces Amazon Nova Micro, a high-speed text-to-text model with ultra-low latency and affordability, as well as Amazon Nova Lite, Nova Pro, and Nova Premier—multimodal models capable of processing text, images, and video to generate text. In addition, Amazon has unveiled two creative models: Amazon Nova Canvas, which generates studio-quality images, and Amazon Nova Reel, designed for high-quality video generation from text and images.
Rohit Prasad, SVP of Amazon Artificial General Intelligence, stated, “With over 1,000 generative AI applications in motion, our new Amazon Nova models aim to address the challenges faced by application builders, providing exceptional intelligence and content generation capabilities while improving latency, cost-effectiveness, and customization.”
Amazon Nova models are built for speed and efficiency, with all models offering multilingual and multimodal support, including the ability to process text, images, and videos in over 200 languages. Amazon Nova Micro, Nova Lite, and Nova Pro support context lengths up to 300K tokens, enabling the processing of up to 30 minutes of video. The models are also designed to be cost-effective, with Amazon Nova Micro, Lite, and Pro priced at least 75% lower than other high-performing models in their respective classes. In addition to being fast and affordable, these models are easy to integrate with Amazon Bedrock, a fully managed service that allows users to access various FMs through a single API.
To enhance model accuracy, Amazon Nova supports custom fine-tuning, allowing developers to use proprietary data to tailor model responses. Additionally, Amazon Nova models support distillation, enabling knowledge transfer from larger models to smaller, more efficient ones that retain high accuracy while being faster and cheaper to operate.
Amazon Nova also introduces creative capabilities with its Canvas and Reel models. Amazon Nova Canvas generates high-quality images from text and image prompts, offering advanced controls for customization such as color scheme and layout. The model has been shown to outperform other image generators like OpenAI DALL-E 3 and Stable Diffusion in side-by-side human evaluations. Meanwhile, Amazon Nova Reel enables the generation of high-quality videos for content creation, including marketing, advertising, and training, with controls for visual style, pacing, and camera movement. It will soon support videos up to two minutes in length.
Looking ahead, Amazon plans to release the Amazon Nova Speech-to-Speech model in early 2025. This model will revolutionize conversational AI with low-latency, human-like interactions by understanding streaming speech input and interpreting verbal and non-verbal cues. Additionally, Amazon is developing a multimodal-to-multimodal model capable of processing text, images, audio, and video inputs and generating outputs across these modalities. This groundbreaking model will be available in mid-2025, further expanding the potential of AI-driven applications.
With Amazon Nova, businesses and developers can unlock new possibilities in AI-driven content generation, customization, and innovation, while benefiting from enhanced performance and cost-efficiency.