Google releases major updates for Gemini models

Google releases major updates for Gemini models

Google has announced significant updates to its Gemini models, aimed at making advanced AI capabilities more accessible and cost-effective for developers worldwide. Two new production-ready models: Gemini-1.5-Pro-002 and Gemini-1.5-Flash-002, feature major improvements in speed and performance.

Key Updates:

  • Price reduction: a 64% reduction in input token prices, 52% in output tokens, and 64% in incremental cached tokens for Gemini 1.5 Pro.
  • Increased rate limits: the rate limit for 1.5 Flash has doubled to 2,000 RPM, and for 1.5 Pro it has nearly tripled to 1,000 RPM.
  • Improved speed: Gemini models now offer 2x faster output and 3x lower latency, making it easier for developers to implement high-performance AI in real time.

The Gemini 1.5 series is designed for a broad range of tasks, including text, code, and multimodal applications. These models can handle large inputs like 1,000-page PDFs and hour-long videos, offering enhanced performance in key areas:

  • A 7% improvement in the MMLU-Pro benchmark, which evaluates AI comprehension.
  • A 20% improvement in complex math tasks such as MATH and HiddenMath.
  • Better results for visual understanding and Python code generation.

Responding to developer feedback, Gemini 1.5 models now generate more concise output – about 5-20% shorter than previous versions. This is especially useful for summarization and information extraction, reducing overall costs while maintaining clarity and accuracy.

The new models come with updated safety filters, allowing developers to customize them based on specific needs. Default filters have been adjusted to balance user instruction compliance and safety.

An improved experimental version, Gemini-1.5-Flash-8B-Exp-0924, has also been released. This model includes significant upgrades in both text and multimodal capabilities and is available via Google AI Studio and the Gemini API.

These latest improvements make the Gemini 1.5 models faster, more cost-effective, and better suited for a wide range of applications. Developers can access these models for free via Google AI Studio, while larger organizations and Google Cloud customers can leverage them through Vertex AI.

For more details, visit the Google blog for developers.

Related articles

8 Significant Research Papers on LLM Reasoning

Simple next-token generation, the foundational technique of large language models (LLMs), is usually insufficient for tackling complex reasoning...

AI-Generated Masterpieces: The Blurring Lines Between Human and Machine Creativity

Hey there! Just the other day, I was admiring a beautiful painting at a local art gallery when...

Marek Rosa – dev blog: GoodAI LTM Benchmark v3 Released

 The main purpose of the GoodAI LTM Benchmark has always been to serve as an objective measure for...