The Gemini 2.5 models were developed as a family of hybrid reasoning models, offering strong performance while also optimizing for cost and speed. The 2.5 Pro and Flash models are now stable and generally available. Additionally, 2.5 Flash-Lite, the most cost-efficient and fastest 2.5 model to date, is being introduced in preview.
Making 2.5 Flash and 2.5 Pro generally available
Based on user feedback, stable versions of Gemini 2.5 Flash and Pro are now available, enabling confident development of production applications. Various developers and organizations have already been utilizing these latest versions in production for several weeks.
Introducing Gemini 2.5 Flash-Lite
A preview of the new Gemini 2.5 Flash-Lite is also being introduced, positioned as the most cost-efficient and fastest 2.5 model. Users can begin developing with this preview version immediately, and feedback is encouraged.
Gemini 2.5 Flash-Lite demonstrates superior quality compared to 2.0 Flash-Lite across coding, math, science, reasoning, and multimodal benchmarks. It performs exceptionally well in high-volume, latency-sensitive applications such as translation and classification, offering reduced latency compared to 2.0 Flash-Lite and 2.0 Flash across a wide range of prompts. This model retains key Gemini 2.5 features, including adaptable ‘thinking’ capabilities for various budgets, integration with tools like Google Search and code execution, multimodal input, and a 1 million-token context length.
Further details regarding the Gemini 2.5 family of models are available in the latest Gemini technical report.

The Gemini 2.5 Flash-Lite preview is now accessible in Google AI Studio and Vertex AI, alongside the stable 2.5 Flash and Pro versions. Both 2.5 Flash and Pro can also be found within the Gemini app. Additionally, customized versions of 2.5 Flash-Lite and Flash have been integrated into Search.


