Last week, we launched an experimental updated version of Gemini 1.5 Pro (0801) that ranked #1 on the LMSYS leaderboard for both text and multi-modal queries. We were so excited by the immediate response to this model that we raised the limits to test with it. We will have more updates soon.
Today, we’re announcing a series of improvements across AI Studio and the Gemini API:
1.5 Flash is our most popular Gemini model amongst developers who want to build high volume, low latency use cases such as summarization, categorization, multi-modal understanding and more. To make this model even more affordable, as of August 12, we’re reducing the input price by 78% to $0.075/1 million tokens and the output price by 71% to $0.3/1 million tokens for prompts under 128K tokens (cascading the reductions across the >128K tokens tier as well as caching). With these prices and tools like context caching, developers should see major cost savings when building with Gemini 1.5 Flash’s long context and multimodal capabilities.
We’re expanding language understanding for both Gemini 1.5 Pro and Flash models to cover more than 100 languages so developers across the globe can now prompt and receive outputs in the language of their choice. This should eliminate model “language” block finish reasons via the Gemini API.
Google Workspace users can now access Google AI Studio without having to enable any additional settings by default, unlocking frictionless access for millions of users. Account admins will still have control to manage AI Studio access.
We have now rolled out Gemini 1.5 Flash text tuning to all developers via the Gemini API and Google AI Studio. Tuning enables developers to customize base models and improve performance for tasks by providing the model additional data. This helps reduce the context size of prompts, reduces latency and in some cases cost, while also increasing the accuracy of the model on tasks.
Our developer documentation is core to the experience of building with the Gemini API. We recently released a series of improvements, updated the content, navigation, look and feel, and released a revamped API reference.
We have many more improvements to the documentation coming soon so please continue to send us feedback!
The Gemini API and AI Studio now support PDF understanding through both text and vision. If your PDF includes graphs, images, or other non-text visual content, the model uses native multi-modal capabilities to process the PDF. You can try this out via Google AI Studio or in the Gemini API.
Over the last few weeks, we have released many improvements to AI Studio, including overhauling keyboard shortcuts, allowing dragging and dropping images into the UI, decreased loading time by ~50%, added prompt suggestions, and much more!
Developers are at the heart of all our work on the Gemini API and Google AI Studio, so keep building and sharing your feedback with us via the Gemini API Developer Forum.