The Gemini Deep Research Agent is now available to developers via the Interactions API. Powered by Gemini 3.0 Pro, it autonomously plans, executes, and synthesizes multi-step research tasks.
Google's largest and most capable AI model. Built from the ground up to be multimodal, Gemini can generalize and seamlessly understand, operate across and combine different types of information, including text, images, audio, video and code.
Gemini 2.5 Computer Use is a new specialized model from Google that powers AI agents to interact with graphical user interfaces. It takes screenshots and user goals as input and generates actions like clicks and typing to automate tasks on websites and apps.