https://www.youtube.com/watch?v=obtjaOy4blw Channel: Rob the AI guy

This video introduces three brand new features recently launched by Google Gemini:

  1. Scheduled Actions: Users can now schedule tasks directly within Gemini’s web app. Access this feature under “Settings & Help” in the bottom-left corner, then click “Scheduled actions.” To create a scheduled action, prompt Gemini, e.g., “Please create me a scheduled action: Send me a daily summary of my calendar, to-do’s and important unread emails at 8 AM.” Gemini will then use its scheduler.schedule tool to set up the recurring action. Users can pause or edit these scheduled actions, including changing the frequency (daily, weekly, monthly) and time. It’s crucial to enable various Google Workspace apps (Gmail, Calendar, Docs, Drive, Keep, Tasks) and other Google services (Flights, Hotels, Maps, YouTube, YouTube Music, OpenStax) in Gemini’s “Apps” settings to leverage its full automation potential. An example mockup shows how a daily summary could include calendar events, a to-do list, and important unread emails.

  2. Imagen 4 & AI Studio for Image Generation: Google has integrated Imagen 4 (and Imagen 4 Ultra) into its free AI Studio platform (aistudio.google.com) for image generation. To access, navigate to “Generate Media” and select “Imagen.” The new Imagen 4 Ultra model is highlighted for its speed, generating hyper-realistic images significantly faster than competitors like ChatGPT (DALL-E 3). Users can describe their desired image, and Imagen will generate it, with options to download, copy, export to Drive, and adjust aspect ratios.

  3. Gemini CLI (Command Line Interface): This is a powerful, open-source AI agent that allows users to interact with Gemini directly from their terminal. It’s designed for developers and technical users, enabling tasks like writing code, debugging, and automating workflows. Key capabilities include: Querying and editing large codebases (with a 1 million token context window). Generating new applications from PDFs or sketches using Gemini’s multimodal capabilities. Automating operational tasks like querying pull requests or handling complex rebases. Connecting to other services (like Imagen, Veo, Lyra for media generation) via MCP servers. Grounding queries with the Google Search tool. Examples provided include writing a Discord bot, summarizing Git changes, exploring codebases, implementing code drafts, automating workflows, and interacting with the system (e.g., converting images, organizing PDF invoices). The video also briefly mentions new Gemma models (Gemma 3n E2B and Gemma 3n E4B) available in AI Studio, which are open-source models typically hosted locally for experimentation.

The speaker emphasizes the importance of embracing AI to stay competitive in the evolving job market and encourages viewers to explore his AI Automation School for personalized feedback and help with AI workflows.