Google has introduced Whisk, an innovative AI-powered image generation tool that enables users to create images by combining visual elements such as subjects, scenes, and styles.
Unlike traditional text-based prompts, Whisk allows users to input or generate images to guide the creative process, facilitating rapid visual exploration and idea remixing.
Whisk operates by analyzing user-provided images to understand their content and context. It then utilizes Google’s latest Imagen 3 model, in conjunction with the Gemini model’s visual understanding capabilities, to generate new images that blend the specified elements.
This approach offers a more intuitive and interactive method for users to visualize concepts without the need for detailed text prompts.
Users can upload their own images or use AI-generated suggestions provided by Whisk. The tool allows for the selection of multiple images for each category—subject, scene, and style—enabling a diverse range of creative combinations.
Additionally, users can refine generated images by adding text details or directly editing the underlying prompts, ensuring the output aligns with their vision.
Google emphasizes that Whisk is designed for rapid visual ideation rather than precise editing. While the tool may not always produce perfect results, it offers editing capabilities to address inaccuracies and encourages users to experiment and iterate on their ideas.
Currently, Whisk is available as an experiment through Google Labs in the United States. Users interested in exploring this tool can access it at labs.google/whisk. Google plans to expand availability to more countries in the future.
In addition to Whisk, Google has announced Veo 2, an advanced video generation model with an improved understanding of cinematography. Veo 2 is initially available on VideoFX and is expected to be integrated into YouTube Shorts and other products in the coming year.
These developments highlight Google’s commitment to advancing AI-driven creative tools, providing users with innovative methods to visualize and express their ideas.