AI image generators have been captivating (and occasionally unsettling) us for several years, with notable advancements from companies like OpenAI, Imagen, Adobe Firefly, and DALL-E-3. As this technology evolves, we find ourselves with an increasing number of ways to fine-tune our desired outcomes. Recently, Google Labs has unveiled Whisk, a new tool that allows users to upload images as references instead of relying solely on text prompts.
Google Labs’ Whisk: Image Generation from Existing Images
For those living in the United States, Whisk is now available as part of Google Labs’ “experiment in generative AI,” according to Google’s official blog. With Whisk, users can enhance their image creation process by incorporating images as reference points. The platform prompts you to identify three essential elements: subject, scene, and style. Whisk then combines these aspects to generate an ideal image for you.
Whisk utilizes Imagen 3, which is Google’s latest model for image generation.
While Whisk introduces a novel way to incorporate images, it doesn’t eliminate text prompts entirely. You still have the option to craft prompts for the three categories or include a general note as needed. After Whisk’s initial image generation, you can also make adjustments. For example, if your first attempt results in a vintage holiday card featuring a cat in the snow, you might be inspired to add snowflakes to the final design.
Every time you generate or modify an image in Whisk’s categories, the platform automatically creates a detailed written description. This feature makes it easy to adjust or edit an existing image simply by modifying the text provided.
If you’re struggling with creativity, the platform offers a fun feature: you can hit a die icon to randomize your visual components. For those seeking more elaborate images, you can also upload multiple subjects, scenes, or style references.
Once you’re satisfied with your creation, you have the option to either save it on the platform or download it for personal use.
Is Whisk Worth It?
Amid the plethora of sophisticated AI image generation tools available to enhance photographs or create “original” artwork, Google’s new offering might initially seem like just another gimmick. However, the unique way Whisk integrates image references into its generation process makes it a valuable tool in both creative and professional contexts.
Imagine you’re tasked with preparing a pitch deck and need to gather images similar to ones you already have. Rather than struggling to articulate your vision with words, you can simply upload the reference file and provide a brief text description of how you want your new image to differ.
Google wants to clarify that Whisk is designed for exploratory purposes rather than precision editing. While other software may be better suited for fine-tuning, Whisk excels in the brainstorming phase:
“We built it for rapid visual exploration, not pixel-perfect edits. It’s about exploring ideas in new and creative ways, allowing you to work through dozens of options and download the ones you love.”
Let’s be real: finding the right words can be a daunting task. I often find myself fumbling for suitable descriptors. This is where Whisk truly shines, making it an incredible tool for those moments when it’s easier to say, “I want an image that looks like this.”