Google unveils Whisk, a creative image tool powered by Gemini

December 17, 2024 11:49 PM PST | By Invezz
 Google unveils Whisk, a creative image tool powered by Gemini
Image source: Invezz

The tech industry’s generative AI race just got more competitive as Google launched Whisk, a tool designed to create unique images from user-uploaded photos.

Unveiled through Google Labs, Whisk allows users in the US to remix subjects, styles, and settings into new visuals without requiring text prompts.

It builds on Google DeepMind’s AI advancements, showcasing Gemini and Imagen 3 technologies.

The move highlights Google’s focus on delivering accessible AI tools while competing against OpenAI’s suite of consumer products, including the text-to-video generator Sora.

What is Whisk and how does it work?

Whisk offers a new take on AI-powered creativity.

Users can upload images representing subjects, settings, or styles.

The platform processes these inputs using Gemini, Google’s AI foundation model launched in December 2023, which generates captions for the content.

These captions feed into DeepMind’s Imagen 3, a text-to-image generator.

Unlike traditional photo editors, Whisk focuses on creative exploration rather than pixel-perfect results.

It allows users to remix categories—such as turning an image into a plushie toy, enamel pin, or sticker—by adjusting inputs or incorporating text to guide specific details.

Google emphasises that the outputs capture the “essence” of a subject, meaning some variations, such as changes to hairstyle or skin tone, may occur.

DeepMind’s Nobel Prize-winning expertise underpins Whisk

Whisk leverages cutting-edge developments from DeepMind, the AI division Google acquired in 2014.

DeepMind’s AI research contributed to two employees winning the 2024 Nobel Prize in Chemistry for protein structure discoveries.

This underscores the lab’s reputation for pushing technological boundaries, which now extends to creative applications like Whisk.

Whisk also positions Google as a leader in consumer-friendly AI.

While its initial text-to-image tool Gemini faced criticism for producing historically inaccurate images, Whisk aims to avoid similar pitfalls by focusing on abstract, exploratory outputs rather than exact replicas.

AI innovation spurs rivalry among tech giants

Google’s unveiling of Whisk highlights its broader strategy to dominate AI-driven consumer products.

The competition is fierce, with OpenAI recently introducing Sora, a text-to-video generator.

Google aims to solidify its advantage by integrating Whisk with Gemini’s capabilities and Imagen 3, signalling a shift toward dynamic, multi-modal AI tools.

Dan Ives, an equity analyst at Wedbush Securities, views Whisk as part of Google’s “treasure chest” of 2025 offerings, alongside its collaboration with Samsung and Qualcomm on a new Android operating system.

These initiatives demonstrate Google’s effort to maintain an edge in the highly lucrative and competitive AI landscape.

Generative AI tools like Whisk have captured public imagination but also faced scrutiny.

For instance, Gemini’s earlier issues with historically inaccurate image outputs raised concerns about AI reliability.

Whisk seeks to navigate these challenges by focusing on imaginative, user-directed creations.

As Google continues to refine its offerings, the tool’s initial rollout as a website for US users will provide a critical testbed for future updates and iterations.

Google’s AI ambitions

Whisk’s debut signals a broader evolution in how AI is used for consumer creativity.

By focusing on user-friendly interfaces and integrating advanced technologies like Gemini, Google aims to democratise access to generative AI.

However, the competition remains intense, with rival platforms pushing the boundaries of what AI can achieve.

The post Google unveils Whisk, a creative image tool powered by Gemini appeared first on Invezz


Disclaimer

The content, including but not limited to any articles, news, quotes, information, data, text, reports, ratings, opinions, images, photos, graphics, graphs, charts, animations, and video (Content) is a service of Kalkine Media LLC., having Delaware File No. 4697309 (“Kalkine Media, we or us”) and is available for personal and non-commercial use only. The principal purpose of the Content is to educate and inform. The Content does not contain or imply any recommendation or opinion intended to influence your financial decisions and must not be relied upon by you as such. Some of the Content on this website may be sponsored/non-sponsored, as applicable, but is NOT a solicitation or recommendation to buy, sell or hold the stocks of the company(s) or engage in any investment activity under discussion. Kalkine Media is neither licensed nor qualified to provide investment advice through this platform. Users should make their own enquiries about any investments and Kalkine Media strongly suggests the users to seek advice from a financial adviser, stockbroker or other professional (including taxation and legal advice), as necessary. Kalkine Media hereby disclaims any and all the liabilities to any user for any direct, indirect, implied, punitive, special, incidental or other consequential damages arising from any use of the Content on this website, which is provided without warranties. The views expressed in the Content by the guests, if any, are their own and do not necessarily represent the views or opinions of Kalkine Media.
The content published on Kalkine Media also includes feeds sourced from third-party providers. Kalkine does not assert any ownership rights over the content provided by these third-party sources. The inclusion of such feeds on the Website is for informational purposes only. Kalkine does not guarantee the accuracy, completeness, or reliability of the content obtained from third-party feeds. Furthermore, Kalkine Media shall not be held liable for any errors, omissions, or inaccuracies in the content obtained from third-party feeds, nor for any damages or losses arising from the use of such content. Some of the images/music that may be used on this website are copyrighted to their respective owner(s). Kalkine Media does not claim ownership of any of the pictures/music displayed/used on this website unless stated otherwise. The images/music that may be used on this website are taken from various sources on the internet, including paid subscriptions or are believed to be in public domain. We have used reasonable efforts to accredit the source (public domain/CC0 status) to where it was found and indicated it, as necessary.
This disclaimer is subject to change without notice. Users are advised to review this disclaimer periodically for any updates or modifications.


Sponsored Articles


Investing Ideas

Previous Next