When Google Lens was introduced in 2017, the search feature achieved a feat that seemed like the stuff of science fiction not too long ago. When you point your phone’s camera at an object, Google Lens identifies it, provides context, and even lets you buy it. This was a new way to search, and I didn’t have to awkwardly type in a description of what was right in front of me.
Lens also showed how Google plans to use machine learning and AI tools to ensure its search engine is visible in every conceivable way. Visual search in Google Lens is also evolving as Google increasingly uses underlying generative AI models to generate summaries of information in response to text searches. And the company says Lens, which currently powers about 20 billion searches per month, will support even more search methods, including video search and multimodal search.
Another tweak to Lens allows you to see even more context about your shopping in your results. Unsurprisingly, shopping is one of the key use cases for Lens. Amazon and Pinterest also offer visual search tools designed to drive more purchases. If you searched for your friend’s sneakers in the old Google Lens, you might have seen a carousel of similar products. Google says the updated version of Lens will show more direct links to purchases, customer reviews, publisher reviews, and comparison shopping tools.
Lens search is now multimodal, a hot word in AI, allowing you to search using a combination of video, images, and voice input. Instead of pointing a smartphone camera at an object, tapping a focus point on the screen, and waiting for the Lens app to come up with a result, users can point the lens and use voice commands at the same time. Is that a cloud? ” “What brand are those sneakers? Where can I buy them?”
Lens will also begin to work on real-time video capture, making it a tool that takes object identification in still images a step further. If you have a broken record player or a flashing light on a malfunctioning appliance in your home, shoot a quick video with Lens and see tips on how to fix that item through a generated AI overview. can.
The feature, first announced at I/O, is considered experimental and available only to users who have opted in to Google’s Search Lab, said Rajan Patel, an 18-year Googler and co-founder of Lens. states. Other Google Lens features, Voice Mode, and Enhanced Shopping are being rolled out more broadly.
This feature, which Google calls its “video understanding” feature, is interesting for several reasons. It currently works with videos captured in real-time, but if Google extends it to captured videos, whether they’re in your personal camera roll or in a huge database like Google’s Entire repositories of can become taggable and overwhelmingly shoppable.
The second consideration is that this lens feature shares some characteristics with Google’s Project Astra, which is expected to be available later this year. Astra, like Lens, uses multimodal input to interpret the world around you through your phone. This spring, as part of the Astra demo, the company showed off a prototype of its smart glasses.
Separately, Meta has just made headlines for its long-term vision for the future of augmented reality. This includes humans wearing boring glasses who can intelligently interpret the world around them and display holographic interfaces. Of course, Google is already trying to make this future a reality with Google Glass (which uses fundamentally different technology than Meta’s latest pitch). Will the combination of Lens’ new features and Astra be a natural progression into a new kind of smart glasses?