Google’s global Genie model can now simulate real streets using Street View


We’ve all brought up the Street View feature on Google Maps to show a friend what our childhood home looked like, or dropped that little person icon on the streets of Paris to see if we’ve booked a hotel in a cool neighborhood. Imagine being able to do that, but in a more immersive and interactive way that allows you to simulate the street and its environs, and even do things like adjust the weather or see what it would look like in a “day after tomorrow” scenario.

This is one of the goals of Google’s latest integration. Starting today, Google DeepMind is connecting Street View to genie project, Universal, general-purpose corporate model that can create diverse interactive environments. The new feature was launched during the Google I/O developer conference.

“It’s really powerful for both the agent (and bot) use case and for humans to play with, and that’s always been the genie’s thesis,” Jack Parker Holder, a research scientist on the DeepMind open team, told TechCrunch.

He gave the example of a new robot being deployed in London, which rarely sees the sun. Parker Holder says Jenny can simulate those rare occasions when the sun would shine off a Victorian residence, so that the rays wouldn’t hit the robot when that happened.

“At the same time, you might say, ‘I’m going to go to New York City, but not at this time of year.'” He continued, “It’s going to snow. “I want to see what this block looks like in the snow.”

Google has been collecting Street View data for 20 years via cars equipped with cameras and individuals strapped to “tracking backpacks.” The tech giant has collected north of 280 billion images across 110 countries and seven continents.

“With Street View, we have images from a huge amount of the world,” Jack said. “You can imagine how powerful it is to combine this rich source of information and real-world data with the ability to simulate worlds.”

Google has released its latest global model Genie 3 to preview the search last August and opened access to the tool to Google AI Ultra subscribers in the US in January, allowing customers to create interactive game worlds from text prompts or images. The goal is to use Genie in educational experiments, games, and robotics training.

Genie 3 really helps with power One of the Waymo simulators to train its self-driving cars for “extremely rare events” such as hurricanes or accidental elephant encounters. Adding Street View data to that could help Waymo prepare to launch in more cities around the world.

Waymo has its own simulator that it is relying on to expand its reach to 11 US cities and test its AI driver in several more. The difference with the Genie is that this is all from the car’s point of view, says Parker Holder. Street View allows not only to simulate a world linked to a real place, but also to shift the point of view to other types of agents, such as a human or a robot.

Google is launching Street View in Genie to some Ultra users in the US starting today, with more widespread access over time. Global Ultra users will have access over the next few weeks, according to the company.

The researchers’ goal is to put this new capability into as many hands as possible, according to Diego Rivas, product manager at DeepMind. He cautioned that Street View in particular and Genie in general are still an experiment, so there is a lot to improve in terms of accuracy.

In the samples the Google team showed me — including an underwater simulation of a neighborhood I used to live in — the results were impressive and recognizable, but still video game quality and not photorealistic. Models are also not aware of physics yet, which means they do not yet understand cause and effect. For example, in a simulation of a woman running through a snow-covered Joshua tree, she ran straight through cacti and shrubs.

Compare that, for example, with Google’s Nano Banana image generator — which can now generate perfect text in infographics — or its video generator Veo — which understands that paper boats drift on water currents, smoke wafts through the air, and fabric drapes over shapes.

Physics is not encoded in these models; They learn it intuitively over time through passive observation, as a living organism would.

“I think for this type of model, video could take anywhere between six to 12 months in terms of resolution and quality, so I think it’s something we’ll figure out,” Parker Holder said.

Genie can’t yet create a true street reconstruction, said Jonathan Herbert, a Google Maps manager who started on the Street View team as an intern 12 years ago. He believes the real breakthrough is the spatial continuity of artificial intelligence. If you rotate 360 ​​degrees, the AI ​​correctly remembers and simulates the environment behind you. From that point on, the model can build a new environment on top of that.

“We’ve been thinking for a long time about how we build the best, richest model of the world on top of Street View data,” Herbert said. “It has certainly been our idea to use map data in new ways and for new types of AI research for a very long time.”

When you make a purchase through the links in our articles, We may earn a small commission. This does not affect our editorial independence.

Leave a Reply

Your email address will not be published. Required fields are marked *