Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
OpenAI It allows some users to try out the new ChatGPT feature it uses artificial intelligence To operate your web browser to book flights, buy groceries, search for deals, and do many other online tasks.
The new tool, called Operator, is an AI agent: it relies on an AI model trained on both text and images to interpret commands and learn how to use a web browser to execute them. OpenAI claims to have the ability to automate many everyday and workday tasks.
The OpenAI launcher follows competing versions of both Google And anthropic, which has already been mentioned Those that have been proven Able to use the web. Artificial intelligence agents are It is widely viewed as the next evolutionary stage As for artificial intelligence following chatbots, many companies have already jumped on the hype train by promoting them. In most cases, the capabilities of these programs are very limited and simply use a language model to automate things that would normally be done using regular programs.
“AI is evolving from this tool that can answer your questions to a tool that is also able to take action in the world, executing complex, multi-step workflows,” says Peter Wellender, VP of Product at OpenAI. “We’ll see a huge impact on people’s productivity, but also on the quality of work people are able to get done.”
OpenAI acknowledges that giving ChatGPT access to a web browser introduces new risks, and says the operator may sometimes misbehave. It says it has implemented several new safeguards and plans to gradually expand the operator’s capabilities.
The plan is to learn from how people use the tool, Welinder and Yash Kumar, production and engineering lead for OpenAI’s Computer Usage Agent, say. They acknowledge that the tool can make unwanted reservations or purchases but have added a lot of work to make sure they ask before doing anything risky. “He will come back to me and seek assurances before taking steps that may be irreversible,” says Kumar.
Today, OpenAI also released a new “System Card” outlining issues that may occur with the launcher. These include the possibility of commands being misunderstood or deviating from what the user requested; be misused by users; Or be targeted by cyber criminals.
“It also poses an incredible amount of safety challenges,” Kumar says. “Because your attack vector area and danger vector area increase dramatically.”
The launcher will initially be available as a “research preview” to ChatGPT users who have a Pro account, which costs $200 per month. The company says it plans to expand access while slowly rolling out the tool because it will inevitably make some mistakes along the way.
In several demos, Operator demonstrated the potential for AI to take a more active role as a web assistant. The tool features a remote web browser and a chat window to communicate with the user.
At WIRED’s request, the operator was asked to book an Amtrak train ride from New Haven to Washington, D.C. I went to the correct site, correctly entered the necessary information to show the schedule, and then asked for further instructions. If a user logs into Amtrak’s website, or to a browser profile that has stored credit card information, the operator will be able to go ahead and book a ticket — although it is designed to ask for permission first.
Kumar asked the client to reserve a table at Beretta restaurant in San Francisco. The program navigated to the OpenTable website, found the right restaurant and looked up availability before asking what to do next. OpenAI says it has partnered with a number of popular sites, including OpenTable, to ensure the launcher runs smoothly on them.
The new tool is based on OpenAI’s GPT-4o AI model, which can recognize the browser, web page, and speech in written text. The tool includes additional training designed to help them understand how to perform tasks online. OpenAI will also make its computer user agent available through its Application Programming Interface (API).