(Reuters) – Generative artificial intelligence heavyweight OpenAI on Thursday previewed an AI agent that can carry out tasks on the web for users, as it seeks to enhance its chatbot amid intensifying competition.
The tool, called Operator, is powered by a model that allows it to interact with on-screen buttons, menus and text fields.
“This capability marks the next step in AI development, allowing models to use the same tools humans rely on daily and opening the door to a vast range of new applications,” the company said in a blog post.
Operator can perform a variety of tasks, like creating to-do lists or assisting with vacation planning. It also takes user input once it decides a task is complete and seeks confirmation for some tasks, such as entering login details on a website.
The tool is currently available to Pro users in the U.S. as a research preview, the Microsoft-backed startup said.
Agents, which are systems that can execute actions such as making purchases and scheduling meetings without direct human intervention, are now at the forefront of companies’ AI agenda.
OpenAI competitor Perplexity launched an agent-based assistant for Android devices earlier on Thursday. This assistant can book dinner reservations, hail rides on apps and set reminders, among other tasks.
Last year, Apple incorporated Apple Intelligence into its voice assistant, Siri, and — in a partnership with OpenAI — the iPhone maker also introduced the use of ChatGPT, with user permission.
While such agents had long been elusive to researchers, the emergence of step-by-step reasoning approaches like those used in OpenAI’s o1 model could make such tasks possible, business executives told Reuters in December.
(Reporting by Arsheeya Bajwa in Bengaluru; Editing by Alan Barona)