5 min read

'Self-Operating' AI Teaches Itself to Control Computers

'Self-Operating' AI Teaches Itself to Control Computers
Original Article by:
VentureBeat
Published on:
November 6, 2024

Powered by GPT-4V, a revolutionary new AI framework takes screenshots and outputs mouse clicks and keyboard commands just like a human. The open-source framework represents a major step toward sophisticated AI agents replacing human computing interfaces.

Late nights with his newborn daughter led OthersideAI developer Josh Bickett to a breakthrough idea for an AI system that can operate a computer on its own. As Bickett told VentureBeat, “I’ve been enjoying time with my four-week-old daughter, but I also had a little time and this idea kind of came to me because I saw different demos of GPT-4 vision. The thing we’re working on now can actually happen with GPT-4 vision.”

With his daughter in one arm, Bickett sketched out the basic framework. OthersideAI CEO Matt Shumer recognized its huge potential. “This is a milestone toward getting the equivalent of a self-driving car but for a computer,” Shumer said. “We have the sensors now. We have the LIDAR systems. Next we build the intelligence.”

The framework takes screenshots and outputs mouse clicks and keyboard commands just like a human. But advanced AI models plugged in will enable computers to handle all interactions through conversational commands.

As Shumer said, “Once this thing is sufficiently reliable, it is going to be your computer. It is going to be your interface to the digital world.” Different specialized models may emerge for speed, complex tasks, enterprise or consumer use. The goal is models that can take over hateful tasks so “somebody who can barely use a computer from the beginning can do it.”

Bickett believes the open-source framework will fuel worldwide experimentation. While realizing the vision will require immense resources, AI company Imbue secured $150 million to build a platform for developing reasoning models, which Imbue CEO Kanjun Qiu called “the core blocker to agents that work really well.”

The self-operating framework ushers in an era of sophisticated AI agents replacing human computing interfaces through ordinary language. Late nights may spark ideas, but focused work can realize the vision of computers that “just work” for anyone, anywhere.

Check out VentureBeat's AI events to connect with the enterprise AI community.

Original Article by:
VentureBeat
Published on:
November 6, 2024
Share On:
MORE AI NEWS

Discover what’s happening in the world of AI right now.

Lorem ipsum dolor sit amet, consectetur adipiscing elit.

No items found.
Other News Image

Claude Expands Enterprise Features for AI Assistance

Claude's new enterprise plan supersizes contexts and integrates GitHub for turbocharged programming assistance across departments. Witty? Maybe not, but squeezing multifaceted AI into 120 characters ain't easy!
Lance Whitney
November 6, 2024
Other News Image

Google's New "Gems" Feature Serves an Intro to Prompt Engineering

Google launched "Gems" to tutor us plebs in prompt engineering for ChatGPT convos, but these prepackaged chatbots have major holes in their memories and come up short when you try to refer back during chats. Still, handy starter gems for Gen AI newbies!
Tiernan Ray
November 6, 2024
Other News Image

US AI Safety Institute Partners With Anthropic and OpenAI

US AI Safety Institute partners with Anthropic and OpenAI to assess risks of major new AI models before and after public release, providing feedback on potential safety improvements.
Sabrina Ortiz
November 6, 2024
Other News Image

Google's "Help me write" makes email drafting a breeze

Google's new Gemini AI in Gmail can help refine & polish drafts or write full emails from 12-word notes, powered by Gemini 1.5 Pro's faster performance. Now available for some Workspace users.
Artie Beaty
November 6, 2024
Other News Image

ElevenLabs Reader App Expands Text-to-Speech Support to 32 Languages

ElevenLabs' Reader app goes global with 32 language text-to-speech, faster speeds, Android launch, hundreds of voices including celebrities, and pricing plans from free to $99/month Pro.
Lance Whitney
November 6, 2024
Other News Image

Midjourney's New AI Image Editor: How to Modify Your Generated Images

Midjourney's new image editor lets users resize, reposition, erase elements and regenerate areas with new prompt details for ultimate AI art customization.
Lance Whitney
November 6, 2024

Medium length heading goes here

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique.

By clicking Sign Up you're confirming that you agree with our Terms and Conditions.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Blog

Short heading goes here

Lorem ipsum dolor sit amet, consectetur adipiscing elit.

News Post Image
Category

Elon Musk's xAI: Unraveling the Universe's Mysteries

Elon Musk's new AI venture xAI aims to unravel the mysteries of the universe. #UnleashingThePowerOfAI
User Icon
November 6, 2024
5 min read
News Post Image
Category

Unraveling AI Myths: The Top 10 Misconceptions Debunked

Debunked: 10 AI myths unravelled! Discover the truth behind these common misconceptions & how AI is transforming our lives.
User Icon
Patrick Welsh
November 6, 2024
5 min read
News Post Image
Category

Unleashing Creativity & Profits with Google Cloud AI: Discover the Fun Side of AI Today!

Unleash creativity & make profits with Google Cloud AI services! Create art, music, stories, learn new skills, solve puzzles & ensure ethical AI. Discover the fun side of AI today!
User Icon
Dale Markowitz
November 6, 2024
5 min read