The future of AI isn’t in the cloud, it’s on your device


When I click on Anthropic’s app Claude I on my phone and I ask it — say, “Tell me a story about a mischievous cat” — a lot happens before the result (The Great Tuna Heist) appears on my screen.

My order is sent to the cloud – a computer in a file Big data center Somewhere – to be played through Sonnet 4.5 for Claude Great language model. The model compiles a plausible response using advanced predictive script, drawing on the massive amount of data it was trained on. This response is then routed back to my iPhone, where it appears word by word, line by line, on my screen. It has traveled hundreds, if not thousands, of miles and passed through multiple computers on its journey to and from my little phone. And it all happens in seconds.

Read more: CNET picks the best CES 2026 awards

This system works well if what you are doing is low risk and speed is not really an issue. I can wait a few seconds to read my little story about Whiskers and his misadventure in the kitchen cupboard. But not every AI task is like this. Some of them require tremendous speed. If an AI device is going to alert someone to an object in its path, it can’t wait a second or two.

Other requests require more privacy. I don’t care if the cat’s story passes through dozens of computers owned by people and companies I don’t know and probably don’t trust. But what about my health information, or my financial data? I might want to keep a tighter lid on that.


Don’t miss any of our unbiased technical content and lab reviews. Add CNET As Google’s preferred source.


Speed ​​and privacy are two main reasons why technology developers are increasingly shifting AI processing away from massive corporate data centers to personal devices like a phone, laptop or smartwatch. There are also cost savings: there is no need to pay a large data center operator. In addition, on-device forms can work offline.

But making this shift possible requires better hardware and more efficient — and often more specialized — AI models. The convergence of these two factors will ultimately shape how fast and smooth your experience is on devices like your phone.

CNET AI Atlas badge art; Click to see more

CNET

Mahadev Satyanarayanan, known as Satya, is a professor of computer science at Carnegie Mellon University. He has long researched what is known as edge computing – the concept of dealing with data processing and storage as close to the actual user as possible. The ideal model for true computing, he says, is the human brain, which does not dump tasks like vision, recognition, speech or intelligence onto any kind of “cloud.” It all happens right there, right “on the device.”

“Here’s the problem: it took nature a billion years for us to evolve,” he told me. “We don’t have a billion years to wait. We’re trying to do it in five or ten years at most. How can we speed up development?”

You can speed things up with better, faster, smaller AI running on better, faster, smaller hardware. As we already see with the latest apps and devices, including those We saw it at CES 2026 -It’s going smoothly.

AI is probably working on your phone right now

On-device AI is nothing new. Remember, in 2017, you could unlock your iPhone for the first time via… Hold it in front of your face? The facial recognition technology used an on-device neural engine, which is not general AI like Claude or ChatGPT, but basic AI.

Today’s iPhones use a more powerful and versatile on-device AI model. It contains about 3 billion parameters – individual calculations of the weight given to probability in the language model. This is relatively small compared to the large, general-purpose models that most AI chatbots run on. DEPSEC-R1For example, he has 671 billion parameters. But it’s not meant to do everything. Instead, it is designed for specific on-device tasks such as summarizing messages. Just like facial recognition technology to unlock your phone, this is something that can’t rely on an internet connection to run a model in the cloud.

Apple has enhanced its on-device AI capabilities – dubbed Apple intelligence — To include visual recognition features, such as allowing you to Find things you’ve taken a screenshot of.

On-device AI models are everywhere. Google’s Pixel phones run the company’s Gemini Nano model according to its specifications G5 tensioner bracket. This model operates features such as Magic braidwhich displays information from emails, messages, and more — right when you need it — without having to search for it manually.

Developers of phones, laptops, tablets, and the devices inside them are building devices with artificial intelligence in mind. But it goes beyond that. Are you thinking about smart watches and glasses that offer much more limited space than even the thinnest phones?

“The challenges of the system are completely different,” said Vineesh Sukumar, head of artificial intelligence and machine learning at Qualcomm. “Can I do all of this on all devices?”

Right now, the answer is usually no. The solution is fairly straightforward. When demand exceeds the model’s capabilities, it offloads the task to a cloud-based model. But depending on how that handoff is managed, it could undermine one of the main benefits of on-device AI: keeping all your data in your hands.

AI is more private and secure

Experts repeatedly point to privacy and security as key benefits of on-device AI. In the cloud, data flies in every direction and you experience more vulnerable moments. If it stays on an encrypted phone or laptop drive, it will be much easier to secure.

The data used by AI models on your devices can include things like your preferences, browsing history, or location information. While all of this is necessary for AI to personalize your experience based on your preferences, it’s also the kind of information you might not want falling into the wrong hands.

“What we are pushing for is to ensure that the user has access and is the sole owner of that data,” Sukumar said.

Person's hand holding an iPhone

Apple Intelligence has given Siri a new look on the iPhone.

Nomi Prasarn/CNET

There are several different ways that data offloading can be handled to protect your privacy. One key factor is that you have to give permission for this to happen. Sukumar said Qualcomm’s goal is to ensure people are informed and have the ability to say no when a model reaches the point of offloading to the cloud.

Another approach – which can work alongside asking user permission – is to ensure that any data sent to the cloud is treated securely, for a brief and temporary period. Apple, for example, uses technology it calls… Private cloud computing. Only offloaded data is processed on Apple’s own servers, only the minimum data required for the task is sent and none of it is stored or made accessible to Apple.

Artificial intelligence without the cost of artificial intelligence

AI models that run on hardware have an advantage for both app developers and users in that the ongoing cost of running them is essentially nothing. There is no cloud services company to pay for the energy and computing power. It’s all in your phone. Your pocket is the data center.

That’s what caught the attention of Charlie Chapman, developer of the Noise Machine app Dark noiseto use the Apple Foundation Models Framework for a tool that lets you create a mix of sounds. The on-device AI model doesn’t create a new sound, it just selects different existing sounds and volume levels to create a single mix.

Since the AI ​​runs on-device, there is no ongoing cost while you make your mixes. For a small developer like Chapman, this means there is less risk associated with the scope of his app’s user base. “If some influential person randomly posts about this and I get an incredible number of free users, it doesn’t mean I’m suddenly going bankrupt,” Chapman said.

Read more: AI basics: 29 ways you can make AGI work for you, according to our experts

AI’s lack of ongoing on-device costs allows small, repetitive tasks like data entry to be automated without huge costs or computing contracts, Chapman said. The downside is that on-device templates vary based on the device, so developers will have to do more work to ensure their apps work on different devices.

The more AI tasks are handled on consumer devices, the less AI companies will spend on building the massive data centers that have every major tech company scrambling for cash and computer chips. “The infrastructure cost is very huge,” Sukumar said. “If you really want to increase volume, you don’t want to have that cost burden.”

For content creators using AI to edit video or photos, running these templates on your own devices has the benefit of avoiding expensive cloud-based subscription or usage fees. At CES. We saw how Dedicated computers or specialized hardwarelike Nvidia DGX Spark,It can run intensive video generation models such as Litrix-2.

The future is all about speed

Especially when it comes to the functionality of devices like glasses, watches, and phones, a lot of the real benefit of AI and machine learning is nothing like the chatbot you used to create the cat story at the beginning of this article. It’s things like object recognition, navigation, and translation. This requires more specialized models and hardware – but it also requires greater speed.

Satya, the Carnegie Mellon professor, has been researching the different uses of AI models and whether they can work accurately and quickly enough using on-device models. When it comes to classifying images of objects, current technology works very well – it is able to provide accurate results within 100 milliseconds. “Five years ago, we couldn’t get that kind of accuracy and speed,” he said.

Shots of Oakley Meta Vanguard AI sunglasses showing a landscape overlaid with Garmin statistics

This screenshot cropped from video footage captured with the Oakley Meta Vanguard AI glasses shows exercise metrics taken from a paired Garmin watch.

Vanessa Hand Oriana/CNET

But for four other tasks — object detection, real-time segmentation (the ability to recognize objects and their shapes), activity recognition, and object tracking — the hardware still needs to be offloaded to a more powerful computer elsewhere.

“I think in the next five years or so, it’s going to be very exciting as device vendors continue to try to make mobile devices more AI tuned,” Satya said. “At the same time, we also have the AI ​​algorithms themselves becoming more powerful, more accurate and more compute-intensive.”

The opportunities are huge. In the future, Satya said, devices may be able to use computer vision to alert you before your trip of uneven payments or remind you who you’re talking to and provide context about your past communications with them. These kinds of things will require more specialized AI and more specialized hardware.

“These will appear,” Satya said. “We can see them on the horizon, but they’re not there yet.”



Leave a Reply

Your email address will not be published. Required fields are marked *