Gurman: Apple Working on On-Device LLM for Generative AI Features

ipedro · Apr 21, 2024

xalea said:
Without it being able to access the internet for information, I'm not sure how useful it would be on an iOS device. I mean, better than nothing I suppose, but don't most people want up-to-date information when using their mobile devices?

An offline LLM means that Siri itself will be processed on device as it has been doing in a limited way for a couple of iOS generations now, but it can still make use of the web to fetch requests. It doesn't need to send your voice to the cloud to process, saving round trip time and preserving privacy.

A recent machine learning model Apple published with the capability to recognize apps' UI and how to use them, gives us a good indication of how this will work. Siri (offline) will be able to act on your behalf to use apps (online).

User:
Siri, I'd like to have dinner with my girlfriend tonight at that place I walked by last week with the red umbrella. I took a photo of it.

Siri:

Code:

Looks at your Photos (offline), finds the red umbrella, geotagged, 
locates the restaurant in Apple Maps (online), 
brings up OpenTable (online) cues up a reservation after looking at your Calendar (offline).

That was Sandro's on College St. I've found you a reservation for 2 at 8pm. You get off work at 5, that should give you enough time to get home, get ready and head over to Sandro's. Ana's schedule also shows her free. Would you like me to book it?

User:
Yes... no, wait, can we do 8:30 instead? I'd like to get a bottle of wine, does Sandro's have a corking fee?

Siri:

Code:

Looks up OpenTable (online) to see if there's an 8:30pm reservation. 
Looks up Sandro's website (online) and searches for a corking fee. 
Looks up Apple Maps for nearby wine stores, 
finds one that's on the Ritual app (online) so they can can bag your wine for pickup, 
finds that you've ordered 2 different wines via Ritual.

Sandro's has a $6 corking fee. I found you an 8:30 reservation. Would you like Mateus Rose or Wolf Blass - Yellow Label Sauvignon? I can reserve it at the Wine Cellar, a short walk from Sandro's.

User:
Let's do Mateus. Go ahead and book the reservation please.

Siri:

Code:

Goes to OpenTable, places the reservation on your behalf. 
Goes to Ritual, orders a bottle of Mateus Rose for pickup at 8pm. 
Adds an event in your calendar with directions to the Wine Cellar 
and another at 8:30pm with directions from there to Sandro's. 
Creates a calendar invite for your girlfriend.

All set! Your Mateus Rose will be ready for pickup at 8pm, Sandro's at 8:30 on College St. and I've sent an invite to Ana.

Local Siri processing without having to access the internet will enable free flowing conversations without a delay. Having the ability to recognize how to use apps on your phone will be the online component. You already use those apps online. I suspect Google will be one of them, to allow Siri to get current information from the internet, using Gemini and returning those answers the same way it can return search results today, but with the capability to make use of them to get you an answer and read them back to you.

Apple's advantage beyond building silicon custom made for its native Siri, is that it has the largest App Store with virtually unlimited potential (there's an app for everything). Give Siri the capability to understand how to use apps (like the model Apple just published) and you can imagine how far this can go.

coolfactor · Apr 21, 2024

Carrotstick said:
If Apple truly cared about privacy, their cloud AI would also be in-house.

This just shows that Apple is behind in cloud based AI

Why does it matter if Apple was first? Once Apple's genAI solutions rolls out, the world keep spinning and we all forget that Apple wasn't the first.

It doesn't matter who was first, only who is best.

coolfactor · Apr 21, 2024

xalea said:
Without it being able to access the internet for information, I'm not sure how useful it would be on an iOS device. I mean, better than nothing I suppose, but don't most people want up-to-date information when using their mobile devices?

On-device processing of requests doesn't mean that it can't access the internet for answers. Is that how you interpreted that?

Server-based processing means that your voice-encoded request is sent over the internet, processed on the server, and the response sent back. That's how older versions of Siri worked, but many requests are now processed on-device.

PeLaNo · Apr 21, 2024

purplerainpurplerain said:
Whoever wrote that must be so ignorant if they don’t understand that an LLM like Gemini running on device would make the device unusable, hot and dead in 30 minutes. You can only have a custom cut down LLM made for smart phones if you want fully on device.

Didn't the Apple AI team publish the paper about having LLM so compact that it can run on a limited resources device?

2312.11514.pdf (arxiv.org)

Tagbert · Apr 21, 2024

TVreporter said:
Someone tell me why I should want or need generative AI?

#OldManYellsatCloud

Go to Bing and ask Copilot?

szw-mapple fan · Apr 21, 2024

xalea said:
Without it being able to access the internet for information, I'm not sure how useful it would be on an iOS device. I mean, better than nothing I suppose, but don't most people want up-to-date information when using their mobile devices?

I don’t think on device means the model is cut off from the internet. Rather than having the LLM running on a cloud server for everything like chatGPT Apple is trying to have more of the processing done on device with the more computationally intensive applications (like image generation for example) done on the server. It’s actually how Siri and a lot of other iOS AI features works right now. Things like searching the web, media controls, face recognition in Photos etc. are processed purely on device.

Beautyspin · Apr 21, 2024

a m u n said:
This is an example of misinformation.

Apple operates its own data centers and integrates Google Cloud and Amazon AWS for various purposes. These external cloud services may be utilized to enhance Apple’s infrastructure during peak usage periods or for specific functionalities.

Did things change?

https://www.cnbc.com/2018/02/26/apple-confirms-it-uses-google-cloud-for-icloud.html

Jumpthesnark · Apr 21, 2024

ipedro said:
I suspect Google will be one of them, to allow Siri to get current information from the internet, using Gemini and returning those answers the same way it can return search results today, but with the capability to make use of them to get you an answer and read them back to you.

All that's needed is for Apple to make the asks anonymous. Because in the scenario you describe it all sounds great except for the inevitable Google-sends-me-ads-for-that-wine, the way it is now.

If it can work the way you say (offloading to the internet only the most necessary search requests, all the rest of the processing is done on device), that would be great.

My second concern, as some others have mentioned, is how much of a database are we seriously supposed to have resident on our phone? Because the "large" part of an LLM means we'd need a lot more storage if we're carrying that around with us.

Beautyspin · Apr 21, 2024

purplerainpurplerain said:
Whoever wrote that must be so ignorant if they don’t understand that an LLM like Gemini running on device would make the device unusable, hot and dead in 30 minutes. You can only have a custom cut down LLM made for smart phones if you want fully on device.

Gemini nano runs on device. Pixel 8 and Pro run as well as Samsung S24 Gemini Nano and most of the AI work from the phone. These devices also use both on device and off device AI capabilities.

Surprise! Google will let the Pixel 8 run on-device AI after all

Guess it figured out those “hardware limitations.”

www.theverge.com

Beautyspin · Apr 21, 2024

HobeSoundDarryl said:
So higher RAM for this AI to use or rolling out some kind of AI RAM in iCloud with a forever rent component? If "onboard" means on board the computer or iDevice, doesn't RAM have to be big enough for it?

Pixel 8 runs Gemini Nano, Google's on device AI LLM. It has 8 GB RAM. Since Apple has been touting that its phones have the fastest chips, I am guessing that older devices should also get these on device AI capabilities. If Pixel 8 can do it, I am sure Apple would like to say that iPhone 11 (or whatever old model it sees fit) could run it showing how powerful their chips are.

Beautyspin · Apr 21, 2024

ocnitsa said:
If they weren’t behind, they’d be behind like the others who’ve poisoned their LLMs through training via Reddit and Twitter. They are learning from others who were too quick to market, I hope, and building something from the ground up predicated on Apple values. Temporarily using others is a necessary evil.

Apple values? You mean gouging the customers? How does that translate to AI?

HobeSoundDarryl · Apr 21, 2024

Beautyspin said:
Pixel 8 runs Gemini Nano, Google's on device AI LLM. It has 8 GB RAM. Since Apple has been touting that its phones have the fastest chips, I am guessing that older devices should also get these on device AI capabilities. If Pixel 8 can do it, I am sure Apple would like to say that iPhone 11 (or whatever old model it sees fit) could run it showing how powerful their chips are.

...OR... Apple being Apple, tell everyone that they need to buy latest iPhones to use all of these amazing/magical/"only Apple can..." new features. One option potentially makes Apple Inc no new money. The other makes them lots of new money. Which will Apple choose???

Beautyspin · Apr 21, 2024

CarAnalogy said:
They released a paper on how to shrink subsets to be used on-device. If it's pre-trained in the cloud on the data types present on the iPhone, especially easy in this case since the iPhone silos information according to Photos, Reminders, Calendars, etc. it should be possible to use that on-device for basic functionality.

They could make a smarter pre-trained Siri that could actually do useful things.

But I'm not getting my hopes up.

Every company that is dabbling in AI releases paper's left and right. They are just scholarly articles and that does not mean there are working models for them. They are not doing anything earth shattering. Google already has Gemini Nano that runs on device (Pixel 8 and Pro as well as Samsung S24 use it). Since their chips are more powerful, the models may be able to do better in real world.

Beautyspin · Apr 21, 2024

ipedro said:
An offline LLM means that Siri itself will be processed on device as it has been doing in a limited way for a couple of iOS generations now, but it can still make use of the web to fetch requests. It doesn't need to send your voice to the cloud to process, saving round trip time and preserving privacy.

A recent machine learning model Apple published with the capability to recognize apps' UI and how to use them, gives us a good indication of how this will work. Siri (offline) will be able to act on your behalf to use apps (online).

User:
Siri, I'd like to have dinner with my girlfriend tonight at that place I walked by last week with the red umbrella. I took a photo of it.

Siri:

Code:

Looks at your Photos (offline), finds the red umbrella, geotagged, locates the restaurant in Apple Maps (online), brings up OpenTable (online) cues up a reservation after looking at your Calendar (offline).

That was Sandro's on College St. I've found you a reservation for 2 at 8pm. You get off work at 5, that should give you enough time to get home, get ready and head over to Sandro's. Ana's schedule also shows her free. Would you like me to book it?

User:
Yes... no, wait, can we do 8:30 instead? I'd like to get a bottle of wine, does Sandro's have a corking fee?

Siri:

Code:

Looks up OpenTable (online) to see if there's an 8:30pm reservation. Looks up Sandro's website (online) and searches for a corking fee. Looks up Apple Maps for nearby wine stores, finds one that's on the Ritual app (online) so they can can bag your wine for pickup, finds that you've ordered 2 different wines via Ritual.

Sandro's has a $6 corking fee. I found you an 8:30 reservation. Would you like Mateus Rose or Wolf Blass - Yellow Label Sauvignon? I can reserve it at the Wine Cellar, a short walk from Sandro's.

User:
Let's do Mateus. Go ahead and book the reservation please.

Siri:

Code:

Goes to OpenTable, places the reservation on your behalf. Goes to Ritual, orders a bottle of Mateus Rose for pickup at 8pm. Adds an event in your calendar with directions to the Wine Cellar and another at 8:30pm with directions from there to Sandro's. Creates a calendar invite for your girlfriend.

All set! Your Mateus Rose will be ready for pickup at 8pm, Sandro's at 8:30 on College St. and I've sent an invite to Ana.

Local Siri processing without having to access the internet will enable free flowing conversations without a delay. Having the ability to recognize how to use apps on your phone will be the online component. You already use those apps online. I suspect Google will be one of them, to allow Siri to get current information from the internet, using Gemini and returning those answers the same way it can return search results today, but with the capability to make use of them to get you an answer and read them back to you.

Apple's advantage beyond building silicon custom made for its native Siri, is that it has the largest App Store with virtually unlimited potential (there's an app for everything). Give Siri the capability to understand how to use apps (like the model Apple just published) and you can imagine how far this can go.

Nice depiction. However, all I could think of the entire time I was reading this was so many ways that Siri could screw it up spectacularly.

Jaisah · Apr 21, 2024

Carrotstick said:
If Apple truly cared about privacy, their cloud AI would also be in-house.

This just shows that Apple is behind in cloud based AI

It's all speculation for now. I am holding out hope that Apple will be using it's own AI for a lot of things and it'll only be using Gemini for "google search" type requests like "How do I make Carbonara" and then Apple's local LLM will respond with an answer that it gathered from Gemini i.e. using Gemini like a tool within Siri. I assume that there will be a few of these "tools" built into Siri as we have already heard rumours about being able to edit images with voice prompts, create GIFs from still images using prompts etc. I'm hopeful that we will see something unique from Apple because if they just take Gemini and integrate that into Siri then it'll be a huge disappointment.

Torty · Apr 21, 2024

Beautyspin said:
Pixel 8 runs Gemini Nano, Google's on device AI LLM. It has 8 GB RAM. Since Apple has been touting that its phones have the fastest chips, I am guessing that older devices should also get these on device AI capabilities. If Pixel 8 can do it, I am sure Apple would like to say that iPhone 11 (or whatever old model it sees fit) could run it showing how powerful their chips are.

I think perhaps iPhone 15P will be supported. 15 and 14p couldn’t even run latest Apple promoted games so those are already obsolete.

svish · Apr 21, 2024

Looking to hear all about it at WWDC. Expecting many of the features to be limited to the new iPhones and Macs that will be released this year.

dampfnudel · Apr 22, 2024

Carrotstick said:
If Apple truly cared about privacy, their cloud AI would also be in-house.

This just shows that Apple is behind in cloud based AI

Apple doesn’t seem to be doing in-house that well these days. Siri, their modem project, their car project, maybe a few other canned in-house projects that were never leaked or disclosed.

purplerainpurplerain · Apr 22, 2024

PeLaNo said:
Didn't the Apple AI team publish the paper about having LLM so compact that it can run on a limited resources device?

2312.11514.pdf (arxiv.org)

The subject was ‘Gemini’ running on device.

purplerainpurplerain · Apr 22, 2024

Beautyspin said:
Pixel 8 runs Gemini Nano, Google's on device AI LLM. It has 8 GB RAM. Since Apple has been touting that its phones have the fastest chips, I am guessing that older devices should also get these on device AI capabilities. If Pixel 8 can do it, I am sure Apple would like to say that iPhone 11 (or whatever old model it sees fit) could run it showing how powerful their chips are.

Nano is very cutdown and probably mostly just a bunch of if else statements designed around the built in apps.

macfacts · Apr 22, 2024

wigby said:
According to this rumor, their AI's LLMs are on device which is safer than any cloud AI.

Apple has its users scared of their own shadow with their talk of privacy. There is no "danger" or embarrassment or shame if someone knows what color you like.

StrollerEd · Apr 22, 2024

ipedro said:
An offline LLM means that Siri itself will be processed on device as it has been doing in a limited way for a couple of iOS generations now, but it can still make use of the web to fetch requests. It doesn't need to send your voice to the cloud to process, saving round trip time and preserving privacy.

A recent machine learning model Apple published with the capability to recognize apps' UI and how to use them, gives us a good indication of how this will work. Siri (offline) will be able to act on your behalf to use apps (online).

User:
Siri, I'd like to have dinner with my girlfriend tonight at that place I walked by last week with the red umbrella. I took a photo of it.

Siri:

Code:

Looks at your Photos (offline), finds the red umbrella, geotagged, locates the restaurant in Apple Maps (online), brings up OpenTable (online) cues up a reservation after looking at your Calendar (offline).

That was Sandro's on College St. I've found you a reservation for 2 at 8pm. You get off work at 5, that should give you enough time to get home, get ready and head over to Sandro's. Ana's schedule also shows her free. Would you like me to book it?

User:
Yes... no, wait, can we do 8:30 instead? I'd like to get a bottle of wine, does Sandro's have a corking fee?

Siri:

Code:

Looks up OpenTable (online) to see if there's an 8:30pm reservation. Looks up Sandro's website (online) and searches for a corking fee. Looks up Apple Maps for nearby wine stores, finds one that's on the Ritual app (online) so they can can bag your wine for pickup, finds that you've ordered 2 different wines via Ritual.

Sandro's has a $6 corking fee. I found you an 8:30 reservation. Would you like Mateus Rose or Wolf Blass - Yellow Label Sauvignon? I can reserve it at the Wine Cellar, a short walk from Sandro's.

User:
Let's do Mateus. Go ahead and book the reservation please.

Siri:

Code:

Goes to OpenTable, places the reservation on your behalf. Goes to Ritual, orders a bottle of Mateus Rose for pickup at 8pm. Adds an event in your calendar with directions to the Wine Cellar and another at 8:30pm with directions from there to Sandro's. Creates a calendar invite for your girlfriend.

All set! Your Mateus Rose will be ready for pickup at 8pm, Sandro's at 8:30 on College St. and I've sent an invite to Ana.

Local Siri processing without having to access the internet will enable free flowing conversations without a delay. Having the ability to recognize how to use apps on your phone will be the online component. You already use those apps online. I suspect Google will be one of them, to allow Siri to get current information from the internet, using Gemini and returning those answers the same way it can return search results today, but with the capability to make use of them to get you an answer and read them back to you.

Apple's advantage beyond building silicon custom made for its native Siri, is that it has the largest App Store with virtually unlimited potential (there's an app for everything). Give Siri the capability to understand how to use apps (like the model Apple just published) and you can imagine how far this can go.

Mateus Rose?

OK, I'm out ...

sbsyoel · Apr 22, 2024

a m u n said:
This is an example of misinformation.

Apple operates its own data centers and integrates Google Cloud and Amazon AWS for various purposes. These external cloud services may be utilized to enhance Apple’s infrastructure during peak usage periods or for specific functionalities.

Said by Apple:

Your Apple iCloud data is now stored on Google servers—surprised?

Apple also stores data with Amazon S3.

arstechnica.com

TheOldChevy · Apr 22, 2024

That's the right approach. Mistral/LLama3 and other LLM models are already efficiently used on a simple MacBook M1, I can easily imagine that these will fit in next generation iPhones and be the perfect interface to a next-gen Siri.

Jackbequickly · Apr 22, 2024

Unless Siri gets some brains, more speed will do nothing. She can not accomplish some of the most simple of tasks!

Gurman: Apple Working on On-Device LLM for Generative AI Features

macrumors 603

macrumors 604

macrumors 604

macrumors regular

macrumors 603

macrumors 68040

macrumors 65816

macrumors 65816

macrumors 65816

macrumors 65816

macrumors 65816

macrumors G5

macrumors 65816

macrumors 65816

macrumors newbie

macrumors 65816

macrumors G3

macrumors 601

macrumors 6502a

macrumors 6502a

macrumors 601

macrumors 6502a

macrumors newbie

macrumors 6502

macrumors 68030

Our Staff