Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

zoomp

macrumors regular
Original poster
Aug 20, 2010
215
363
I was wondering the other day what Apple is planning in the shadow of its R&D dpt regarding AI.

Could it be that Apple will combine its powerful and efficient Silicon SoCs with large language models (LLMs) like GPT-4 to create a secure, privacy-preserving user experience?

The main concept is to perform local semantic indexing of user data on Apple devices using the dedicated ML cores in Apple Silicon chips, such as the M1's Neural Engine. This local semantic indexing would be designed to maintain user privacy by creating a representation of the data that can be safely used without exposing raw information. Techniques like differential privacy or federated learning could be employed to achieve this.

Once the semantic index or representation is created, it could be sent to an LLM hosted on Apple's servers for processing. The LLM would generate a response based on the user's data without ever having direct access to the raw data, further preserving privacy.

To ensure a smooth and efficient user experience, the Apple Silicon SoC's advantages could be leveraged, such as unified memory architecture, specialized ML cores, energy efficiency, and tight integration with Apple's software ecosystem.

This approach has the potential to offer a balanced solution that maintains user privacy while still providing the benefits of LLMs, such as powerful natural language understanding and generation capabilities.

I would love to hear your thoughts on this idea. Do you think it's a viable solution for maintaining privacy while still leveraging the power of LLMs? Are there any potential pitfalls or challenges that you see in implementing this approach? Could this scale down to a iPhone?
 

Tagbert

macrumors 603
Jun 22, 2011
5,609
6,551
Seattle
I think it is likely that Apple will eventually integrate LLMs into Siri, Spotlight, and some other areas but I expect them to approach it cautiously and don’t expect to see any significant enhancements until next year’s OS versions. This year, if they do anything it would likely be very small scale. One rumor suggested LLM powered jokes in Siri. I hope for just a little more than that but would not be surprised if that is all.

I would expect them to try to keep it local and push that as a privacy feature.
 

dgdosen

macrumors 68030
Dec 13, 2003
2,761
1,401
Seattle
Are you thinking about local inference? As my brain explodes from what Chat GPT can and will do - I see how inference becomes more costly due to the size of the model. That means users might want more GPU (and CPU?) power on the 'edge' to get the full benefit from these models.

That also makes me think having more GPU power will be more important for local inference. That'd give customers (other than gamers and content creators) a better reason to buy AS machines with more GPUs (or whatever they want to stick on a Max chip...)

I think Apple could be in a position to help 'best enable' working with these future iterations of LLMs - in ways other than Siri and Spotlight - and yes touting 'security' would be a selling point.
 

zoomp

macrumors regular
Original poster
Aug 20, 2010
215
363
The role of the Neural Engine in local inference cannot be understated. As LLMs become more complex, having specialized hardware like the Neural Engine becomes crucial for running ML tasks efficiently on edge devices. This not only ensures smoother user experiences but also helps conserve battery life in mobile devices, like iPhones and iPads.

GPUs can also be more efficient in certain workflow but NE can be Apple hidden card.
 
  • Like
Reactions: Tagbert

Rnd-chars

macrumors regular
Apr 4, 2023
247
232
Everything you described seems feasible, zoomp, but I wonder how much of it they would prefer (and are capable of) keeping local (which means valuable offline experiences) versus still private and effectively tokenized in the cloud (allowing for the same capabilities through any Siri and Internet connected device).

Rather than having a personalized LLM, it may make sense for them to start with a portable/local generalized LLM with Apple specific tunings (e.g., extended accessibility, Xcode and app generation, transport me to the deck of the Picard’s Starship Enterprise using my XR headset, etc). This would help from a time-to-market perspective and is already feasible using CPU processing only (which would obviously only get better with NPU and GPU support). I suspect the initial set of use cases would be content generation and task completion, which could be an extremely compelling product differentiator for them.
 

dgdosen

macrumors 68030
Dec 13, 2003
2,761
1,401
Seattle
Last edited:

name99

macrumors 68020
Jun 21, 2004
2,235
2,067
Isn't Apple Neural Engine the same as Nvidia Tensor Cores? What advantages does Neural Engine have over Tensor Cores? Is the neural engine API public?

While we wait for Apple to release its LLM, we can try this:
Performance wise, not much. ANE right now is primarily about power-saving compared to using the GPU for inference. But that could definitely change once AI becomes settled enough that certain usage patterns are both common and amenable to specialized hardware paths which are not of generic GPU interest. This may be the case with language inference, for example.

The extent to which this power-saving matters depends on how often you expect to be using AI. It might seem like not much of an issue if you see AI as the occasional Siri query; it's much more of an issue if you see AI as running continually while you are using a camera, or while a game is playing, or in the background seeing if a workout (and of what type) has started, or suchlike...

Remember always this is a huge space. Training is not inference. Language is not vision. Image generation is not image recognition. etc etc.
nV are optimizing for one set of things (based on their core skills and customer base); Apple are optimizing for a somewhat different set of things (likewise based on their core skills and customer base).
 

floral

macrumors 65816
Jan 12, 2023
1,010
1,230
Earth
R&D dpt
Silicon SoCs with large language models (LLMs) like GPT-4
local semantic indexing
ML cores in Apple Silicon chips
Techniques like differential privacy or federated learning
Once the semantic index or representation is created, it could be sent to an LLM
unified memory architecture
specialized ML cores
powerful natural language understanding and generation capabilities.
It's official, these AI chatbot shenanigans have evolved beyond my understanding. *o*
 

dgdosen

macrumors 68030
Dec 13, 2003
2,761
1,401
Seattle
LLMs on Apple's NE and unified memory get touched on a bit in this podcast - especially near the end:


Too bad the M3 Pro/Max isn't on the roadmap until 2024 :(
 
  • Like
Reactions: Xiao_Xi

Xiao_Xi

macrumors 65832
Oct 27, 2021
1,502
930
nV are optimizing for one set of things (based on their core skills and customer base); Apple are optimizing for a somewhat different set of things (likewise based on their core skills and customer base).
Out of curiosity, what things are Apple and nVidia focusing on?
 

name99

macrumors 68020
Jun 21, 2004
2,235
2,067
Out of curiosity, what things are Apple and nVidia focusing on?
nV is focussing on stuff that sells to the datacenter and academic departments that can buy large nV installations; and trickles down to smaller devices.

Apple is focussed on stuff that operates on battery, and helps consumers; and trickles up to larger devices.

For example nvLink is a large, expensive chunk of nV designs that's not present on any Apple design (certainly not right now); closest is the UltraFusion link on Max, but that's a kinda different beast.
 

zoomp

macrumors regular
Original poster
Aug 20, 2010
215
363
The other day I watched an interview with the CEO of Stability.ai and how he believe in personal models.

As we move toward a future with personalized models on edge devices, Apple's Neural Engine (NE) could play a critical role. The NE can function as a secure, private, and efficient long-term memory, addressing one of the limitations of current large language models (LLMs). This advantage positions Apple at the forefront of balancing powerful AI capabilities with user privacy and security.

Imagine an assistant that can remember everything you told it in a secure and private way. And could act upon this information. How useful would that be?
 
Last edited:

Xiao_Xi

macrumors 65832
Oct 27, 2021
1,502
930
Apple's Neural Engine (NE) could play a critical role. The NE can function as a secure, private, and efficient long-term memory, addressing one of the limitations of current large language models (LLMs).
Why use the NE and not the GPU? NE is only faster than GPU when the Mx SoC has a weak GPU.
1681151633288.png


Apple needs to improve the performance and stability of the most popular deep learning frameworks in macOS, as they are currently in beta.
 
Last edited:

name99

macrumors 68020
Jun 21, 2004
2,235
2,067
The other day I watched an interview with the CEO of Stability.ai and how he believe in personal models.

As we move toward a future with personalized models on edge devices, Apple's Neural Engine (NE) could play a critical role. The NE can function as a secure, private, and efficient long-term memory, addressing one of the limitations of current large language models (LLMs). This advantage positions Apple at the forefront of balancing powerful AI capabilities with user privacy and security.

Imagine an assistant that can remember everything you told it in a secure and private way. And could act upon this information. How useful would that be?
That sounds like a great theory. (And I agree that an important missing piece of existing LLMs is personalization; part of what makes a great assistant great is that the assistant knows not just the general theory of how many people like food X vs food Y, but what YOU SPECIFICALLY like, and so on for movies, gift recommendations, writing style, etc.)
BUT
it only works if there is a way to extract learned data from a device to transfer it to another device AND to share the (constantly evolving) model across all my devices.
Right now Apple has this stuff so silo'd that this is a constant problem. As far as I can tell, CarPlay forgets my common driving routes every time I upgrade my phone; or as a second example, typing correction does not seem to be learned across my devices.

Until Apple solves this, personalization will be a useless gimmick. I'm not interested in a setup where, if I perform a task on my iPad I get one type of result, whereas the same task on my Mac gives a very different result because the models on each have picked up very different training...
 

dgdosen

macrumors 68030
Dec 13, 2003
2,761
1,401
Seattle
That sounds like a great theory. (And I agree that an important missing piece of existing LLMs is personalization; part of what makes a great assistant great is that the assistant knows not just the general theory of how many people like food X vs food Y, but what YOU SPECIFICALLY like, and so on for movies, gift recommendations, writing style, etc.)
BUT
it only works if there is a way to extract learned data from a device to transfer it to another device AND to share the (constantly evolving) model across all my devices.
Right now Apple has this stuff so silo'd that this is a constant problem. As far as I can tell, CarPlay forgets my common driving routes every time I upgrade my phone; or as a second example, typing correction does not seem to be learned across my devices.

Until Apple solves this, personalization will be a useless gimmick. I'm not interested in a setup where, if I perform a task on my iPad I get one type of result, whereas the same task on my Mac gives a very different result because the models on each have picked up very different training...
Isn't that what something like a 'retrieval-plugin' is for?


Does Apple 'need' to be part of this other than for providing the hardware punch?
 

zoomp

macrumors regular
Original poster
Aug 20, 2010
215
363
Emad Mostaque touch in these ideas in 1h03 of this interview.


He believes apple is waiting for most of its users to be using SoCs with NEs to upgrade Siri to Jarvis level.
 
  • Like
Reactions: dgdosen

quarkysg

macrumors 65816
Oct 12, 2019
1,232
820
Why use the NE and not the GPU? NE is only faster than GPU when the Mx SoC has a weak GPU.
I would think power consumption is one of the key factor. It would look to me that AS NE is geared more for edge computing. Deep learning is probably relegated to servers, as least for now.
 
  • Like
Reactions: Xiao_Xi

dgdosen

macrumors 68030
Dec 13, 2003
2,761
1,401
Seattle
Do you think Apple will miss the opportunity to create a paid service as Microsoft has done?
Of course not! Apple could package all this in some kind of next version of Swift/Siri Playgrounds... They (and everyone else) are getting embarrassed by OpenAI/Microsoft beating them to the punch.
 

zoomp

macrumors regular
Original poster
Aug 20, 2010
215
363
I think I just had an epiphany watching a video about the rumored upcoming glasses at WDDC. It sounds like a terrible idea BUT:

As we look towards the future of personalized models and natural language-based interactions with computers, AR glasses, like the ones rumored to be unveiled by Apple, could serve as the perfect edge device. Combining Apple's expertise in hardware and software integration with LLM-driven interactions, users could benefit from powerful AI capabilities while maintaining privacy through local processing on the device.

Integrating LiDAR and cameras into the AR glasses would enable multi-modal interactions, providing accurate real-time spatial awareness, object recognition, and depth perception. This would unlock a range of context-aware applications, such as navigation, education, entertainment, and accessibility, that understand the user's environment and deliver personalized, relevant assistance.

By combining visual, spatial, and natural language inputs with LLMs, Apple's AR glasses have the potential to revolutionize the way we interact with technology and our surroundings, creating a powerful and versatile platform for the future of computing.

Like when our finger was the best way to interact with a device (remember SJ introducing iPhone?) what about the best thing to interact with a “computer” than our own voice?
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.