Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

MapleBeercules

Cancelled
Nov 9, 2023
127
157
Comparing M3 to anything right now is a bad comparison...
AMD 7000 was built on a known processor architecture, the m3 was built on a new architecture with terrible results both in yields and in quality. When TSMC switches over to N3E and apple makes a chip on that node, it will smoke anything AMD has on their roadmap for years to come.

N3B which all M3 processors are based upon is pure crap, infact apple is the only retailer who accepted any N3B products every other company turn down N3B because it was pure crap.
 

MRMSFC

macrumors 6502
Jul 6, 2023
343
353
Comparing M3 to anything right now is a bad comparison...
AMD 7000 was built on a known processor architecture, the m3 was built on a new architecture with terrible results both in yields and in quality. When TSMC switches over to N3E and apple makes a chip on that node, it will smoke anything AMD has on their roadmap for years to come.

N3B which all M3 processors are based upon is pure crap, infact apple is the only retailer who accepted any N3B products every other company turn down N3B because it was pure crap.
As a resident Apple Silicon fanboy;

It’s perfectly valid to compare two current competing products with each other. What you’ve suggested is like what another user suggested but in reverse (that comparing Intel v. Apple Silicon isn’t fair because of process node).

We should hold every product to the same standards.
 

name99

macrumors 68020
Jun 21, 2004
2,283
2,139
Statisticians use the standard deviation, not the percentage, to establish whether two points are significantly different or not.
Seriously dude?
OK, let me be very clear. When I say noise I mean there is nothing TECHNICALLY interesting in such small differences.
If you find such differences fascinating for whatever reason, go for it.

But don't be surprised when other people are simply UNINTERESTED in your trumpeting such numbers. They'd not interesting for clarifying tech differences between designs. They're not interesting for deciding to buy a new machine.
 
  • Haha
Reactions: Xiao_Xi

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,532
956
HotChips just uploaded the presentations from the last conference.

AMD made two presentations that may be interesting for this thread.
- AMD Next Generation “Zen 4” Core and 4th Gen AMD EPYCTM 9004 Server CPU

- AMD Ryzen 7040 Series Mobile Processor
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,532
956

leman

macrumors Core
Oct 14, 2008
19,319
19,336

For fastest performance, you want your data to be in cache. Cache usually works in small blocks (e.g. 64 bytes). This means that if you are processing a sequential list of data elements, only the first of the 64-byte block will actually result in a DRAM access — entire cache block is loaded and subsequent n elements can be loaded from fast cache. Even better though, if you know that you will be processing a bunch of such elements, it makes sense for the CPU to load the relevant blocks into the cache even before you get to processing the relevant item, this reduces waiting times. A while ago we had dedicated prefetch hints for this (e.g. before doing a large memory read you could instruct the CPU that you will do it, prompting it to start loading the data from DRAM into the cache). Nowadays this is done with automatic prefetches that try to learn your access pattern and prefetch data accordingly. Detecting linear access is simple (e.g. if the CPU sees that you have accessed N subsequent locations in memory it can assume you are doing array processing and start fetching ahead). Apple goes one step further and also prefetches indirect accesses — that is, if you are processing an array of pointers — it will detect it and start loading the data at the subsequent pointer addresses.
 
  • Like
Reactions: MRMSFC and Xiao_Xi

Sydde

macrumors 68030
Aug 17, 2009
2,557
7,059
IOKWARDI
Nowadays this is done with automatic prefetches that try to learn your access pattern and prefetch data accordingly.

There is in fact still an instruction for explicit memory prefetch, because maybe it is sometimes still needed. I am not seeing any flush/invalidate instructions (though with all the layers of caching, that would be somewhat fraught); perhaps those are effected through MSR?
 

name99

macrumors 68020
Jun 21, 2004
2,283
2,139
There is in fact still an instruction for explicit memory prefetch, because maybe it is sometimes still needed. I am not seeing any flush/invalidate instructions (though with all the layers of caching, that would be somewhat fraught); perhaps those are effected through MSR?
The DC instruction has modifiers that will perform a wide range of cache maintenance operations.

Prefetching on Apple chips is in fact extremely sophisticated. I'd be surprised to see a real-world example of a data pattern that's both predictable enough for SW prefetch to be worthwhile, but isn't caught by one of the many Apple hardware prefetchers.
 
  • Like
Reactions: Xiao_Xi

Sydde

macrumors 68030
Aug 17, 2009
2,557
7,059
IOKWARDI
I'd be surprised to see a real-world example of a data pattern that's both predictable enough for SW prefetch to be worthwhile, but isn't caught by one of the many Apple hardware prefetchers.

Well, look at DCZVA: the program tells the processor, I am going to fill this whole line with stuff, so zero it out and don't bother to load it. That is just excellent.
 

name99

macrumors 68020
Jun 21, 2004
2,283
2,139
Well, look at DCZVA: the program tells the processor, I am going to fill this whole line with stuff, so zero it out and don't bother to load it. That is just excellent.
If you are calling DC ZVA a *prefetch* instruction then I'm out of this conversation.
You're obviously more interested in "winning" debate games by playing stupid word tricks than in understanding technology.
 

Sydde

macrumors 68030
Aug 17, 2009
2,557
7,059
IOKWARDI
If you are calling DC ZVA a *prefetch* instruction then I'm out of this conversation.
You're obviously more interested in "winning" debate games by playing stupid word tricks than in understanding technology.
No, it is not a prefetch, it is a do-not-fetch, because the program only wants to write. It saves the fetch cycle that would normally happen when a program starts writing stuff. Of course, AS might well have that in their memory optimization logic, so that a program would not need to issue the instruction at all. In fact, I would not at all be surprised if the other designs, including x86, have it as well.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.