Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Freeangel1

Suspended
Jan 13, 2020
1,191
1,753
At some point shrinking a chip becomes a negative thing.

I find it hard to believe that a 2nm chip will still beat out a 5 or 4 nm chip.

Maybe in power efficiency usage.

But at some point these super small chips will function like Intel Atom chips.

All low power consumption but NO more high output, high frequency heavy hitting power for intensive tasks.

We shall see but I really think I'm right on this one.
 

Bug-Creator

macrumors 68000
May 30, 2011
1,763
4,688
Germany
I find it hard to believe that a 2nm chip will still beat out a 5 or 4 nm chip.

But at some point these super small chips will function like Intel Atom chips.

Intel's Atom chips were made on whatever node was current at that time starting with 45nm so thats not a thing.

Going smaller means:
- you can put more transistors on a chip with the same size and power consumption, which can either be used to add more specialised functions or just more of the same cores. Both won't do much for single threaded but a whole lot a more for everything else

- shorter distances make it is easier to keep things in sync allowing you to either do more complicated functions in 1 cycle or to up the clock

Just remember how the PIV struggled to get anything done at it's (for the time) insane clocks and how these same clockspeeds are now the base for low power cores.
 

leman

macrumors Core
Oct 14, 2008
19,213
19,102
a) it would require design changes that would go against what is best for laptops and they use the same chips for both

b) it would have used much more power and cooling to the point where the Ultra would not have been possible in the Studio (at that size) and Apple does care a lot about these things

c) the extra heat and power would have made it impossible to run all cores that fast all the time negating much of the benefits in multithreaded applications

d) M1.... is still 1st gen quite possible that Apple will push up the clocks later on or with systems that have a biggere thermal envelope (like the next MacPro)

I think you summed my argument very neatly for me. Making a chip that runs a high clock will likely diminish the energy efficiency at any clock. Which is exactly how it works: Intel/AMD design for high clock, high latency, high power consumption, Apple designs for low clock, low latency, low power consumption. The effective throughput is comparable.

Furthermore, I think the b) and c) arguments are flawed: why can't Apple do what Intel does and introduce more dynamic clock spread — e.g. stay with the 3.0ghz in multicore operation and use a higher peak clock if only few cores are active. That wouldn't compromise the cooling capacity of Apple desktops (they are more then capable of dissipating eventual 20-30W of single-core peak power) and would still allow for high multicore performance — at slightly lower clocks. So I am not convinced. A much more likely explanation is that M1 series peaks out at 3.26ghz because that's is literal limit — simply can't go any faster than this and still operate error-free. After all, it is not just about power consumption, but about synchronising chip internals.

This leaves the argument d) — M1 is a first generation, based on a mobile phone chip, so it likely lacks the inherent scalability. This is essentially the argument that Apple will be able to build a chip that can have both very high power efficiency and relatively high clock scalability. Is it possible? Until now, nobody managed anything like this. We see either chips that go fast (but are less efficient) or efficient chips that can't go too fast. Maybe Apple can do it — after all, they are the only ones that make a 3ghz chip that is as fast as state of the art 5ghz competitor chips. But if they manage to do it, it won't be because ARM is better than x86, but because Apple engineers are better than Intel/AMD ones. Again, the best that other ARM designers (not Apple) can do is match Intel's modern Atom cores at the same power consumption.

So yeah, I stay cautiously optimistic, but I do not think the case it closed and done. That said, I agree with you that x86 is a boring, ugly ISA and I would like to see it retired. ARMv8+SVE is currently arguably the best developed, best rounded ISA, and I quite like it. RISC-V is yet untested and I remain sceptical about it's viability for general-purpose computing, but we will see. Ultimately however, these are all the same family of ISAs which use the same basic ideas and philosophy, minor execution details notwithstanding. I would like to see some new ideas (like the Mill architecture) that will hopefully allow new advances in performance.
 
Last edited:

leman

macrumors Core
Oct 14, 2008
19,213
19,102
Guess as to 5nm/4nm/3nm?

Wouldn't dare to guess. Don't have a good understanding of that industry. If I read the reports correctly, it's unlikely that more advanced nodes will be ready this year...
 

Kazgarth

macrumors 6502
Oct 18, 2020
303
836
At some point shrinking a chip becomes a negative thing.

I find it hard to believe that a 2nm chip will still beat out a 5 or 4 nm chip.

Maybe in power efficiency usage.

But at some point these super small chips will function like Intel Atom chips.

All low power consumption but NO more high output, high frequency heavy hitting power for intensive tasks.

We shall see but I really think I'm right on this one.
I don't see how is that true unless you have hidden data that you may want to share with us?

If we take Ryzen for example, it's been able to push higher and higher frequencies with each node shrink.

Ryzen 1000 (14nm) max clock speed 3.6 GHz
Ryzen 2000 (12nm) max clock speed 4.3 GHz
Ryzen 3000 (7nm) max clock speed 4.7 GHz
Ryzen 5000 (7nm+) max clock speed 4.9 GHz
Ryzen 7000 (5nm) max clock speed 5.5GHz

How is that a negative thing.
 

T'hain Esh Kelch

macrumors 603
Aug 5, 2001
6,334
7,208
Denmark
Only a 15% uplift in ST in Cinebench R23 vs a 5950X. Considering Zen 2 to Zen 3 had bigger uplift and AMD waited 2 years to deliver Zen 4.

Keep in mind this is on TSMC 5nm.
It is worth noting that Zen3 had an 11% uplift in Cinebench over Zen2, but a 19% IPC increase. And AMD says 15%+, not 15%. This could very well be a play to see how Intel reacts, and then a 20%+ IPC increase. And rumors also say we'll get up to 24 cores. It will be interesting to see how Lisa plays it out.
 
  • Like
Reactions: Xiao_Xi

mi7chy

macrumors G4
Oct 24, 2014
10,495
11,155
15% isn't bad and while not as power hungry as Intel so AMD is more balanced and faster than Apple AS. Coming from Ryzen 5000 series the 7000X3D series with fat cache is more interesting along with RDNA3 iGPU.

Blender CPU BMW scene desktop (lower better):

1m20s - Theoretical AMD 7950X 16CPU (CPU)
1m35.21s - AMD 5950X with core boost and no PBO (CPU Blender 3.0)
1m43s - M1 Ultra 20CPU 64GPU (CPU Blender 3.1)
1m50s - M1 Ultra 20CPU 48GPU (CPU Blender 3.1)

Blender CPU BMW scene laptop (lower better):

3m20s - Theoretical AMD 7800H 8CPU (CPU)
3m55.81s - AMD 5800H 8CPU base clock, no boost and no PBO (CPU Blender 3.0)
4m11s - M1 Pro 10CPU (CPU Blender 3.1 alpha)
5m51.06s - MBA M1 8CPU (CPU Blender 3.0)
 
Last edited:
  • Like
Reactions: Wizec

deconstruct60

macrumors G5
Mar 10, 2009
12,309
3,900
Only a 15% uplift in ST in Cinebench R23 vs a 5950X. Considering Zen 2 to Zen 3 had bigger uplift

Who actually runs production rendering workloads strictly single threaded?

AMD didn't spend tons of time trying to super optimizing single thread drag racing benchmarks? Not really a huge 'problem'.

They are running in the 5.4-5.5GHz without doing any overclocking at all.
https://videocardz.com/newz/amd-confirms-ryzen-7000-5-5-ghz-demo-did-not-involve-overclocking

No memory overclocking with EXPO. There were rumors that overclocking was a substantive design focus for this iteration. ( that won't be an be upside for the laptop space , but AMD did Ryzen 6000 on TSMC N6 for that. )


For the hyper focused ST drag racing crowd with a hefty liquid cooler (or cryogenic fueled cooler) it is going to be more than 15%.

It isn't a laptop , simple GUI app optimized design. SIMD and multithreading got more weighting on this iteration. To go completely unidimensional on single threaded drag racing probably misses most of the upgrades.

The Ryzen 7000 also now has an iGPU present by default. Again not for the same motivations as Apple's laptop focus , but some fixed function ( e.g., video de/encode) and as a parallel compute resource.


and AMD waited 2 years to deliver Zen 4.

Keep in mind this is on TSMC 5nm.

"2 years" is skipping over the Ryzen 6000 laptop focused iteration. AMD is moving in multiple directions and across a wider front than they were 3-4 years ago. AMD is producing like 3-4x as many SoC CPU packages as Apple. The 7000 likely will get followed with a slightly different focused 8000 series

The notion that the 7000's series ST performance isn't 'fast enough to get highly useful' work done for 90+ % of the population is weak. The vast majority of folks out there have > 1 thread workloads at this point. ( the folks reading macrumors in a modern web browser are blowing past one thread. ) Does most everyone need more than 8 cores ? Not really. However, we are largely past the stage were one thread is the nominal norm.
 

exoticSpice

Suspended
Original poster
Jan 9, 2022
1,242
1,951
It is worth noting that Zen3 had an 11% uplift in Cinebench over Zen2, but a 19% IPC increase. And AMD says 15%+, not 15%. This could very well be a play to see how Intel reacts, and then a 20%+ IPC increase. And rumors also say we'll get up to 24 cores. It will be interesting to see how Lisa plays it out.
AMD had confirmed that 15% uplift included IPC.
 

exoticSpice

Suspended
Original poster
Jan 9, 2022
1,242
1,951
Who actually runs production rendering workloads strictly single threaded?
I know that. What I meant was IPC was pretty low for Zen 4 as AMD said that >15% figure included IPC and clocks.
The Ryzen 7000 also now has an iGPU present by default. Again not for the same motivations as Apple's laptop focus , but some fixed function ( e.g., video de/encode) and as a parallel compute resource.
Not new. Intel had them on their desktop CPUs for a while.
"2 years" is skipping over the Ryzen 6000 laptop focused iteration
Which was again not a huge upgrade performance wise. I still buy 6000 U series laptops in my country. I can 12th gen Intel's P and H series though both instore and online.
The notion that the 7000's series ST performance isn't 'fast enough to get highly useful' work done for 90+ % of the population is weak.
I am not saying that. 90+% of the population don't need latest CPUs. In terms of going forward I believe AMD's cadence is slowing down.
 

exoticSpice

Suspended
Original poster
Jan 9, 2022
1,242
1,951
15% isn't bad and while not as power hungry as Intel so AMD is more balanced and faster than Apple AS. Coming from Ryzen 5000 series the 7000X3D series with fat cache is more interesting along with RDNA3 iGPU.

Blender CPU BMW scene desktop (lower better):

1m20s - Theoretical AMD 7950X 16CPU (CPU)
1m35.21s - AMD 5950X with core boost and no PBO (CPU Blender 3.0)
1m43s - M1 Ultra 20CPU 64GPU (CPU Blender 3.1)
1m50s - M1 Ultra 20CPU 48GPU (CPU Blender 3.1)

Blender CPU BMW scene laptop (lower better):

3m20s - Theoretical AMD 7800H 8CPU (CPU)
3m55.81s - AMD 5800H 8CPU base clock, no boost and no PBO (CPU Blender 3.0)
4m11s - M1 Pro 10CPU (CPU Blender 3.1 alpha)
5m51.06s - MBA M1 8CPU (CPU Blender 3.0)
Compared to previous Zen releases it is. Let's just wait and see.

Zen 4's and Meteor Lake is where the battle is at. Apple's SoC are efficiency at all costs is not suited for desktop.
 

deconstruct60

macrumors G5
Mar 10, 2009
12,309
3,900
I know that. What I meant was IPC was pretty low for Zen 4 as AMD said that >15% figure included IPC and clocks.

It is a percentage. As the base number gets higher the threshold of 10% goes higher.

2.9GHz ... 10% 0.29GHz ==> 3.19
3.9GHz ... 10% 0.39GHz ==> 4.29
4.9GHz ... 10% 0.49GHz ==> 5.39


The code being executed isn't getting more parallel. So have to artificially inject instruction level parallelism to go 'wider' once get past point have exploited most of the parallelism present. To keep the core size down AMD probably didn't go wider still so it is mostly clock. but the clock gain here is larger than previous ones in raw terms.

5950X 4.9 --> 7950X 5.5 --> 0.5GHz (Zen 3 -> Zen 4 )
3950X 4.7 --> 5950X 4.9 0.2GHz (Zen 2 -> Zen 3 )
TR2950X 4.4 --> 3950X 4.7 --> 0.3GHz (Zen 2 -> Zen 2 )
TR1950X 1950X 4.0 -> 2950X 4.4 --> 0.4GHz



Zen 2 -> Zen 3 was more of an IPC bump. Once take that, then don't necessarily get that again later relatively easily. Especially if not growing the core substantively bigger ( or hiding much bigger growth under a large fab shrink). (L2 cache grew but no mention so far if the L3 grew. Still 'stuck' with just two memory controllers . DDR5 but also bumped up the clocks. So no new 'headroom' on bandwidth there; the size of memory request queue is likely the same. )


Intel went wider with their "big" cores (Golden Cove) in Gen 12 , but also much bigger than what AMD is doing. A 6nm IO die probably costs more. The 5nm wafers are probably costing more (for now). So I doubt AMD was in the mood to just throw money at a bigger CPU core chiplet die just to win some tech porn, single threaded benchmarks.



Not new. Intel had them on their desktop CPUs for a while.

It is 'new' for the mainstream Ryzen desktop. They dumped iGPUs from the package transistor budget to chase higher core counts. So part of the issue here is that 'transistor budget' is being spent on iGPU. go from 0 to 10 is an 'infinity' percent increase. They spent 'treasure' on non CPU cores. ( Apple is probably mostly doing the same thing with M2 ).


Which was again not a huge upgrade performance wise. I still buy 6000 U series laptops in my country. I can 12th gen Intel's P and H series though both instore and online.

Not a large upgrade battery performance wise? Surely you jest. Is a laptop users absolute #1 top priority going to be maximum possible single thread performance or battery life ? In most cases it is latter ; especially if it will be used detached from a power socket for a significant amount of time per day.


I am not saying that. 90+% of the population don't need latest CPUs. In terms of going forward I believe AMD's cadence is slowing down.

Again law of large numbers in a percentages context.

Company A grows revenues $500M on a $10B revenue base 5% increase.
Company B grows revenues $100M on a $1B revenue base 10% increase.

who actually brought in more new money? The folks at Company A are snoozing all day at work?


AMD is taking turns on what they focus iterate on. Steady , methodical improvements that are not high risk. This a bigger clock leap than they taken on any of the previous iterations. After a "biggest clock" jump they'll likely switch focus on something else next. ( besides the Zen4C cores which is likely yet another dimension they are flushing out over time. )

For the Epyc package configurations these "Zen 4 " chiplets probably are a winners for the next 1-2 years. Those are far fatter margins for AMD and helps with their balance sheet. ( 6 years or so ago AMD was borrowing money to keep the lights on and spend some focused money on R&D. ). if the Bergamo/Zen4C SoC are competitive against Ampere's ARM SoCs that will help too in the higher margin space.

In the desktop space, I doubt seeing all the synergies between Ryzen 7000 and RX 7000 GPUs being laid out in the open yet. ( These (specially in the Ryzen 3/5 zone ) will certainly pair better with the RX 6500 stuff released earlier this year. ).

AMD's slide mentions some AI Acceleration which haven't been presented well so far. ( decent chance that is not a 'low' 15% increase. )


Also could be a later twist if 3D cache version 2.0 doesn't require the same level of clock hit that the first version required.
 
Last edited:

deconstruct60

macrumors G5
Mar 10, 2009
12,309
3,900
. But if they manage to do it, it won't be because ARM is better than x86, but because Apple engineers are better than Intel/AMD ones. Again, the best that other ARM designers (not Apple) can do is match Intel's modern Atom cores at the same power consumption.


Errrrr. Really? Ampere isn't whipping all of AMD/Intel offerings , but it does more certainly beat some non-Atom based models ( Eypc 7742 is not an "Atom core") .


(and the next page of per core performance under load. Can point at the 8-thread max gap at the top, but they are also shipping packages with an order of magnitude more cores on a larger process node than Apple is currently (or will for the intermediate future) . It doesn't make them 'inept' designers... it folks working with a different set of design constraints. )

This whole Apple's designers are just "way smarter" and everyone else is a bumblers who couldn't make the cool list should stop. It isn't true. The

Neoverse designs are not hopelessly behind.

[ No versus AMD/Intel metrics here but Graviton 3 does better than Graviton 2.

https://www.daemonology.net/blog/2022-05-23-FreeBSD-Graviton-3.html

]
 
  • Like
Reactions: psychicist

leman

macrumors Core
Oct 14, 2008
19,213
19,102
Errrrr. Really? Ampere isn't whipping all of AMD/Intel offerings , but it does more certainly beat some non-Atom based models ( Eypc 7742 is not an "Atom core") .

I am talking about single-core performance, not aggregated throughput for multi-core monsters. Ampere scores 5.2/6.1 in SPEC, Intel atom (Gracemont) scores 5.25/7.66: https://www.anandtech.com/show/1704...hybrid-performance-brings-hybrid-complexity/7

EPYC hardly represents peak single core performance of AMD CPUs as it’s clocked painfully low to allow multiple cores packed on a die. I mean, EPYC 7742 is Zen2 at max 3.4ghz, same CPU cores in other products go up to 4.7hz. And Zen3 is faster yet.
 

deconstruct60

macrumors G5
Mar 10, 2009
12,309
3,900
I am talking about single-core performance, not aggregated throughput for multi-core monsters. Ampere scores 5.2/6.1 in SPEC, Intel atom (Gracemont) scores 5.25/7.66: https://www.anandtech.com/show/1704...hybrid-performance-brings-hybrid-complexity/7

Testing older with newest of Intel isn't necessarily revealing. Qualcomm X2 gets a 4.82 .


same die reflowed on TSMC N7 is 10% faster.

https://www.anandtech.com/show/1739...n-1-moving-to-tsmc-for-more-speed-lower-power

Don't have the confirming benchmarks yet but 1.1 * 4.82 = 5.3

Since 5.25 - 5.2 was such a huge difference , I guess .05 over is big deal.

Throwing "Atom" on Gracemount is a stretch. It is not what 'Atom' was for a long time. Gracemount is meant to be a better Skylake.

The X2 and Neoverse cores are related. X2 have a slightly bigger area budget allocated to them. The notion that the other folks are not smart or talented enough to figure it out is highly dubiuos. The porting from the Samsung to TSMC N4 is indicative that there are substantive other issues involved than just the core designer talent.

if dropped another 10% on the X2 with a N4-N3 move that would put it at 5.8 with minor adjustments to architectural design . 5.8 is in same zone at the AMD 5800 HS. Zero changes. No bigger cache , nothing.

Apple really has not demonstrated that they can out hustle everyone on an even playing field ( same process node , same design constraints ).

[ And it is shipping. Gracemount Granite Ridge is where?

https://www.techpowerup.com/270718/...-24-core-processor-features-pcie-4-0-and-ddr5

NOTE: this is 2020 ... Intel didn't switch to "Intel 7" "Intel 4" until 2021. What Intel shipped was something that probably was initially targeted at Intel 4 and backported to Intel 7 . That is way it is "not so small". It is only "small" relative to the Golden Cove (Gen12) P core. It would be more like a classic Atom core on Intel 4. ]


EPYC hardly represents peak single core performance of AMD CPUs as it’s clocked painfully low to allow multiple cores packed on a die. I mean, EPYC 7742 is Zen2 at max 3.4ghz, same CPU cores in other products go up to 4.7hz. And Zen3 is faster yet.

It is a big core that got beat. AMD is doing a Zen4C .... if everyone else was so hugely lame why at they putting effort there? It is because they are not as lame as you're making them out to be. Even playing field AMD (and Intel ) will have issues.

The instruction sets are the big hang up here or the talent at the respective camps. If hog tie one of the groups with limited transistor budgets , design constraints , and/or costs limitations you'll get differences in performance.
 
  • Like
Reactions: psychicist

leman

macrumors Core
Oct 14, 2008
19,213
19,102
What problems do you foresee RISC-V having?

Sorry, I should have been more clear. RISC-V is definitely viable, it's just I am not sure why it would be any better for general purpose computing than ARM. The big advantage of RISC-V is it's open nature, which allows small teams on tight budgets to experiment and innovate. But one can't make a competitive general-purpose CPU without significant investments in talent and resources, and a company that can pull it off won't have any problems with purchasing an ARM license. And of course, RISC-V still lacks basic functionality like low-latency, flexible SIMD. And while vendors can add whatever they want, the fragmented nature of RISC-V extensions means lower overall quality of compilers and common toolchains. So... why not just take a proven ISA with a working, high quality tooling?
 

BigPotatoLobbyist

macrumors 6502
Dec 25, 2020
301
155
As for the ISA debates: I'm sure a relatively open ISA like RISC-V - probably a revised version - has potential to "win out" as long as standards are formed but as of this very moment RISC-V itself has deficiencies and there is not a high performance RISC-V core. Even SiFive's "Horse Creek" core on Intel 4 is targeting Arm ca75/76-tier performance with their next core targeting the A78, and the former core - the P550, still has yet to be released. And that's before we arrive at software, or standards for the extensions, and as leman says (I agree) their SIMD has issues.

Finally, an architectural license for Arm V8/V9 can be had for a sum within Arm's reach for a firm with the requisite resources for building a CPU anyways. Ask startups like Ampere - who are going custom with Arm now.

In the long term, I do expect an open-source ISA to "win" with some kind of patent-pooled organization guiding it, similar to Arm but "more open". But I think that while the majority of PC's and servers are still on X64, and Windows specifically hasn't even made the transition to Arm yet (in terms of usage data) it's probably a good idea to slow the roll on RISC-V: The Intel/AMD/M1/Arm killer narratives. Great for IOT/some embedded stuff though.


Also, on technical merits: I'd argue Arm V8 and Arm V9 are undeniably superior to X86_64 and RISC-V both in terms of which ISA offers the lowest overhead to designing high performance CPU's, and likewise for low power where things like X86's decode and memory model incur penalties on efficiency and relative performance. And even beyond that, SVE2 is just awesome relative to AVX-512 or RISC-V's Vector implementations.
 
  • Like
Reactions: Xiao_Xi

Xiao_Xi

macrumors 65816
Oct 27, 2021
1,482
921
I'd argue Arm V8 and Arm V9 are undeniably superior to X86_64 and RISC-V both in terms of which ISA offers the lowest overhead to designing high performance CPU's, and likewise for low power where things like X86's decode and memory model incur penalties on efficiency and relative performance. And even beyond that, SVE2 is just awesome relative to AVX-512 or RISC-V's Vector implementations.
Do you have a link that explains this?
 

TheRealAlex

macrumors 68030
Sep 2, 2015
2,863
2,019
7900X 16 Cores at 5.5Ghz each
32GB DDR5 RAM 6000
RTX 4090
a few M.2 drives
I really hope my

Seasonic PRIME 1300 Platinum SSR-1300PD 1300W 80+ Platinum​

can hold up.
 

Xiao_Xi

macrumors 65816
Oct 27, 2021
1,482
921
It looks like AMD's new CPU will have a bigger jump in performance than AMD had previously stated.
During today’s Financial Analyst Day 2022, AMD clarified that it is targeting an 8 to 10% increase in IPC for the Zen 4 processors
AMD also clarified that Zen 4 processors would have >25% performance-per-watt and >35% overall performance improvements.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.