Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

ManuelGomes

macrumors 68000
Dec 4, 2014
1,617
354
Aveiro, Portugal
Something's odd in all this.
490 being based on Vega: what of the rest of the lineup? Only one SKU in the 400 series based on Vega? The rest is all Fury?
Either AMD made Vega only for high-end, which is strange, or a new family will come up.
I'm not sure 490 isn't a dual Polaris.
And all Fury cards are Vega.
All possibilities are open.
[doublepost=1471974179][/doublepost]At least 490 is confirmed to be >256b so either dual Polaris or indeed Vega - maybe Vega Lite/11.
And Vega 10 is Fury of course.
I guess it makes sense indeed.
 

koyoot

macrumors 603
Original poster
Jun 5, 2012
5,939
1,853
Manuel, there will be one Fury-badged GPU. The Big Vega.

Naming scheme is very simple. RX 460, RX 470, RX 480, RX 490, RX Fury. 2 Polaris design, 2 Vega designs.

AMD did not made Vega for high-end. Vega will also end up in Raven Ridge APUs. It is exactly the same architecture.

It cannot be simpler. Zen+Vega APUs - whole APU market, Polaris - mainstream discreet GPU market. Vega - High-end Discreet GPU market.
 

deconstruct60

macrumors G5
Mar 10, 2009
12,311
3,902
http://www.anandtech.com/show/10578...rs-micro-op-cache-memory-hierarchy-revealed/2

This is best analysis of the Zen CPU. Now it is all down to core clocks on final silicon.

Anandtech hasn't dug into the UnCore part of the flavors of Zen yet. Perhaps after the Hotchips presentation.


They will have 32 lanes. Zen is modular architecture. It has 2 clusters that are built from 4 cores, and 8 MB of L3 cache.

Both clusters need to have their own separate 16 PCIe lanes, for complete scalability.

4 cores, 8 MB L3 cache, 16 PCIe lanes.
8 cores, 16 MB L3 cache, 32 PCIe lanes.

Errrr. no. Here is the die photo of the 8 core model.

Zeppelin_Die_stitched_labelled.png

http://dresdenboy.blogspot.com/2016/08/some-last-chance-pre-hot-chips.html

The PCIe+FCH ( south bridge) portion is not replicated per core cluster. The memory channel controller is replicated. The GMI-LInk (GMI -- Global Memory Interconnect. Think HyperLink or Intel's QPI links ) is replicated. The south bridge is not.


8 cores may have 16 or 32 depending upon what UnCore Southbridge is attached. That is likely why the 32 core model has 32 PCI-e v3 lanes. [ a MCM package with two if these 8 core dies each with 16 ; 16 * 2 = 32 ] Therefore, the two package solution of the 32 core model has 64 [ which aligns up with Anandtech's motherboard layout]
 

cube

Suspended
May 10, 2004
17,011
4,972
Isn't the Vega for APU smaller than Vega 11?

Given how long it took for AMD to come up with Polaris, I would hope they would bring a lower end discrete Vega design in the near future.
 

deconstruct60

macrumors G5
Mar 10, 2009
12,311
3,902
....
Bandwidth between the CPU and the GPU is not usually a bottleneck. With zen topping out at 32 cores that is going to be a very big die.

For server APUs they can use a bigger MCM (multi chip module) package like IBM does for Power. For example, a Power 5 MCM with 4 Power5 dies and 4 L3 caches.
Power5.jpg

https://en.wikipedia.org/wiki/POWER5


The overall module is the "APU". So could put a full sized 32 core die and a med-large GPU HBM module into the same supersize (relative to normal desktop socket) module. If have a multiple die solution then need a low latency, high bandwidth solution to handle traffic between the dies. But also don't necessarily need the same substrate interposer to span both the GPU+HBM and the CPU die

The bandwidth (and latency ) is a bottleneck if start to use NUMA, global memory. So if the CPU and GPU are both pulling/pushing data into the HBM memory that could clog up fairly quickly. Especially on HPC kernels that lean on bisection bandwidth ( and not mainstream desktop software. )

Again I don't see the relevancy here for a Mac Pro. These "server" APUs are likely aimed at being fundamental building blocks for Supercomputer ( or mini-Supercomputers ).... stuff heading for machine rooms not ultra quiet desktops.

I doubt they have any room left to fit any sort of GPU on there, especially one that has even mainstream performance.

There is no necessity to put everything on one die in a "System on a Chip" (SoC) solution. Many iOS devices have RAM stacked on top. The Intel Iris Pro soltuions with eDRAM have the large eDRAM cache in a separate die placed in the same module as the CPU die. Same thing only a bit bigger.
 

koyoot

macrumors 603
Original poster
Jun 5, 2012
5,939
1,853
Anandtech hasn't dug into the UnCore part of the flavors of Zen yet. Perhaps after the Hotchips presentation.




Errrr. no. Here is the die photo of the 8 core model.

Zeppelin_Die_stitched_labelled.png

http://dresdenboy.blogspot.com/2016/08/some-last-chance-pre-hot-chips.html

The PCIe+FCH ( south bridge) portion is not replicated per core cluster. The memory channel controller is replicated. The GMI-LInk (GMI -- Global Memory Interconnect. Think HyperLink or Intel's QPI links ) is replicated. The south bridge is not.


8 cores may have 16 or 32 depending upon what UnCore Southbridge is attached. That is likely why the 32 core model has 32 PCI-e v3 lanes. [ a MCM package with two if these 8 core dies each with 16 ; 16 * 2 = 32 ] Therefore, the two package solution of the 32 core model has 64 [ which aligns up with Anandtech's motherboard layout]
http://www.planet3dnow.de/vbulletin...95W-TDP-DDR4?p=5110384&viewfull=1#post5110384
Update #1: As "Crashtest" explains in a later posting, the respective Summit Ridge system (w/ Myrtle mainboard) seems to have at least 36 PCIe lanes. According to him, the listed configuration seems to be a bit chaotic. BTW, "Promotory" should actually be written "Promontory".
From the same blog ;).
 

deconstruct60

macrumors G5
Mar 10, 2009
12,311
3,902
... to have at least 36 PCIe lanes. According to him, the listed configuration seems to be a bit chaotic. BTW, "Promotory" should actually be written "Promontory".
From the same blog ;).

36 PCIe lanes of what version? x8 lanes of PCI-e v2 isn't as competitive with Intel's offerings as x8 lanes of PCI-e v3. AMD might have leapfrogged the C612 ( X99) chipset a bit with v3 "top-to-bottom" but I suspect they haven't. Note that Intel isn't trying to put the Southbridge into most of their dies yet. There is a trade-off in doing that.
 

deconstruct60

macrumors G5
Mar 10, 2009
12,311
3,902
What about Thunderbolt?

You mean relevancy for the Mac Pro design? Chuckle.... this tread doesn't seem to be concerned about that.

If there is not enough PCI-e v3 bandwidth to go around, then it is a non starter even before start to wade into boot/support issues.

Intel is still the sole supplier of Thunderbolt controllers. Not buying them bundled with CPU packages isn't going to be cheaper. Intel probably isn't going to bend over backwards with boot/compatibility issues. Likewise, not sure Thunderbolt has gotten to the critical mass point where AMD cares about being "left out". The USB 3.1 Type C alternative modes of DP+USB gen 2 covers much of what the original TB v1 did. AMD has more than enough drama to fix to get back to be very competitive in the overall market than to add TB to the pile at the moment.

Unless, there are more than a couple other system vendors who want TB v3 who are not Apple, it doesn't really make alot of sense to AMD to chase that. Especially, if Apple being Scrooge McDuck and not paying for the whole effort in advance. It is too high a risk at the moment to spend a significant amount of money and lose out in a design bake-off. AMD needs to get healthy and then they may be able to afford that kind of stuff.
 
  • Like
Reactions: ManuelGomes

cube

Suspended
May 10, 2004
17,011
4,972
I am not concerned about the Mac Pro.

I care if a PC motherboard lacks Thunderbolt.
 

koyoot

macrumors 603
Original poster
Jun 5, 2012
5,939
1,853
Nobody here in their mind thinks that 8 core, 95W CPU, with Haswell/Broadwell level of performance would be suitable for Mac Pro. Especially with smaller amount of PCIe lanes, and Dual channel memory.

What cube is asking for is supposedly as a interest for his potential Zen build at home. IIRC, Thunderbolt was supposed to work with any brand, not only Intel, but also AMD, ARM, and... Nvidia.

Back to thread for a second. 95W APUs also available? That is more interesting.
 

Mago

macrumors 68030
Aug 16, 2011
2,789
912
Beyond the Thunderdome
You mean relevancy for the Mac Pro design? Chuckle.... this tread doesn't seem to be concerned about that.

If there is not enough PCI-e v3 bandwidth to go around, then it is a non starter even before start to wade into boot/support issues.

Intel is still the sole supplier of Thunderbolt controllers. Not buying them bundled with CPU packages isn't going to be cheaper. Intel probably isn't going to bend over backwards with boot/compatibility issues. Likewise, not sure Thunderbolt has gotten to the critical mass point where AMD cares about being "left out". The USB 3.1 Type C alternative modes of DP+USB gen 2 covers much of what the original TB v1 did. AMD has more than enough drama to fix to get back to be very competitive in the overall market than to add TB to the pile at the moment.

Unless, there are more than a couple other system vendors who want TB v3 who are not Apple, it doesn't really make alot of sense to AMD to chase that. Especially, if Apple being Scrooge McDuck and not paying for the whole effort in advance. It is too high a risk at the moment to spend a significant amount of money and lose out in a design bake-off. AMD needs to get healthy and then they may be able to afford that kind of stuff.
A trash can thermal core could easy 3x APU, a nMP based on Zen APU should be a monster.
 

deconstruct60

macrumors G5
Mar 10, 2009
12,311
3,902
A trash can thermal core could easy 3x APU,

No. The inter-socket/package connections that AMD will have (if could be used with an APU ) likely have distance limitations that the current Mac Pro design can't get around. Also as mentioned elsewhere, the additional ram DIMMs go where?

The desktop/'single socket' APUs can't really be connected. Having three seperate computers inside the Mac Pro case doesn't really buy all that much.
[doublepost=1472022495][/doublepost]
What cube is asking for is supposedly as a interest for his potential Zen build at home.

This forum is Macs > Desktop > Mac Pro not

General PC > Desktop > Home Builder.

If interest in the homebuilder topic then probably would be more productive to go to a homebuilder forum.



IIRC, Thunderbolt was supposed to work with any brand, not only Intel, but also AMD, ARM, and... Nvidia.

It isn't going to work for "free" with zero effort. AMD is sticking their toe in the water with XConnect.

http://www.anandtech.com/show/10133/amd-xconnect-external-radeons

Pragmatically though that is driven more so from the AIO card market and not the CPU "division" inside AMD. Intel based laptops/SFF/etc. that can't easily stick a card into that have a TB v3 socket. It is more a driver issue than a hardware/system boot one ( the AIO card just gets the transparent 'it is just a PCI-e switch with hot plug-and-play' view of TB from outside the base computer system.


Eventually if AMD starts to get more laptop design wins then perhaps ....
"... . It remains to be seen if laptops with AMD chips get Thunderbolt 3 ports, though it looks likely. ..."
http://www.pcworld.com/article/3051...questions-about-amds-bristol-ridge-chips.html
[ as this article points out there are some other Intel tech that may get some traction RealSense , Optane, etc. AMD can't ignore all of them just because Intel is behind them. ]
But until AMD starts to hear " well you would have made our design bake off, but our laptop needs to have Thunderbolt. " TB isn't going to be a priority.

If the primary traction AMD gets is mainly on the more classic desktop form factor then Thunderbolt is unlikely to be a priority.


Back to thread for a second. 95W APUs also available? That is more interesting.

Interesting from a Mac perspective how? If this there is not to be focused on anything Mac then it should be shut down.
 

cube

Suspended
May 10, 2004
17,011
4,972
This is a Zen thread. Shutting down a bit of discussion of the subject in general is intolerance.
 

Zarniwoop

macrumors 65816
Aug 12, 2009
1,036
759
West coast, Finland
And I think custom APU is the reason why Apple became AMD only three years ago. Intel is too pricey, Apple wants a bigger cut from Mac's. With custom APU, similar concept as with console makers, Apple could do same to x86-64 world, as they've done with Axx chips in ARM world.

For me it seems, that this is pointing to next year and macOS 10.13. Metal should be finished, and openCL will reach the 2.x version thanks to HSA layer. Custom APU revolution will start from desktops. iMacs, new Mini and maybe Mac Pro. Then Macbook Pro but Macbook will stay with Intel for some time, unless iPad Pro has became something useful.
 
Last edited:

koyoot

macrumors 603
Original poster
Jun 5, 2012
5,939
1,853
http://www.anandtech.com/show/10591...art-2-extracting-instructionlevel-parallelism

Part 2 of Anandtech analysis of Zen arch.

http://venturebeat.com/2016/08/18/amds-takes-biggest-jab-at-intel-in-years-with-zen-processor/

One more thing: http://wccftech.com/amd-zen-architecture-hot-chips/#comment-2855691209 one of the comments there.
We'll have to wait for benchmarks, but I'm growing ever more suspicious that Zen's SMT implementation is more like Power8's than anything Intel's produced so far. Intel's approach has been to allow a second thread to use unused CPU resources, but doesn't really over-provision those resources (a single thread can very nearly saturate the whole CPU). On Power8, they can scale up to 8 threads per core (Zen will only do 2), but they make that viable by doubling down on key CPU resources in the first place (Instruction Cache, rename registers, etc.). The end result is that the second SMT thread on Intel increases overall performance by around 15-25%, but on Power8 the second SMT thread can increase overall performance by around 60% in some workloads. In Layman's terms, Power8's 'hyperthreads' are more useful than Intel's.

AMD haven't talked about rename registers yet, but they have revealed that the instruction cache is 64KB per core; perhaps not-so-coincidentally, that's double the size of Skylake's instruction cache, and the same size as Power8's. The L1 Data cache is only 32K in all of these processors, but its rather odd in processor design to have your instruction cache be twice the size of your L1 data cache -- unless you have a good reason. There's only two reasons I can think of -- either that second thread chews through a lot more instructions than in competing SMT designs, or possibly the uOp Cache can spill to L1. Looking at the slide from HotChips that shows which CPU resources are exclusive, competitively shared, or arithmetically arbitrated, has me leaning toward the former, though they might not have overprovisioned CPU resources enough to match Power8 fully. There were also rumors months back about Zen doing some really novel things with SMT, which would seem to back that up.

The implication of that would be that Zen could run at a lower clockspeed than Intel's current Broadwell DE but still match in overall threaded performance (but perhaps giving up 10-15% single-threaded performance (not clock-normalized)). For the mainstream, they could release a quad-core CPU at similar clocks to Skylake, and outperform it in threaded workloads. In gaming workloads, since current consoles make 6-7 threads available to games, a quad-core Zen with 4 hyper-threads giving ~60% additional performance would give a lot bettter performance than a quad-core i7 with 4 hyper-threads giving ~20% additional performance. In fact, that Zen would would have a throughput comparable to 6-7 dedicated cores.

We won't know until someone does an architecture deep-dive or we have benches showing SMT gains much larger than intel's. But its looking increasingly likely from what I see.
 
Last edited:

Mago

macrumors 68030
Aug 16, 2011
2,789
912
Beyond the Thunderdome
No. The inter-socket/package connections that AMD will have (if could be used with an APU ) likely have distance limitations that the current Mac Pro design can't get around. Also as mentioned elsewhere, the additional ram DIMMs go where?

The desktop/'single socket' APUs can't really be connected. Having three seperate computers inside the Mac Pro case doesn't really buy all that much.
Dec, the technology required to interconnect CPUs its old, Given AMD foresee those APU to be key offering for HPC AMD should already has some provisions and name it Coherent Link (also I read somewhere they till launch an HPE Moonshot card.).

Where the Dimms? at the back of each board just like now you have the SSD on the back of the GPU, it should be a 360 degree DIMM distributions.

Also I miss, a side plus from multiple APU system its the availability of a plenty PCIe Lines for things like NVMe's and Thunderbolt (or thunderbolt-like) interfaces.

Imagine a 96 core Mac Pro with 12 DIMM, 3 NVMe and 10 TB3 ports, its possible, ... likely ? unlikely.

Of course all said here about a tcMP with Zen its purely theoretical speculation, the most likely apple wil follow the single socket way.
 
Last edited:

cube

Suspended
May 10, 2004
17,011
4,972
I expect that since one was traditionally able to build a quite inexpensive quad Opteron system.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.