AMD Discussion

cube · Aug 23, 2016

Is it really so or one of them is dual GPU?

ManuelGomes · Aug 23, 2016

Something's odd in all this.
490 being based on Vega: what of the rest of the lineup? Only one SKU in the 400 series based on Vega? The rest is all Fury?
Either AMD made Vega only for high-end, which is strange, or a new family will come up.
I'm not sure 490 isn't a dual Polaris.
And all Fury cards are Vega.
All possibilities are open.
[doublepost=1471974179][/doublepost]At least 490 is confirmed to be >256b so either dual Polaris or indeed Vega - maybe Vega Lite/11.
And Vega 10 is Fury of course.
I guess it makes sense indeed.

koyoot · Aug 23, 2016

Manuel, there will be one Fury-badged GPU. The Big Vega.

Naming scheme is very simple. RX 460, RX 470, RX 480, RX 490, RX Fury. 2 Polaris design, 2 Vega designs.

AMD did not made Vega for high-end. Vega will also end up in Raven Ridge APUs. It is exactly the same architecture.

It cannot be simpler. Zen+Vega APUs - whole APU market, Polaris - mainstream discreet GPU market. Vega - High-end Discreet GPU market.

deconstruct60 · Aug 23, 2016

koyoot said:
http://www.anandtech.com/show/10578...rs-micro-op-cache-memory-hierarchy-revealed/2

This is best analysis of the Zen CPU. Now it is all down to core clocks on final silicon.

Anandtech hasn't dug into the UnCore part of the flavors of Zen yet. Perhaps after the Hotchips presentation.

They will have 32 lanes. Zen is modular architecture. It has 2 clusters that are built from 4 cores, and 8 MB of L3 cache.

Both clusters need to have their own separate 16 PCIe lanes, for complete scalability.

4 cores, 8 MB L3 cache, 16 PCIe lanes.
8 cores, 16 MB L3 cache, 32 PCIe lanes.

Errrr. no. Here is the die photo of the 8 core model.

http://dresdenboy.blogspot.com/2016/08/some-last-chance-pre-hot-chips.html

The PCIe+FCH ( south bridge) portion is not replicated per core cluster. The memory channel controller is replicated. The GMI-LInk (GMI -- Global Memory Interconnect. Think HyperLink or Intel's QPI links ) is replicated. The south bridge is not.

8 cores may have 16 or 32 depending upon what UnCore Southbridge is attached. That is likely why the 32 core model has 32 PCI-e v3 lanes. [ a MCM package with two if these 8 core dies each with 16 ; 16 * 2 = 32 ] Therefore, the two package solution of the 32 core model has 64 [ which aligns up with Anandtech's motherboard layout]

cube · Aug 23, 2016

Isn't the Vega for APU smaller than Vega 11?

Given how long it took for AMD to come up with Polaris, I would hope they would bring a lower end discrete Vega design in the near future.

deconstruct60 · Aug 23, 2016

Stacc said:
....
Bandwidth between the CPU and the GPU is not usually a bottleneck. With zen topping out at 32 cores that is going to be a very big die.

For server APUs they can use a bigger MCM (multi chip module) package like IBM does for Power. For example, a Power 5 MCM with 4 Power5 dies and 4 L3 caches.

https://en.wikipedia.org/wiki/POWER5

The overall module is the "APU". So could put a full sized 32 core die and a med-large GPU HBM module into the same supersize (relative to normal desktop socket) module. If have a multiple die solution then need a low latency, high bandwidth solution to handle traffic between the dies. But also don't necessarily need the same substrate interposer to span both the GPU+HBM and the CPU die

The bandwidth (and latency ) is a bottleneck if start to use NUMA, global memory. So if the CPU and GPU are both pulling/pushing data into the HBM memory that could clog up fairly quickly. Especially on HPC kernels that lean on bisection bandwidth ( and not mainstream desktop software. )

Again I don't see the relevancy here for a Mac Pro. These "server" APUs are likely aimed at being fundamental building blocks for Supercomputer ( or mini-Supercomputers ).... stuff heading for machine rooms not ultra quiet desktops.

I doubt they have any room left to fit any sort of GPU on there, especially one that has even mainstream performance.

There is no necessity to put everything on one die in a "System on a Chip" (SoC) solution. Many iOS devices have RAM stacked on top. The Intel Iris Pro soltuions with eDRAM have the large eDRAM cache in a separate die placed in the same module as the CPU die. Same thing only a bit bigger.

koyoot · Aug 23, 2016

deconstruct60 said:
Anandtech hasn't dug into the UnCore part of the flavors of Zen yet. Perhaps after the Hotchips presentation.

Errrr. no. Here is the die photo of the 8 core model.

http://dresdenboy.blogspot.com/2016/08/some-last-chance-pre-hot-chips.html

The PCIe+FCH ( south bridge) portion is not replicated per core cluster. The memory channel controller is replicated. The GMI-LInk (GMI -- Global Memory Interconnect. Think HyperLink or Intel's QPI links ) is replicated. The south bridge is not.

8 cores may have 16 or 32 depending upon what UnCore Southbridge is attached. That is likely why the 32 core model has 32 PCI-e v3 lanes. [ a MCM package with two if these 8 core dies each with 16 ; 16 * 2 = 32 ] Therefore, the two package solution of the 32 core model has 64 [ which aligns up with Anandtech's motherboard layout]

http://www.planet3dnow.de/vbulletin...95W-TDP-DDR4?p=5110384&viewfull=1#post5110384
Update #1: As "Crashtest" explains in a later posting, the respective Summit Ridge system (w/ Myrtle mainboard) seems to have at least 36 PCIe lanes. According to him, the listed configuration seems to be a bit chaotic. BTW, "Promotory" should actually be written "Promontory".
From the same blog

.

cube · Aug 23, 2016

What about Thunderbolt?

deconstruct60 · Aug 23, 2016

koyoot said:
... to have at least 36 PCIe lanes. According to him, the listed configuration seems to be a bit chaotic. BTW, "Promotory" should actually be written "Promontory".
From the same blog .

36 PCIe lanes of what version? x8 lanes of PCI-e v2 isn't as competitive with Intel's offerings as x8 lanes of PCI-e v3. AMD might have leapfrogged the C612 ( X99) chipset a bit with v3 "top-to-bottom" but I suspect they haven't. Note that Intel isn't trying to put the Southbridge into most of their dies yet. There is a trade-off in doing that.

koyoot · Aug 23, 2016

dec60, just wait and see

.

deconstruct60 · Aug 23, 2016

cube said:
What about Thunderbolt?

You mean relevancy for the Mac Pro design? Chuckle.... this tread doesn't seem to be concerned about that.

If there is not enough PCI-e v3 bandwidth to go around, then it is a non starter even before start to wade into boot/support issues.

Intel is still the sole supplier of Thunderbolt controllers. Not buying them bundled with CPU packages isn't going to be cheaper. Intel probably isn't going to bend over backwards with boot/compatibility issues. Likewise, not sure Thunderbolt has gotten to the critical mass point where AMD cares about being "left out". The USB 3.1 Type C alternative modes of DP+USB gen 2 covers much of what the original TB v1 did. AMD has more than enough drama to fix to get back to be very competitive in the overall market than to add TB to the pile at the moment.

Unless, there are more than a couple other system vendors who want TB v3 who are not Apple, it doesn't really make alot of sense to AMD to chase that. Especially, if Apple being Scrooge McDuck and not paying for the whole effort in advance. It is too high a risk at the moment to spend a significant amount of money and lose out in a design bake-off. AMD needs to get healthy and then they may be able to afford that kind of stuff.

cube · Aug 23, 2016

I am not concerned about the Mac Pro.

I care if a PC motherboard lacks Thunderbolt.

koyoot · Aug 23, 2016

Nobody here in their mind thinks that 8 core, 95W CPU, with Haswell/Broadwell level of performance would be suitable for Mac Pro. Especially with smaller amount of PCIe lanes, and Dual channel memory.

What cube is asking for is supposedly as a interest for his potential Zen build at home. IIRC, Thunderbolt was supposed to work with any brand, not only Intel, but also AMD, ARM, and... Nvidia.

Back to thread for a second. 95W APUs also available? That is more interesting.

Mago · Aug 23, 2016

deconstruct60 said:
You mean relevancy for the Mac Pro design? Chuckle.... this tread doesn't seem to be concerned about that.

If there is not enough PCI-e v3 bandwidth to go around, then it is a non starter even before start to wade into boot/support issues.

Intel is still the sole supplier of Thunderbolt controllers. Not buying them bundled with CPU packages isn't going to be cheaper. Intel probably isn't going to bend over backwards with boot/compatibility issues. Likewise, not sure Thunderbolt has gotten to the critical mass point where AMD cares about being "left out". The USB 3.1 Type C alternative modes of DP+USB gen 2 covers much of what the original TB v1 did. AMD has more than enough drama to fix to get back to be very competitive in the overall market than to add TB to the pile at the moment.

Unless, there are more than a couple other system vendors who want TB v3 who are not Apple, it doesn't really make alot of sense to AMD to chase that. Especially, if Apple being Scrooge McDuck and not paying for the whole effort in advance. It is too high a risk at the moment to spend a significant amount of money and lose out in a design bake-off. AMD needs to get healthy and then they may be able to afford that kind of stuff.

A trash can thermal core could easy 3x APU, a nMP based on Zen APU should be a monster.

deconstruct60 · Aug 23, 2016

Mago said:
A trash can thermal core could easy 3x APU,

No. The inter-socket/package connections that AMD will have (if could be used with an APU ) likely have distance limitations that the current Mac Pro design can't get around. Also as mentioned elsewhere, the additional ram DIMMs go where?

The desktop/'single socket' APUs can't really be connected. Having three seperate computers inside the Mac Pro case doesn't really buy all that much.
[doublepost=1472022495][/doublepost]

koyoot said:
What cube is asking for is supposedly as a interest for his potential Zen build at home.

This forum is Macs > Desktop > Mac Pro not

General PC > Desktop > Home Builder.

If interest in the homebuilder topic then probably would be more productive to go to a homebuilder forum.

IIRC, Thunderbolt was supposed to work with any brand, not only Intel, but also AMD, ARM, and... Nvidia.

It isn't going to work for "free" with zero effort. AMD is sticking their toe in the water with XConnect.

http://www.anandtech.com/show/10133/amd-xconnect-external-radeons

Pragmatically though that is driven more so from the AIO card market and not the CPU "division" inside AMD. Intel based laptops/SFF/etc. that can't easily stick a card into that have a TB v3 socket. It is more a driver issue than a hardware/system boot one ( the AIO card just gets the transparent 'it is just a PCI-e switch with hot plug-and-play' view of TB from outside the base computer system.

Eventually if AMD starts to get more laptop design wins then perhaps ....
"... . It remains to be seen if laptops with AMD chips get Thunderbolt 3 ports, though it looks likely. ..."
http://www.pcworld.com/article/3051...questions-about-amds-bristol-ridge-chips.html
[ as this article points out there are some other Intel tech that may get some traction RealSense , Optane, etc. AMD can't ignore all of them just because Intel is behind them. ]
But until AMD starts to hear " well you would have made our design bake off, but our laptop needs to have Thunderbolt. " TB isn't going to be a priority.

If the primary traction AMD gets is mainly on the more classic desktop form factor then Thunderbolt is unlikely to be a priority.

Back to thread for a second. 95W APUs also available? That is more interesting.

Interesting from a Mac perspective how? If this there is not to be focused on anything Mac then it should be shut down.

cube · Aug 24, 2016

This is a Zen thread. Shutting down a bit of discussion of the subject in general is intolerance.

Zarniwoop · Aug 24, 2016

And I think custom APU is the reason why Apple became AMD only three years ago. Intel is too pricey, Apple wants a bigger cut from Mac's. With custom APU, similar concept as with console makers, Apple could do same to x86-64 world, as they've done with Axx chips in ARM world.

For me it seems, that this is pointing to next year and macOS 10.13. Metal should be finished, and openCL will reach the 2.x version thanks to HSA layer. Custom APU revolution will start from desktops. iMacs, new Mini and maybe Mac Pro. Then Macbook Pro but Macbook will stay with Intel for some time, unless iPad Pro has became something useful.

cube · Aug 24, 2016

I don't care about a proprietary Metal API.

Apple should support Vulkan.

Same reason why I prefer OpenCL to CUDA, and OpenGL to DirectX.

koyoot · Aug 24, 2016

http://www.anandtech.com/show/10591...art-2-extracting-instructionlevel-parallelism

Part 2 of Anandtech analysis of Zen arch.

http://venturebeat.com/2016/08/18/amds-takes-biggest-jab-at-intel-in-years-with-zen-processor/

One more thing: http://wccftech.com/amd-zen-architecture-hot-chips/#comment-2855691209 one of the comments there.

We'll have to wait for benchmarks, but I'm growing ever more suspicious that Zen's SMT implementation is more like Power8's than anything Intel's produced so far. Intel's approach has been to allow a second thread to use unused CPU resources, but doesn't really over-provision those resources (a single thread can very nearly saturate the whole CPU). On Power8, they can scale up to 8 threads per core (Zen will only do 2), but they make that viable by doubling down on key CPU resources in the first place (Instruction Cache, rename registers, etc.). The end result is that the second SMT thread on Intel increases overall performance by around 15-25%, but on Power8 the second SMT thread can increase overall performance by around 60% in some workloads. In Layman's terms, Power8's 'hyperthreads' are more useful than Intel's.

AMD haven't talked about rename registers yet, but they have revealed that the instruction cache is 64KB per core; perhaps not-so-coincidentally, that's double the size of Skylake's instruction cache, and the same size as Power8's. The L1 Data cache is only 32K in all of these processors, but its rather odd in processor design to have your instruction cache be twice the size of your L1 data cache -- unless you have a good reason. There's only two reasons I can think of -- either that second thread chews through a lot more instructions than in competing SMT designs, or possibly the uOp Cache can spill to L1. Looking at the slide from HotChips that shows which CPU resources are exclusive, competitively shared, or arithmetically arbitrated, has me leaning toward the former, though they might not have overprovisioned CPU resources enough to match Power8 fully. There were also rumors months back about Zen doing some really novel things with SMT, which would seem to back that up.

The implication of that would be that Zen could run at a lower clockspeed than Intel's current Broadwell DE but still match in overall threaded performance (but perhaps giving up 10-15% single-threaded performance (not clock-normalized)). For the mainstream, they could release a quad-core CPU at similar clocks to Skylake, and outperform it in threaded workloads. In gaming workloads, since current consoles make 6-7 threads available to games, a quad-core Zen with 4 hyper-threads giving ~60% additional performance would give a lot bettter performance than a quad-core i7 with 4 hyper-threads giving ~20% additional performance. In fact, that Zen would would have a throughput comparable to 6-7 dedicated cores.

We won't know until someone does an architecture deep-dive or we have benches showing SMT gains much larger than intel's. But its looking increasingly likely from what I see.

Mago · Aug 24, 2016

deconstruct60 said:
No. The inter-socket/package connections that AMD will have (if could be used with an APU ) likely have distance limitations that the current Mac Pro design can't get around. Also as mentioned elsewhere, the additional ram DIMMs go where?

The desktop/'single socket' APUs can't really be connected. Having three seperate computers inside the Mac Pro case doesn't really buy all that much.

Dec, the technology required to interconnect CPUs its old, Given AMD foresee those APU to be key offering for HPC AMD should already has some provisions and name it Coherent Link (also I read somewhere they till launch an HPE Moonshot card.).

Where the Dimms? at the back of each board just like now you have the SSD on the back of the GPU, it should be a 360 degree DIMM distributions.

Also I miss, a side plus from multiple APU system its the availability of a plenty PCIe Lines for things like NVMe's and Thunderbolt (or thunderbolt-like) interfaces.

Imagine a 96 core Mac Pro with 12 DIMM, 3 NVMe and 10 TB3 ports, its possible, ... likely ? unlikely.

Of course all said here about a tcMP with Zen its purely theoretical speculation, the most likely apple wil follow the single socket way.

ManuelGomes · Aug 24, 2016

http://www.anandtech.com/show/10585...hmark-is-zen-actually-2-faster-than-broadwell
[doublepost=1472049856][/doublepost]3 APUs would look very symmetrical but it's a no go for sure. Maybe when Naples gets APU with Vega inside...

Mago · Aug 24, 2016

A bit old, bu this page talks confirm 4 Socket Server Zen CPU (server cpu) and Dual Socket APUs

http://vrworld.com/2016/02/12/cern-confirms-amd-zen-high-end-specifications/

Still not that bad having 32 cores 64 threads on a Dual Socket tcMac Pro, with upto 8 DIMM slots and 4/8GB HBM2 ram Each ...

cube · Aug 24, 2016

I expect that since one was traditionally able to build a quite inexpensive quad Opteron system.

SoyCapitanSoyCapitan · Aug 24, 2016

cube said:
I don't care about a proprietary Metal API.

Apple should support Vulkan.

Same reason why I prefer OpenCL to CUDA, and OpenGL to DirectX.

DX does kill GL though. Not everything corporate sucks.

cube · Aug 24, 2016

SoyCapitanSoyCapitan said:
DX does kill GL though. Not everything corporate sucks.

GL was created by a corporation.

It does not matter if DX has an edge.

AMD Discussion

Suspended

macrumors 68000

macrumors 603

macrumors G5

Suspended

macrumors G5

macrumors 603

Suspended

macrumors G5

macrumors 603

macrumors G5

Suspended

macrumors 603

macrumors 68030

macrumors G5

Suspended

macrumors 65816

Suspended

macrumors 603

macrumors 68030

macrumors 68000

macrumors 68030

Suspended

Suspended

Suspended

Our Staff