Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

avkills

macrumors 65816
Jun 14, 2002
1,182
985
rondocap, can you run a Octane Bench (20.2.3) with just one of the duo cards and then with both? For whatever reason it will not run with the 580x. But I did run it on my PC with the 1080ti; got 189.23 overall.
 

rondocap

macrumors 6502a
Original poster
Jun 18, 2011
527
307
rondocap, can you run a Octane Bench (20.2.3) with just one of the duo cards and then with both? For whatever reason it will not run with the 580x. But I did run it on my PC with the 1080ti; got 189.23 overall.
I ran Octane X, the trench benchmark, should give you a similar idea

1628786145619.png
 

Grumply

macrumors 6502
Feb 24, 2017
285
193
Melbourne, Australia
Rocket benchmark test, similar to the candle test:

Middle Test, UHD Pro res

W6800x Duo
09 Blur: 59 fps
18 Blur: 51 fps
30 Blur: 31 fps
66 Blur: 15 fps

1 TNR: 59 fps
2 TNR: 39 fps
4 TNR: 20 fps
6 TNR: 14 fps

To compare to the dual W5700x:

2x W5700X

09 Blur: 59 fps
18 Blur: 30 fps
30 Blur: 19 fps
66 Blur: 9 fps

1 TNR: 51 fps
2 TNR: 26 fps
4 TNR: 14 fps
6 TNR: 10 fps

I just ran the same UHD Prores tests on my new 2x 6800XT build and got these:


2x 6800XT

09 Blur: 59.0
18 Blur: 59.0
30 Blur: 37.0
66 Blur: 18.0

01 TNR: 59.0
02 TNR: 40.0
04 TNR: 21.0
06 TNR: 15.0


And this is what I had with my 2x Radeon VIIs:

2x Radeon VII

09 Blur: 83.0
18 Blur: 44.0
30 Blur: 27.5
66 Blur: 13.0

01 TNR: 68.0
02 TNR: 36.5
04 TNR: 19.0
06 TNR: 13.0


They're interesting numbers. Obviously not all effects in Resolve are created equal - some (like the blur used here) scaled more predictably with the raw compute power of the GPUs in use. And others (like Temporal Noise Reduction) are a great leveler, and humble all of these cards (while reducing the performance gaps between).
 
  • Like
Reactions: rondocap

arche3

macrumors 6502
Jul 8, 2020
407
286
Do you have any red komodo 6k hq r3d footage and export in premiere pro to prores 4444hq.

With a film lut and a color lut.
 
  • Like
Reactions: OkiRun

Grilled Cheese

macrumors member
Aug 5, 2021
62
63
Brilliant. I have to admit that I’m often surprised by how well w5700x GPUs compare against much pricier competition.
A follow up question (from a FCP user with limited GPU knowledge)…

Do you have any tests that demonstrate the performance of GPUs in terms of their ability to render effects (like adding a LUT, blur, colour adjustments etc)?

Alternatively, which common benchmarks are good indicators of this type of performance?

Project export times are less important to me than quick effects rendering to keep the workflow moving along.
 

rondocap

macrumors 6502a
Original poster
Jun 18, 2011
527
307
Yes I do!
Brilliant. I have to admit that I’m often surprised by how well w5700x GPUs compare against much pricier competition.
Yeah from my testing the W5700x, and 2 of them especially, are the best value for the performance, many times coming close to the other more expensive options.

I am kind of getting the feeling that for video work, the big expensive GPUs really don't offer the same value as their price would indicate. That's why the W6900x is not really a good buy imo.

I am happy with the W6800x Duo though, I think it strikes the right balance of performance vs value.

2 of the W6800x Duos are really meant for 3D rendering, Octane X, etc. They really do very little for the price for video work in most cases.
 

Grumply

macrumors 6502
Feb 24, 2017
285
193
Melbourne, Australia
I am kind of getting the feeling that for video work, the big expensive GPUs really don't offer the same value as their price would indicate. That's why the W6900x is not really a good buy imo.

I think this is a really important point for people to get their heads around. I just slogged through a tedious few hours doing some real world benchmarking between my outgoing Radeon VIIs and my new 6800XTs (using a recent Davinci Resolve project that I found really hard-going on the Radeon VIIs, in terms of smooth real time playback), and the results are not at all what I was expecting (given the difference in raw Metal Compute power between the two options), and not at all what many of the artificial benchmarks (Candle Test etc) would have you believe.

For exporting, the newer cards were a whopping 6% faster, and for general real-time playback in the timeline they were also only 6% faster (averaged across the whole timeline). With certain clips and grades, the difference is larger, but for many (if not most) there's barely a difference at all.

For example, if you remove Temporal Noise Reduction from the colour grade, the 6800XTs speed up a little (comparatively) and yield a 10% advantage in real time playback over the Radeon VIIs. But TNR does bring them very close to each other. So obviously compute power doesn't scale linearly with certain video effects and plugins.

All of which has been rather eye-opening, and has actually got me considering a major revision to my normal colour grading workflow. If a project is going to require any significant amount of noise reduction (as this particular benchmarked project did - to yield cleaner secondary colour keys). Then I'm going to do all of that noise reduction in a first pass (with nothing else), and then work on top of the noise-reduced exports.

According to these tests I've run, that should yield a 65% increase in real-time playback speeds (with the 6800XTs, 58% with the Radeon VIIs), which is enough to get perfectly smooth playback without the need for any caching (and the complexities, hiccups that sometimes come with that in Resolve).
 
  • Like
Reactions: rondocap

rondocap

macrumors 6502a
Original poster
Jun 18, 2011
527
307
I think this is a really important point for people to get their heads around. I just slogged through a tedious few hours doing some real world benchmarking between my outgoing Radeon VIIs and my new 6800XTs (using a recent Davinci Resolve project that I found really hard-going on the Radeon VIIs, in terms of smooth real time playback), and the results are not at all what I was expecting (given the difference in raw Metal Compute power between the two options), and not at all what many of the artificial benchmarks (Candle Test etc) would have you believe.

For exporting, the newer cards were a whopping 6% faster, and for general real-time playback in the timeline they were also only 6% faster (averaged across the whole timeline). With certain clips and grades, the difference is larger, but for many (if not most) there's barely a difference at all.

For example, if you remove Temporal Noise Reduction from the colour grade, the 6800XTs speed up a little (comparatively) and yield a 10% advantage in real time playback over the Radeon VIIs. But TNR does bring them very close to each other. So obviously compute power doesn't scale linearly with certain video effects and plugins.

All of which has been rather eye-opening, and has actually got me considering a major revision to my normal colour grading workflow. If a project is going to require any significant amount of noise reduction (as this particular benchmarked project did - to yield cleaner secondary colour keys). Then I'm going to do all of that noise reduction in a first pass (with nothing else), and then work on top of the noise-reduced exports.

According to these tests I've run, that should yield a 65% increase in real-time playback speeds (with the 6800XTs, 58% with the Radeon VIIs), which is enough to get perfectly smooth playback without the need for any caching (and the complexities, hiccups that sometimes come with that in Resolve).
Yeah that is very true. I am having to actually chase down the scenarios where for video work the more expensive GPUs make a significant difference. For export, the differences aren't major.

I did notice some playback improvements in Final Cut with 8k raw - plays perfectly now where before I would get some dropped frames. Resolve always played it smoother.

Candle and those benchmarks do show a more significant difference, but it is not exactly indicative of real world. I think if you're doing really heavy grading work, like heavy noise reduction, that's where you may find more differences I think with the bigger GPU setups.

But anyway, having 4 GPUs is really cool - so that makes up for some of it too. lol
 

rondocap

macrumors 6502a
Original poster
Jun 18, 2011
527
307
So here is a weird bottleneck scenario, if anyone has some further insight:

First test:
Sonnet PCIE NVME card with 4x 970 Evo Plus in Raid 0, benchmarks speeds over 6000mb/s

Both the source file is on the drive, and the destination is also on the drive.

Export a 6K Red to Pro res 422 HQ: 13 minutes, and all 4 W6800x memory maxed out. This is too slow, I've gotten much faster with just 2 GPUs.

Write/read speeds stayed in the 200-300mb/s range

2nd test:

Same footage, once again all 4 GPUs utilized. but this time, source file on the raid 0 PCIE card, and the destination file going to the main Apple SSD.

Write/read speeds climbed to 500-600 mb/s range, and export time dropped to 9 minutes, which is more in line with what it should be.

So what exactly is the reason for these disk speed issues? Shouldn't the raid 0 be able to handle this speed of read and write simutaniously since it's way below even the sustained write speeds of these drives? Is it a pcie bandwidth issue when accessing one of these sonnet cards? I wonder if it has something to do with how the sonnet cards interact and split the lanes up on one 16x lane. (Yes it is connected to 16x and bandwidth is 100% or lower)
 

joelypolly

macrumors 6502a
Sep 14, 2003
511
218
Bay Area
So here is a weird bottleneck scenario, if anyone has some further insight:

First test:
Sonnet PCIE NVME card with 4x 970 Evo Plus in Raid 0, benchmarks speeds over 6000mb/s

Both the source file is on the drive, and the destination is also on the drive.

Export a 6K Red to Pro res 422 HQ: 13 minutes, and all 4 W6800x memory maxed out. This is too slow, I've gotten much faster with just 2 GPUs.

Write/read speeds stayed in the 200-300mb/s range

2nd test:

Same footage, once again all 4 GPUs utilized. but this time, source file on the raid 0 PCIE card, and the destination file going to the main Apple SSD.

Write/read speeds climbed to 500-600 mb/s range, and export time dropped to 9 minutes, which is more in line with what it should be.

So what exactly is the reason for these disk speed issues? Shouldn't the raid 0 be able to handle this speed of read and write simutaniously since it's way below even the sustained write speeds of these drives? Is it a pcie bandwidth issue when accessing one of these sonnet cards? I wonder if it has something to do with how the sonnet cards interact and split the lanes up on one 16x lane. (Yes it is connected to 16x and bandwidth is 100% or lower)
The x16 card has a PCIe bridge I think so you are probably seeing issues with software RAID, CPU use and PCIe multiplexing. You'd probably get better performance with a real x16 PCIe SSD backed by hardware
 
  • Like
Reactions: rondocap

rondocap

macrumors 6502a
Original poster
Jun 18, 2011
527
307
The x16 card has a PCIe bridge I think so you are probably seeing issues with software RAID, CPU use and PCIe multiplexing. You'd probably get better performance with a real x16 PCIe SSD backed by hardware
Interesting, that may be it. What can you think of in terms of NVME drives and an adapter that would not give me these issues? Would a single PCIE NVME on a single adapter for just a regular, non raid fast NVME like the 970 evo perform better?
 

Grumply

macrumors 6502
Feb 24, 2017
285
193
Melbourne, Australia
So here is a weird bottleneck scenario, if anyone has some further insight:

First test:
Sonnet PCIE NVME card with 4x 970 Evo Plus in Raid 0, benchmarks speeds over 6000mb/s

Both the source file is on the drive, and the destination is also on the drive.

Export a 6K Red to Pro res 422 HQ: 13 minutes, and all 4 W6800x memory maxed out. This is too slow, I've gotten much faster with just 2 GPUs.

Write/read speeds stayed in the 200-300mb/s range

2nd test:

Same footage, once again all 4 GPUs utilized. but this time, source file on the raid 0 PCIE card, and the destination file going to the main Apple SSD.

Write/read speeds climbed to 500-600 mb/s range, and export time dropped to 9 minutes, which is more in line with what it should be.

So what exactly is the reason for these disk speed issues? Shouldn't the raid 0 be able to handle this speed of read and write simutaniously since it's way below even the sustained write speeds of these drives? Is it a pcie bandwidth issue when accessing one of these sonnet cards? I wonder if it has something to do with how the sonnet cards interact and split the lanes up on one 16x lane. (Yes it is connected to 16x and bandwidth is 100% or lower)

Drives don't like reading AND writing at the same time. It's why you always want your media on one drive, your cache on another, and your exports going to a third (I also keep my OS on a separate drive as well).

This way you'll get maximum performance on each operation.
 
  • Like
Reactions: Adult80HD

rondocap

macrumors 6502a
Original poster
Jun 18, 2011
527
307
Drives don't like reading AND writing at the same time. It's why you always want your media on one drive, your cache on another, and your exports going to a third (I also keep my OS on a separate drive as well).

This way you'll get maximum performance on each operation.
Yeah that is how I have it setup, with separate drives, but I wanted to test the same drive operation too to see where I can find potential bottlenecks
 

joelypolly

macrumors 6502a
Sep 14, 2003
511
218
Bay Area
Basically with RAID 0 you are having your CPU coordinate reads and write across 4 drives which have internal controllers coordinating its own reads and writes. For modern NVMe drives this isn't always a good thing since firmware is optimized for their own usage. Which is why you don't always see a 1 to 1 scaling with things like random reads and writes but you do see it with sequential read/writes.

I'd probably do what Grumply said and just use them as individual drives since a PCIe x4 drive will exceed your required performance anyways. Personally I have 2 u.2 Micron drives and 1 m.2 ADATA drive that I use separately
 
  • Like
Reactions: rondocap

Fastball32

macrumors member
Nov 17, 2011
97
42
Hi,
Thanks Rondocap.
Do you notice any stability issues with the new 6800 duo's? If so, what is the most stable setup of GPU's for Big Sur in your opinion?
Old forum posts say that w5700x was not as stable as Vega, so wondering if there's still driver issues with w5700x and possibliy with the new 6800's. I think you've commented that "funny things" can happen when you mix different GPU's, so wanted to know if you could expand on that.
Have you had a chance to do Bootcamp with 6800? I don't know if it is supported in Bootcamp.
Thanks. You've been a real help to the community.
 
  • Like
Reactions: rondocap

IanK MacPro

macrumors member
Jul 6, 2018
68
43
Buckinghamshire, UK
Do you have some instructions or a specific benchmark I can run? I am not too familiar with Redshift, but I did run Octane benchmarks.

The 4 GPUs got 4 seconds in the Chess test

View attachment 1817766
That would be great thank. download the Mac version from here:

instructions are here:


The current leader on the Mac is someone who tested with 2x VegaII Duo's

Unlike Octane, Redshift doesn't scale as linearly once you go beyond 3x - 4x cards, which is why my 2 old 5.1 MacPros running dual 1080Ti's in each still run really well for what I do, would love to know how the new Metal cards compare!


Redshift 3.0.45 Metal Benchmark Results​


The results we be in minutes, current relevant results are:


Apple M1 - 37:21

MP 7.1 32 Threads - AMD Radeon Pro Vega 64 - 12:01

MP 7.1 32 Threads - AMD Radeon Pro W5700X 16GB - 10:38

MP 7.1 16 Threads - AMD Radeon VII 16 GB - 08:56

MP 7.1 12 Threads - AMD Radeon RX 5700XT 8GB x 2 - 06:27

MP 7.1 32 Threads - AMD Radeon RX 6800 XT - 06:43

MP 7.1 32 Threads - AMD Radeon Pro W5700X 16GB x 2 - 05:46

MP 7.1 32 Threads - AMD Radeon RX 6900 XT - 05:26

MP 7.1 32 Threads- AMD Radeon Pro Vega II Duo 32GB x 2 - 02:10
 

rondocap

macrumors 6502a
Original poster
Jun 18, 2011
527
307
Hi,
Thanks Rondocap.
Do you notice any stability issues with the new 6800 duo's? If so, what is the most stable setup of GPU's for Big Sur in your opinion?
Old forum posts say that w5700x was not as stable as Vega, so wondering if there's still driver issues with w5700x and possibliy with the new 6800's. I think you've commented that "funny things" can happen when you mix different GPU's, so wanted to know if you could expand on that.
Have you had a chance to do Bootcamp with 6800? I don't know if it is supported in Bootcamp.
Thanks. You've been a real help to the community.
I used dual W5700X previously, and recently they were very stable.

the w6800x so far has been stable, no weird issues to report, so drivers Seem good. possibly they can use some performance optimizations for multi gpu scaling, but that always comes with time.

for mixing different GPUs, the main problem is that the NLE editors usually don’t utilize them as well as when it is two of the same. You can even have worse performance like this. Davinci resolve is a little bit better at using different GPUs, but Final Cut definitely likes two of the same
 
  • Like
Reactions: ikir

TrevorR90

macrumors 6502
Oct 1, 2009
378
297
That would be great thank. download the Mac version from here:

instructions are here:


The current leader on the Mac is someone who tested with 2x VegaII Duo's

Unlike Octane, Redshift doesn't scale as linearly once you go beyond 3x - 4x cards, which is why my 2 old 5.1 MacPros running dual 1080Ti's in each still run really well for what I do, would love to know how the new Metal cards compare!


Redshift 3.0.45 Metal Benchmark Results​


The results we be in minutes, current relevant results are:


Apple M1 - 37:21

MP 7.1 32 Threads - AMD Radeon Pro Vega 64 - 12:01

MP 7.1 32 Threads - AMD Radeon Pro W5700X 16GB - 10:38

MP 7.1 16 Threads - AMD Radeon VII 16 GB - 08:56

MP 7.1 12 Threads - AMD Radeon RX 5700XT 8GB x 2 - 06:27

MP 7.1 32 Threads - AMD Radeon RX 6800 XT - 06:43

MP 7.1 32 Threads - AMD Radeon Pro W5700X 16GB x 2 - 05:46

MP 7.1 32 Threads - AMD Radeon RX 6900 XT - 05:26

MP 7.1 32 Threads- AMD Radeon Pro Vega II Duo 32GB x 2 - 02:10
Wow, had no idea that M1 was that much slower when using applications requiring metal. And people on this forum are begging for a silicone Mac Pro ?
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.