Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

MacZoltan

macrumors member
Original poster
May 18, 2016
94
9
I have this 2013 mac pro and no matter if i put a brand new pair of tested ECC memory into one of the Mac Pros memory slot it shows after
2-3 restart it has ECC...
I tried to swap the memory module with the other slot and the ECC error stays for a while then goes then comes back on the same slot.
Any suggestion?
Can be the CPU messing with me having memory chanell issues or the memory slot itself?
 

macguru9999

macrumors 6502a
Aug 9, 2006
780
363
Did you solve this, ? I have an error on 1&3 but not 2&4. sometimes restarts when they are populated.

 

macguru9999

macrumors 6502a
Aug 9, 2006
780
363
Well ill order a 10 core and try it, thats the best bet and i can always resell if no go. as the ram talks direct to the cpu and the slots look ok .... its a fair bet.
[automerge]1595092008[/automerge]
 

MacZoltan

macrumors member
Original poster
May 18, 2016
94
9
good, dont forget to thighten all8 fully,
many other video and guide claims otherwise and that actually causes memory,cpu problems:)
 
  • Like
Reactions: macguru9999

fiatlux

macrumors 6502
Dec 5, 2007
351
139
I am experiencing the same kind of issues (currently on slots 1&2 but had it on other slots with various combinations of modules).

Is it really likely that the CPU went bad? Isn't a loose contact somewhere more likely? I seem to have recurrent issues with the MP6,1 that I never had with my earlier MP and I was wondering whether it was not just a matter of poor memory slot design with too stiff connectors...
 

mikas

macrumors 6502a
Sep 14, 2017
890
646
Finland
I bought a used 6,1 Mac Pro cylinder (6 core 32/500 D300) in september. Me too noticed quickly there were a lot of ECC errros accumulating with one DIMM, andsome with another DIMM. No panics or restarts though. They were all recoverable errors.
1606969200835.png

It bugged me though, and because I needed more RAM anyway, I ordered 64 from eBay. ECC errors went all away with RAM change.

So I don't know if they were badly seated (I do think it had factory installed RAM), or DIMMs going bad for some reason - maybe heat, maybe usage and age or whatever.

OT: Now I have this 10 core 2690v2 here ready, waiting for some holiday spare time with my cylinder.
 

fiatlux

macrumors 6502
Dec 5, 2007
351
139
I bought a used 6,1 Mac Pro cylinder (6 core 32/500 D300) in september. Me too noticed quickly there were a lot of ECC errros accumulating with one DIMM, andsome with another DIMM. No panics or restarts though. They were all recoverable errors.
View attachment 1685513
It bugged me though, and because I needed more RAM anyway, I ordered 64 from eBay. ECC errors went all away with RAM change.

So I don't know if they were badly seated (I do think it had factory installed RAM), or DIMMs going bad for some reason - maybe heat, maybe usage and age or whatever.

OT: Now I have this 10 core 2690v2 here ready, waiting for some holiday spare time with my cylinder.

I already replaced/swapped all memory modules and improvements were only temporary in my case.

My Mac currently only reports errors for DIMM1 and has been usable for the day without slow-downs or crashes. But yesterday, I had several crashes, with sometimes modules not recognised at all or even improperly recognised (like 40GB of RAM made of 16+16+8+0 when all four slots were populated with 16GB modules !!!). I even had some cases where the Mac refused to boot and I had to disconnect all cables for a while.

OTOH, it can sometimes run beautifully for days or even weeks at times. I think I will too wait for some holiday spare time for a full dissassembly/cleaning/rassembly, and perhaps a CPU replacement (but I already have the top-of-the-line 2697v2). If that does not cure it, I may end up throwing it through the window (and into the swimming pool) in despair ;-).

If only the new Mac mini supported more memory and an eGPU...
 

fiatlux

macrumors 6502
Dec 5, 2007
351
139
My Mac currently only reports errors for DIMM1 and has been usable for the day without slow-downs or crashes.
I spoke too soon, it again crashed on me. About this Mac now reports two empty RAM slots:mad: Here is the crash reports in case there is any hidden clue:

Machine-check capabilities: 0x0000000001000c1d
family: 6 model: 62 stepping: 4 microcode: 1070
signature: 0x306e4
Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz
29 error-reporting banks
Processor 0: IA32_MCG_STATUS: 0x0000000000000005
IA32_MC8_STATUS(0x421): 0xfe137f4000010091
IA32_MC8_ADDR(0x422): 0x0000000307ea9580
IA32_MC8_MISC(0x423): 0x00000021507cfc86
IA32_MC14_STATUS(0x439): 0xc81fc50200800091
IA32_MC14_MISC(0x43b): 0xc908440400082000
Processor 1: IA32_MCG_STATUS: 0x0000000000000005
IA32_MC8_STATUS(0x421): 0xfe137f4000010091
IA32_MC8_ADDR(0x422): 0x0000000307ea9580
IA32_MC8_MISC(0x423): 0x00000021507cfc86
IA32_MC14_STATUS(0x439): 0xc81fc50200800091
IA32_MC14_MISC(0x43b): 0xc908440400082000
Processor 2: IA32_MCG_STATUS: 0x0000000000000005
IA32_MC8_STATUS(0x421): 0xfe137f4000010091
IA32_MC8_ADDR(0x422): 0x0000000307ea9580
IA32_MC8_MISC(0x423): 0x00000021507cfc86
IA32_MC14_STATUS(0x439): 0xc81fc50200800091
IA32_MC14_MISC(0x43b): 0xc908440400082000
Processor 3: IA32_MCG_STATUS: 0x0000000000000005
IA32_MC8_STATUS(0x421): 0xfe137f4000010091
IA32_MC8_ADDR(0x422): 0x0000000307ea9580
IA32_MC8_MISC(0x423): 0x00000021507cfc86
IA32_MC14_STATUS(0x439): 0xc81fc50200800091
IA32_MC14_MISC(0x43b): 0xc908440400082000
Processor 4: IA32_MCG_STATUS: 0x0000000000000005
IA32_MC8_STATUS(0x421): 0xfe137f4000010091
IA32_MC8_ADDR(0x422): 0x0000000307ea9580
IA32_MC8_MISC(0x423): 0x00000021507cfc86
IA32_MC14_STATUS(0x439): 0xc81fc50200800091
IA32_MC14_MISC(0x43b): 0xc908440400082000
Processor 5: IA32_MCG_STATUS: 0x0000000000000005
IA32_MC8_STATUS(0x421): 0xfe137f4000010091
IA32_MC8_ADDR(0x422): 0x0000000307ea9580
IA32_MC8_MISC(0x423): 0x00000021507cfc86
IA32_MC14_STATUS(0x439): 0xc81fc50200800091
IA32_MC14_MISC(0x43b): 0xc908440400082000
Processor 6: IA32_MCG_STATUS: 0x0000000000000005
IA32_MC8_STATUS(0x421): 0xfe137f4000010091
IA32_MC8_ADDR(0x422): 0x0000000307ea9580
IA32_MC8_MISC(0x423): 0x00000021507cfc86
IA32_MC14_STATUS(0x439): 0xc81fc50200800091
IA32_MC14_MISC(0x43b): 0xc908440400082000
Processor 7: IA32_MCG_STATUS: 0x0000000000000005
IA32_MC8_STATUS(0x421): 0xfe137f4000010091
IA32_MC8_ADDR(0x422): 0x0000000307ea9580
IA32_MC8_MISC(0x423): 0x00000021507cfc86
IA32_MC14_STATUS(0x439): 0xc81fc50200800091
IA32_MC14_MISC(0x43b): 0xc908440400082000
Processor 8: IA32_MCG_STATUS: 0x0000000000000005
IA32_MC8_STATUS(0x421): 0xfe137f4000010091
IA32_MC8_ADDR(0x422): 0x0000000307ea9580
IA32_MC8_MISC(0x423): 0x00000021507cfc86
IA32_MC14_STATUS(0x439): 0xc81fc50200800091
IA32_MC14_MISC(0x43b): 0xc908440400082000
Processor 9: IA32_MCG_STATUS: 0x0000000000000005
IA32_MC8_STATUS(0x421): 0xfe137f4000010091
IA32_MC8_ADDR(0x422): 0x0000000307ea9580
IA32_MC8_MISC(0x423): 0x00000021507cfc86
IA32_MC14_STATUS(0x439): 0xc81fc50200800091
IA32_MC14_MISC(0x43b): 0xc908440400082000
Processor 10: IA32_MCG_STATUS: 0x0000000000000005
IA32_MC8_STATUS(0x421): 0xfe137f4000010091
IA32_MC8_ADDR(0x422): 0x0000000307ea9580
IA32_MC8_MISC(0x423): 0x00000021507cfc86
IA32_MC14_STATUS(0x439): 0xc81fc50200800091
IA32_MC14_MISC(0x43b): 0xc908440400082000
Processor 11: IA32_MCG_STATUS: 0x0000000000000005
IA32_MC8_STATUS(0x421): 0xfe137f4000010091
IA32_MC8_ADDR(0x422): 0x0000000307ea9580
IA32_MC8_MISC(0x423): 0x00000021507cfc86
IA32_MC14_STATUS(0x439): 0xc81fc50200800091
IA32_MC14_MISC(0x43b): 0xc908440400082000
Processor 12: IA32_MCG_STATUS: 0x0000000000000005
IA32_MC8_STATUS(0x421): 0xfe137f4000010091
IA32_MC8_ADDR(0x422): 0x0000000307ea9580
IA32_MC8_MISC(0x423): 0x00000021507cfc86
IA32_MC14_STATUS(0x439): 0xc81fc50200800091
IA32_MC14_MISC(0x43b): 0xc908440400082000
Processor 13: IA32_MCG_STATUS: 0x0000000000000005
IA32_MC8_STATUS(0x421): 0xfe137f4000010091
IA32_MC8_ADDR(0x422): 0x0000000307ea9580
IA32_MC8_MISC(0x423): 0x00000021507cfc86
IA32_MC14_STATUS(0x439): 0xc81fc50200800091
IA32_MC14_MISC(0x43b): 0xc908440400082000
Processor 14: IA32_MCG_STATUS: 0x0000000000000005
IA32_MC8_STATUS(0x421): 0xfe137f4000010091
IA32_MC8_ADDR(0x422): 0x0000000307ea9580
IA32_MC8_MISC(0x423): 0x00000021507cfc86
IA32_MC14_STATUS(0x439): 0xc81fc50200800091
IA32_MC14_MISC(0x43b): 0xc908440400082000
Processor 15: IA32_MCG_STATUS: 0x0000000000000005
IA32_MC8_STATUS(0x421): 0xfe137f4000010091
IA32_MC8_ADDR(0x422): 0x0000000307ea9580
IA32_MC8_MISC(0x423): 0x00000021507cfc86
IA32_MC14_STATUS(0x439): 0xc81fc50200800091
IA32_MC14_MISC(0x43b): 0xc908440400082000
Processor 16: IA32_MCG_STATUS: 0x0000000000000005
IA32_MC8_STATUS(0x421): 0xfe137f4000010091
IA32_MC8_ADDR(0x422): 0x0000000307ea9580
IA32_MC8_MISC(0x423): 0x00000021507cfc86
IA32_MC14_STATUS(0x439): 0xc81fc50200800091
IA32_MC14_MISC(0x43b): 0xc908440400082000
Processor 17: IA32_MCG_STATUS: 0x0000000000000005
IA32_MC8_STATUS(0x421): 0xfe137f4000010091
IA32_MC8_ADDR(0x422): 0x0000000307ea9580
IA32_MC8_MISC(0x423): 0x00000021507cfc86
IA32_MC14_STATUS(0x439): 0xc81fc50200800091
IA32_MC14_MISC(0x43b): 0xc908440400082000
Processor 18: IA32_MCG_STATUS: 0x0000000000000005
IA32_MC8_STATUS(0x421): 0xfe137f4000010091
IA32_MC8_ADDR(0x422): 0x0000000307ea9580
IA32_MC8_MISC(0x423): 0x00000021507cfc86
IA32_MC14_STATUS(0x439): 0xc81fc50200800091
IA32_MC14_MISC(0x43b): 0xc908440400082000
Processor 19: IA32_MCG_STATUS: 0x0000000000000005
IA32_MC8_STATUS(0x421): 0xfe137f4000010091
IA32_MC8_ADDR(0x422): 0x0000000307ea9580
IA32_MC8_MISC(0x423): 0x00000021507cfc86
IA32_MC14_STATUS(0x439): 0xc81fc50200800091
IA32_MC14_MISC(0x43b): 0xc908440400082000
Processor 20: IA32_MCG_STATUS: 0x0000000000000005
IA32_MC8_STATUS(0x421): 0xfe137f4000010091
IA32_MC8_ADDR(0x422): 0x0000000307ea9580
IA32_MC8_MISC(0x423): 0x00000021507cfc86
IA32_MC14_STATUS(0x439): 0xc81fc50200800091
IA32_MC14_MISC(0x43b): 0xc908440400082000
Processor 21: IA32_MCG_STATUS: 0x0000000000000005
IA32_MC8_STATUS(0x421): 0xfe137f4000010091
IA32_MC8_ADDR(0x422): 0x0000000307ea9580
IA32_MC8_MISC(0x423): 0x00000021507cfc86
IA32_MC14_STATUS(0x439): 0xc81fc50200800091
IA32_MC14_MISC(0x43b): 0xc908440400082000
Processor 22: IA32_MCG_STATUS: 0x0000000000000005
IA32_MC8_STATUS(0x421): 0xfe137f4000010091
IA32_MC8_ADDR(0x422): 0x0000000307ea9580
IA32_MC8_MISC(0x423): 0x00000021507cfc86
IA32_MC14_STATUS(0x439): 0xc81fc50200800091
IA32_MC14_MISC(0x43b): 0xc908440400082000
Processor 23: IA32_MCG_STATUS: 0x0000000000000005
IA32_MC8_STATUS(0x421): 0xfe137f4000010091
IA32_MC8_ADDR(0x422): 0x0000000307ea9580
IA32_MC8_MISC(0x423): 0x00000021507cfc86
IA32_MC14_STATUS(0x439): 0xc81fc50200800091
IA32_MC14_MISC(0x43b): 0xc908440400082000
mp_kdp_enter() timed-out on cpu 19, NMI-ing
mp_kdp_enter() NMI pending on cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 20 21 22 23
mp_kdp_enter() timed-out during locked wait after NMI;expected 24 acks but received 1 after 10960771 loops in 1350000000 ticks
panic(cpu 19 caller 0xffffff8016bf06a9): "Machine Check at 0xffffff8017f07cee, registers:\n" "CR0: 0x000000008001003b, CR2: 0x000070000c91eff8, CR3: 0x000000002aa73000, CR4: 0x00000000001626e0\n" "RAX: 0x0000000000000020, RBX: 0xffffff93eac8c0d0, RCX: 0x0000000000000001, RDX: 0x0000000000000000\n" "RSP: 0xffffffa14c2a3d50, RBP: 0xffffffa14c2a3d80, RSI: 0x0000000000000001, RDI: 0xffffff93eac7a500\n" "R8: 0xffffff93eac80500, R9: 0xffffff93eac8c0d0, R10: 0x0000000000000005, R11: 0x000000002aa73000\n" "R12: 0xffffff8017f1d640, R13: 0xffffff93eac0c6c0, R14: 0x0000000000000006, R15: 0x00000000000007b0\n" "RFL: 0x0000000000000046, RIP: 0xffffff8017f07cee, CS: 0x0000000000000008, SS: 0x0000000000000010\n" "Error code: 0x0000000000000000\n"@/AppleInternal/BuildRoot/Library/Caches/com.apple.xbs/Sources/xnu/xnu-7195.50.7/osfmk/i386/trap_native.c:168
Backtrace (CPU 19), Frame : Return Address
0xffffff801695fad0 : 0xffffff8016abc66d
0xffffff801695fb20 : 0xffffff8016bff073
0xffffff801695fb60 : 0xffffff8016bef6aa
0xffffff801695fbb0 : 0xffffff8016a61a2f
0xffffff801695fbd0 : 0xffffff8016abbf0d
0xffffff801695fcf0 : 0xffffff8016abc1f8
0xffffff801695fd60 : 0xffffff80172bee1a
0xffffff801695fdd0 : 0xffffff8016bf06a9
0xffffff801695fec0 : 0xffffff80172bf845
0xffffff801695fed0 : 0xffffff8016a6228f
0xffffffa14c2a3d80 : 0xffffff8017ef9752
0xffffffa14c2a3e60 : 0xffffff8017ef8a92
0xffffffa14c2a3f20 : 0xffffff8016bf1a32
0xffffffa14c2a3f40 : 0xffffff8016ae3db9
0xffffffa14c2a3f80 : 0xffffff8016ae3f88
0xffffffa14c2a3fa0 : 0xffffff8016a6113e
Kernel Extensions in backtrace:
com.apple.driver.AppleIntelCPUPowerManagement(222.0)[18F6E7C9-1CBD-3DEE-AA53-265B695AF1FF]@0xffffff8017ef6000->0xffffff8017f13fff

Process name corresponding to current thread: kernel_task

Mac OS version:
2
 

ButchMac

macrumors newbie
Dec 15, 2020
3
2
I am having the same issues with my system: MpLate2013 - dual AMD 700 graphics and 4 Dimms each with 16gig... Dimm 3 was giving an issue first, so I swapped 3 with 4 and the errors followed. Thought great... just get one new identical stick... but I kept testing and running Resolve Fusion with a huge mem intensive setup... could not get it to use more than 48gig ram....hmmm???? and still zero 'swap used' going on according to the men tab of the activity monitor... anyway that would indicate that I am not running out of ram... but... Dimm 4 still had the errors and then 3 started to have errors. Really hard to fault find or replicate. Im going to go to 4 x 32 new, all identical sticks (all be a slightly slower speed) and see what happens... our facility has other identical systems without these issues.... I might move the can to the top of my desk (as much air as possible) and with the new ram see what happens. I think its an unfortunate combo of heat and age on the ram, as I have only started to notice these issues when Prem and Resolve moved to open CL and then Metal... but that you would think, would only involve graphics card Mem... Anyway... can't wait for the SOC M1 to up its presence in a beast of a machine.
 

fiatlux

macrumors 6502
Dec 5, 2007
351
139
Its a faulty motherboard. I had the same problem.

I am hoping it is a loose contact somewhere (the author of the CPU swap video above mentioned the importance of correctly seating and tightening the CPU to avoid memory channel errors), so I’ll do a full tear down/rebuild with fresh thermal compound.

If that doesn’t work, I’ll declare it a total loss. I’ve spent too much money and time trouble shooting that machine, there’s no way I’ll take a chance changing the CPU riser card or logic board ($$$).
 

ButchMac

macrumors newbie
Dec 15, 2020
3
2
Its a faulty motherboard. I had the same problem.
I can see your reasoning but today I had more time available to investigate so I moved my RAM around all the Dimms.
From the start, I have had 2 sticks giving me errors - 1 has thousands and the other only ever seems to get 50 or less.
No matter where I moved them to, the erronous sticks always gave the same results no matter what Dimms they were in. I then removed the 2 (what I think) are bad sticks from the system and ran on 32 gig of ram and all worked well without any errors. This would have to mean my issue is thankfully not the Motherboard. Will go to 4 new sticks of 32gig and see how I go.
 
  • Like
Reactions: macguru9999

macguru9999

macrumors 6502a
Aug 9, 2006
780
363
I can see your reasoning but today I had more time available to investigate so I moved my RAM around all the Dimms.
From the start, I have had 2 sticks giving me errors - 1 has thousands and the other only ever seems to get 50 or less.
No matter where I moved them to, the erronous sticks always gave the same results no matter what Dimms they were in. I then removed the 2 (what I think) are bad sticks from the system and ran on 32 gig of ram and all worked well without any errors. This would have to mean my issue is thankfully not the Motherboard. Will go to 4 new sticks of 32gig and see how I go.
If the "good dimms" work in all of the slots equally well you are probably right. I had the opposite problem that all of my dimms were good, some new, but in certain slots they were not recognised, and in others you would get crashes and restarts. Luckily I had another motherboard and when I fitted the 12 core processor and ram to that board and reassembled all of the problems went away.
 

fiatlux

macrumors 6502
Dec 5, 2007
351
139
If the "good dimms" work in all of the slots equally well you are probably right. I had the opposite problem that all of my dimms were good, some new, but in certain slots they were not recognised, and in others you would get crashes and restarts. Luckily I had another motherboard and when I fitted the 12 core processor and ram to that board and reassembled all of the problems went away.
My experience as well - it does not seem linked to modules but rather memory slots/channels:
  1. Slot 1 gives me loads of ECC errors (like 100,000+/day);
  2. Slot 2 only a couple a day;
  3. Slot 3 is ECC-error free but would occasionally not be recognised at all after a reboot - resulting in 48GB (I have a slightly bent pin in that slot that I partly and carefully fixed with fine tweezers);
  4. Slot 4 seems non problematic.
I could live with 32GB using only slots 2 and 4 I guess, although I would prefer to keep a module in slot 3 as I fear that the risk of short circuit because of the bent pin is bigger with no module in place.

I was also surprised by the impact of going dual channel instead quad-channel: in Geekbench, the performance hit is a good 15%! You really want all 4 slots populated if you can.

I am hoping that this is related to a slightly loose CPU or board connection. The machine has again worked without crash for almost a week, but I have received the new thermal paste ordered online and I'll probably do the planned tear down/rebuild tonight. Wish me luck... :confused:
 
Last edited:

fiatlux

macrumors 6502
Dec 5, 2007
351
139
0b071c7aca1b50018a08f584e947323c.jpg

Good news is that disassembly is a lot easier and faster than I feared.
151391959b77ce7532636e4f16dc315e.jpg


Bad news is that part of the grid array is slightly damaged, with bent pins
That with the bent pins in one memory slot, a previous owner must have been pretty careless during one of his maintenances. Fixing that will pretty tough. I have good eyes but even with tiny tweezers and a loupe, that’s damn small!
Any source for cheap CPU riser boards?
 

fiatlux

macrumors 6502
Dec 5, 2007
351
139
A pair of 2.5x reading glasses do wonders for small intricate work. Nice to have on hand.
My elder son took care of it with his soldering station with magnifier. MacPro back up and running... still a few ECC errors in bank 1... hopefully it will be ok for a few more months (years?), until I find an Apple Silicon machine I'd be fully satisfied with (M1 CPU perfs would be OK, GPU would need to at least match my Vega 56, memory would need to be at least 32GB, preferably user-upgradable).
 

macguru9999

macrumors 6502a
Aug 9, 2006
780
363
0b071c7aca1b50018a08f584e947323c.jpg

Good news is that disassembly is a lot easier and faster than I feared.
151391959b77ce7532636e4f16dc315e.jpg


Bad news is that part of the grid array is slightly damaged, with bent pins
That with the bent pins in one memory slot, a previous owner must have been pretty careless during one of his maintenances. Fixing that will pretty tough. I have good eyes but even with tiny tweezers and a loupe, that’s damn small!
Any source for cheap CPU riser boards?
Thats a pity, I priced the riser board (motherboard) and they were too expensive. I was going to tell you that i had tried a new 12 core cpu on my first riser with no luck but then a whole spare machine came up which did not boot at all and appeared to have a heat blemish on the cpu, BUT the riser board was undamaged and when I used it in the original mac pro with the new 12 core cpu i had already bought i ended up with a perfect trash can )2TB ssd/64GB ram). I sold it for 3800AUD , and I even got 550AUD for the left over machine as a non working project ! This was pre apple silicon of course ....
 

fiatlux

macrumors 6502
Dec 5, 2007
351
139
Thats a pity, I priced the riser board (motherboard) and they were too expensive. I was going to tell you that i had tried a new 12 core cpu on my first riser with no luck but then a whole spare machine came up which did not boot at all and appeared to have a heat blemish on the cpu, BUT the riser board was undamaged and when I used it in the original mac pro with the new 12 core cpu i had already bought i ended up with a perfect trash can )2TB ssd/64GB ram). I sold it for 3800AUD , and I even got 550AUD for the left over machine as a non working project ! This was pre apple silicon of course ....

My MacPro has now been running fine for 5 days and still has very few ECC errors on Bank 1. I don’t think I’ll take a chance at another maintenance unless I come across a really cheap MP6,1.
 
  • Like
Reactions: macguru9999
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.