I have been having for a few months: random crashes, mostly while playing resource-intensive games. My first guess was a temp issue, but neither CPU or GPU temp goes above 70°C. My second guess was PSU, as I replaced an old GTX 460 with a GTX 780 a few months before the issues started happening and the PSU was a 5y-old 550W Antec, and I have been told this brand isn't known for its quality in PSUs.
I have replaced it with a Corsair 750W a few weeks ago, and things seemed to go fine, but I got another crash yesterday - that led to a BSOD / KMODE_EXCEPTION_NOT_HANDLED - and one tonight.
Symptoms of the crashes are the screen going gray or white for a few seconds, then either receiving no input or displaying a still of the last rendered image, sometimes slightly corrupted. Sounds keep playing in the background, not looped. Input seems to still be received, as I managed to reboot the computers a few times by keeping ctrl+alt+del pressed for long enough. Usually if I try to reboot straight after a crash, the system doesn't stay stable for long and tends to crash again or have the screens turn off for a few milliseconds.
This seems to indicate the GPU just stops working for some reason when the crashes happen, but the rest of the system keeps working its merry way. I bought the GTX 780 online around february, slightly used but of good quality. There is no warning before the crashes, no FPS drop in game or otherwise, no sudden heat spike in my temp logs. Aside from the crashes, the card performs admirably in benchmarks or in recent games. Drivers are up to date, no overclocking is applied to any part of the rig, but I did start using EVGA Precision to control fan speeds when I thought it was a temp issue, and even using it to underclock the GPU in case it helped. It doesn't seem to.
I'm kinda running out of ideas, which is why I'm hoping to get some info out of the BSOD crash dump - as most of my crashes don't lead to a BSOD, I'm kinda making sure I don't waste this one. I'm hoping I don't have to write off the GTX 780, but if I have to I'd rather be sure I do actually have to.
In order to test various other causes of failures, I'm going to run a memtest86+ overnight, and then a few of the other stress tests found around in these forums. Can't hurt to have those results at hand before someone asks for them.
Thank you in advance for your feedback.
EDIT: memtest86+ is negative, no errors after 8th pass.
EDIT2: negative on Prime95 - no warnings/errors in 3h stress test.
EDIT3: solid positive on Furmark. Triggers a crash but unfortunately not another BSOD. CSV log pastebined here. Now running Driver Verifier to see if I'm looking at a driver or hardware issue.