Display freezes need to reboot, GPU issue/relating to nvlddm?


  1. Posts : 6
    Windows 10
       #1

    Display freezes need to reboot, GPU issue/relating to nvlddmkm??


    Hi, been sent here from the Win11 sister forum where this thread is essentially about the same issue I'm having - just for Win11

    I've been having an issue on my 4 year old Win 10 rig that's been going on intermittently for several months now (sometimes it happens multiple times in a day, sometimes once a week, sometimes goes away for longer but it always returns!). See below a list of symptoms, troubleshooting steps done as well as some screenshots and my V2 collector log.

    Any help would be greatly appreciated.


    V2 collector log on Google Drive


    SYMPTOMS
    • Screen freezes regardless of system load (happened when gaming, when completely idle, when only having base apps like browser + Spotify open, or also while not doing anything load intensive)
    • Fans, drives, motherboard etc all still running - other evidence of activity: Sometimes sound will still play for a while, or sound will loop the last few seconds for a while, or sometimes when pressing buttons it will still make Windows sounds showing some OS functionality still running just graphics being dead(?)
    • Cold reboot needed to recover
    • Sometimes display won’t come back after cold reboot - dual monitor setup only showing “no signal” while fans and other hardware are running/LEDs on/showing activity - in that case I usually have to unplug one or both of my monitors from the GPU (1x HDMI, 1x DPort) and try another cold reboot. Can plug both monitors back in afterwards
    • Sometimes there will be a few seconds of stuttering frames and audio before the crash
    • Sometimes there will be a Windows pop-up that app.exe (e.g. Overwatch, Discord, Firefox) has been blocked from accessing hardware. This may not always lead to a full crash but the app affected will crash
    • nvlddmkm driver errors shown in Event Viewer in some cases right before the crash or before the Windows pop-up
    • Other times there is no log evidence in the files, last log entries are usually a few minutes before crash
    • I noticed that while temperatures/power usage doesn't seem to be abnormal, logs for GPU show some weird data values moments before the crash happens (e.g. temp logged as 0 so assume the machine doesn't receive any data from sensors) and logging resumes normally after a successful reboot. I use OpenHardware Monitor. See screenshot of data in question from a crash that happened today (21/01/23).


    Logging data:
    Display freezes need to reboot, GPU issue/relating to nvlddm?-sensor-log-data-time-crash.png

    Eventviewer logs for the same time stamps:
    Display freezes need to reboot, GPU issue/relating to nvlddm?-eventviewer-time-crash.png


    Troubleshooting done so far:


    • Completely reinstalled drivers multiple times, including the use of DDU from safe mode, multiple versions of Nvidia drivers over the last 6+ months
    • Updated drivers for pretty much anything else available
    • Opened up my computer and checked that all parts are seated correctly, dusted off parts (my setup is designed quite well, barely had any dust!)
    • Tried a few combinations of power plugs and power sockets in case of dodgy power setups
    • Changed power options to high performance in case it’s due to power issues
    • Changed timeout + recovery settings in registry as per How to Fix ‘Display Driver nvlddmkm Stopped Responding’ on Windows 10/11
    • Turned off hardware acceleration in various apps incl Windows overall, browser, discord and games that use it
    • Reseated GPU
    • Checked that GPU isn’t overclocked
    • Underclocking via MSI Afterburner
    • Temperature checks/logging didnt seem to be a temp issue although GPU is running on the warmer side when gaming (Overwatch/Cyberpunk up to 83°C), after lowering graphics settings in Overwatch runs at around 50°C
    • Reinstalled Overwatch as it’s the game I play the most
    • Disabled a few gimmicks from Geforce experience (Shadowplay, ingame overlay etc)
    • Memtests, multiple - came back ok
    • Removed lots of crap/superfluous apps from PC, incl most apps that could interfere with GPU settings and that I no longer need
    • Calculated PSU capacity required for the setup I have: I have a 600W BeQuiet Bronze, Estimates vary from 300W upwards, only reaches 600W if I massively overcalculate everything just in case? However this does not explain crashes when the PC is almost idle?
    • Scannow - sometimes came back with errors but resolved itself in between, still did a few windows repairs via DISM just in case
    • Looked at sensor logs for temperature anomalies/power surges etc - only abnormalities are around GPU sensors, see screenshot - temperatures not higher than usual/or definitely not high enough) - but noticed that temperature logging had stopped / a temperature of 0 was logged a few mins before crash (e.g. logged half an hour of temps around 45°C for GPU, then 5 mins before crash only 0°C, then logging resumed after a few minutes when the reboot was complete) - see screenshot above





    Haven’t tried yet:
    • Pretty much anything BIOS related - not feel comfortable
    • Flash GPU - dont want to risk
    • Complete clean reinstall of Windows
    • Replacing parts like GPU, PSU or monitors (might test a new monitor, want a new one anyway..)
    Last edited by Nuku2; 21 Jan 2023 at 14:03.
      My Computer


  2. NTN
    Posts : 972
    W10 19045.2546
       #2

    Could you please inform us about your specs?
    CPU, Memory, GPU of course, OS Drive and PSU as minimum.
      My Computers


  3. Posts : 6
    Windows 10
    Thread Starter
       #3

    Of course - sorry I thought that this was also in the log collector files. I'll also add it into my post on top, here are the base specs, let me know if anything else is missing :) Thanks

    OS Windows10 Enterprise V10.0.19045 Build 19045
    Motherboard Gigabyte B450M DS3H AMD Socket AM4
    BIOS American Megatrends F42h 18/10/2019
    Storage 512GB M.2-2280 NVMe PCIe SSD and Seagate Expansion external SSD, 500 GB
    GPU NVIDIA GeForce RTX 2060 Ventus XS 6GB
    RAM 2 x Adata XPG Gammix D10 (Tray Bulk) DDR4
    CPU AMD Ryzen 7 2700X 3.7GHz 8 Core
    PSU Cooler Master MWE v2 650W 80+ Bronze PSU (I mentioned BeQuiet before, sorry I misremembered!)


    Edit: Can't seem to edit my first post, eek
      My Computer


  4. NTN
    Posts : 972
    W10 19045.2546
       #4

    https://answers.microsoft.com/en-us/...d-f3bc52ee1d2d

    "A blue screen shows up and says, "video TDR failure nvlddmkm.sys" Sometimes with this game black screen problem happened too."

    How to Fix VIDEO_TDR_FAILURE BSODs | Tom's Hardware

    Sometimes this helps..

    Try those values at:
    TdrDelay
    TdrDiDelay

    Attachment 384028
      My Computers


  5. Posts : 6
    Windows 10
    Thread Starter
       #5

    Display freezes need to reboot, GPU issue/relating to nvlddm?-regedit.png

    Thanks! I had already created the TdRDelay key a while ago based on similar instructions, have added the TdRDiDelay one now and also just in case recreated the other one as I had set a different value (was 10 instead of 20 - assuming the value suggested on your snippet just allows for more recovery time?)

    Does the TdrDiDelay make a big difference? Not entirely sure what the difference is from TdRDelay
      My Computer


  6. NTN
    Posts : 972
    W10 19045.2546
       #6

    Nuku2 said:
    Display freezes need to reboot, GPU issue/relating to nvlddm?-regedit.png

    Thanks! I had already created the TdRDelay key a while ago based on similar instructions, have added the TdRDiDelay one now and also just in case recreated the other one as I had set a different value (was 10 instead of 20 - assuming the value suggested on your snippet just allows for more recovery time?)

    Does the TdrDiDelay make a big difference? Not entirely sure what the difference is from TdRDelay
    I just see that TdrDiDelay is also recommended....the default value for TdrDelay is 2, and for TdrDiDelay 5. I see that someone recommends a value of 60 here, which I think is a bit high. Now I've had these on 20 for a while without any problems whatsoever. I forgot these values ​​after an upgrade, and then the PC froze quite quickly as soon as I took a screenshot, but with these values ​​it has been perfectly stable.

    How to Fix a GPU Driver Crash When Using Unreal Engine | Unreal Engine 5.0 Documentation
    Attachment 384039
    Last edited by NTN; 22 Jan 2023 at 07:51.
      My Computers


  7. Posts : 6
    Windows 10
    Thread Starter
       #7

    Okay unfortunately that didn't seem to have done anything - I had 2 crashes in short succession today when playing Overwatch. One time the system froze completely so had to do a cold reboot, second time it was just Overwatch that crashed.

    First crash: Nothing in event viewer except for Kernel event saying the reboot after was unexpected
    Second crash: Nvidia driver message as per usual

    Display freezes need to reboot, GPU issue/relating to nvlddm?-crash-message.png
      My Computer


  8. Posts : 6
    Windows 10
    Thread Starter
       #8

    Latest developments - I've decided it was time again to completely wipe and reinstall the Nvidia drivers just in case, did that via DDU using safe mode.

    Also got myself a new monitor to replace the older of my two as I wasn't really happy with it anymore. Didn't really think it would fix anything but you never know.

    Of course I had another hard crash when playing Overwatch. I noticed sound and graphics stuttering, then an error popped up on screen and a split second later the screen froze completely, couldn't move cursor, no sound, nothing. No Nvidia/driver crash messages in Eventviewer either.

    Display freezes need to reboot, GPU issue/relating to nvlddm?-ow-error.png

    My latest memorydump analysis log (I know very little about it but maybe it helps?


    Code:
     !analyze -v
    *******************************************************************************
    *                                                                             *
    *                        Bugcheck Analysis                                    *
    *                                                                             *
    *******************************************************************************
    
    DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)
    An attempt was made to access a pageable (or completely invalid) address at an
    interrupt request level (IRQL) that is too high.  This is usually
    caused by drivers using improper addresses.
    If kernel debugger is available get stack backtrace.
    Arguments:
    Arg1: fffff8051f1f99d0, memory referenced
    Arg2: 0000000000000002, IRQL
    Arg3: 0000000000000008, value 0 = read operation, 1 = write operation
    Arg4: fffff8051f1f99d0, address which referenced memory
    
    Debugging Details:
    ------------------
    
    Page 3d1fc not present in the dump file. Type ".hh dbgerr004" for details
    Page 3d1fc not present in the dump file. Type ".hh dbgerr004" for details
    Page 3d1fc not present in the dump file. Type ".hh dbgerr004" for details
    Page 3d1fc not present in the dump file. Type ".hh dbgerr004" for details
    Page 3d1fc not present in the dump file. Type ".hh dbgerr004" for details
    Page 3d1fc not present in the dump file. Type ".hh dbgerr004" for details
    Page 3d1fc not present in the dump file. Type ".hh dbgerr004" for details
    Page 3d1fc not present in the dump file. Type ".hh dbgerr004" for details
    Page 3d1fc not present in the dump file. Type ".hh dbgerr004" for details
    Page 3d1fc not present in the dump file. Type ".hh dbgerr004" for details
    Page 3d1fc not present in the dump file. Type ".hh dbgerr004" for details
    Page 3d1fc not present in the dump file. Type ".hh dbgerr004" for details
    Page 3d1fc not present in the dump file. Type ".hh dbgerr004" for details
    Page 3d1fc not present in the dump file. Type ".hh dbgerr004" for details
    Page 3d1fc not present in the dump file. Type ".hh dbgerr004" for details
    
    KEY_VALUES_STRING: 1
    
        Key  : Analysis.CPU.mSec
        Value: 3671
    
        Key  : Analysis.DebugAnalysisManager
        Value: Create
    
        Key  : Analysis.Elapsed.mSec
        Value: 6682
    
        Key  : Analysis.IO.Other.Mb
        Value: 3
    
        Key  : Analysis.IO.Read.Mb
        Value: 0
    
        Key  : Analysis.IO.Write.Mb
        Value: 7
    
        Key  : Analysis.Init.CPU.mSec
        Value: 1514
    
        Key  : Analysis.Init.Elapsed.mSec
        Value: 11411
    
        Key  : Analysis.Memory.CommitPeak.Mb
        Value: 96
    
        Key  : Bugcheck.Code.DumpHeader
        Value: 0xd1
    
        Key  : Bugcheck.Code.KiBugCheckData
        Value: 0xd1
    
        Key  : Bugcheck.Code.Register
        Value: 0xa
    
        Key  : WER.OS.Branch
        Value: vb_release
    
        Key  : WER.OS.Timestamp
        Value: 2019-12-06T14:06:00Z
    
        Key  : WER.OS.Version
        Value: 10.0.19041.1
    
    
    FILE_IN_CAB:  MEMORY.DMP
    
    BUGCHECK_CODE:  d1
    
    BUGCHECK_P1: fffff8051f1f99d0
    
    BUGCHECK_P2: 2
    
    BUGCHECK_P3: 8
    
    BUGCHECK_P4: fffff8051f1f99d0
    
    READ_ADDRESS:  fffff8051f1f99d0 
    
    IP_IN_PAGED_CODE: 
    nvlddmkm+c999d0
    Page 3d1fc not present in the dump file. Type ".hh dbgerr004" for details
    Page 3d1fc not present in the dump file. Type ".hh dbgerr004" for details
    fffff805`1f1f99d0 ??              ???
    
    BLACKBOXBSD: 1 (!blackboxbsd)
    
    
    BLACKBOXNTFS: 1 (!blackboxntfs)
    
    
    BLACKBOXPNP: 1 (!blackboxpnp)
    
    
    BLACKBOXWINLOGON: 1
    
    PROCESS_NAME:  System
    
    DPC_STACK_BASE:  FFFFF8050D89AFB0
    
    TRAP_FRAME:  fffff8050d89a440 -- (.trap 0xfffff8050d89a440)
    NOTE: The trap frame does not contain all registers.
    Some register values may be zeroed or incorrect.
    rax=fffff8051f1f99d0 rbx=0000000000000000 rcx=ffffb08614d44000
    rdx=00000000ff060080 rsi=0000000000000000 rdi=0000000000000000
    rip=fffff8051f1f99d0 rsp=fffff8050d89a5d8 rbp=ffffb086161c3000
     r8=fffff8050d89a700  r9=0000000000000000 r10=0000000000000000
    r11=fffff8050d89a610 r12=0000000000000000 r13=0000000000000000
    r14=0000000000000000 r15=0000000000000000
    iopl=0         nv up ei pl zr na po nc
    nvlddmkm+0xc999d0:
    Page 3d1fc not present in the dump file. Type ".hh dbgerr004" for details
    Page 3d1fc not present in the dump file. Type ".hh dbgerr004" for details
    fffff805`1f1f99d0 ??              ???
    Resetting default scope
    
    FAILED_INSTRUCTION_ADDRESS: 
    nvlddmkm+c999d0
    Page 3d1fc not present in the dump file. Type ".hh dbgerr004" for details
    Page 3d1fc not present in the dump file. Type ".hh dbgerr004" for details
    fffff805`1f1f99d0 ??              ???
    
    STACK_TEXT:  
    Page 3d1fc not present in the dump file. Type ".hh dbgerr004" for details
    fffff805`0d89a2f8 fffff805`0780d329     : 00000000`0000000a fffff805`1f1f99d0 00000000`00000002 00000000`00000008 : nt!KeBugCheckEx
    fffff805`0d89a300 fffff805`07808ee3     : ffffb086`16ca1590 fffff805`077fb294 4202a05f`20000000 00000000`00000000 : nt!KiBugCheckDispatch+0x69
    fffff805`0d89a440 fffff805`1f1f99d0     : fffff805`1e5fb874 00000000`00000064 fffff805`1e7e0f5c 00000000`01000010 : nt!KiPageFault+0x463
    fffff805`0d89a5d8 00000000`00000000     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nvlddmkm+0xc999d0
    
    
    SYMBOL_NAME:  nvlddmkm+c999d0
    
    MODULE_NAME: nvlddmkm
    
    IMAGE_NAME:  nvlddmkm.sys
    
    STACK_COMMAND:  .cxr; .ecxr ; kb
    
    BUCKET_ID_FUNC_OFFSET:  c999d0
    
    FAILURE_BUCKET_ID:  AV_CODE_AV_PAGED_IP_nvlddmkm!unknown_function
    
    OS_VERSION:  10.0.19041.1
    
    BUILDLAB_STR:  vb_release
    
    OSPLATFORM_TYPE:  x64
    
    OSNAME:  Windows 10
    
    FAILURE_ID_HASH:  {5e1e0500-a5d5-2e5c-dae5-9b72e39c3ef8}
    
    Followup:     MachineOwner
    ---------
    
    windbg> .hh dbgerr004
    Randomly googling some of the bug IDs and error codes leads me to this but I don't understand much - some sort of issue with accessing memory? Help please!

    https://learn.microsoft.com/en-us/wi...-less-or-equal
      My Computer


  9. Posts : 6
    Windows 10
    Thread Starter
       #9

    The logs pointing towards access issues with memory made me run another series of memtests, one of them froze in the middle of it...

    I just decided to go for another component swap on the cheaper side and swapped out my ram modules. Haven't had crashes since - no crashes for a week now
    Last edited by Nuku2; 10 Feb 2023 at 15:55.
      My Computer


 

  Related Discussions
Our Sites
Site Links
About Us
Windows 10 Forums is an independent web site and has not been authorized, sponsored, or otherwise approved by Microsoft Corporation. "Windows 10" and related materials are trademarks of Microsoft Corp.

© Designer Media Ltd
All times are GMT -5. The time now is 18:58.
Find Us




Windows 10 Forums