Random BSODs, multiple stop codes, 90% of the time under system load

Page 1 of 2 12 LastLast

  1. Posts : 8
    Windows 10 Pro
       #1

    Random BSODs, multiple stop codes, 90% of the time under system load


    I built this computer about 3 months ago, and since I built it, I have been getting random BSODs all the time. Sometimes once a day, sometimes more. Usually, it's while playing a game, but not always. Have tried every troubleshooting tip thrown my way, and have nowhere left to turn. Clean OS install did not solve my issue, memtest86 returned no errors on a 10 hour run (though I still suspect the RAM), stress-testing the system does NOT cause it to crash, even at 100% load across the board, the errors are totally random and almost always caused by different processes. I get multiple stop codes, not always the same one.

    The two most common stop codes I receive are IRQL_NOT_LESS_OR_EQUAL and PAGE_FAULT_IN_NONPAGED_AREA
    As far as I can tell, the only singular service that has caused more than one crash is tcpip.sys, but I don't know enough about the errors logs to delve any further into them, so I am posting here.

    EDIT: Here is a list of all of the things I have tried ( that I can currently think of), in no particular order.
    - Updated all possible drivers
    - Clean OS install
    - sfc /scannow
    - DISM /Online /Cleanup-Image /RestoreHealth
    - Changed PSU cables (this solved a separate issue)
    - Ran 10 hour memtest86 (no errors)
    - Used Prime95 + Furmark to load the system to 100% and attempt to force-reproduce issue (nope)

    Attachment 208104
      My Computer

  2. Ztruker's Avatar
    Posts : 13,551
    Windows 10 Pro X64 21H1 19043.1043
       #2

    The latest dump shows a problem with your XFX Radeon Vega64.
    Code:
    BugCheck D1, {276c9e8, 7, 0, fffff80e88eab6fe}
    *** WARNING: Unable to verify timestamp for atikmdag.sys
    *** ERROR: Module load completed but symbols could not be loaded for atikmdag.sys
    Probably caused by : atikmdag.sys ( atikmdag+10b6fe )
    Your current driver is dated 5/15/2018.
    The one on the web site is dated 10/11/2018.

    I suggest completely uninstalling the display drivers using Display Driver Uninstaller (DDU) from WagnardMobile here: Official Download Here. Do this in Safe Mode. Then get the latest driver from here: Radeon™ RX Vega 64 Drivers & Support

    Then do a custom install of only the graphics driver, see if that makes a difference.

    You also have one dump showing memory corruption. This doesn't necessarily mean you have bad RAM as this can be caused by a device driver too but it does indicate a possible RAM problem. If updating the video driver doesn't help, remove half of your RAM, run with 16GB and see how things are. If it still fails swap the installed and removed RAM and test some more. If no problems then one of the removed 8GB DIMMS is defective, so swap RAM until you narrow it down to which is bad.

    The alternative is to run a memory tester. I would still suggest removing half the RAM as testing 32GB takes a long, long time. 16GB can take 12 to 15 hours so overnight is the best time to do it.

    ===================================================
    Follow this tutorial: MemTest86+ - Test RAM - Windows 10 Forums

    MemTest86+ is a diagnostic tool designed to test Random Access Memory (RAM) for faults. MemTest86+ will verify that:

    • RAM will accept and keep random patterns of data sent to it
    • There are no errors when different parts of memory try to interact
    • There are no conflicts between memory addresses


    Memtest86+ runs from bootable media to isolate the RAM from the system, no other components are taken into account during the test.

    warning   Warning
    MemTest86+ needs to run for at least 8 passes to be anywhere near conclusive, anything less will not give a complete analysis of the RAM.


    If you are asked to run MemTest86+ by a Ten Forums member make sure you run the full 8 passes for conclusive results. If you run less than 8 passes you will be asked to run it again.

    Note   Note
    MemTest86+ has been known to discover errors in RAM in later passes than the eighth pass. This is for information only; if you feel there is a definite problem with the RAM and 8 passes have shown no errors feel free to continue for longer.


    Running 8 passes of MemTest86+ is a long and drawn out exercise and the more RAM you have the longer it will take. It's recommended to run MemTest86+ just before you go to bed and leave it overnight.

    Take a picture when done and post in the forum please.
      My Computers


  3. Posts : 8
    Windows 10 Pro
    Thread Starter
       #3

    I actually already did the DDU wipe of my old drivers and installed the new one, just earlier today. The reason I am using the build dated 05/15/2018 is that it is the lastest "full release" driver, and the driver dated 10/11/2018 is a beta driver, which I have not had good luck with in the past. That latest error on the GPU was likely caused because I turned many of the settings in Radeon Settings up, including overvoltage and overclocking. These settings caused system instability (i.e. that last crash) after which I reverted those settings to default values.

    I ran memtest86 for 5 passes with all 4 DIMMS installed last time I did it, which turned up no errors. I will absolutely run the test with only 16GB installed for 12 passes, and report back when I have done so.

    You only explicitly addressed 2 of my dumps in this thread, are there any other errors in the other ~5 dumps in my logs? If so, what do you recommend regarding those?

    Thanks so much for your replies.
      My Computer


  4. Posts : 8
    Windows 10 Pro
    Thread Starter
       #4

    I ran memtest86 overnight with 2 of my 4 RAM sticks installed, the test completed a total of 12 passes and returned no errors, tonight I will run the same test with the other 2 sticks and report back.
      My Computer

  5. Ztruker's Avatar
    Posts : 13,551
    Windows 10 Pro X64 21H1 19043.1043
       #5

    100218-9484-01
    BugCheck D1, {ffffe000129b6f00, 2, 0, fffff808871eefbf}
    Probably caused by : tcpip.sys ( tcpip!InetAcquirePort+257 )
    DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)

    100318-8109-01
    BugCheck 50, {ffff8308be7d5170, 0, fffff800d6f71a80, c}
    Could not read faulting driver name
    Probably caused by : ntkrnlmp.exe ( nt!IopXxxControlFile+8c0 )
    PAGE_FAULT_IN_NONPAGED_AREA (50)

    100318-8171-01
    BugCheck A, {48, 2, 0, fffff8036558ff45}
    Probably caused by : memory_corruption ( nt!MiCompleteProtoPteFault+365 )
    IRQL_NOT_LESS_OR_EQUAL (a)

    100618-7859-01
    BugCheck 50, {ffffc78053700188, 0, fffff808600dffbe, 0}
    Could not read faulting driver name
    Probably caused by : dxgkrnl.sys ( dxgkrnl!ADAPTER_RENDER:diPresent+c6 )
    PAGE_FAULT_IN_NONPAGED_AREA (50)

    101118-8125-01
    BugCheck 3B, {c0000005, fffff80993c7f10a, ffffea0f3c9b7660, 0}
    Probably caused by : tcpip.sys ( tcpip!InetAcquirePort+3a2 )
    SYSTEM_SERVICE_EXCEPTION (3b)

    101218-7859-01
    BugCheck D1, {276c9e8, 7, 0, fffff80e88eab6fe}
    *** WARNING: Unable to verify timestamp for atikmdag.sys
    *** ERROR: Module load completed but symbols could not be loaded for atikmdag.sys
    *** WARNING: Unable to verify timestamp for win32k.sys
    *** ERROR: Module load completed but symbols could not be loaded for win32k.sys
    Probably caused by : atikmdag.sys ( atikmdag+10b6fe )
    DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)

    As you can see the dumps jump around a bit. Are you overclocking? If so set everything to nominal and see how things are.
    Check memory clocking and voltages as well.
      My Computers


  6. Posts : 8
    Windows 10 Pro
    Thread Starter
       #6

    Beyond the XMP profile required to run any RAM speeds above 2133mhz, nothing else is overclocked. My ram is rated at 3200mhz, which is the speed it was set to, but when I attempted to boot with only two sticks installed at that speed, system refused to POST Therefore, when I ran memtest86 I did so at 2133mhz. I don't have the tech knowledge to play with RAM timings/voltages manually, so beyond automatically setting XMP, nothing else was changed. If I need to set specifc RAM timings/voltages manually, I know where in the BIOS to do that, but not how to determine what the appropriate values are for my specific setup.

    As far as setting all values to nominal, I will do so and see if the system is stable, but if this turns out to be the problem, what can I do to have stable RAM at 3200mhz, if anything? Or am I limited permanently to 2133mhz?
      My Computer

  7. Ztruker's Avatar
    Posts : 13,551
    Windows 10 Pro X64 21H1 19043.1043
       #7

    I'm not a hardware guy. I've never built my own PC nor have I ever messed with clock speed or system voltages so I really don't know, sorry.

    I'll see if I can get one of the other guys to drop in and help with clocking and voltage questions.

    Edit: Looking at the manual, I did see this: Attachment 208309 Not sure if that means those are the only speeds that are supported for RAM or not. Let me see if I can find someone else to help.
      My Computers


  8. Posts : 8
    Windows 10 Pro
    Thread Starter
       #8

    So, do you think the RAM is the definite cause of my BSODs? If so, do you think RMAing the RAM would solve my problem?
      My Computer

  9. Ztruker's Avatar
    Posts : 13,551
    Windows 10 Pro X64 21H1 19043.1043
       #9

    No, I don't know what the cause is. RAM is a possibility but there is noting that definitely says so.

    You have 32GB so run with 16GB for awhile and see if you get an BSODS. If not them one of the removed RAM DIMMs is probably defective. If you do get a BSOD then swap the removed and installed RAM and test some more. If doing this doesn't make any difference then it's unlikely RAM is the problem.

    Since you've already beat on it with Priem95 and FurMark I would suggest trying Driver Verifier as the next step, as follows:
    ===================================================
    Driver Verifier
    is a diagnostic tool built into Windows 10, it is designed to verify both native Microsoft drivers and third party drivers. Driver Verifier's verification process involves putting heavy stress on drivers with the intention of making bad, outdated, incompatible or misbehaving drivers fail. The required result is a BSOD (Blue Screen of Death) which will generate a crash dump for debugging purposes. Machines exposed to Driver Verifier may run very sluggishly due to the stress being applied to the drivers.

    Driver Verifier - Enable and Disable in Windows 10

    Pay close attention to PART TWO and make sure the correct boxes are checked.

    warning   Warning
    It is not advised to run Driver Verifier for more than 48 hours at a time. Disable Driver Verifier after 48 hours or after receiving a BSOD, whichever happens soonest.

    Always create a Restore Point prior to enabling Driver Verifier so you have a way to recover if it goes haywire. Seldom does but it can happen.

    What we're looking for is a verifier generated BSOD with a mini dump that will tell us what driver caused it. If you get a BSOD, rerun the Beta log collector and upload the resulting zip file.
      My Computers


  10. Posts : 8
    Windows 10 Pro
    Thread Starter
       #10

    Thanks again for your replies. I've encountered a new problem, I cannot get the RAM to boost back to 3200 at all. The system refuses to POST now at anything other than 2133, even though it was "working" at 3200 for months. Anyway, I will run that Driver Verifier with the ram at its stock speed, and report back.
      My Computer


 
Page 1 of 2 12 LastLast

  Related Discussions
Our Sites
Site Links
About Us
Windows 10 Forums is an independent web site and has not been authorized, sponsored, or otherwise approved by Microsoft Corporation. "Windows 10" and related materials are trademarks of Microsoft Corp.

© Designer Media Ltd
All times are GMT -5. The time now is 14:29.
Find Us




Windows 10 Forums