BSOD irql not less or equal, AMDppm.sys

Page 1 of 2 12 LastLast

  1. ekw
    Posts : 6
    Windows 11 Pro 22H2
       #1

    BSOD irql not less or equal, AMDppm.sys


    Hi

    Over a longer period of time now I have suffered occasional BSOD with "irql not less or equal" fault code. It may fail when idle or in gaming. No noticable patterns.

    Minidumps does name Amdppm.Sys as the problem.

    I have tried every solution I have found on the topic online without result. I think it is a hardware issue and want help to pinpoint what component that is broken so I can find a replacement.

    Uploaded dumpfiles to OneDrive: DESKTOP-10PKS30-(2023-06-27_16-45-41)

    Current Windows version is Windows 11 22H2 (22621.1848), yes, same issue with Windows 10.

    Hope anyone might have time for a look.

    Thanks,
    Ekw
      My Computer


  2. Posts : 402
    Windows 10 and Windows 11
       #2

    The dumps are pretty much identical and they do point, as you observed, at the amdppm.sys driver. Despite the name this is a Microsoft supplied driver and so it's considered to be faultless. One of the things that amdppm.sys is reponsible for is putting the processor in a low power state (to save power when idle), and I've seen many CPUs that are unstable at lower power states. Your system log (even though it's not in English, which makes it hard for me to read) shows that you are having unrecoverable hardware errors...
    Code:
    Event[11034]
      Log Name: System
      Source: Microsoft-Windows-WHEA-Logger
      Date: 2023-06-25T17:56:59.6030000Z
      Event ID: 18
      Task: N/A
      Level: Feil 
      Opcode: Info  
    
      Keyword: N/A
      User: S-1-5-19
      User Name: NT-MYNDIGHET\LOKAL TJENESTE
      Computer: DESKTOP-10PKS30
      Description: 
    Det har oppstŐtt en uopprettelig maskinvarefeil.
    
    Rapportert av komponent: Prosessorkjerne
    Feilkilde: Machine Check Exception
    Feiltype: Cache Hierarchy Error
    APIC-ID for prosessor: 0
    
    Detaljvisningen for denne oppf°ringen inneholder mer informasjon.
    A good way to tell whether this is a low power CPU problem is to change the power option profile to High Performance. This profile disables the low power states for the CPU, so it it doesn't BSOD with this power profile it's almost certainly a CPU that's unstable at lower power states (C-States).
      My Computer


  3. Posts : 41,701
    windows 10 professional version 1607 build 14393.969 64 bit
       #3

    The installed RAM are running at speeds higher than the speeds supported by the CPU.

    Please turn off XMP.

    Indicate whether the installed RAM are or are not on the Qualified Vendor List (QVL).


    Please modify the default language to English during the troubleshooting so that log files can be scanned and read.


    https://www.tenforums.com/tutorials/...dows-10-a.html


    Change Display Language in Windows 10
      My Computer


  4. ekw
    Posts : 6
    Windows 11 Pro 22H2
    Thread Starter
       #4

    Hi and thanks for answering.

    I have now changed to High Performance in the power options. Will see if the BSOD persist.

    As of the RAM, GSkill F4-3600C16D-32GTZNC, I have checked both the motherboard (Asus B550-E Gaming) and cpu manufacture sites and it is on the QVL and compatible with XMP/D.O.C.P. So I guess that sould be fine as is?

    Sorry for the Norwegian language in the files, forgot about that. I have changed to English and uploaded new zip file to Onedrive.
    DESKTOP-10PKS30-(2023-06-28_16-57-07)

    Thanks,
    Ekw
      My Computer


  5. Posts : 41,701
    windows 10 professional version 1607 build 14393.969 64 bit
       #5

    Please run tuneup plus and post a share link into this thread using one drive, drop box, or google drive.

    https://www.tenforums.com/attachment...p_plus_log.bat

    Batch files for use in BSOD debugging

    For any new BSOD post a new V2 share link into the newest post.
      My Computer


  6. ekw
    Posts : 6
    Windows 11 Pro 22H2
    Thread Starter
       #6

    Hi,
    Had a couple BSOD tonight. This is while the power options is set to high performance.

    I did run the Tuneup Plus, I think this is the same soulutions I have tried before only manually in cmd. Uploaded report and v2 log to Onedrive.


    DESKTOP-10PKS30-(2023-06-28_21-42-28)

    Tuneup_log 2023-06-28 at 21-38-28

    Thanks,
    Ekw
      My Computer


  7. Posts : 41,701
    windows 10 professional version 1607 build 14393.969 64 bit
       #7

    Pease read this link on Windows Driver Verifier (WDV):

    Enable and Disable Driver Verifier in Windows 10.


    Learn the methods to recover from using the tool by booting into safe mode and running one or more of these commands:

    verifier /reset

    verifier /bootmode /resetonbootfail


    Make a new restore point:

    Create System Restore Point in Windows 10


    Start WDV with these settings:
    a) Test all non-Microsoft drivers
    b) Test no Microsoft drivers
    c) Start the customized tests with the three settings displayed in the TF link

    Plan to run WDV for approximately 24 - 48 hours continuously after the last WDV BSOD.

    If there is no immediate BSOD then open administrative command prompt and type or copy and paste

    verifier /querysettings

    Post a share link into this thread using one drive, drop box, or google drive.

    For any BSOD post a new V2 share link into the newest post.
      My Computer


  8. Posts : 402
    Windows 10 and Windows 11
       #8

    I'll leave you with @zbook for now, but there is something really interesting in all your dumps. I don't know whether it's significant but it's very unusual. I'll explain.

    Here's a typical call stack (all your dumps are the sema)...
    Code:
    8: kd> knL
     # Child-SP          RetAddr               Call Site
    00 fffff884`bfeef258 fffff806`604418a9     nt!KeBugCheckEx
    01 fffff884`bfeef260 fffff806`6043cf34     nt!KiBugCheckDispatch+0x69
    02 fffff884`bfeef3a0 fffff906`82683dfb     nt!KiPageFault+0x474
    03 fffff884`bfeef538 fffff806`82683c4b     0xfffff906`82683dfb
    04 fffff884`bfeef540 fffff806`82689b07     amdppm!ReadGenAddr+0x1f
    05 fffff884`bfeef570 fffff806`826818d3     amdppm!C2Idle+0x87
    06 fffff884`bfeef5a0 fffff806`602dd5ba     amdppm!AcpiCStateIdleExecute+0x23
    07 fffff884`bfeef5d0 fffff806`602dcf61     nt!PpmIdleExecuteTransition+0x42a
    08 fffff884`bfeefa10 fffff806`60431034     nt!PoIdle+0x361
    09 fffff884`bfeefc00 00000000`00000000     nt!KiIdleLoop+0x54
    You read these stacks from the bottom up. You can see that the bugcheck happened in frame 3 where a function identified only as 0xfffff906`82683dfb is called. That means that we don't have any symbols for the function at this address. Immediately following is the page fault that leads to the BSOD. We can be fairly sure already then, that the call to 0xfffff906`82683dfb was in error.

    Displaying the details of frame 4 (the one before the problem frame) we see...
    Code:
    8: kd> .frame /r 4
    04 fffff884`bfeef540 fffff806`82689b07     amdppm!ReadGenAddr+0x1f
    rax=0000000000000000 rbx=0000000000000000 rcx=ffff820f88df2948
    rdx=0000000000000414 rsi=ffffa50172151180 rdi=ffff820f88df2948
    rip=fffff80682683c4b rsp=fffff884bfeef540 rbp=ffff820f89e62010
     r8=0000000000000000  r9=0000000000000414 r10=ffff820f88df2948
    r11=0000000000000048 r12=00000000ffffffff r13=000000000970010a
    r14=0000000000000000 r15=0000000000000000
    iopl=0         nv up di ng nz na po nc
    cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00040086
    amdppm!ReadGenAddr+0x1f:
    fffff806`82683c4b 488bd0          mov     rdx,rax
    Notice the RIP register value (0xfffff80682683c4b), this is the instruction pointer and (as expected) it's pointing at the amdppm!ReadGenAddr+0x1f function. There is nothing amiss with this call, and if we disassemble the first few instructions we can see it doing useful work...
    Code:
    8: kd> u amdppm!ReadGenAddr+0x1f
    amdppm!ReadGenAddr+0x1f:
    fffff806`82683c4b 488bd0          mov     rdx,rax
    fffff806`82683c4e 4885c0          test    rax,rax
    fffff806`82683c51 7422            je      amdppm!ReadGenAddr+0x49 (fffff806`82683c75)
    fffff806`82683c53 418a4a02        mov     cl,byte ptr [r10+2]
    fffff806`82683c57 84c9            test    cl,cl
    fffff806`82683c59 7403            je      amdppm!ReadGenAddr+0x32 (fffff806`82683c5e)
    fffff806`82683c5b 48d3ea          shr     rdx,cl
    fffff806`82683c5e 418a4a01        mov     cl,byte ptr [r10+1]
    Now though, if we look at the details of frame 3 (the one with the strange function call)...
    Code:
    8: kd> .frame /r 3
    03 fffff884`bfeef538 fffff806`82683c4b     0xfffff906`82683dfb
    rax=0000000000000000 rbx=0000000000000000 rcx=ffff820f88df2948
    rdx=0000000000000414 rsi=ffffa50172151180 rdi=ffff820f88df2948
    rip=fffff90682683dfb rsp=fffff884bfeef538 rbp=ffff820f89e62010
     r8=0000000000000000  r9=0000000000000414 r10=ffff820f88df2948
    r11=0000000000000048 r12=00000000ffffffff r13=000000000970010a
    r14=0000000000000000 r15=0000000000000000
    iopl=0         nv up di ng nz na po nc
    cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00040086
    fffff906`82683dfb ??              ???
    You can see from the (???) characters that the memory location 0xfffff906`82683dfb is invalid - there's nothing at that location. That's why we get the page fault and BSOD.

    Now look at the RIP register in this frame, it's also pointing (as expected) at address 0x0xfffff906`82683dfb, which is why we've tried to execute code at this address - that's what the RIP register does, it points to the address of the next instruction to be executed.

    But now compare the two RIP values; 0xfffff80682683c4b and 0xfffff906`82683dfb. Look at the high 4 bytes of each, normally these would be the same for all calls within functions in the same driver (like amdppm.sys) . But here, one is 0xfffff906 and the other is 0xfffff806 (my highlighting).

    If we now look at whatever is at the location 0x0xfffff806`82683dfb (ie. swapping that 9 for what probably should be an 8) we see...
    Code:
    8: kd> ln 0xfffff806`82683dfb
    Browse module
    Set bu breakpoint
    
    (fffff806`82683da0)   amdppm!ReadIoMemRaw+0x5b   |  (fffff806`82683e08)   amdppm!ReadIoMemRawEx
    We can see that it's another function within the same amdppm.sys driver. I'm left wondering whether there has been a bit flip that caused the RIP register to point way outside the driver code? Remember that 8 in binary is B1000 whilst 9 is B1001.

    I can't explain why this bit flip might be happening. It could be RAM, but it's strange that it happens consistently and in the same memory location - although the amdppm.sys module is loaded in different memory locations each time, that bit flip is the same in all dumps.

    I will think some more on this, but for now I'll leave you with @zbook.

    - - - Updated - - -

    I know I said I'd wait but I spent most of yesterday afternoon talking over your dumps with someone who is an acknowledged BSOD dump analysis guru, and someone who knows infinitely more than I do (and I'm no dummy). He agrees that your problem looks to be a hardware bit flip. In the five dumps you provided, three of them fail in logical processor 9 and the other two fail in logical processor 8 (of 24 logical processors). That points very strongly at the CPU being responsible. Another big clue that this is likely to be a CPU problem is that it only happens in amdppm.sys, which is only called when a processor goes idle. Part of the job of amdppm.sys is to look for useful work (processing a DPC queue for example) and another part is to modify the power state of the processor to use less power and produce less heat. I still believe that this bit flip is happening when the CPU enters a lower power state, as I've mentioned, I've seen other CPUs become unstable at lower power states (again, as I've mentioned).

    What you want to try and do is to stop the CPU entering any low power states. I'm not very familiar with AMD CPUs, but when you switched to the High Performance power option was that the default Windows High Performance option or an AMD option? Please try the Windows High Performance option if you used something else earlier.

    You might also try disabling C-States in the BIOS, the C-States are the various power states the CPU can be in, disabling them (usually) means that it stays in a high power mode.

    You might also try changing the Processor Power Management settings in whatever power option profile you use so that the minimum processor state is 99% and the Maximum processor state is also 99%. In theory this should also stop C-State switching.

    With all respect to @zbook, I'm certain that this is hardware, the System log errors I highlighted earlier confirm this. My BSOD dump guru also think that Driver Verifier is unlikely to find anything, but by all means try it.

    I'm not a hardware expert at all, but there may be some possibility of modifying CPU voltages perhaps, to mitigate this problem. Others who know more about this than I may be able to help.

    The bottom line I am certain, is that your CPU is unstable at low power (when idle). I hesitate to recommend a new CPU but I really think that's the best long-term solution.

    Again, apologies to @zbook for jumping back in so soon.
      My Computer


  9. ekw
    Posts : 6
    Windows 11 Pro 22H2
    Thread Starter
       #9

    Hi again,

    I did run Windows Driver Verifier for about 26 hours. No BSOD and results looked fine, I forgot to screenshot the result for you.

    The High Performance option I used is the one within Windows power settings.

    I have now disabled C-States in BIOS. Will see if the BSOD persist.

    Thanks to you guys I now know that its the CPU that causes problems and might need replacement.

    Thanks,
    Ekw

    - - - Updated - - -

    Hi

    I set the C-States back to auto. I ran into some BSOD "page_fault_in_nonpaged_area" while that was deactivated.

    I will get the CPU replaced.

    Thank you for your service.

    Ekw
      My Computer


  10. Posts : 41,701
    windows 10 professional version 1607 build 14393.969 64 bit
       #10

    WDV was used to potentially confirm that there were no misbehaving drivers.

    Which customized tests were used?

    No WDV share link results were posted.




    Many of the BSOD bugchecks where D1 others were WHEA 124.

    WHEA 124 typically blame the CPU (Intel or AMD).

    Unfortunately this debugging blame is too often incorrect.

    Swap testing confirmation is needed so that time and money are well spent.

    Microsoft recommends using WDV for BSOD D1 bugchecks when the routine commands fail to display misbehaving drivers.

    https://learn.microsoft.com/en-us/wi...-less-or-equal




    There was no hardware swap confirmation yet.

    There may be a problem with the motherboard.

    It's speculation until there is a swap confirmation.



    It's best to swap test before replacing so that you've got a confirmation of the malfunctioning hardware component.

    If you're not able to get a confirmation then make sure that you've got an acceptable return option.
    Last edited by zbook; 30 Jun 2023 at 21:33.
      My Computer


 

  Related Discussions
Our Sites
Site Links
About Us
Windows 10 Forums is an independent web site and has not been authorized, sponsored, or otherwise approved by Microsoft Corporation. "Windows 10" and related materials are trademarks of Microsoft Corp.

© Designer Media Ltd
All times are GMT -5. The time now is 07:44.
Find Us




Windows 10 Forums