Mystery shut down--confirm diagnosis, advice on next steps


  1. Posts : 35
    Windows 10 Pro for Workstations
       #1

    Mystery shut down--confirm diagnosis, advice on next steps


    This is an update to an ~year old post of mine which describes rare but super annoying mystery shutdowns.

    Since that post, mystery shutdowns have happened 9 times to me on 2019-10-20, 2019-10-27, 2019-11-20, 2019-11-21, 2019-12-01, 2019-12-06, 2019-12-25, 2020-03-29, and then 2020-06-06 (yesterday morning). So, less than once per month, with the frequency decreasing to about once every 3 months so far in 2020. But it happened again yesterday. I have got to find the culprit...

    One suggestion to my original post was to see if the shutdown happens in Safe Mode or if I do a clean boot. I am not sure if that is helpful, since the problem is so sporadic.

    Instead, surely somewhere in Microsoft's Byzantine event logs, there must be a clue as to what is going on. Since my original post, I have gotten better at capturing events and analysing them.

    This time, ultimately using Windbg on a dump file, I am fairly sure that I have narrowed the issue down to my PCI driver, especially HDAudBus.

    To keep this initial post short and readable, the details why I concluded that are in my next reply to this post ("REPLY #1: why I think that PCI, especially HDAudBus, is the culprit"). I would appreciate it if you pour thru that and let me know if you agree with my conclusion or not. I have another reply after that, in which I discuss for completeness some other things I observed, but which I guess are not issues ("REPLY #2: a mystery logon, Client License Service").

    If I am correct that it is my PCI driver, especially HDAudBus, what do I do next about this?

    Is this something that I need to report to Dell or Microsoft?

    My hardware has been stable for over a year (its a Dell Precision 7530 laptop). It is running Windows 10 Pro for Workstations updated to the Windows 10 May 2020 Update version 2004 three days ago. I run Dell's SupportAssist frequently, and always update all my drivers, BIOS, etc. I ran it today, and there is nothing to install except an optional "STMicroelectronics Free Fall Data Protection Driver". For some reason, this software refuses to install: it remains stuck on "Validating..." for a long time, so I abort.

    Finally, this link advises the following steps after a bugcheck: 1-Driver verifier and 2-Memtest. (This was also part of dalchina's answer to my original post.) I ran a Memtest (mdsched.exe) as described here and it found no memory problems. My computer is currently running Driver Verifier, but it is ONLY stress testing HDAudBus. I have seen no shutdown so far. I will let it run for up to 2 days.
      My Computer


  2. Posts : 39,994
    windows 10 professional version 1607 build 14393.969 64 bit
       #2

    Please see BSOD posting instructions:
    BSOD - Posting Instructions
      My Computer


  3. Posts : 35
    Windows 10 Pro for Workstations
    Thread Starter
       #3

    REPLY #1: why I think that PCI, especially HDAudBus, is the culprit

    Here are the closest in time events from Event Viewer --> Windows Logs --> System which bracket this morning's shutdown (in descending time order):

    Code:
    Level       Date and Time          Source                      EventID    Task Category    Description [copied from text area]
    Error       2020-06-06 08:53:15    WER-SystemErrorReporting    1001       None             The computer has rebooted from a bugcheck.  The bugcheck was: 0x0000009f (0x0000000000000003, 0xffff9201a1d2a360, 0xffff908f4f447ba0, 0xffff9201ab1a7820). A dump was saved in: C:\WINDOWS\MEMORY.DMP. Report Id: 6347cfd0-84d0-4c7c-90e2-73d0ef53c4f4.
    Error       2020-06-06 08:53:13    Service Control Manager     7023       None             The Intel(R) PROSet/Wireless Zero Configuration Service service terminated with the following error: %%2147770990
    Error       2020-06-06 08:53:11    EventLog                    6008       None             The previous system shutdown at 12:20:28 AM on ‎6/‎6/‎2020 was unexpected.
    Critical    2020-06-06 08:52:58    Kernel-Power                  41       (63)             The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly.
    Error       2020-06-06 00:19:02    Service Control Manager     7009       None             A timeout was reached (30000 milliseconds) while waiting for the Client License Service (ClipSVC) service to connect.

    Event #5 above was that Error which occurred yesterday morning at 00:19:02 my local time, presumably just before the shutdown. It had this message: A timeout was reached (30000 milliseconds) while waiting for the Client License Service
    Does anyone know if this Error is so severe that it would shutdown my computer? Or is it a red herring?

    The initial 4 events above occurred after I noticed the shutdown and then tried to use my computer at 08:53. I am particularly interested in the first events.

    #2 whose message is
    The Intel(R) PROSet/Wireless Zero Configuration Service service terminated with the following error: %%2147770990
    appears to have occurred after I started my computer, so I do not think that it is the cause of the shutdown.

    #1 seems more relvant, since its message
    The computer has rebooted from a bugcheck...A dump was saved in: C:\WINDOWS\MEMORY.DMP...
    seems to indicate that a bugcheck was occurring around the time of the mystery shutdown.

    The only way I know to examine that bugcheck is to look at the memory dump files.

    I first installed BlueScreenView, and here is its summary output from its top panel:
    Code:
    060620-15203-01.dmp	2020-06-06 00:21:00	DRIVER_POWER_STATE_FAILURE	0x0000009f	00000000`00000003	ffff9201`a1d2a360	ffff908f`4f447ba0	ffff9201`ab1a7820	ntoskrnl.exe	ntoskrnl.exe+3dda20					x64	ntoskrnl.exe+3dda20					C:\WINDOWS\Minidump\060620-15203-01.dmp	6	15	19041	3,376,532	2020-06-06 08:53:15
    The complete output from BlueScreenView (the process in the stack) is in the attached file minidump_BlueScreenView.txt.

    I next installed WhoCrashed, and here is its somewhat more detailed output:
    Code:
    On Sat 2020-06-06 00:21:00 your computer crashed or a problem was reported
    crash dump file: C:\WINDOWS\Minidump\060620-15203-01.dmp
    This was probably caused by the following module: ntoskrnl.exe (nt+0x3DDA20) 
    Bugcheck code: 0x9F (0x3, 0xFFFF9201A1D2A360, 0xFFFF908F4F447BA0, 0xFFFF9201AB1A7820)
    Error: DRIVER_POWER_STATE_FAILURE
    file path: C:\WINDOWS\system32\ntoskrnl.exe
    product: Microsoft® Windows® Operating System
    company: Microsoft Corporation
    description: NT Kernel & System
    Bug check description: This bug check indicates that the driver is in an inconsistent or invalid power state. A device object has been blocking an IRP for too long a time. 
    This is likely to be caused by a hardware problem. 
    The crash took place in the Windows kernel. Possibly this problem is caused by another driver that cannot be identified at this time.
    That level of detail is still inadequate. Ascribing the error to ntoskrnl.exe, the low level NT kernel, is worthless.

    So, I next decided to install Windbg from here and I analysed the file C:\WINDOWS\MEMORY.DMP using these instructions. (I later found better instructions for Windbg on this forum here.)

    The complete Windbg results are attached in the file Windbg.txt. Here are select lines from that file that I think are the most relevant:
    Code:
    A driver has failed to complete a power IRP within a specific time.
    ...
    IMAGE_NAME:  pci.sys
    
    MODULE_NAME: pci
    
    FAULTING_MODULE: fffff8002e260000 pci
    ...
    PROCESS_NAME:  java.exe
    ...
    FAILURE_BUCKET_ID:  0x9F_3_HDAudBus_IMAGE_pci.sys
    Here is my interpretation of all this evidence: it looks like some driver is the cause of the mystery shutdown, with Windbg indicating that it is something to do with PCI, maybe in particular with Microsoft's HDAudBus.

    The only thing that puzzles me is Windbg reporting "PROCESS_NAME: java.exe". Yes, I was letting a Java program run overnight, and it was definitely active at the time of the crash. But I do not see how that Java program was specifically involved with PCI or HDAudBus. Maybe this is a red herring.

    I would love all your feedback concerning this analysis. Did I nail the culprit, or am I missing something?
    Mystery shut down--confirm diagnosis, advice on next steps Attached Files
      My Computer


  4. Posts : 39,994
    windows 10 professional version 1607 build 14393.969 64 bit
       #4

    Please run the log collectors: V2 and DM
    Then upload folders/files directly into this thread.
      My Computer


  5. Posts : 35
    Windows 10 Pro for Workstations
    Thread Starter
       #5

    REPLY #2: a mystery logon, Client License Service

    I find Microsoft's Event Viewer to be extremely frustrating to use, especially if I want to see all the events in sequence.

    So, I used FullEventLogView to get that event stream.

    The closest in time events from FullEventLogView which bracket this morning's shutdown are in the attached file FullEventLogView.txt.

    What concerns me are the last 2 events in that file. Select details:
    Code:
    2020-06-06 00:18:32.132	3053	4672	Undefined	Security	Microsoft-Windows-Security-Auditing	"Special privileges assigned to new logon.
    ...
    Privileges:		SeAssignPrimaryTokenPrivilege
    ...
    			SeImpersonatePrivilege
    			SeDelegateSessionUserImpersonatePrivilege"		Special Logon (12548)	Audit Success	996	7376	DESKTOP-A1VR7HL		
    
    2020-06-06 00:18:32.132	3052	4624	Undefined	Security	Microsoft-Windows-Security-Auditing	"An account was successfully logged on.
    ...
    Logon Information:
    	Logon Type:		5
    	Restricted Admin Mode:	-
    	Virtual Account:		No
    	Elevated Token:		Yes
    
    Impersonation Level:		Impersonation
    ...
    Process Information:
    	Process ID:		0x3d0
    	Process Name:		C:\Windows\System32\services.exe
    
    Network Information:
    	Workstation Name:	-
    	Source Network Address:	-
    	Source Port:		-
    
    Detailed Authentication Information:
    	Logon Process:		Advapi  
    	Authentication Package:	Negotiate
    	Transited Services:	-
    	Package Name (NTLM only):	-
    	Key Length:		0
    This information initially scared me, since I know that I myself did not login at that time! Was it malware? Something that I should be concerned about?

    But I now think, as shown in the details above, that it was Microsoft's own services.exe and/or Advapi which caused the login.

    Furthermore, when I look at all of my events for IDs 4672 and 4624, I see tons of them every day that look just like the above.

    So, I assume that this is not an issue.

    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    Another thing to be aware of in that file FullEventLogView.txt is that there was apparently some issues related to the Client License Service, such as this event:
    Code:
    2020-06-06 00:19:02.147	862	7009	Error	System	Service Control Manager	A timeout was reached (30000 milliseconds) while waiting for the Client License Service (ClipSVC) service to connect.			Classic	976	15204	DESKTOP-A1VR7H
    L

    My guess is that this too is not what caused the shutdown.

    - - - Updated - - -

    zbook said:
    Please run the log collectors: V2 and DM
    Then upload folders/files directly into this thread.
    Do they record information that I would want to remain private?

    Is it OK to let them run for months on end? (I am only seeing these mystery shutdowns once every few months.)

    If you look at the replies that I did to my original post, in which I give details, I have attachments with stuff like the Windbg report. I would think that that is what you would ultimately want anyways, right?
    Mystery shut down--confirm diagnosis, advice on next steps Attached Files
    Last edited by up2trix; 07 Jun 2020 at 19:08.
      My Computer


  6. Posts : 39,994
    windows 10 professional version 1607 build 14393.969 64 bit
       #6

    See Ten Forums BSOD forum:
    https://www.tenforums.com/bsod-crashes-debugging/

    There are no personal files collected.
      My Computer


  7. Posts : 35
    Windows 10 Pro for Workstations
    Thread Starter
       #7

    I turned off Driver Verifier just now.

    As I mentioned in my first post, I set it yesterday to ONLY stress test HDAudBus.

    After a restart last night, my computer seemed to behave normally. A nightly process that I run every night completed successfully this morning, for example.

    But when I tried to use it this morning, bad and weird things happened:

    1) first thing this morning, I tried to hop on a zoom meeting from a web link using Brave browser. Zoom refused to open; never seen that before. I tried to open it a second time, no joy. I then tried to exit Brave. 2 of my 3 windows closed, but the 3rd one, which had the link to the zoom meeting, would not close even after waiting ~2 minutes.

    2) tried to open Task Manager to kill Brave, but Task Manager's GUI did not completely draw itself. Tried to close Task Manager by clicking on its window's X, but that did not work; its window on the top said "Not Responding".

    3) after waiting ~2 more minutes and neither Brave nor Task Manager had died, I used the Windows Start menu to shut down the computer.

    4) upon restart, the only app I initially tried to get running was my VPN. That app started Not Responding as well, and when I tried to open Task Manager to kill it, same Not Responding in Task Manager as last time. Once again I had to shut down the computer.

    5) this time, it took a while to shut down. Even after Windows was seemingly down (my screen was black), my laptop's power button LED was still lit, like it was still active. Eventually I hit that power button briefly, and it went unlit. Waited ~10 seconds, hit power again, and it started to boot, but it remained stuck in BIOS on the Dell logo, never presenting me with the BitLocker PIN login. After > 1 minute, I hit the power button to stop this boot. Tried again, this time it booted normally.

    6) got into Windows, and first thing I did was open a cmd shell and run Verifier and delete all its settings.

    7) restarted once again, and now my computer seems to be fine once more...
      My Computer


 

  Related Discussions
Our Sites
Site Links
About Us
Windows 10 Forums is an independent web site and has not been authorized, sponsored, or otherwise approved by Microsoft Corporation. "Windows 10" and related materials are trademarks of Microsoft Corp.

© Designer Media Ltd
All times are GMT -5. The time now is 21:19.
Find Us




Windows 10 Forums