Category Archives: Troubleshooting

Windows Resiliency Initiative Includes Quick Machine Recovery

It’s that time of year again, when MS meetings and conferences — Ignite 2024, in this case — heat things up with future promises and new idea campaigns. Yesterday’s Windows Experience Blog from David Weston (MS VP Enterprise & OS Security) is a case in point. Entitled Windows security and resiliency: Protecting your business, it asserts that a new Windows Resiliency Initiative includes Quick Machine Recovery as a key capability. Very interesting!

Explaining Windows Resiliency Initiative Includes Quick Machine Recovery

This new initiative “takes four areas of focus” as its goal — namely (all bullet points quoted verbatim from the afore-linked blog post, except for my [bracketed] commentary):

  • Strengthen reliability based on learnings from the incident we saw in July. [Crowdstrike kernel mode error took down 8.5M Windows PCs.]
  • Enabling more apps and users to run without admin privileges.
  • Stronger controls for what apps and drivers are allowed to run.
  • Improved identity protection to prevent phishing attacks.

The first and arguably most impactful preceding item is what led MS to its announcement of Quick Machine Recovery. Here’s how Weston explains it:

This feature will enable IT administrators to execute targeted fixes from Windows Update on PCs, even when machines are unable to boot, without needing physical access to the PC. This remote recovery will unblock your employees from broad issues much faster than what has been possible in the past. Quick Machine Recovery will be available to the Windows Insider Program community in early 2025.

In other words, this new feature should enable what savvy administrators had to do using OOB access to affected machine via KVMs smart enough to bootstrap machines otherwise unable to boot.

Great Addition: How’s the Execution?

IMO this is something MS should’ve built into Windows long ago. I’m curious to see how (and how well) it works. I’m also curious to see if it will be available for Windows 10 as well as 11. Only time will tell, but I’ll be all over this when it hits Insider Builds early next year. Good stuff — I hope!!

Facebooklinkedin
Facebooklinkedin

Forced 24H2 Upgrade Throws BSOD

I couldn’t help myself: I HAD to try it. On the Lenovo ThinkStation P3 Ultra, I used the Windows 11 Installation Assistant to bring on the newest version. Alas, my forced 24H2 upgrade throws BSOD with error code 0X85 SETUP_FAILURE. Quick research found an MS Learn article on that very topic. Alas, it also says “a fatal error occurred during setup” and suggests unplugging peripherals and trying again, but provides no real repair advice. You can see my iPhone BSOD photo, skews and all, as the lead-in graphic here.

Bad Cess As Forced 24H2 Upgrade Throws BSOD

Please note: even though the BSOD text reads in part “We’ll restart for you,” I had to toggle the power button to bring the P3 Ultra back to life. Sigh: looks like its Intel i9-13900 Intel CPU is subject to some documented issues. Indeed, I just found an Intel Community post that says if Turbo Boost is enabled in the BIOS, it can crash during the Windows 11 upgrade process.

So I visited Settings > System > Recovery > Advanced Startup and then entered the BIOS. Sure enough, Turbo Mode was enabled, so I disabled same. Now, I’m running the Installation Assistant again. It zoomed through download and verification phases, so the files from the original download were obviously still present. Now it’s doing the GUI install portion …

Is the 2nd Try Charmed, or Doomed?

It took about 10-15 minutes for GUI install to complete. Turning off Turbo Mode notably slows things down. The post-GUI install went much slower, though: it zoomed up to 71% in 5-8 minutes, then took the better part of an hour to work its way to completion and OOBE.

But I’ve now got a working 24H2 installation on the ThinkStation P3 Ultra,  as you can see in the next screencap. It shows Lenovo Vantage device info, above which I’ve positioned Winver output. Then I had to go back into the BIOS and turn Turbo Mode back on. With Turbo Mode restored, the system runs very much faster.

Winver 24H2 in front, Lenovo Vantage Device Details in back.

Now, I have to ask: is this disable/enable in BIOS looming over all future upgrades, or is it just a one-time 24H2 thing? As the clue that pointed me toward this fix came from 22H2, probably not. Another thing for me to remember, in that case…

And isn’t that just the ways things go from time to time, here in Windows-World? You betcha!

 

Facebooklinkedin
Facebooklinkedin

BIOS Update Demands Cable Switch

Whoa: this time, things got just a little bit TOO interesting. I’ve got a Lenovo P360 Ultra ThinkStation on loan, and a BIOS update came through today (to version S0JKT2AA). But when I would install the update, the usual BIOS flash screens did not come up after a reboot. It wasn’t until I swapped the graphics cable from the full-size DP to full-size DP port, to a full-size DP (monitor) to mini DP (PC) that the splash screen showed up at boot, and the BIOS flash ran through to completion. Thus, the BIOS update demands cable switch to succeed. Go figure!

How Did I Figure Out That BIOS Update Demands Cable Switch

By watching the post-reboot behavior on-screen, I realized it wasn’t showing me what it was supposed to. Basically, the screen stayed black post-restart until the lock screen for Windows 11 appeared. I knew I was supposed to see the boot-up splash screen (which reads “Lenovo” in white letters on a black background on this device). But instead: nada.

So on a whim, I brought down the video & power cables box from atop my bookshelves. Then, I grabbed a full-size DisplayPort to mini-DP cable and used it to replace the full-size DP to full-size DP I was currently using. Immediately thereafter, I got a splash screen and the BIOS update started processing. It took a while, but it eventually ground through to a successful update.

What About those Intel Graphics?

The next item of business was to get the built-in Intel graphics (UHD Graphics 770) updated. After a handful of failed attempts to get the Lenovo version to run, I visited the Intel DSA (Driver & Support Assistant) and installed that version instead. It worked. You can see the results for my final — and entirely welcome — update check using the Lenovo Commercial Vantage tool as the lead-in graphic above.

That was a wild ride. But indeed, that’s the way things go in Windows-World far too often, based on my current level of interest vs. fatigue. Today, fatigue wins out. Sigh.

Facebooklinkedin
Facebooklinkedin

MX Error Provokes Outlook Account Fix

Ever since Microsoft pushed an Outlook update in late September, Outlook hasn’t let me access my primary email account. Something about handling of DNS info related to mailservers changed, and not for the better. Simply put, the configuration I’d been using to ingest incoming email and send outgoing email quit working. But when I checked the dreamhost config recommendations, everything agreed with same.  Despite repeated fix attempts, account setup kept foundering because of a reset to some whacko domain I never heard or read about –namely: smtp. mailchannels.net. This morning I had an astonishingly positive encounter with Microsoft 365 chat support, during which an MX error provokes Outlook account fix. Buckle up: this is going to take some explaining…

How an MX Error Provokes Outlook Account Fix

Outlook is obviously reading from MX records for the domain names it runs into. The only way I can get my home account (the one for this very website, in fact) to work is by over-riding both incoming and outgoing mail server values that the lookup process finds on its own.

It gets worse. If I tell Outlook to repair itself, it overwrites my over-rides with those selfsame values again. Fortunately, I’ve now got all this stuff memorized and I know how to fix it. But it wasn’t until we tried and failed to use my domain name (edtittel.com) for the mail servers that the inbuilt Outlook facility started reading the right MX records. Only then was I able to use those for the email host instead of whatever Outlook was dredging out of the MX records it finds on my behalf. Sigh.

Automation Had Best Be Correct…

I understand that MS is just trying to help by automating the mail server lookups and name assignments. That’s terrif, as long as they get those lookups right. But as I’ve just learned, over-riding errors in such lookups can get excessively interesting.

Shoot! I couldn’t even get email to work in Outlook until I figured out I should ignore its findings and insist on what the provider’s configuration page told me I should use. What’s interesting is that’s what was in there in the first place, and quit working late last month. I wasn’t able to get back into the fold, however, until I tried my own domain name, at which point the error trail finally located workable MX records.

Go figure! That’s what keeps me on the edge of my seat, and makes Windows-World an always-interesting place to work and live.

Facebooklinkedin
Facebooklinkedin

Uncovering Create Dev Drive Gotcha

Yesterday, I blogged about a real, but apparently small, performance boost for ReFS volumes vis-a-vis NTFS ones. While I was undertaking that testing, I switched a USB4 NVMe from NTFS to ReFS to keep everything else the same. I’m pretty sure that’s the best way to isolate file system differences because port, cable, enclosure and drive all remained the same. Along the way, I found myself uncovering “Create Dev Drive” gotcha. Let me explain.

Uncovering Create Dev Drive Gotcha:
Two Create Dev Drive Buttons

If you attach an unallocated drive to a Windows PC, then navigate to System > Storage > Disks & Volumes, you’ll not only see the “Create Dev Drive button” at the top of that UI pane as shown in the lead-in graphic. Should you scroll down to said unallocated drive, you can evoke a different Create Dev Drive button by clicking on the down-caret for “Create volume” like so:

Here’s the gotcha: if you use the upper Create Dev Drive button, everything works as it should. But if you use the lower one, the create operation fails every time, and reports it fails because the drive is read-only. Here’s what the Settings UI looks like after that error:

Something odd and interesting apparently happens when you use this button instead. I’m reporting this to Feedback Hub. Here’s that link, if you’d care to upvote: Create a dev drive button doesn’t work.

Clean-Up and Fix

Once you do this to yourself, you need to clean things up before you can set things up correctly using (only) the upper “Create Dev Drive” button. You must open Disk Management and delete the RAW volume you’ll now see there (right-click, and select Delete Volume… from the pop-up menu). Then you can return to Settings > System > Storage >Disks & volumes and do it right this time. Enjoy!

One more thing: the Dev Home app is a great place to get started when creating an ReFS volume. It does the Settings navigation for you and drops you right where you want to be. Just remember: it only works when you select the upper, general “Create Dev Drive” button, NOT the lower, device-specific “Create Dev Drive” button. I have no idea why this is so, but that’s the way it seems to work at present. Mysteries like these are what keep me forever fascinated with the wrinkles in Windows-World.

 

Facebooklinkedin
Facebooklinkedin

Chkdsk /f Fixes DISM Issues

Here’s an interesting item. As part of routine maintenance, I ran DISM /online /cleanup-image /analyzecomponentstore on the P16 Mobile Workstation this morning. Imagine my surprise when it threw  “Error 2; The system cannot find the file specified.” at about 80% complete. I’d never run across this one before. But a Google Search soon revealed that this happens when DISM encounters a corrupted entry in the component store. MS Answers also reported that, nearly always, chkdsk /f fixes DISM issues of this kind. So that’s what I tried: as you can see from the lead-in graphic, it worked!

How Chkdsk /f Fixes DISM Issues

This particular disk scanning operation repairs any corrupted files it finds, if it can. That has me wondering if sfc /scannow might not have had the same salutary effect. I think that’s at least possible, so I’ll have to try it next time around. The only follow-on is that repairs to the C: drive (especially for the kinds of files that DISM cares about) must run while the Windows OS image is not in use. That means scheduling that check and repair during boot-up before the OS takes over operation of the PC (that is, while the boot loader is running things).

Thus, I had to reboot the P16, and watch the check run as a pre-boot task (large white text against a black screen). Here’s a capture from inside a Hyper-V VM (otherwise, it’s challenging to grab boot-time screens from Windows).

Once that repair had completed, I was able to run the previously inoperative DISM command without trouble. Every now and then, one gets lucky in Windows-World. This time, the repair worked just like it was supposed to. Good stuff!

 

Facebooklinkedin
Facebooklinkedin

Morning Black Screen Recalls Pending Reboot

On September 10, NVIDIA release its Game-Ready driver, version 561.09. At its conclusion it asked for a reboot. “Oh yeah,” I thought, “I’ll do that later.” It’s happening a LOT later than I planned, nearly 8 days on. If you look at the uptime info in the lead-in graphic you’ll see I’ve somehow managed no reboots since then. But, for the last two days this PC’s monitors have stayed dark when I’ve tried to wake it up first thing in the morning. Alas, that morning black screen recalls pending reboot, which I apparently MUST do (soon).

Note: I’ve been able to bring the desktop back from the black screen state on each of the past two days by striking CTRL-ALT-DEL at the keyboard, then canceling out of the Security Options screen that pops up. Good thing to know, in case this ever happens to you.

How Morning Black Screen Recalls Pending Reboot

Normally, when I click a mouse button or hit a keyboard key when my PC is sleeping, it starts right up. Both yesterday and today, though, I get black screens on both monitors with no cursor. Experience informs me that this is 95+% likely caused by a graphics driver issue. And as I think about it, I dimly recall installing 561.09 last week, then never following up with a reboot. If you do the math on the uptime field from WinFetch in the lead-in graphic, it was last updated on September 9,  around 3:09 PM (thanks timeanddate). Thus, it hasn’t been updated since the GPU driver got updated.

I’ve also noticed graphics running a bit slower and jerkier lately, too. It all adds up: I should’ve remembered to reboot the same day I updated NVIDIA graphics driver. But it may be too late to go back, but it’s not too late to reboot right now. And sure enough, when I do, no more black screens on startup, nor after waking from sleep (which I forced from Power > Shutdown > Sleep through the Start Menu to check).

Go figure. I should know better. This not-so-gentle reminder does the trick to help me remember this time. Isn’t that just the way things sometimes go in Windows-World?

Facebooklinkedin
Facebooklinkedin

Choosing USB Power Ports Properly

I should have known. I put the Lenovo ThinkPad X12 Gen 1 hybrid tablet back into service yesterday. Indeed, I had a ThinkPad Universal Thunderbolt 4 Dock sitting right next to  the device. “No problem,” I thought to myself “I’ll hook up to one of these USB-C ports and I won’t need to rustle up its 65W brick.” Wrong! Just as it’s essential to choose USB-C ports for their bandwidth ratings when attaching storage devices, ditto for choosing USB power ports properly when seeking a charge. Let me explain…

Why Choosing USB Power Ports Properly Matters

I show the rear view of the TB4 dock in the lead-in graphic. Turns out that only the TB4-rated ports (the leftmost block of two is to the right of the DC power input connected) deliver more than 10W via USB-C. The others are rated 10 Gbps too, while TB4 gets the coveted 40 Gbps rating.

I knew things were off when the BIOS told me that the PSU wasn’t delivering an acceptable amount of power as the X12 started booting up. “Doh!” I reflected, “it’s important to read the fine print on the USB-C connectors to make sure they have the power lightning bolt and plug into those.” And sure enough,  if I zoom in on the detail on the two left-most USB-C ports on the back, the lightning bolt is pretty visible on each one.

Left: lightning bolt above; right: below. 100W available from each one, as per specs.

It pays to check before plugging into USB-C. If you can’t see or don’t know, it never hurts to RTM. As soon as I figured out what I was doing, in fact, it all made sense. Just another perfect day in Windows-World, right? Cheers!

Facebooklinkedin
Facebooklinkedin

StartMenuExperienceHost.exe Knocks ReliMon Over

When searching for Windows blog topics, I occasionally drop in on Reliability Monitor (aka ReliMon). FYI, it’s actually a special version of the more general-purpose Performance Monitor (PerfMon). This morning, I saw what I can only describe as a bad-to-worse stability index chart. See the lead-in graphic. Upon examination, I concluded that StartMenuExperienceHost.exe knocks ReliMon over with daily errors. Ouch!

Handling StartMenuExperienceHost.exe Knocks ReliMon Over

Digging into the details, I see this element present every day (multiple times on some days) for 16 of the past 17 days. That’s a new personal record for me, and it’s interesting. Why? Because this system hasn’t been giving me any obvious trouble, repeat SMEH errors notwithstanding. (Hope that abbreviation is obvious…)

So naturally I went looking for enlightenment about SMEH and the related MoBEX error that occurs for each instance in the detail page. Unsurprisingly, I found a registry hack to address the issue at TenForums.com from well-known VIP member Samuria. Apparently, it involves a well-known permissions inheritance issue for values inside the

HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\User Shell Folders

key. I’ve applied the fix Samuria recommends, and will observe ReliMon over coming days to see if it helps.

The Enduring Value of Internet Community

Though one must exercise caution in picking up and running with fixes from the Internet, there are gradations of trust and merit in play, too. Because I’ve been an active member of TenForums for years and have seen many, many useful tips from Samuria over that entire interval, I’m comfortable with following his advice. That makes this a “safe fix” IMO. But if you have a recent backup handy, and know how to restore it, you can always get back to where you started. That’s my fallback position, and I’m sticking with it. Cheers!

I’ll keep you posted as I see if this helps … or not. Stay tuned!

Sept 13 follow-up #1: No dice, but…

I got a comment from fellow TenForums VIP OldNavyGuy that told me two things: he tried the reghack and it didn’t work for him. He also build a new user profile and moved over to that, then killed the old one. He reports that did away with the ongoing torrent of StartMenuExperienceHost.exe errors. I’ll try it sometime, and see.

Facebooklinkedin
Facebooklinkedin

Intel 13-14 Gen CPU Issues Unfolding

In tech news over the past 2-3 weeks, there’s been some serious CPU stuff revealed. As updated in this recent Windows Central item, PCs with Intel’s 13th and 14th generation CPUs (Raptor Lake and Raptor Lake Refresh, respectively) are prey to a microcode bug. Units with a TDP of 65W or greater can run excessive voltage under some conditions. This can cause crashes and BSODs. On July 26, in fact, Tom’s Hardware reported a scary observation. It said “13th Generation Raptor Lake processors have a return rate [4X] higher than … the previous generation”  (copy abbreviated). There’s the basis for my claim to see Intel 13th-14th Gen CPU issues unfolding.

What Intel 13-14 Gen CPU Issues Unfolding Means

If you’ve got PCs or laptops with such CPUs inside, you’ll need to keep an eye on them. Intel plans to issue a microcode fix sometime soon. When it’s available, you’ll want to schedule that update sooner rather than later. I’d also recommend that owners think about  underclocking as a form of insurance against possible problems that normal voltage level operations might otherwise cause.

Indeed, for those with 13th Gen Raptor Lake devices, you’ve been dodging trouble for some time now. The already-cited Tom’s Hardware story, mentions that “the first sporadic reports of CPU crashing errors surfaced in December 2022 and grew to a crescendo by the end of 2023.” You’ve been warned!

For more info on underclocking, this wikiHow Tech story “Underclock Your Computer Hardware: 2 Easy Ways” looks like a good place to start.

No Raptor Lake Exposure Here…

I have to chuckle as I report that the PCs and Laptops at Chez Tittel aren’t subject to this reported exposure. Because its worst-case consequences could require replacing a CPU, that’s a very, very good thing. I was concerned about my workhorse test PC, a well-equipped Lenovo P16 Mobile Workstation Gen 1. But a quick trip to CPU-Z (which you can use on your PCs to suss out relevant details) showed it running an Alder Lake 12th Gen Intel CPU. I was totally relieved to see that this morning (see lead-in graphic).

Facebooklinkedin
Facebooklinkedin