CrowdStrike Outage July 2024 - Technical Summary of What Happened
An update to CrowdStrike Falcon Sensor on 19 July 2024 caused about 8.5 million Windows machines (estimated) to hang and show BSOD (Blue Scren of Death). A fix was pushed out later that day but the machines that went down had to be manually fixed (as of this writing). Here’s a technical summary of what happened based on Dave Plummer’s video below:
- Falcon Sensor runs on kernel mode! Executables that run on kernel mode need to pass WHQL (Windows Hardware Quality Lab).
- Falcon Sensor was designed in such a way that an update to the sensor would be done through another file called a Channel File. This Channel File or simply a configuration file would be updated multiple times a day and would not go through WHQL process.
- On the ill fated day, CrowdStrike sent a configuration file that was invalid. (We’ll probably know how this passed through when they publish their findings)
- Falcon Sensor read the new configuration file and since it was invalid, threw an error. Since the Sensor runs in kernel mode, it threw a BSOD.
Remedy
For the affected machines, only manual fix is available. The steps are:
- Boot the machine in safe mode.
- Remove the offending file starting with C-0000291 under c:\System32\drivers\CrowdStrike directory.
- Reboot the machine.
- On devices with BitLocker encryption, one also needs the keys.
Prevention
There’s a setting in the Falcon Sensor to delay the update for upto 3 months. The default unfortunately is set to immediate update.
Video
Postscript
Here’s an interesting way an Australian Tax Firm used barcode scanners to read the long BitLocker keys! https://www.theregister.com/2024/07/25/crowdstrike_remediation_with_barcode_scanner/
Reference
CrowdStrike July 19 Outage Updates Page:
https://www.crowdstrike.com/falcon-content-update-remediation-and-guidance-hub/
💣