Contents
- 1 What is CrowdStrike?
- 2 Single Point of Failure: Global IT Outage Underpins Over-Reliance on Microsoft by Corporates
- 3 A Single Point of Failure: Sending the Wrong Message to Hackers
- 4 How Microsoft Failed: “Largest IT Outage in History” a Huge Risk Going into the Future
- 5 How CrowdStrike Failed
- 6 Microsoft Blog Post: “8.5 Million Windows Devices Affected”
- 7 CrowdStrike CEO: “A Fix Has Been Deployed”
- 8 The Human Cost: Thousands Stranded in Airports
- 9 Possible Solutions Going Forward
- 10 Has the IT Outage been Fixed?
- 11 Sign up for Mania Africa
- 12 The Lesson: Stay Prepared, Don’t Rely on IT Service Providers Too Much
Yesterday’s global IT outage, now reported to have been caused by faulty code in a CrowdStrike update for Microsoft systems, has underpinned overreliance on Microsoft by corporates. Such heavy dependence on a single system can be very dangerous, especially in the way it underscores a single point of failure that hackers could potentially use to bring down corporate systems globally. We look into the failures the global IT outage exposed and ask how things can be made better to avoid such vulnerability in the future.
Featured: Easily Send US Packages Internationally with Boundr
Sending packages overseas has never been easier!
With Boundr, you can quickly and affordably send gifts, flowers, and more to your loved ones around the world. Get upfront prices and a free shipping quote today.
Whether you’re shopping online or from a department store, Boundr ensures your items reach their destination safely and on time. Don’t wait – start saving on international shipping now.
Visit Boundr.com and experience the convenience and reliability of our service. Send your package today with Boundr!
Offer: Save over 50% when you ship with Boundr! Use the link below.
We may earn a commission when you make a purchase.
What is CrowdStrike?
CrowdStrike is a cybersecurity provider that provides antivirus and security updates for Microsoft systems and other business worldwide. The company is one of the few cybersecurity giants in the IT sector and we wonder how such a big company could allow faulty code into an update and then send it to their client Microsoft without the necessary checks.
Single Point of Failure: Global IT Outage Underpins Over-Reliance on Microsoft by Corporates
Perhaps the most dangerous thing emanating from the global IT outage yesterday is the realization that a single update, such as the faulty CrowdStrike update that is blamed, could bring down corporate systems the world over. As a result of just a few lines of faulty code in the update, supermarkets, banks, hospitals, and even airlines were crippled, with their operations thrown into a rush of trying to find and fix the problem.
A Single Point of Failure: Sending the Wrong Message to Hackers
When a huge number of the world’s corporate entities rely on a single system to run their operations and do business, it exposes them to a momentous danger as the system in question presents a single point of failure. What’s even worse, the global IT outage tells hackers that they can target a single system and have corporate entities’ systems go down the world over.
What hackers see from events such as yesterday’s global IT outage is that corporations the world over are over-reliant on Microsoft systems and suites such as Microsoft 365. To them, this presents a huge opportunity as they now know there is a single point of failure that they can potentially exploit and bring down corporate systems worldwide.
How Microsoft Failed: “Largest IT Outage in History” a Huge Risk Going into the Future
Yesterday’s outage was termed as the “largest IT outage in history” and this sends shivers down my spine. The fact that we are in 2024 and a few lines of code can cause the largest IT outage in history is quite shocking. How on earth can a company such as Microsoft not have protections in place to avoid something like this happening?
The fix going forward is that Microsoft needs to have better policies to roll back defective drivers and not just raw dog risky updates to customers.
Crowdstrike will likely promote their code safety officer to put in code sanitization tools that will catch this automatically.
— Zach Vorhies / Google Whistleblower (@Perpetualmaniac) July 19, 2024
Even more, how can corporations the world over put all their trust, belief, and operations in this one suite or company [Microsoft]? How can hospitals, banks, supermarkets, and airlines bet on this one horse and foolishly place all their eggs in one basket?
How CrowdStrike Failed
As per my understanding, CrowdStrike also failed in ensuring that it had checks in place for its code and went on to push a faulty update knowing very well that its updates could potentially affect millions of devices. Microsoft, on its part, should have done better and not just allow a security program to load its faulty code without doing its own vetting of the code.
And Crowdstrike will likely take a hard look at rewriting their system driver from what it currently is, C++ to a more modern language like Rust, which doesn’t have this problem.
— Zach Vorhies / Google Whistleblower (@Perpetualmaniac) July 19, 2024
Microsoft Blog Post: “8.5 Million Windows Devices Affected”
Microsoft, via a blog post, has said “We currently estimate that CrowdStrike’s update affected 8.5 million Windows devices, or less than one percent of all Windows machines,”. Can you imagine that, 8.5 million companies and individuals were touched and brought down by a sloppy update? The post from David Weston, a CyberSecurity executive at Microsoft, went on to say “While the percentage was small, the broad economic and societal impacts reflect the use of CrowdStrike by enterprises that run many critical services”. Critical services that could be targeted, I must say.
CrowdStrike CEO: “A Fix Has Been Deployed”
CrowdStrike CEO George Kurtz said, “This is not a security incident or cyberattack. The issue has been identified, isolated and a fix has been deployed.” The thing is though, this was an attack on the confidence people will have on IT systems such as Microsoft Azure and the companies behind them going forward.
The Human Cost: Thousands Stranded in Airports
We could talk about all the monetary losses but I want to talk about the human suffering that has emanated from the outage. First, hundreds if not thousands of airline passengers have spent the night in airports. These are families and individuals including children, the elderly, and the sick who have been forced to camp out in airports just because the airline couldn’t have backups in place.
A Shame that Even Hospitals Lacked Alternative Systems or Analogue Backups
Second, the same can be said for hospitals where patients’ records could have potentially been compromised, and had it been a hack, millions of patients’ health records could have been leaked; all because the hospital could not be cyber security-conscious enough to know that you never rely on a single system in IT. Moreover, you never lack an analog backup of everything. There could even be a global electricity outage from solar rays, for example, and would that mean that people won’t be treated because the hospital cannot operate without its digital systems?
Possible Solutions Going Forward
Interoperable Alternative Systems
I think that first, and foremost, every business, individual, hospital, or airline using the Microsoft operating system (OS) and products such as Microsoft Azure and 365 should immediately sign up for an alternate program or OS on top of the Microsoft one. An alternate and additional tech service provider, whose system is interoperable with Microsoft’s, could mean that when one system goes down, they can use the other, and later reconcile the data.
System Backups and Analogue Backups
Second, every business and entity should always have backups of their systems and backups of backups stored in remote locations, maybe even on a cloud server, to ensure they can avoid redundancies.
Third, every entity should have an analog backup of their data, records, and files. For instance, a hospital should have analog patient health records, that they can use to deliver care in instances when the digital system is down or when they are faced with something like an electrical outage. An analogue backup could also be an analog system of operations that can be used without the need for computers, which could be later reconciled with the digital system once normalcy is restored.
Has the IT Outage been Fixed?
Unfortunately, even now there are still businesses, banks, supermarkets, airlines, and hospitals still trying to fix the outage. What happened is that CrowdStrike’s faulty update forced computers to crash and shut down in a way that they could not be easily turned on again. Here’s an X thread explaining that:
Crowdstrike Analysis:
It was a NULL pointer from the memory unsafe C++ language.
Since I am a professional C++ programmer, let me decode this stack trace dump for you. pic.twitter.com/uUkXB2A8rm
— Zach Vorhies / Google Whistleblower (@Perpetualmaniac) July 19, 2024
Essentially, a single line of code that referenced a non-existent string led the computers to crash, by corrupting the device on the driver level, meaning that the computer is left displaying an error message and with no way to turn it back on. Additionally, these computers are also difficult to turn on remotely and the companies affected will need to have their IT administrators figure out what they need to do to reboot their systems.
The Lesson: Stay Prepared, Don’t Rely on IT Service Providers Too Much
I think it is such a shame that many entities were so unprepared, including Microsoft itself, for something like this. The good thing, though, is that the vulnerabilities have been exposed and we hope that both CrowdStrike and Microsoft will do better next time. Instead of corporate entities waiting for them to do better, however, they should bolster their own systems first and put in place mechanisms to ensure that they do not fall victim to the failures of their IT service providers.