Software failures hit hard in 2025, affecting banks, airlines, retailers, and even government systems. These werenât small hiccupsâthey were large-scale outages, security breaches, and system breakdowns that caused real financial and reputational damage. In this blog, weâll walk through the biggest incidents of the year and uncover the lessons they offer for teams looking to strengthen their quality practices.
1. Cloudflare software bug knocks thousands of websites offlineâtwice

In November 2025, Cloudflare, one of the most critical internet infrastructure providers, suffered a major global outage that knocked thousands of websites offline for a few hours. Platforms like X (formerly Twitter), ChatGPT, Spotify, Canva, Uber, and League of Legends all went down, leaving users frustrated and businesses scrambling.
The cause? A software bug triggered by a configuration change. Even tools that track outages struggled since they rely on Cloudflare too. For context, Cloudflare helps manage and protect traffic for 20% of the web and handles trillions of requests daily. While the outage only lasted a few hours, the ripple effect was huge,highlighting just how much of the internet depends on a handful of core providers.
Then in December 2025, Cloudflare struck again. A configuration change related to its Web Application Firewall (WAF) briefly disrupted traffic to services including LinkedIn, Zoom, and Shopify, prompting another round of user frustration and executive apologies.
For companies and users alike, it was a clear reminder: even the webâs giants arenât immune to hiccups, and downtime anywhere in the chain can mean chaos everywhere.
2. Starlink outage affects thousands of users worldwide
In July 2025, Starlink experienced a major service outage that left tens of thousands of users from Europe, the United States, Africa, Asia, and Australia without internet access for several hours. The disruption spread rapidly across multiple regions, with outage trackers showing a sharp spike in reports from both residential and business users who rely on Starlink as a primary or backup connectivity option.
According to Starlinkâs vice president of Starlink Engineering, Michael Nicolls, âThe outage was due to failure of key internal software services that operate the core network.â
The outage even prompted an apology from senior executives, including founder Elon Musk, who wrote he was âSorry for the outageâ.
Even though the outage was relatively short, the impact was immediate. Many rural businesses and remote teams depend on satellite internet as their only reliable connection, meaning any downtime results in stalled operations, delayed communications, and lost productivity. For companies using Starlink as part of their redundancy strategy, the incident highlighted an uncomfortable realityâeven cutting-edge satellite networks arenât immune to cascading software issues.
3. Taco Bell AI driveâthrough system crashes because of edge case

In 2025, Taco Bell tried to speed up drive-through service with an AI voice ordering system but it didnât exactly go as planned. Customers quickly ran into funny and frustrating glitches, from misheard orders to the AI misunderstanding simple requests.
Things went viral when someone managed to order 18,000 cups of water as a prank and managed to crash the system. Clips of the AI getting orders wrong and frustrated customers spread across social media, forcing Taco Bell to pause and rethink its rollout.
The takeaway? Edge cases matter in software testing. AI systems might handle 95% of normal interactions perfectly, but unusual or unexpected inputs, like prank orders or uncommon phrasing, can break the system. Taco Bell learned the hard way that manual testing and human oversight is still key. And for businesses experimenting with artificial intelligence, itâs a reminder that though tech can impress, it doesnât always get it right.
Proper testing for edge cases helps companies catch problems before they reach real customers, saving time, money, and viral embarrassment.
4. McDonaldâs AI hiring bot leaves applicantsâ personal data exposed
In June 2025, McDonaldâs faced a major software embarrassment when its AIâpowered hiring platform, known as McHire, which uses a chatbot named Olivia to screen job applicants, was found to have serious security flaws that could have exposed personal data from millions of applicants. Security researchers discovered that the admin interface was protected by laughably weak login credentials like â123456â for both username and password, allowing them to access backend systems without any authentication safeguards.
This security flaw could have exposed personal data from more than 64 million job applicants, including names, emails, and chat histories.
While thereâs no evidence that anyone maliciously exploited the flaw, the system was patched quickly. McDonaldâs emphasized that the third-party AI provider had overlooked basic security safeguards.
This incident highlights an important lesson for any company deploying AI or automationâgood intentions arenât enough without strong security and quality controls. According to S&P Global Market Intelligence, the number of companies abandoning AI initiatives surged from 17% in 2024 to 42% in 2025, largely due to rising costs, data quality challenges, and security concerns. Cutting corners on basics like password policies, API protections, and authentication, especially in tools that collect personal data, can turn innovative systems into serious liabilities.
In a time where AI is increasingly integrated into core business functions, robust testing, security audits, and threat modeling are essential to prevent the dark side of automation from becoming a realâworld problem.
5. Major cyberattack costs Marks and Spencer ÂŁ300 million
In April 2025, British retail giant Marks & Spencer (M&S) was hit by a major cyberattack that brought online orders and some in-store systems to a standstill. Hackers, linked to the group Scattered Spider, gained access through an IT help desk and deployed ransomware, forcing M&S to pause online clothing and homeware sales for weeks.
The financial hit was huge. The company warned the breach could slash up to ÂŁ300 million from operating profits and its share price dipped as the disruption played out. Beyond lost sales, the attack exposed customer data and revealed just how vulnerable modern retail is to digital threats.
According to Cybersecurity Ventures, global ransomware damage costs in 2025 are expected to reach $57 billion.
The lesson learnt here is that even established retailers can be blindsided when IT systems fail. Strong cybersecurity, thorough testing, and careful monitoring of supplier networks arenât just nice-to-havesâtheyâre essential for keeping business running and customers safe.
6. Barclays IT glitch locks customers out of accounts on payday
Between January 31 and February 2, Barclays, one of the UKâs biggest banks, suffered a major IT glitch that left thousands of customers unable to access their accounts. The timing couldnât have been worseâit coincided with payday and tax deadlines. People reported frozen balances, failed payments, and long queues at ATMs, leaving them unable to pay bills, buy groceries, or check their balances.
The disruption was so widespread that Barclays paid ÂŁ7.5âŻmillion in compensation. Many eligible customers received payouts of up to ÂŁ100 each, depending on how they were impacted.
The Barclays outage was more than an inconvenience. It showed how deeply peopleâs daily lives now depend on reliable software. When banking systems fail, itâs not just frustrationâitâs missed payments, delayed plans, and real financial consequences. Thatâs why robust fintech and banking software testing, thorough monitoring, and strong contingency plans are essential for critical systems like financial services.
7. Collins Aerospace cyberattack disrupts European air travel

In September 2025, a cyberattack on Collins Aerospace disrupted checkâin, boarding, and baggage systems at major European airports, including London Heathrow, Brussels, Berlin, and Dublin. Airlines had to abandon automated kiosks and revert to manual processes, leaving passengers stuck in long lines and causing flight delays and cancellations.
The attack targeted Collinsâ MUSE passenger processing software, used by airlines to manage electronic checkâins and boarding. With the system offline, staff scrambled to keep operations moving, highlighting just how dependent modern air travel is on complex software systems.
The incident involved ransomware, reminding the aviation industry that even critical infrastructure is vulnerable. For travelers, it was a frustrating few days of delays and chaosâand for airlines, a costly lesson in cybersecurity and system resilience.
8. Shopify Cyber Monday outage
During Cyber Monday 2025, one of the busiest online shopping days of the year, Shopify, the eâcommerce platform behind millions of online stores, suffered a major outage that left thousands of merchants unable to access critical tools. Reports spiked midâmorning as shop owners found they couldnât log into their dashboards, pointâofâsale systems, or admin panels just as sales activity was peaking.
Although storefronts and checkout pages generally continued working, the login and backend issues prevented merchants from updating prices, managing orders, adjusting inventory, or responding to customersâessentially freezing operations during a day that generated an estimated $14.2 billion in global online sales.
Shopify traced the problem to a login authentication system failure and worked through the day to restore access.
9. Optus emergency call system crash
In September 2025, Australian telco Optus suffered a catastrophic outage that blocked access to Triple Zero (000) emergency services for around 13 hours during a routine firewall upgrade.
Optus admitted warnings werenât escalated, and its monitoring systems failed to flag the problem. CEO Stephen Rue apologised and confirmed the outage was the result of human error during a routine firewall upgrade when the wrong process plan was selected for a routine firewall upgrade. Regulators demanded better testing, stronger safeguards, and real-time monitoring to prevent a repeat.
The outage also affected general telecom services, leaving thousands of customers unable to make routine calls. It was a stark reminder that even routine upgrades can turn critical infrastructure into a disaster if testing and oversight fall short.
What can we learn from the biggest software failures in 2025?
If thereâs one thing 2025 made obvious, itâs that software failures arenât just technical issuesâtheyâre business risks. Lost sales, frustrated customers, and damaged trust all adds up pretty fast. The good news? Most of these incidents, or bugs, were preventable with stronger software testing, better monitoring, and more realistic real-world validation. Whether youâre an enterprise business or a fast-growing startup, investing early in quality assurance isnât optional anymoreâitâs how you protect your team, your revenue, and your reputation.
Want fewer surprises and smoother releases?
Quality shouldnât wait until something breaks. Get in touch with quality assurance experts and ensure you have all the checks you need to avoid costly surprises.




