Yet Another Article on AI: The Kill Switch

Written by Madhan Ponnusamy | Feb 14, 2025 5:33:20 AM

TL;DR

AI is everywhere, solving real-world problems but also bringing significant risks and uncertainties. Inspired by the ideas in the book The Coming Wave, this article focuses on how to build fault-tolerant systems for AI, specifically LLMs. Using concepts like FMEA and failover switches, we’ll explore how to handle AI failures effectively and prepare for unexpected challenges, including the possibility of rogue AI. Let’s dive into creating a robust "Kill Switch" for AI.

The Inspiration

A few months ago, a friend recommended a book to me—The Coming Wave by Mustafa Suleyman and Michael Bhaskar. For context, Mustafa Suleyman is the co-founder of DeepMind, a company at the forefront of artificial intelligence. In the book, he argues that we are on the cusp of a transformative revolution driven by AI. Alongside its enormous potential, Suleyman warns of the risks it brings: misuse, inequality, and threats to social stability.

Reading this wasn’t just eye-opening; it was unsettling. The risks he described painted a chilling picture of what could go wrong, and, quite frankly, it scared the crap out of me. Thankfully, the book also outlines policy solutions to mitigate these dangers.

Here’s the thing: I’m not a policymaker, an influential CEO, or a billionaire philanthropist. I’m just another software architect and engineer. So, the question that tormented me was—what can I do about these risks? This question sparked an idea: a "Kill Switch." What if we could design mechanisms to disable AI when it goes rogue? I’ll leave it there for now—let your imagination run wild (stream a Hollywood movie, if that helps).

Let's get real

Over the past year, like many of you, I’ve been watching the endless “AI show.” It’s everywhere—"AI this, AI that, even AI toilets." Honestly, it’s exhausting.

So, let me get real for a moment. I decided to experiment with GenAI to help me write this article. And what did I get? A few catchy phrases here and there, but not much else. Even the image I generated looks clumsy and templated. Sure, I can hear what you’re thinking: “By the time you’re done writing, a better model will come out.” You’re probably right—the AI landscape is evolving at breakneck speed. For now, this is what we have.

You might fall on either side of the argument—“AI will take over everything” versus “Relax, we said the same about Google.” As a software architect and engineer, I’m less interested in debating and more focused on what’s actionable. My question is simple: how can I use this tool to make my life easier or my work more impactful?

Let’s park that thought for now and dig deeper into what it all means.

Have you built a tolerant system?

Despite its flaws, GenAI—especially large language models (LLMs)—offers tremendous value for businesses. It’s no surprise that if you’re a software engineer or architect, you may already have been asked (or will be soon) to design systems that leverage these capabilities.

No system is fault-tolerant unless you intentionally design it to be. This is a universal truth in software development. Even Google, with all its expertise, reminds us of this in their SRE handbook. (Fun fact: my machine-learning-enabled phone still takes better pictures than anything I can generate with GenAI!). I am such a show-off, isn't it?

So, how do we ensure systems are tolerant to faults? Two critical practices come to mind:

Conducting Failure Mode & Effects Analysis (FMEA): This helps identify potential failure points and their impact, allowing us to design proactive mitigations.
Building Failover Switches: These allow systems to gracefully handle failures and recover without significant disruption.

You probably see where I’m heading with this. To make AI—particularly LLMs—work in real-world systems, we could apply these principles. While GenAI spans a vast array of applications, I’ll focus on LLMs for now. As they have the potential to penetrate every corner of our systems, making fault tolerance not just a best practice but a necessity.

Failure Mode & Effect Analysis for AI (FMEAI)

Traditionally, FMEA involves identifying all system components, predicting potential points of failure, creating backup plans, and testing the system under failure conditions. So, how can we apply this to systems with large language models (LLMs)?

As a starter, Let's imagine a system like the one below and try to build a FMEAI for that

If we think of an LLM as just another system component, it seems straightforward: you make a request, and if you don’t get the correct response, it’s a failure. The tricky part here is—how do we define an “incorrect response,” especially when failures could have significant customer or business impact?

To address this, we need a tailored approach. Here are some potential steps to measure and manage LLMs:

Define Success Metrics: Establish clear criteria for evaluating responses, such as: Accuracy, Safety, Relevance, etc.
Measure in Real-Time: Continuously monitor LLM performance through: Sampling, User feedback, automated tests, etc.
Define Thresholds for Failure: Set acceptable performance ranges for each metric. For example, if relevance drops below a certain percentage, it triggers an alert.
Automate Monitoring and Actions: Implement systems that monitor these metrics and automatically take action when thresholds are breached. For instance: Switch to an alternative system, Escalate issues to human operators or Apply predefined fallbacks, such as useful responses

While this isn’t a comprehensive guide, the goal is to establish a framework for measuring LLM performance, monitoring it in real-time, and acting promptly to mitigate failures. When the system detects an issue based on these metrics, it’s time to use alternate systems or approaches—just as in traditional FMEA.

The kill switch

Handling Failures with Switches

In traditional systems, component failures are typically managed through automatic switchover to backup systems or by presenting meaningful alternatives to users. Unfortunately, with LLMs, failure isn’t always binary—it’s often based on multiple metrics. This makes it wiser to implement controlled mechanisms like Hswitches

Here’s how it works:

Failure Detection: A continuous monitoring system evaluates performance against predefined thresholds (as outlined earlier). When a failure is detected, an alert is sent to the operations or support team.
Switch Activation: The team can manually flick the switch, or it can be automated for failures below the unacceptable threshold. This ensures rapid response to mitigate impact.
Fallback Activation: Once the switch is activated, the FMEAI framework redirects all consuming systems to their alternate processes or backup systems to handle customer requests seamlessly.

The resulting system operates with precision, ensuring that failures—whether minor or critical—are managed effectively. Potentially, the system shown above could look like this

Back to the Naughty AI Problem

Now, let’s revisit the "naughty AI" problem. Suppose the machines start to rise and a rogue AI threatens humanity (yes, I’m looking at you, Skynet). This is where the ultimate "Kill Switch" comes in. With a single flick, my team member—let’s call him "John Connor," son of "Sarah Connor"—can shut it all down.

Okay, maybe it’s a cheeky nod to science fiction, but the message is clear: we need robust systems to handle not just today’s challenges with GenAI and other AI models but also the unexpected twists that may arise in the future.

In essence, the same principles of fault tolerance and failover apply. Whether it’s handling minor glitches or preventing a rogue AI apocalypse, a well-designed system ensures resilience and control.

Is it that simple?

I know this is a simplified view of a much larger and more complex problem. There are still critical, unanswered questions that don’t have straightforward solutions, such as:

What happens if AGI (Artificial General Intelligence) becomes a reality, and even the outer-layer modules of the system are autonomously generated and controlled by AGI?
Will businesses be willing to invest in building alternate flows and systems to ensure safety and resilience, especially given the costs?
Building these alternates will likely slow down faster progress. Is the trade-off between speed and safety acceptable in competitive industries?
If a state-managed or central agency enforces the implementation of a "Kill Switch," will it end up slowing down innovation across the entire industry?
Should we, instead, ignore the risks, chase progress and profits, and simply hope that future AGI will be kind enough to let us "live long and prosper"?

These are not easy questions to answer, but they are worth asking. As engineers, architects, and stakeholders in this AI revolution, we must remain vigilant and responsible, ensuring that while we innovate, we also build systems capable of safeguarding from unintended consequences.

View full post