AI and Us: What I’m Working On These Days

Last year, I joined the communications team at the Machine Intelligence Research Institute. Before this, I spent several hundred hours introducing lay audiences to the basic science of modern artificial intelligence. Below, I’ll share what I’ve been working on and thinking about these past few years, and why I think it matters to all of us. All opinions are, of course, my own.

Humanity seems to be on track to build a machine that is smarter than we are. It seems likely this will occur in my lifetime – maybe even before the decade is out. One of the leading AI labs just announced a $500 billion project with this exact goal. You don’t rustle up that kind of money unless you’re serious. Will they end up with a superintelligence? I don’t know, but the field’s been moving fast lately, there’s been tons of cool and exciting progress, and the technology no longer seems decades away.

The problem is, nobody knows what’s going on inside the minds they’re building. Modern AIs are (to oversimplify terribly) enormous arrays of numbers that act like giant coffee filters for data. Data is poured through the layers, and the filters are tweaked millions of times until the coffee starts looking right. For complicated mathematical reasons, this process works; we have AIs that can read and summarize War and Peace in seconds, or solve problems that stump many human experts, or convincingly simulate a human face and voice and chat with you on Skype (yes, really). But nobody knows how the filters work; nobody understands exactly what’s going on inside these giant inscrutable arrays, and nobody can predict how they’ll behave without testing them first. 

(For those more technically inclined: This is in contrast to more traditional algorithms. Before machine learning took off, a skilled programmer could look at a code segment and roughly predict what it does and why. A very skilled programmer could even predict the places it’d get stuck and the errors it’d throw. They could say “This code block implements MergeSort, but there’s an error on line 12 that will cause it to mess up if the input contains a percent sign”. This kind of precision is more or less impossible with current machine learning methods. You just have to run them and see what happens.) 

Consequently, no one knows how to robustly steer or even verify what a modern AI actually wants. The state-of-the-art plan for aligning a superintelligent AI with human values is, more or less, “ask a slightly less competent machine how to build a smarter one that is good.” How do we align the weaker machine? Nobody knows that either.

This is a terrifying state of affairs. On our current path, the AI we get will almost certainly have goals that are incompatible with our survival. It will probably be inclined to fight us to get what it wants – not because it’s Bad and Evil, but because we’re in the way. If you call forth a mind that’s more powerful than you are, and you don’t have exquisitely detailed knowledge and control over exactly what outcomes it’s aimed at and why, then no matter how strict the summoning, the thing that answers is not your friend.

And, because it’s smarter than we are, it will win. It’s not a question of “misuse”. You cannot safely “use” a mind that considers you an obstacle to its plans. You cannot “capture its value”. You do not “have” a hostile superintelligence. It has you. There are no do-overs, no second chances. You just lose.

I repeat: the AI will win. Not the US. Not China. Not the Democrats or the Republicans, not the luddites or the technocrats, not the wealthy or the poor. Not OpenAI or Anthropic or Meta or Deepmind or DeepSeek or anyone else who is, at this very moment, frantically attempting to summon a mysterious god on a flimsy leash. Everyone dies. It’s over. The end.

That is a catastrophe I dearly hope to avert.

And so I find myself, normally an ardent supporter of every technology under the sun, compelled to shout from the rooftops: we must not do this thing. We are not ready. Trying will kill us all. 

One day, perhaps, after decades of careful study, humanity might be prepared to call forth something vastly more competent than we are. But right now, it’s clear that we aren’t. If humanity is to survive the coming decades, we have to recognize the blatantly suicidal course we’re on and stop it in its tracks. In practical terms, we need a global halt to training runs of cutting-edge machine-learning models, enforced by national regulation and international treaty alike. How do we get there from here? I’m not sure, but for the sake of everyone I know and love, I’m damn well going to try.

As a middle-class American without much influence on the gears that move the world, it’s been hard for me to wrap my brain around this problem. But I have some ideas for those who, like me, want to make a difference.

  • Think about it. No, seriously. Spend what time you can spare to look into the topic. This website has a decent overview, and some ideas for action as well.
  • See for yourself what a modern AI can do. What does it take to challenge it? Then think about what it means that no one understands exactly why it says the things it does.
  • Talk it over with someone close to you.
  • Ask yourself: Is there something you, personally, can do in five minutes that seems promising? Do it.
  • Elected officials sincerely care about the opinions of the constituents whose vote their job depends on. That’s probably you. Let them know what you think.
  • Set a reminder for a convenient time; when it triggers, think about the problem again, and do one more thing.
  • Got a question about all this? Ask me! 
  • See also: https://aisafety.info/how-can-i-help

Leave a Comment