Category: Everything’s computer

  • We absolutely do know that Waymos are safer than human drivers

    We absolutely do know that Waymos are safer than human drivers

    The Argument readers are invited to a conversation between Jerusalem Demsas and Brink Lindsey, senior vice president at the Niskanen Center, about his new book, The Permanent Problem, on Wednesday, Jan. 21, from 4 p.m. to 6 p.m. at the Johns Hopkins University Bloomberg Center.

    Event page

    America’s abundance movement has focused on regulatory reform and housing policy — necessary fights, but perhaps insufficient ones. This conversation will explore the deeper diagnosis offered in The Permanent Problem: that our crisis isn’t just one of scarcity but of meaning.

    Join us for a conversation about why the abundance movement may need to expand its ambitions: from making things affordable to rebuilding the communities and shared purposes that make abundance worth having.

    Reserve your spot.

    In a recent article in Bloomberg, David Zipper argued that “We Still Don’t Know if Robotaxis Are Safer Than Human Drivers.” Big if true! In fact, I’d been under the impression that Waymos are not only safer than humans, the evidence to date suggests that they are staggeringly safer, with somewhere between an 80% to 90% lower risk of serious crashes.

    “We don’t know” sounds like a modest claim, but in this case, where it refers to something that we do in fact know about an effect size that is extremely large, it’s a really big claim.

    It’s also completely wrong. The article drags its audience into the author’s preferred state of epistemic helplessness by dancing around the data rather than explaining it. And Zipper got many of the numbers wrong; in some cases, I suspect, as a consequence of a math error.

    There are things we still don’t know about Waymo crashes. But we know far, far more than Zipper pretends. I want to go through his full argument and make it clear why that’s the case.

    In many places, Zipper’s piece relied entirely on equivocation between “robotaxis” — that is, any self-driving car — and Waymos. Obviously, not all autonomous vehicle startups are doing a good job. Most of them have nowhere near the mileage on the road to say confidently how well they work.

    But fortunately, no city official has to decide whether to allow “robotaxis” in full generality. Instead, the decision cities actually have to make is whether to allow or disallow Waymo, in particular.

    Fortunately, there is a lot of data available about Waymo, in particular. If the thing you want to do is to help policymakers make good decisions, you would want to discuss the safety record of Waymos, the specific cars that the policymakers are considering allowing on their roads.1

    Imagine someone writing “we don’t know if airplanes are safe — some people say that crashes are extremely rare, and others say that crashes happen every week.” And when you investigate this claim further, you learn that what’s going on is that commercial aviation crashes are extremely rare, while general aviation crashes — small personal planes, including ones you can build in your garage — are quite common.

    It’s good to know that the plane that you built in your garage is quite dangerous. It would still be extremely irresponsible to present an issue with a one-engine Cessna as an issue with the Boeing 737 and write “we don’t know whether airplanes are safe — the aviation industry insists they are, but my cousin’s plane crashed just three months ago.”

    The safety gap between, for example, Cruise2 and Waymo is not as large as the safety gap between commercial and general aviation, but collapsing them into a single category sows confusion and moves the conversation away from the decision policymakers actually face: Should they allow Waymo in their cities?

    Zipper’s first specific argument against the safety of self-driving cars is that while they do make safer decisions than humans in many contexts, “self-driven cars make mistakes that humans would not, such as plowing into floodwater3 or driving through an active crime scene where police have their guns drawn.” The obvious next question is: Which of these happens more frequently? How does the rate of self-driving cars doing something dangerous a human wouldn’t compare to the rate of doing something safe a human wouldn’t?

    This obvious question went unasked because the answer would make the rest of Bloomberg’s piece pointless. As I’ll explain below, Waymo’s self-driving cars put people in harm’s way something like 80% to 90% less often than humans for a wide range of possible ways of measuring “harm’s way.”

    The “we don’t know” argument relies on indifference to the numbers: sometimes they’re better, sometimes they’re worse; therefore it is impossible to say whether, on the whole, they are better or worse.

  • I can’t stop yelling at Claude Code

    I can’t stop yelling at Claude Code

    They say that if you really want to know a person’s character, you should observe how she treats her servants. On Christmas vacation, I realized I didn’t like what this said about me.

    I was at my parents’ house; my oldest daughter was playing board games with her grandparents, and my youngest was trying to befriend my sister’s cat. My wife was napping, my brother was cooking, and I was yelling at Claude.

    I was, to be clear, incredibly impressed by Claude on the whole — specifically Claude Code running Opus 4.5, Anthropic’s command-line “agent,” a large language model that can build websites and do projects for you. I had absentmindedly pitched it an idea one day earlier, and now we had a functioning website and several hours of playable content. Working with Claude was like having an eager, responsive, literally indefatigable development team on tap — on Christmas Eve! I had never felt so powerful.

    I wasn’t in love with what it was doing to me.

    Type your email…
    Subscribe
    In college, I was once told that the really hard part of programming was knowing, in sufficient detail, what you wanted the computer to do. This was not my experience of programming.

    In my experience of programming, the really hard part was figuring out which packages weren’t installed or weren’t updated or were in the wrong folder, causing the test we’d done in class to completely fail to work in the same way on my own computer. The next really hard part was Googling everything the debugger spat out to find an explanation of how to make it go away.

    I hated programming. I studied it because at my university, it seemed like everyone studied it; I studied it because it was where the good jobs were; I studied it because I was envious of what my friends who could program could do, the way they could sit down and tap at their keyboard for half an hour and make a website — a kludgy, messy website, but a website nevertheless — come to life.

    If I worked 10 times as hard as in all my other classes combined, I could get good grades, but I never felt the magic. Programming was an unending drudgery of package installations, looking up how libraries worked, figuring out why the code still didn’t work, fixing it, and then finding that the code still didn’t work for new, frustrating reasons.

    Claude Code solves all of that. Programming, now, really is just a matter of knowing in sufficient detail what you want the computer to do; no small matter, but a meaningful one, a fun one, an important one. Coding, a task that mostly tested my frustration tolerance, had been turned into writing, a task that I can barely be induced to stop doing when my drafts are already way too long.

    Now, 99% of the time, it feels like magic. The remaining 1% is absolutely maddening.

    This isn’t a totally new feeling: a feeling of frustration somewhere between hitting your printer when it isn’t working and yelling at a puppy for peeing on the couch. But I can tell, using Claude Code, that it is going to be a big part of my life going forward, and I don’t want “yelling at the printer” to be a big part of my personality.

    I was inspired to try out Claude Code by the insistence of some people I respect that “This Is It. AGI — that is, general artificial intelligence, variously defined but often meaning artificial intelligence that can do everything humans can do on a computer — is Here.”

    I knew that Claude Code wasn’t going to be AGI. But I will say this: A lot of the time, it feels like it is. That is, if you happen to run across the kind of problems that Claude Code is really good at solving, instead of a bunch of the kinds of problems it’s really bad at.

    And precisely because it is so good most of the time, when it’s incredibly dumb, it is maddening in a way that I’ve never found in any previous AI system. When a toddler gets a multiplication problem wrong, it’s not maddening; it’s kind of cool they were attempting multiplication at all.

    But Claude Code is good enough that it’s easy to start to relate to it as, well, an almost-human employee, with an element of how you relate to a critical household appliance. You send it specs, it builds them. You ask questions, it answers them.

    Some part of your brain starts to rely on a new affordance. I can delegate tasks to Claude. I can whisper things and see them spring to life full-formed.

    And then, sometimes, you can’t.

  • AI predictions for 2026: The flood is coming

    AI predictions for 2026: The flood is coming

    A good forecaster doesn’t start with the future; she starts with the past. I wanted to answer the question “What should we expect from AI in 2026?” and so I began to reflect on what had happened this last year. If you’re interested in doing some forecasting of your own, stop reading for a moment and jot down a few notes (or comment below) about how you think AI changed in the last year.

    Most people can’t name a single thing that changed from the beginning to the end of the year, even while the technology improved massively.

    The best widely available general purpose image model on Jan. 1, 2025 was Midjourney V 6.1. The best today, I’d argue, is Google’s Gemini 3 Pro Image Preview (which is branded as Nano Banana Pro, and available to test out for free). Here are some prompts that I think showcase the differences (copied these straight from my conversations with the chatbots so please excuse the typos):

    “photorealistic image of four children sitting on a couch gathered around a laptop, watching something. some of them are sitting on the back of the couch so as to get a good view. their postures are varied and absurd but they’re all engrossed in what they’re watching”

    Here’s a response from the version of Midjourney available Jan. 1, 2025:

  • Don’t get fancy with your labor market fixes for AI

    Don’t get fancy with your labor market fixes for AI

    The only thing emerging faster than the AI arms race in America is the debate over what to do about the workers who AI will displace.

    Many of the loudest voices are also pitching some of the most ambitious ideas. OpenAI CEO Sam Altman is interested in implementing a universal basic income (UBI) — an experimental idea that would require not only a gigantic budget but also massive new infrastructure to develop and implement. UBI sounds like a simple solution, but any new program — particularly one aimed at universality — would need to make decisions about eligibility, implementation, and oversight that are complex and politically contentious.

    While politicians and tech optimists are out there pitching programs that are tough to build consensus around — or overly specific to the point of being unworkable — we already have a boring, scalable solution to the problem of mass job loss.

    It’s called unemployment insurance (UI).

    The best economic policy is flexible, automatic, and responsive to a range of shocks. UI already exists in all 50 states and has a strong track record of responding to technological disruption in the labor market.

    We don’t yet have a clear sense of exactly how big AI’s labor-market disruptions will be — or who would be most immediately affected. By bolstering a long-running program that already serves as a social safety net for the unemployed, the economy can endure whatever shocks AI might deliver.

    Why build UBI when we already have UI?