I don’t know who first said this. But Nathan @labenz repeats it often. He’s right to do so.
Over the past few years, Joe Carlsmith has published several blog posts that nicely articulate views that I’ve also arrived at, for similar reasons, before he published the posts 1. My own thinking has certainly been influenced by him, but on non-naturalist realism, deep atheism and AI existential risk, and a few other topics in AI and metaethics, I was definitely there-ish before he published. But: I had not written up these views in anything approaching the quality of his blog posts. I’d have found it hard to do so, even with great effort.
What should I make of the fact that one of the best contemporary philosophers is on a similar path on some topics? On the one hand, this is gratifying and encouraging: this is some evidence that (a) my views are correct and (b) that I “have what it takes” to develop my own, somewhat novel views on important topics at the vanguard.
On the other hand, it makes me think “Joe has it covered, and will do a better job than me”. This pushes on my long-running concern that spending time on moral philosophy and futurism—which I am constantly drawn to—is mostly self-indulgence on my part; that going “all in” this stuff would mean falling short of my “be useful” aspiration. If I went “all in”, I think 90%+ that I’d top out as “good”, but not “world class”. And: on the face of it, the returns to being merely “good” are pretty low.
Much better, plausibly, to keep the philosophy as a passionate side-project. It feeds into my work as an “ethical influencer”, which is one way of thinking about the main impact of my career so far. Plausibly this role—perhaps mixed with some more “actually do the thing” periods—is my sweet spot in the global portfolio.
To be clear: Joe also has a lot of fantastic posts which have contained many many “fresh to me” ideas and insights. I read everything he writes.↩︎
Holden Karnofsky: One of the reasons I’m so interested in AI safety standards is because kind of no matter what risk you’re worried about, I think you hopefully should be able to get on board with the idea that you should measure the risk, and not unwittingly deploy AI systems that are carrying a tonne of the risk, before you’ve at least made a deliberate informed decision to do so. And I think if we do that, we can anticipate a lot of different risks and stop them from coming at us too fast. “Too fast” is the central theme for me.
You know, a common story in some corners of this discourse is this idea of an AI that’s this kind of simple computer program, and it rewrites its own source code, and that’s where all the action is. I don’t think that’s exactly the picture I have in mind, although there’s some similarities.
The kind of thing I’m picturing is maybe more like a months or years time period from getting sort of near-human-level AI systems — and what that means is definitely debatable and gets messy — but near-human-level AI systems to just very powerful ones that are advancing science and technology really fast. And then in science and technology — at least on certain fronts that are the less bottlenecked fronts– you get a huge jump. So I think my view is at least somewhat more moderate than Eliezer’s, and at least has somewhat different dynamics.
But I think both points of view are talking about this rapid change. I think without the rapid change, a) things are a lot less scary generally, and b) I think it is harder to justify a lot of the stuff that AI-concerned people do to try and get out ahead of the problem and think about things in advance. Because I think a lot of people sort of complain with this discourse that it’s really hard to know the future, and all this stuff we’re talking about about what future AI systems are going to do and what we have to do about it today, it’s very hard to get that right. It’s very hard to anticipate what things will be like in an unfamiliar future.
When people complain about that stuff, I’m just very sympathetic. I think that’s right. And if I thought that we had the option to adapt to everything as it happens, I think I would in many ways be tempted to just work on other problems, and in fact adapt to things as they happen and we see what’s happening and see what’s most needed. And so I think a lot of the case for planning things out in advance — trying to tell stories of what might happen, trying to figure out what kind of regime we’re going to want and put the pieces in place today, trying to figure out what kind of research challenges are going to be hard and do them today — I think a lot of the case for that stuff being so important does rely on this theory that things could move a lot faster than anyone is expecting.
I am in fact very sympathetic to people who would rather just adapt to things as they go. I think that’s usually the right way to do things. And I think many attempts to anticipate future problems are things I’m just not that interested in, because of this issue. But I think AI is a place where we have to take the explosive progress thing seriously enough that we should be doing our best to prepare for it
Rob Wiblin: Yeah. I guess if you have this explosive growth, then the very strange things that we might be trying to prepare for might be happening in 2027, or incredibly soon.
Holden Karnofsky: Something like that, yeah. It’s imaginable, right? And it’s all extremely uncertain because we don’t know. In my head, a lot of it is like there’s a set of properties that an AI system could have: roughly being able to do roughly everything humans are able to do to advance science and technology, or at least able to advance AI research. We don’t know when we’ll have that. One possibility is we’re like 30 years away from that. But once we get near that, things will move incredibly fast. And that’s a world we could be in. We could also be in a world where we’re only a few years from that, and then everything’s going to get much crazier than anyone thinks, much faster than anyone thinks.
See also: HK on PASTA.
The impressive performance of recent language models across a wide range of tasks suggests that they possess a degree of abstract reasoning skills. Are these skills general and transferable, or specialized to specific tasks seen during pretraining? To disentangle these effects, we propose an evaluation framework based on “counterfactual” task variants that deviate from the default assumptions underlying standard tasks. Across a suite of 11 tasks, we observe nontrivial performance on the counterfactual variants, but nevertheless find that performance substantially and consistently degrades compared to the default conditions. This suggests that while current LMs may possess abstract task-solving skills to a degree, they often also rely on narrow, non-transferable procedures for task-solving. These results motivate a more careful interpretation of language model performance that teases apart these aspects of behavior.
We examine whether substantial AI automation could accelerate global economic growth by about an order of magnitude, akin to the economic growth effects of the Industrial Revolution. We identify three primary drivers for such growth: 1) the scalability of an AI labor force restoring a regime of increasing returns to scale, 2) the rapid expansion of an AI labor force, and 3) a massive increase in output from rapid automation occurring over a brief period of time. Against this backdrop, we evaluate nine counterarguments, including regulatory hurdles, production bottlenecks, alignment issues, and the pace of automation. We tentatively assess these arguments, finding most are unlikely deciders. We conclude that explosive growth seems plausible with AI capable of broadly substituting for human labor, but high confidence in this claim seems currently unwarranted. Key questions remain about the intensity of regulatory responses to AI, physical bottlenecks in production, the economic value of superhuman abilities, and the rate at which AI automation could occur.
See also: Sam Hammond’s critical discussion.
And note that those most bullish on explosive growth typically only put it at 1/3 before 2100.