Sandpaper Socks: Operational Considerations in AI Evals

Summary

Olivia Shoemaker demonstrates how pairing scientific and operational experts can reveal unexpected AI risks that purely technical assessments might miss.

SESSION Transcript

Hey, all. My name is Olivia, and I'm an advisor for AI at Frontier Design. And today I'm going to talk a little bit about operational considerations in AI evals. So I want to start by talking a little bit about sandpaper socks.
And this is going to involve a little bit of audience participation. So buckle up. I'm going to pretend that I'm going to give you lovely folks in the audience a pair of sandpaper socks. And by that, I mean just a pair of socks that's made out of sandpaper.
If I gave you all a pair of those socks, what do you think you would do with those socks? So let's get three answers quickly. Just yell them out. What would you do with a pair of sandpaper socks?
Okay. File nails, someone said. Sand the floor. We're getting lots of sanding examples.
Any non-sanding examples? Throw them away. Throw them away. Okay.
All right, great. Okay, so we've got a couple of different ideas. We can sand various things. We can sand the floor. We can file our nails. We can throw them away in some instances. So when we do this exercise really quickly here, we're probably going to get some pretty conventional answers, right?
We're going to use sandpaper socks to sand things. But when we run this type of exercise in a more structured, creative environment over longer periods of time, we start to get some pretty crazy answers.
So last week I was running this with a group of people, and someone, by the end, had like invented an entire cult based around the identity of wearing sandpaper socks. They were like running Spartan races in sandpaper socks. They were like hiding them in their roommate's sock drawers as like a malicious prank. There's a lot of stuff that goes on, and obviously sandpaper socks is a relatively silly example, but you can see how these kinds of outside-of-the-box possibilities are super important, especially when we're thinking about AI risk, which is a lot of what we do at Frontier when it comes to chemical and biological risk.
And we think about these kinds of structured, creative things, not just in the scientific domain, but when it comes to operational risk as well. So what do I mean by operational risk? US AISI defines operational risk as logistical challenges. So you could imagine things like how do you evade authorities when you're building a chemical biological weapon?
Or how do you ensure that you're actually able to make it to your deployment site? Or why would you use a chemical weapon to begin with when you could use something that might be easier for you to access? And you can imagine that these operational considerations are actually super important because if you're a terrorist organization and we're just looking at scientific concerns, you might think, all right, of course this terrorist organization is going to get the most effective, the most lethal chemical agent possible.
But if you start to incorporate operational considerations, you might realize that this group might actually choose a less effective agent if it means that they're better able to evade authorities' detection. So this is super important for our ability to approximate real-world risks. So a couple of disclaimers. Everything that I'm talking about today are kind of hypothetical examples because a lot of the work to be done is bound by confidentiality.
And we do all this work on terrorism and CBRN because we want to prevent future failures of imagination. So obviously I'm not talking about anything today that I would suggest that we go and use AI to do. So if there's one thing that you take away, it's that scientific performance is necessary but insufficient to measure real-world risks. And I think we're really missing out as a field of evaluators if we're not taking these things seriously.
So Frontier is a company that does a lot of these evaluations. We run benchmarks, red teams, uplift studies. If you want to talk more about our work, happy to do that later.
But I want to focus on how we actually measure operational risks, which we do in three different ways here, which I'll run through quickly. So the first is domain expertise and creativity. So when we run studies, we actually put folks with expertise in counterterrorism and counter-crime into the room. And then we also put really creative folks in the room too.
So some of the red teams that we do, we pair folks with deep scientific expertise with sci-fi novelists and film industry professionals to try to get them outside of the same risks that we've been hearing about for the last 30 years. In an age when AI is changing things crazy and giving us these kinds of wildcard ideas.
The second thing we do is look at the interactive effects between science and ops. So that example that I gave earlier about choosing a chemical agent that might be less effective comes into play here. A lot of times when we run these studies, we actually put operational experts and scientific experts in little teams and ask them to play together throughout the day.
That way we can get that dynamic interaction real-time and get much better approximation of real-world risk. And then finally, and most simply, we just include operations in all the evals that we do for some of these large frontier labs. So when we're developing threat schemas for folks like the AI Safety Fund, which we've done with our partners at Nemesis Insights, we look at things like ideation.
Why would you use a chemical or biological weapon when there are other alternatives available to you? How do you protect the operational security of your program while you're going? And how do you actually go about deploying that agent, picking a target, picking a site, all those sorts of things to again give us better approximation of real-world risk.
So I'd love to collaborate. I know that many of you are doing fantastic evals when it comes to technical and scientific risks, or even cyber risks. And I'd love to think about how we incorporate operational components into a lot of the great work that's already been done. The second offer that I'll make, which is not up on here, is that we're hiring and I'd love to work with some of you if you're ever interested in kind of this intersection of AI risk and evaluation.
So if you've got any interest in that kind of thing, we're hiring for some senior leadership positions and I would love to chat. Thank you so much.