Governance Challenges of Biological AI Models
Summary
SESSION Transcript
I think Austin was being very modest before. I will say that he is one of the few people in Washington that is both extremely smart on policy and politics. If you want to get anything done in DC, please partner with SeedAI. I'm Melissa Hopkins. I’m Health Security Policy Advisor at the Johns Hopkins Center for Health Security. I'm a lawyer by training, and I worked for two years in the US Congress, most recently as a tech fellow specialising in AI and biotech for a senior member representing Silicon Valley. I'm just going to show you, really quickly, the training compute of biological AI models over the years. You can see recently that the top model is ESM3. That's at about 1024 FLOPS, which is just under 1025 FLOPS that the EU AI Act automatically qualifies as a general purpose AI model with systemic risk. There’s about 360 biological AI models that we're aware of, and this is from Epoch AI—shout out to Epoch.My main points that I want to make today are first, just like as a general principle to future-proof your legislative text, legislative text is very slow to change, but technology is very fast to change. That means generally, try not to put numbers in your legislation. That means do not put FLOPS, if you can help it, or do not put parameters in your legislative text. It's perfectly fine to put them in your regulations because you can update your regulations fairly quickly. Also, be mindful of what kind of models your legislation will be applying to in the definitions that you use. We found that a lot of the definitions that different policies or different guidances are coming out with that biological AI models are following within the scope of those definitions. But then the ways in which they're saying to do things are not necessarily or should not necessarily be applied to biological AI models in these specific kinds of ways. I think as you're scoping policies, if you can carve out exemptions or tailor your policies to be mindful of these three key differences between biological AI models and LLMs, you'll be able to future-proof your policies in appropriate ways.The first key difference to be aware of between biological AI models and classic LLMs or other generative AI models are information hazards when it comes to red-teaming biological AI models or BAIMs as we call them. When you're red-teaming an LLM, it's red-teaming information that already exists. When you're red-teaming a BAIM, it's red-teaming to see if it can create novel information. There are information hazards inherent in that that you might want to caution policymakers from advocating for. The second is that there are accident risks, biological lab accident risks, from red-teaming BAIMs. That comes from the fact that it’s a little bit hard to do wet-lab—sorry, it’s actually hard to validate that there are inherent risks without doing wet-lab validation from a biological AI model. I’ll have to look this up for my colleagues. Raise your hand, colleagues in the back. Thanks.For instance, my colleagues at CHS said that one can imagine designing new, comparatively harmless viral proteins with foundation BAIMs and then optimizing their fitness and immune evasion capacity with narrow BAIMs, which would be concerning capability if it happened within a single evaluation or model. You could imagine that you would have to potentially take that and then do it in a wet-lab to be able to verify it. Similarly, that same example also applies for conjunction or linked model risks. It's very easy to take an LLM and do one single evaluation of an LLM and say, yes or no, is it giving me a bad output? It's much harder to evaluate a biological AI model for a single output because of the fact that you have to link them with other models to be able to see whether the output can be linked with another model's output, and that is the thing that’s creating the harm. You’ll have to be aware of that when you're crafting policies because it might take multiple evaluations to be able to actually identify the risk from a biological AI model compared to an LLM.Finally, for biological AI models, we think it might be appropriate to identify risks earlier in the AI life cycle compared to pre-development, compared to pre-deployment. It might not be appropriate to ask a biological AI model developer to do a pre-deployment evaluation if they've already done a pre-development risk assessment. Be aware of those kinds of differences when you're crafting legislation, and we’ll be in good order for future-proof in the future. Thank you.[Applause.]