Security for AI with Confidential Computing

Summary

Mike Bursell highlights the role of attestation in AI security, explaining how confidential computing uses Trusted Execution Environments to ensure data integrity and confidentiality while collaborating with industry partners to enhance protection.

SESSION Transcript

If you're here to talk about attestation and security for AI, that's good; you're in the right place. If you're not, it's called attestation is the name of the talk.
Okay, scaring people off. A couple of people coming in, I think. The name of the talk in the thing is attestation, and that is absolutely what I'm focusing on.
Okay, so first of all, that's me. I am the exec director of the Confidential Computing Consortium.
Who has heard of confidential computing? Okay, so half the people have heard of confidential computing. That's a start. What I want to talk about is, firstly, why we care about this. Secondly, a very brief intro to confidential computing, the joy of attestation, and then go on to AI.
So luckily, AI is secure. Well, it's secure or AI is—I just want to give up at this point and say, we haven't got a statement we can use. So I want to give an example of why [unclear]. Has anyone seen this picture? It's quite a well-known picture in the software security world; it's the supply chain.
Your software supply chain generally looks like this. Nothing particular to do with AI, but this is kind of what it looks like. It's used because if you care about the security of the software that's coming out that you're using at the end here, then you need to care about all the other pieces because the software supply chain, if its security is breached at any point, you've got a problem. The idea is that each of these points gives an opportunity for someone to break that security.
Each time you're doing anything, and each time you're transferring something, there's an opportunity for security vulnerabilities. So what's this got to do with AI? This is kind of how I think of AI, and if you disagree or I'm wrong, then please talk to me afterwards. But let's go with this for now. You've got a training model, and you add a couple of data sets to it, and out of that pops an inference engine. You have a user, and she, in this case, can ask questions of the inference engine and hopefully will get a sensible answer.
Is that a fair enough way of looking at the world? Let's pretend. Let's think about what this looks like if we think about the security case. The first thing is, what if someone messes with the data sets? You got a problem, right? Or they could in fact mess with the training model. That’ll give you a problem. Or they might mess with it whilst it's going to be used to the inference model. Or they could in fact mess with it whilst it's being used as an inference model. Or they could mess with the question that's being asked by the user, or they could mess with the answer that's being given to the user. The point being that we have lots of opportunities for bad things to happen in your AI supply chain, if we're going to call it that—the life cycle of an AI. You have betrayed my tiny trust. I'm very sad. Picture of sad cat is appropriate here.
So confidential computing, what is it? How will it help? What will it do?
Let's just step back a bit to how the world works. The world in most of the world these days in computing works like this: everything is a workload. Very few things have their own dedicated machine that they run on all the time. It does happen, but most workloads, most things we’re running applications look like this: often in the cloud but they might be on a machine you own and control. You can think of the world a bit like this: the big dotted line is the system, the computer, and then there's a host operating system and all of those nice bits and pieces, and there are one or more workloads. The workload we care about is the one on the right, which is in yellow, and apologies if you have color blindness issues, but it’s on the right we care about. That’s our workload, and we want to protect that.
Confidential Computing aims to help us with this. I’m not going to go into huge amount of detail, but it's about the protection of data in use by performing computation data in a hardware-based and attested Trusted Execution Environment. The phrase hardware-based is really important here. We do not trust software because people can mess with software. They can mess with hardware a bit, but it's much, much more difficult. So, hardware-based. We'll come to the word attested in a minute because that's very important as well.
A Trusted Execution Environment, you can basically think of it as capabilities in a chip, which allow you to protect (typically encrypt) that data so that it is only ever decrypted when it's on the chip. This means that other workloads can't look into it or mess with it. Even the computer itself that's running it, the operating system, the admins, the hypervisors, the container run times, can't mess with it as well. This means that now you can have assurances about the confidentiality and integrity of your workload. No one can look in, and no one can mess with it. But there's a problem, and this is where attestation comes in.
Let's say I have a workload that I want to run, and I say to Phil here, will you run my workload for me in a Trusted Execution Environment because I don't trust you not to mess with it? And he says, yes, of course, I've done that. I've put that workload in a Trusted Execution Environment. It's fine now. The problem is I can't trust him. The whole point is I didn't trust him in the first place, so why would I trust him to have given me the right answer? He could have just pretended it's in one of these environments and protected by hardware. This is where attestation helps us. There are a number of models for how you can use attestation. This is one of the simplest, and I'll go through it pretty quickly.
Here, the nice blue box is the idea of a Trusted Execution Environment, this is protected area protected by the hardware. In it, we have the application—in this case, it might be a training model—and some data (the yellow bit). That is within inside the Trusted Execution Environment. At the bottom, we have a CPU or GPU, and in the future, that could be an NPU, DPU, or whatever you want it to be, and we have a user on the bottom left. The user wants to have strong cryptographic assurances that what is running in there and the data that’s in there is what they think it is.
Here's what happens: the application says to the chip running it, would you please measure me? By measure, I mean take a cryptographic hash of every bit in the memory that's being run. We’ve now got a hash of that, so we know exactly what that looks like. What the CPU or GPU then does is it signs that with some keys which can be traced back to Intel, AMD, Nvidia, or whoever was the creator of that chip. So we now have this cryptographic measurement which says, okay, we've measured what was in there, and we've signed that. We’ve sent that to an attestation service. That might be a third party or it might be you running it, but the point is, the attestation service can look at what's in that signed quote and check: a) that it was signed correctly by a trusted party in this context and b) that it is what we think it is. If it is, then you get a big tick for the user, and the user is happy. They know they have cryptographic assurances that what is running in that TEE is what they think it is, and that it is running on valid hardware, and that it’s all being correctly set up. That measuring process is quite complex and involves things like checking the microcode and the firmware, as well as all of the stuff that’s relevant in the stack.
We’ve now got a really nice thing here. We’ve got some really nice properties that we can be assured of now. But it allows us to do a lot more than that because that thing down there, that measurement that we created and signed is basically a certificate because what really is a certificate other than a signed piece of information, in this case, a hash? You can start doing some really nice and useful and cool things with certificates. You can use it for TLS, for instance, when you want to be building up a channel to talk over on the network. That’s a useful thing. You can use it to prove identity, also very useful, and prove uniqueness. You can extend it or use it for signing the output of an application, things like that as well. As well as that, of course, you’ve got these confidentiality and integrity assurances. We’ve got a bunch of primitives, a bunch of use cases that we can use this certificate for which come completely out of the fact that we’ve got this attestation capability.
Now attestation is a word that is used quite loosely within computing and supply chain. Here, we’re using a very specifically—sometimes this is called remote attestation because again, I don’t trust Phil to do that. The verification needs to be done separately. Sometimes it’s called hardware-based attestation as well. But this is very specific use case here.
So let's try and combine these properties and see what happens. Let’s think about all those things we want to do. We want confidentiality and integrity throughout all of the training and inference steps because we want to know that what goes in is not being messed with and what we’re using is not being messed with and that other people can't look at it. You may want to check that other people can't look into your training model as well, lots of good reasons for that. We'd also like to be able to check that what comes out as an answer can be directly tracked back to the initial data sets. That's a very useful property to have as well. Also, I'd like to be sure when I ask questions of the inference engine, no one can mess with that. Those are all those different things I had, the skulls that I had.
So, let’s look at how this might work. First of all, we are going to run our training model and put the data sets into a Trusted Execution Environment, and we go to create an attestation certificate based on that. That’s a signed hash so that we know exactly what was in there. What we’ve just done is we’ve protected the confidentiality and integrity of both those data sets but also of the training model. These are different properties which is why I’ve listed them separately. There’s four properties there really, but they’re different things you might care about for different reasons.
We now can transfer that into the inference engine. Now if we’re to embed that certificate into the inference engine, we can now be assured that that inference engine has come from, has been based on those data sets and that training model. We’ve now got provenance all the way back to those data sets and that training model. We also can be sure that we have integrity as well, and we could even use this for communications confidentiality, but I haven’t shown that here.
Now, having done that, because we’re running that in a TEE, we can also create a certificate for that. We now have another certificate that we’ve created—a second certificate, I suppose—which shows that that inference engine has not been messed with, has not been looked at, the integrity and confidentiality of that. We’ve now hit another properties we were trying to get to. Last but not least, we use that certificate also for protecting the communications between the user and the inference engine.
We have now just managed to create a system which meets all of those properties or provides all the properties that we’re looking for before. We are protecting the confidentiality and integrity of those data sets. We’re protecting the confidentiality and integrity of the training model. We are tracking the inference engine provenance, and therefore, because we are protecting the confidentiality here, we can be sure we can track the provenance of the answers all the way back to the training model and the data set. We are protecting the confidentiality and integrity of that query response information.
By using confidential computing, which is hardware-based attested Trusted Execution Environments, we've built a whole bunch of very useful properties into this end-to-end system. I've gone through that rather quicker than I thought, so we'd have lots of time for questions if people want, but just a little bit about the Confidential Computing Consortium.
We are a community focused on projects. We have 13, soon to be 14, open-source projects—I’ll show you some of them in a moment—securing data in use and accelerating the adoption of confidential computing through open collaboration. That’s important because we are part of the Linux Foundation. We have been around for five and a half years. These are our premier members. We have over 50 members total. These are obviously some big names here. We have lots of startups. We have a bunch of academic institutions who are also members of the Confidential Computing Consortium.
Here are some of the—these are all the existing open-source projects. There will be another one added probably in the next week or two. They do a variety of things. Some of them are easy frameworks to let you pick up and play with this. Veraison, for instance, is an attestation framework. Remember, I showed you that attestation service. Someone's got to run that, and this is the capability to allow you to do that. We have others as well, providing things like environments in which you can run your application. Some are doing things with particular chipsets, and others are more general.
For instance, Islet over there on the left, was donated by Samsung last year, and that allows you to use it for for cryptographic wallets, cryptocurrency wallet types things. We've got a variety of things, and one of the latest has got the coolest logo to the little ManaTEE, which I do like. I've given a number of presentations where I riff on TEA for tea, because I'm British, and I decided not to do that today. Lucky you.
We'd love to have you—if you are here at this talk, then the organization of which you're part should probably be a member of the Confidential Computing Consortium. It's pretty much as simple as that. If you work for a startup with under 100 employees, then it is free for the first year. You need to be a member of the Linux Foundation, which I think costs $5K a year, but the involvement and the subscription to the CCC is free.
We do a number of things. We have technical committees talking at things like Linux kernel patches, some very low-level ones, up to issues like governance, risk, and compliance. We're looking to start maybe a new one on workload identity. I would love to have a special interest group or committee around AI. We probably have one starting on Web3 fairly soon, so there’s all those sorts of things.
We also speak at conferences, have sponsors sponsor conferences, and have booths. If you're a member of the consortium, you're welcome to come onto the booth and get involved in all those sorts of things as well. We're also starting a job board. If you're interested in that, please come talk to me, or just go to confidentialcomputing.io.
We have, I think, possibly—well, we're running a bit late, so we have up to 10 minutes for questions. Or if I've given the perfect presentation and there are no—don't laugh like that—and there are no questions, that's also fine. But any questions? Go on, that's fine. I don't mind.
[Audience question]:
What is the Consortium's current thinking on where we're at with current state-of-the-art models, which are models that need to be trained and served across multiple accelerators, so your T boundary needs to be extended across multiple chips or accelerators? How does current tooling and projects under your consortium address the current need of where we are at with scale?
[Mike Bursell]:
There's a couple of things that need to happen to make that work, frankly. One is standards, and there are a number of standards that are happening to help that. The PCIe is one of the PCIe standards, and there are a couple of things going on there. The CCC isn't a standards-making body, but a number of the people who are involved in standards like that are also CCC members and will fairly frequently give us updates on what's going on. So I know there's some good stuff going on there.
At least one of the SPDM tools is aimed directly at tools to work on that. Part of the interesting thing is , if I go back to this question, this picture way back, sort of an hour and a half ago, it feels like, this one here. In this case, we just have a single TEE, but there may well be occasions where you actually want on the same machine to have some stuff on the CPU and some stuff on the GPU. How do you allow the applications in those two different processes to have some level of trust and attestation and data sharing between them?
There are standards being created around that, and indeed, software to do that. If you're interested, we'd love to see you there to get involved with some of that work. That's ongoing work. It's getting more mature, and we track it and are involved, but we're not driving it directly. Thank you. Nice, detailed question. I like that one.