Exploring Cooperation & Alignment

Summary

Kalesha Bullard highlights the need for redefining AGI beyond individual capabilities and introduces coalitional skill games as a tool to understand collective AI agency.

SESSION Transcript

Hi. My name is Kalesha Bullard. I am a research scientist at DeepMind. Actually, that wasn't quite my talk title. My talk title is Exploring Cooperation and Alignment. This is a quote from a recent position paper that articulates levels of Artificial General Intelligence. It is an important and sometimes controversial concept in computing research used to describe AI systems that are at least as capable as humans at most tasks. The key thing I want to highlight here is that AGI is typically defined in terms of individual agency. There is an opportunity to recognize that in biology, we have a lot of agency that we get through collectives.
Here are two examples of collectives where construction workers come together to build huge buildings and at a larger scale, really complex structures that increasingly enable other humans. In particular, this is a plan for cooperation because you'd have architects designing things for the complex structure. This is also a picture of collective agency in biology, where the collective agency emerges. This is an important example. On the left side, you have ants building a bridge for other ants to walk over. On the right side, this is depicting immune cells launching a coordinated attack to kill a cancer cell in the body. One of the things we want to highlight here is that, when we come together as collectives, new affordances, new tasks, new skills, and new agency are born from that. This is important for us to capture when talking about Artificial General Intelligence, and in particular Artificial Super Intelligence.
At some point, people are already starting to talk about it in the news. We believe that we'll have AI agents able to, not only do things collectively, but do things in ways that surpass human intelligence. It becomes important to figure out how to capture that.
Some key questions: "When we're able to produce (whether planfully or emergently) collective affordances and capabilities beyond the level of any individual agent—one question which is more philosophical and good to think about—do we analogously begin to move beyond AGI towards Artificial Super Intelligence?"
And what I want to build on here is "How can we align this highly capable collective in terms of values and outcomes when agents have different individual skills or capabilities and they may also have multiple objectives that they need to satisfy?"
I'd like to discuss one potential framwork that I've used on some of my research that can be used to help structure the way we think about cooperation and collective agency.
Cooperative game theory is an entire branch of game theory, less well known for effectively partitioning agents into collectives. The underlying premise is that you get more value when you have agents in collectives than using them individually. The game formalism I want to highlight here is coalitional skill games. You have task, skills, and players. Every player has a subset of skills. Every task has a skill requirement. And importantly, a coalition can only perform a task if they have the skills neccesary for that task to be performed. So, if a task requires skills one and two, the coalition has to have those skills. The coalition's value or reward is a collection of all the tasks they can perform.
Here’s an example. Let's say you have these different abstract agents, and S sub ID is the agent skill set. The colors tell you the different agents that are synonymous. You have three tasks that these agents need to perform. You don't need to pay attention to what the tasks are saying, but Each task requires certain skill sets, but they each require some skill sets, and the task reward is some combination of importance and complexity. One way you could partition these agents is to have every agent working together, which is like the idea of individual agency. The challenge here is if you look at the coalitions and tasks, task 2 is the only task any of the coalitions can do, and three of them can do those tasks, but a lot of player skill sets are not being used. If you put all the agents in the same coalition, it's a bit better., but what we want to do is partition the agents into teams to optimally maximize their skill sets. What I want you to know here in terms of biological systems is here event agents have different skills or roles and are being leveraged differently. So when we talk about being able to partition agents in the best structures, it’s really some optimal combination of coordination and distribution of the tasks.
If you recall this—I don't actually expect you to recall this, but just as a reference. Here what I'm effectively saying is now, if you add in some kind of cost, say you want to trade-off the the effectiveness of tasks with a safety cost, you can add in a multi-objective function here. So the key things to note here is this idea of cooperation and multi-agency presenting novel unexplored challenges for AI alignment. It is critical for AGI and ASI as it is for biology. Cooperative game theory offers a formalism to best leverage agents, partition, and distribute them, but also enable them to coordinate to make the most of them. Thank you. [Applause]