📜

Work-in-progress plan for deep international AI coordination

Initial doc here: https://docs.google.com/document/d/1dPVqv86puhmHCAfJjmt1cHJMjJOz-expVqsYckTiRAc/edit

Overall plan

We build coordination structures that allow

Parties with power over AI stuff to build some coalition to build it together, safely and cooperatively. The structure is such that there are strong incentives for them to do this.
Excess from the cooperation goes to an impartial benefit fund that tries to do “good stuff” with it, including good stuff that the parties-with-power-over-AI don’t care about so much.

I think that various bits of mechanism design, philosophy, etc can help this sort of plan go better.

Impartial benefit

This is fairly broadly designed - the idea is that this goes to good stuff that doesn’t happen by default. This could include:

Giving other currently-alive humans a stake in the future (e.g. people who own no capital and are not citizens of any country - making sure that they have some say over what happens in the future).

Any UBI-like thing should be Universal Basic Capital rather than Universal Basic Income - this resolves Malthusian dynamics (because you internalize the tradeoff of creating more copies/children but giving each of them fewer resources) (h/t Eric Drexler)

Building some structure that’s more trying to pursue some impartial good / the most morally valuable use of resources

One way of framing this question is: What could we hand to Anthropic’s Long-term benefit trust about how any benefits from Anthropic’s activities should be handled?

❓Open questions

Intuitively, I kind of like how this line of thinking is splitting things up - the first question is basically economics, the second is a fairly empirical question, the third is philosophy, the fourth is engineering. I could imagine these things being parcelled up and passed on to specialists who could make progress on them.

1. Mechanism design to find a way of building the coalition with desirable properties

I’d say the desirable properties are:

the Shapley properties
Plus the property that it’s always good for all players if an additional player joins the mechanism

Or maybe they’re more like:

Each player to always be (much) better off from joining the coalition
All coalition members to benefit from allowing additional players into the coalition
Subject to the above, as much “excess” as possible, that can be devoted to more impartially valuable concerns
The mechanism to be fair-seeming / principled, in order to facilitate coordination
[Probably other things]

[Note that, if we expect there to be a lot of gains from trade here, then we maybe have a bunch of wiggle room, and we can use a mechanism that works if there are lots of gains from trade but not in the general case. This makes this easier.]
Maybe in reality you can just handle a lot of this by political bargaining, but I think that figuring out a bit of the theory might be helpful, to inform what bargains might make sense; and because the theoretically optimal setups might provide some Schelling point/justification for taking particular positions, thus making coordination a bit easier.
[Maybe step 1 on this is figuring out what the desiderata for the mechanism are, and then step 2 is working on the mechanism]

2. Initial estimates of what the payoffs could be for players

Initial estimates of rough inputs into that mechanism (e.g. building off Epoch analysis, looking at market shares, looking at how critical various bits of the supply chain are)
Initial estimates of expected value of the future (e.g. building off random FHI stuff)
Then combining these into a sense of how much budget the upper limit of player rewards would be.

3. What would be the best thing to do with the altruistic benefit fund?

This is a question about how to best set up the split / mechanism or whatever for the excess - e.g. should it be one-human-one-share, or hand it over to philosopher kings etc.
This is where I think some of the discussion from Tuesday’s session fits in.
But I also imagine that this includes mechanism design, trying to think through what a reflection process should look like.

4. How do you establish trust, and guarantee that the mechanism will be carried out?

It’s really really important that this doesn’t devolve into something totalitarian, sclerotic or risk-averse. This plan could be net-negative if one of those things happen.
Try to see if people like Vitalike could work on this.