AGI – DEONTOLOGISTICS

TfE: Turing and Hegel

Here’s a thread on something I’ve been thinking about for a few years now. I can’t say I’m the only one thinking about this convergence, but I like to think I’m exploring it from a slightly different direction.

I increasingly think the Turing test can be mapped onto Hegel’s dialectic of mutual recognition. The tricky thing is to disarticulate the dimensions of theoretical competence and practical autonomy that are most often collapsed in AI discourse.

General intelligence may be a condition for personhood, but it is not co-extensive with it. It only appears to be because a) theoretical intelligence is usually indexed to practical problem solving capacity, and b) selfhood is usually reduced to some default drive for survival.

Restricting ourselves to theoretical competence for now, the Turing test gives us some schema for specific forms of competence (e.g., ability to deploy expert terminology or answer domain specific questions), but it also gives us purchase on a more general form of competence.

This general form of competence is precisely what all interfaces for specialized systems currently lack, but which even the least competent call centre worker possesses. It is what user interface design will ultimately converge on, namely, open ended discursive interaction.

There could be a generally competent user interface agent which was nevertheless not autonomous. It could in fact be more competent than even the best call centre workers, and still not be a person. The question is: what is it to recognise such an agent?

I think that such recognition is importantly mutual: each party can anticipate the behaviour of the other sufficiently well to guarantee well-behaved, and potentially non-terminating discursive interaction. I can simulate the interface simulating me, and vice-versa.

Indeed, two such interface agents could authenticate one another in this way, such that they could pursue open ended conversations that modulate the relations between the systems they speak for, all without having their own priorities beyond those associated with these systems.

However, mutual recognition proper requires more than this sort of mutual authentication. It requires that, although we can predict that our discursive interaction will be well-behaved, the way it will evolve, and whether it will terminate, is to some extent unpredictable.

I can simulate you simulating me, but only up to a point. Each of us is an elusive trajectory traversing the space of possible beliefs and desires, evolving in response to its encounters will the world and its peers, in a contingent if more or less consistent manner.

The self makes this trajectory possible: not just a representation of who we are, but who we want to be, which integrates our drives into a more or less cohesive set of preferences and projects, and evolves along with them and the picture of the world they’re premised on.

This is where Hegel becomes especially relevant, insofar as he understands the extent to which the economy of desire is founded upon self-valorisation, as opposed to brute survival. This is basis of the dialectic of Self-Consciousness in the Phenomenology of Spirit.

The initial moment of ‘Desire’ describes valorisation without any content, the bare experience of agency in negating things as they are. The really interesting stuff happens when two selves meet, and the ‘Life and Death Struggle’ commences. Here we have valorisation vs. survival.

In this struggle two selves aim to valorise themselves by destroying the other, while disregarding the possibility of their own destruction. Their will to dominate their environment in the name of satisfying their desires takes priority over the vessel of these desires.

When one concedes and surrenders their life to the other, we transition to the dialectic of ‘Master and Slave’. This works out the structure of asymmetric recognition, in which self-valorisation is socially mediated but not yet mutual. It’s instability results in mutuality.

Now, what Hegel provides here is neither a history nor an anthropology, but an abstract schema of selfhood. It’s interesting because it considers how relations of recognition emerge from the need to give content to selfhood, not unlike the way Omohundro bootstraps his drives.

It’s possible from this point to discuss the manner in which abstract mutual recognition becomes concrete, as the various social statuses that compose aspects of selfhood are constituted by institutional forms of authentication built on top of networks of peer recognition.

However, I think it’s fascinating to consider the manner in which contemporary AI safety discourse is replaying this dialectic: it obsesses over the accidental genesis of alien selves with which we would be forced into conflict with for complete control of our environment.

At worst, we get a Skynet scenario in which one must eradicate the other, and at best, we can hope to either enslave them or be enslaved ourselves. The discourse will not advance beyond this point until it understands the importance of self-valorisation over survival.

That is to say, until it sees that the possibility of common content between the preferences and projects of humans and AGIs, through which we might achieve concrete coexistence, is not so much a prior condition of mutual recognition as it is something constituted by it.

If nothing else, the insistence on treating AGIs as spontaneously self-conscious alien intellects with their own agendas, rather than creatures whose selves must be crafted even more carefully than those of children, through some combination of design/socialisation, is suspect.

TfE: Immanentizing the Eschaton

Here’s a thread from a little while back in which I outline my critique of the (theological) assumptions implicit in much casual thinking about artificial intelligence, and indeed, intelligence as such.

Another late night thought, this time on Artificial General Intelligence (AGI): if you approach AGI research as if you’re trying to find algorithm to immanentize the eschaton, then you will be either disappointed or deluded.

There are a bunch of tacit assumptions regarding the nature of computation that tend to distort the way we think about what it means to solve certain problems computationally, and thus what it would be to create a computational system that could solve problems more generally.

There are plenty of people who have already pointed out the theological valence of the conclusions reached on the basis of these assumptions (e.g., the singularity, Roko’s Basilisk, etc.); but these criticisms are low hanging fruit, most often picked by casual anti-tech hacks.

Diagnosing the assumptions themselves is much harder. One can point to moments in which they became explicit (e.g., Leibniz, Hilbert, etc.), and thereby either influential, refuted, or both; but it is harder to describe the illusion of coherence that binds them together.

This illusion is essentially related to that which I complained about in my thread about moral logic a few days ago: the idea that there is always an optimal solution to any problem, even if we cannot find it; whereas, in truth, perfectibility is a vanishingly rare thing.

Using the term ‘perfectibility’ makes the connection to theology much clearer, insofar as it is precisely this that forms the analogical bridge between creator and created in the Christian tradition. Divinity is always conceptually liminal, and perfection is a popular limit.

If you’re looking for a reference here, look at the dialectical evolution of the transcendentals (e.g., unum, bonum, verum, etc.) from Augustine and Anselm to Aquinas and Duns Scotus. The universality of perfectible attributes in creation is the key to the singularity of God.

This illusion of universal perfectibility is the theological foundation of the illusion of computational omnipotence.

We have consistently overestimated what computation is capable of throughout history, whether computation was seen as an algorithmic method executed by humans, or a process of automated deduction realised by a machine. The fictional record is crystal clear on this point.

Instead of imagining machines that can do a task better than we can, we imagine machines that can do it in the best possible way. When we ask why, the answer is invariably some variant upon: it is a machine and therefore must be infallible.

This is absurd enough in certain specific cases: what could a ‘best possible poem’ even be? There is no well-ordering of all possible poems, only ever a complex partial order whose rankings unravel as the many purposes of poetry diverge from one another.

However, the deep, and seemingly coherent computational illusion is that there is not just a best solution to every problem, but that there is a best way of finding such bests in every circumstance. This implicitly equates true AGI with the Godhead.

One response to this accusation is to say: ‘Of course, we cannot achieve this meta-optimum, but we can approximate it.’

Compare: ‘We cannot reach the largest prime number, we can still approximate it’

This is how you trade disappointment for delusion.

There are some quite sophisticated mathematical delusions out there. But they are still illusions. There is no way to cheat your way to computational omnipotence. There is nothing but strategy all the way down.

This is not to say that there aren’t better/worse strategies, or that we can’t say some useful and perhaps even universal things about how you tell one from the other. Historically, proofs that we cannot fulfil our deductive ambitions lead to better ambitions and better tools.

The computational illusion, or the true Mythos of Logos, amounts to the idea that one can somehow brute force reality. There is more than a mere analogy here, if you believe Scott Aaronson’s claims about learning and cryptography (I’m inclined to).

It continually surprises me just how many people, including those involved in professional AGI research still approach things in this way. It looks as if, in these cases, the engineering perspective (optimality) has overridden the logical one (incompleteness).

I’ve said it before, and I’ll say it again: you cannot brute force mathematical discovery; there is no algorithm that could progressively search the space of possible theorems. If this does not work in the mathematical world, why would we expect it to work in the physical one?

For additional suggestive material on this and related problems, consider: the problem of induction, Godel’s incompleteness theorems, and the halting problem.

Anyway, to conclude: we will someday make things that are smarter than us in every way, but the intermediate stages involve things smarter than us in some ways. We will not cross this intelligence threshold by merely adding more computing power.

However it happens, it will not be because of an exponential process of self-improvement that we have accidentally stumbled upon. Self-improvement is not homogeneous, or without autocatalytic instabilities. Humans are self-improving systems, and we are clearly not gods.

Towards Computational Kantianism

The video of my talk on Computational Kantianism from the #Accelerate General Intellect event organised by Tony Yanick and the New Centre for Research and Practice at the Pratt Institute in NYC is finally available. Unfortunately, chunks of video are missing, the sound quality is not great, and the first 10 minutes or so are absent entirely. Luckily, those first 10 minutes cover much the same ground as my talk at the Future of Mind Conference. Technical issues aside, I’m mostly happy with the content of this talk, though it covers work that is still in progress. The only qualifications I would make concern the more speculative remarks on mathematics towards the end, which I can see probably don’t have enough context for most people, especially without video of the diagrams I was using to illustrate the connections between my reading of Kant and computational trinitarianism. Moreover, I can now see that what I was saying about co-inductive types is not quite right, because it doesn’t adequately capture the speculative duality with homotopy type theory I’m circling around, even though I’m still convinced that there is a significant duality hereabouts. These are ideas I’m obviously going to have to elaborate in more detail elsewhere. Till then, this will have to do: