I thought I would write a post delving into more detail about how advanced new BCIs (Brain-Computer Interfaces) could be used to build a “zombie AGI" socialbot, that I briefly described here:
At the end, I will discuss potentially taking it a lot further. Hopefully, you will find the post inspiring, causing you think more about what is possible, near-term (5 to 10 years from now). I am also optimistic about other designs, that don't involve brain-scanners; but I think the brain-based approach will develop the fastest. It may not seem that way at the moment, but just wait and see what happens if Facebook or Openwater succeeds in their construction of next-gen BCIs!
The way that the zombie AGI probably won't be built -- at least not at first -- is as a model trained to predict all the brain changes and actions as people read chat text and type a response. There would be too many parameters to set, and too many ways it could go wrong. Also, there is a problem with aligning the words people are reading with their brain responses -- might take eye-tracking.
One very simple alternate approach I described here:
That is probably the closest to being implemented. But it will be hard to make it live up to the dreams of scifi.
The following "passive" approach will likely result in more human-like AI: in this approach, several people would wear light, comfortable, high-res brain caps (no less comfortable or obtrusive than a baseball cap) over several sessions as they listen to conversations or read text chats, one word at a time, as their brain responses are recorded, cleaned, and made into training data (simple averaging over subjects is one approach; a better approach is to map all responses into a common representation space, first). A neural net would then be trained, using this data, to predict those brain responses, given the responses from the previous few seconds, and given the corresponding temporally-aligned text. This trained model could be used as a "reranker", by predicting error signals, to determine which text responses were appropriate. A separate module would generate potential text responses, ranked in some way; and then the brain-based model would critique them, given the full text of previous responses and the associated predicted brain responses. Maybe the second module generates 1000 potential responses, which the first model whittles down to just 5 or 10. Or, one can apply backtracking, and eliminate much larger numbers of responses, eliminating them mid-way through when they go badly.
The second module could be trained just on massive amounts of free text -- billions of words. However, the first module might only need the text and brain data from 100 individuals, say, as they are exposed to the equivalent of 3,000 pages worth of words from novels. That count is probably around 1 million words per person, and 100 million words among all listeners. That should be enough to cover a fairly broad range of topics and contexts, if the 100 people listen to different things (so, no averaging of brain data). Even just 10 million words might suffice.
You might think that, since all brains are different, you wouldn’t be able to merge those 100 brains into a single model. However, it has been known for some time that brains respond in similar ways to the same media. This shouldn’t be surprising, as we all have the same body parts and senses, that each connect to analogous areas of the brain. Also, people have built algorithms to map individual brain responses into a common representation space, extracting even more similarities across brains. See this talk by Jack Gallant:
Another thing you might think is that the "passive method" or "reranker" is untried, and therefore there is greater uncertainty about its efficacy. However, it turns out that the use of rerankers is actually a common approach to chatbot design (but they don't use brain data). If the reranker isn’t terrible, we should expect the composite system to perform no worse than a base seq2seq model, such as this one:
And if it's even a little bit good, huge improvements will likely result.
Adding brain data to a "critic" is similar in spirit to the "visceral machines" work posted a few weeks ago:
In that work, bodily responses correlated with things like fight-or-flight reactions, are used to train a model to predict the "visceral reaction" associated to a portion of video. This, in turn, is used as a kind of "intrinsic reward" mechanism to overcome the "sparse rewards" problem when training a system via Reinforcement Learning.
They claim in the paper that what is learned appears not to be a simple heuristic:
We believe that the trained CNN is far richer than the simple distance-based measure and is able to capture the context around the task of driving the car in confined spaces (e.g., avoiding turning at high speeds and rolling the car).
And that's just what results from a fairly crude, low-resolution signal. Given the brain's response to text exchanges, there should be all kinds of useful signals, such that if the neural net does even a poor job at predicting them, it will still do a good job eliminating bad responses -- so long as it doesn't do a really poor job.
We know, for instance, that crude EEG headsets can pick up all kinds of signals from the brain when someone reads or listens to text with syntactic or semantic errors. For example, the N400 signal responds to semantic errors:
In the same paper, they confirmed that the negativity was not just caused by any unexpected event at the end of a sentence, since a semantically expected but physically unexpected word (i.e. She put on her high-heeled SHOES) elicited a P300 instead of negativity in the N400 window. This finding showed that the N400 is related to semantic processing, and is not just a response to unexpected words.
And, like with the visceral machines work, I would expect that one could use this signal alone to improve chatbots -- but, ultimately, there is only so much one can do with it. Even 10 such signals isn’t enough to go very far. Part of what would be missing here is causal relations in how the signals interact with each other and with the media. A machine learning model could, in principle, infer some of these latent causal patterns (as with viceral machines); but probably nowhere near enough.
However, if you have thousands of signals at high spatial and temporal resolution to work with, as you would using next-gen wearable brain-scanners, machine learning models could start to unravel and predict much more of this hidden causal structure. Neural nets trained to predict brain states should acquire some of the brain's ability to maintain and use context (by inferring how the brain "tunes"), some amount of short-term memory, commonsense knowledge encoded in brain-wide activity patterns (see the work of Gallant and others), some types of causal reasoning, sentiment understanding, and so on. What would be missing is things like long-term memory, which can't easily be decoded from brain population responses. Still, a conversation with a socialbot built as I described might look like this:
> User: So, tell me about yourself. What countries have you been to?
> Computer: I've been to many countries, but France was my favorite.
> User: I've visited France, too. Most travelers have only been to Paris; they haven't explored the rural areas, which I like much better.
> Computer: I'm one of those who has only been to Paris. Tell me what's so great about about the small towns?
> User: The people are just so friendly and the food is to die for.
> Computer: I’ve heard that before. Tell me about the people you encountered.
The first response from the computer here might be one of thousands of potential outputs. The text generation module might produce the following potentials:
1. I've been to many countries, but chocolate was my favorite.
2. Countries are my favorite.
3. I don’t know.
4. The rain in Spain stays mainly.
5. I’ve been to many countries, but France was my favorite.
And then the reranker or critic would eliminate the first four (or first 9,999), and output the fifth one. The first four would cause the model to produce large error signals -- sort of like the N400 signal.
It’s worth noting that long-term memory isn't really needed here. The computer gives a response that looks like it requires long-term memory (e.g. saying it went to Paris); but that is merely a randomly-generated "appropriate" response. And using short-term memory it maintains the ruse for the duration of the conversation.
Adding long-term memories is certainly possible; but I would be thrilled just to see a socialbot that could work as well as the above example for 5 or 6 back-and-forth exchanges. Let's take this one step at a time!
As magical as something like a really good socialbot would be, the risk of failure is probably too high for people to attempt something like this the very second good BCIs become available. They will need to work up to it, to build confidence that it will work so well -- it will work; the key question is, "How well?" I think things will ramp-up like with image recognition accuracy: as people see what they can achieve with brain data, they will devote exponentially more energy to experimenting and building applications.
Another thing that may slow down the development are ethical and privacy concerns. Have those 100 people who contribute their brain data to the project unintentionally revealed their deepest, darkest secrets? Or passwords and bank account numbers? And what about the zombie AGI? -- predicting brain responses would put it much closer to being a real, live human being than any other piece of software yet produced. Could we really say that it’s “just a lifeless machine executing dumb algorithms"?
Finally, it’s worth mentioning that such a system could possibly be made much, much smarter, by tweaking the model. For example, the model could perhaps be tweaked to solve scientific problems, through the use of neuroevolution or reinforcement learning. What would happen if we released a bot onto the internet with the verbal intelligence and commonsense of a human, along with the scientific acumen of Einstein, or beyond?