The purpose of this posting, and the others in this thread, is to lay out what I think is the human enhancement potential of next-generation Brain Computer Interfaces. Specifically, I’m thinking of BCIs that have very high spatial and temporal resolution, and that can scan the brain at depth. I am not interested here in:
* EEG BCIs. These are great for some things, though are limited.
* BCIs that write to the brain. These will have many uses; I’m just not interested in them for this posting.
I foresee four broad categories of how BCIs will enhance humans:
1. They will use our brain patterns to show us things we never knew could be deduced from what we know. This will allow us to extend human intelligence in various ways, and will also accelerate our ability to learn new things.
2. They will greatly improve a computer’s ability to read our intentions and disambiguate. This will make virtual assistants more accurate and engaging, for example.
3. They will allow us to transfer our thoughts to the computer -- e.g. our inner voices will be transcribed; our intentions will be interpreted (“robot, clean the floor”); the images in our heads can be transferred to the screen.
4. They will allow us to indirectly enhance ourselves, by training machine learning algorithms to automate tasks.
I have written extensively about #4, so will focus on the other three in this note. The next several posts will address these in order. All will involve the use of statistical algorithms and machine learning, and I foresee progress similar to what we have seen with speech recognition, image recognition and machine translation -- it will start slow, and then will rapidly improve. The remainder of this posting will address #1:
I. Computers will use our brain patterns to show us things we never knew could be deduced from what we know.
Here is something rather amazing: just by looking at how words co-occur in the English language, it is possible to determine the distance between cities of Europe, and also determine their longitude and latitude coordinates, with high accuracy.
I suppose I need to explain this a little bit. When people talk about the cities of Europe, they mention nearby cities together more often than ones far apart. This is obviously not the case in all contexts -- for example, if we’re talking about “major world powers”, you might find that “London” and “Berlin” are mentioned together more often than London is to some other city of England. But if we’re talking about a very broad set of contexts, then word co-occurrence statistics of city names can be used to localize them on a map.
It’s not just cities. This has been tried with parts of the body! I kid you not. Here is a paper about it:
Recent literature has shown that perceptual information, such as geographical locations, modalities, and iconicity, is encoded in language. The current paper extended these findings by addressing the question whether language encodes (literally) embodied information: whether statistical linguistic frequencies can explain the relative location of different parts of the body.
The takeaway message here is that statistical methods are very powerful, and can be used to make deductions about things that are “implicit”, “hidden”, or “encrypted”. In point of fact, it was statistics that helped England crack the German Enigma code during World War II, and might have been the deciding factor in winning the war.
we can apply this same kind of mindset to the data generated by high-resolution BCIs recording our brains as we interact with the world. When we hear a word mentioned, our brains will respond with a certain regularity; and that regularity can be determined through the use of statistical methods (e.g. using MVPA or “Mutli-Voxel Pattern Analysis”).
You might think that everybody’s brain is different, and that in different contexts our brain responds differently, making this kind of analysis impossible. However:
* When we hear or see a word, at least about a third of our brain pattern is the same across contexts. It sounds like a lot is being lost, but that’s still good enough to read off some generic, context-independent information about that word; and if we can decode how those other two-thirds of the brain pattern represent context, then we can extract even finer details of meaning.
* Although different people have different brains, there are still some similarities in brain patterns across individuals. There are also algorithms for how to map responses from brain to brain, that enable one to match even more of the features shared by different brains. It’s remarkable how much of this information is shared by different individuals -- individuals who grew up in different towns or cities, who had different cultural upbringings, and even spoke different languages!
Ok, so we have powerful statistical methods, and we have brains that exhibit similar patterns when people hear the same words -- even in different contexts. What can we do with that?
Well, maybe we can apply the statistics to brain data to make surprising deductions about the contents of our knowledge, just like with the longitude and latitude coordinates example above. For example, maybe by examining our brain responses when we hear the word “New York”, we can feed those responses into an algorithm, and out will pop an estimate as to its location, population, size, and other details. Something like this has been shown for statistical analysis on free text, so why not brain patterns?:
Now, one can easily make such an algorithm that just memorizes the answers, by programming in a “lookup table” with cities and their various properties. This is not the kind of algorithm I’m talking about. I’m talking about an algorithm that can be spelled out in just a few lines of code. This algorithm will have some learnable parameters that will require feeding in some “training examples” of (city, brain response, attributes) triples to set their values; but this number of examples will be very tiny compared to the number of cities it will work on. The total amount of information contained in the lines of code and the training data will be very small compared to the amount of information we can extract from the brain when a very large number of city names are presented to an individual.
The success of such an algorithm, just like the success of the ones that do the same for stats applied to free text, should not be taken to indicate the information about cities is “contained in the human brain”. Rather, the human brain contains associational information that can be translated into good estimates for location, population, size, and so on.
It would be a neat parlor trick to do this for cities, but that wouldn’t be very practical, as we can look up that information very easily. So what would be some good applications?
For a start, consider this one: when you hear the name of someone you have known for years, certain parts of your brain light up that indicate your knowledge of how they speak and walk, whether their skin is pale and sickly, their psychological state, and so on. A computer given access to that information could attempt to predict whether they have a major illness like Alzheimer’s or cancer; and would probably be right far above chance-level -- accurate enough to make a good suggestion that they should see a specialist.
Another example: let’s say that BCI is light and compact enough to where you can wear it basically all the time. You lie down on the couch to watch TV, and as you listen to the news, the BCI scans your brain responses. When you hear certain words, a program analyzes those brain responses. The program knows the typical response of someone who correctly understands the meaning of the word; and, as I said earlier, some portion of that response is shared across individuals, and across most contexts. So, if you have an incorrect understanding of the word, that crosses a certain threshold of error, it could alert you by stating the correct definition or meaning. It wouldn’t pipe up for casual misunderstandings, only for the really egregious ones.
You can take that further, using ideas similar to the ones I described above regarding cities: it’s known how to use word co-occurrence statistics to determine the temporal order of events:
In several computational studies we demonstrated that the chronological order of days, months, years, and the chronological sequence of historical figures can be predicted using language statistics. In fact, both the leaders of the Soviet Union and the presidents of the United States can be ordered chronologically based on the cooccurrences of their names in language. An experiment also showed that the bigram frequency of US president names predicted the response time of participants in their evaluation of the chronology of these presidents.
And I’m guessing something similar can be done for brain responses. So, for example, after a BCI scans your brain when you hear “Teddy Roosevelt”, an algorithm might look at your neural representations, and determine that you think he was president in the early 1800s, instead of the early 1900s. After making that determination with high confidence, the computer might chime in with, “Teddy Roosevelt was the U.S. president from 1901 to 1909”. You might then say, “Wow! How did it know I didn’t know that!?”
I’m a believer in a certain “stoplight” theory of problems with learning new skills or areas of study: I think most of the roadblocks that people face when progressing to the next level in their education, are based on some small set of misconceptions or flaws. It’s like how in getting from point A to point B in a busy city, most of the time is spent stuck at stoplights (depending on the city and time of day) or in snarled traffic.
For example, when learning to play the piano or the game of chess, there are often subtle flaws in ones playing or strategy that one doesn't notice at first, and that take time to overcome. An observant teacher can help, but not everyone has access to the best teachers. Maybe, for example, a novice chess player spends too much time focused on the periphery of the board, and not enough time focusing on the center -- a BCI and the right software might detect that flaw, by comparing the player's neural representations with those of a master.
Another example: while learning about modern physics, some people may read, in a popular press article, about “extra dimensions”. If they are like most, they think “dimension” refers to “parallel universe”, instead of “number of coordinates” (or size of basis). That one misconception could prevent them from learning any more about the subject. If you pile up several more misconceptions like that, there is almost no chance they will progress much further in their understanding.
Could we find those misconceptions right as they arise? I think in some cases we can. I’ve already mentioned how this would work for factual knowledge about geography and history, and about the correct use of the word "dimension"; but it probably also works for pinning down the conceptualizations for doing science. Here is a paper that points in that direction:
In the paper, researchers looked at the brain response patterns of various different kinds of students (undergrad and grad) as they were presented with physics terms like “momentum” and “electric field”. They found certain similarities -- and differences -- in the responses. With much higher resolution BCIs, I think algorithms could be built to probe the finer aspects of neural representations of scientific knowledge. If someone’s early understanding is at variance with that of experts in the field, then they could be told this by a computer (that can even generate natural language descriptions about their misconceptions).
The very same methods could enable computers to help scientists solve problems. For example, if a researcher is thinking about a potential approach to nuclear fusion, their brain activity may be similar to that of other researchers in different fields when they think about specific problems. The computer could alert the fusion specialist to that work, and perhaps the solution transfers.
Maybe as the researcher thinks about fusion, he or she forms a dynamic, mental sculpture, representing the wavering magnetic fields and plasma currents inside a reactor. The sculpture may not be purely visual, but could involve body motions (dance; hands cupped around currents), sounds, or mere fleeting glimpses of fields in the void. The thought patterns behind that mind-sculpture might be similar to ones that pop into the heads of engineers working on the design of high-temperature engines, for example. Perhaps if the fusion researcher knew precisely which research paper in the engine design literature to look at, they could transfer some of the ideas to their field.
This is the kind of analogizing that is common in the sciences, though people rarely notice that they are making analogies.
A more down-to-earth example of how to use BCIs to analogize would be the following: suppose you want to know what the equivalent is of a particular experience in some other city you will be visiting. Maybe you are looking for the equivalent of a particular jazz club in city A that you have experienced in city B. Or maybe you are looking for a piece of Jazz music that gives you the same sense of wonder and chills-down-the-arms-and-legs as when you listen to a particular modern classical piece.
How could it be done? Through the use of “brain vectors”. You can do something similar to what is described here:
The basic method that applies to “word vectors” should also apply to vectors generated from brain data using a BCI.
These examples I have written are only a snapshot of some of the possibilities for this type of human enhancement by BCIs. Stay tuned for the next posting, which will be about how BCIs will allow computers to understand us better (e.g. disambiguate).