I have recently been skimming papers on using BCIs and other physiological signals to annotate text and other media for retrieval; for example, this work, which appeared in Nature Scientific Reports back in December 2016:
In experiments, participants were asked to read Wikipedia documents about a selection of topics while their EEG was recorded. Based on the prediction of word relevance, the individual’s search intent was modeled and successfully used for retrieving new relevant documents from the whole English Wikipedia corpus.
(Here, they are annotating the text a user is reading for a single use, not annotating large amounts of text for later retrieval; but it's a similar idea.)
Unfortunately, the signals used were very noisy, and low in information content. What would it be like, I wondered, if we had high temporal and spatial resolution BCIs to help annotate everything? This little note is a meditation on that eventuality:
For one thing, it should be noted that we won't even need advanced AI or machine learning to extract a lot of benefit from those BCI signals! Most of the benefit could come from very simple indexing algorithms, along with some very simple "linear decoding" methods. Maybe some AI would be needed for speech recognition and query-understanding -- but no complex reasoning engines, video understanding, object recognition, or machine reading algorithms are necessary. If the BCIs were available circa the year 2000, say, it would lead to much better-performing search engines than we have today.
For example, here is how the BCIs could be used to annotate the web for much smarter retrieval: take Wikipedia as a starting point, and get a few people to read articles while wearing the BCIs (as in the above research paper); multiple people would read the same articles as their eyes are tracked (so we see what they are reading), and then their brain activities would be averaged later, to increase robustness and accuracy. Every single word would be annotated by a multidimensional "brain word vector" that indicates not only what the person thinks the word means generically, but also what it means in context, the mood and sentiment associated with the word, time and place, and many other things. So, for example, if they were to read the Wizard of Oz (the story, not the Wiki article) and came across the name "Dorothy", they might think,
girl, young, naive, mid-West, Kansas, good, dog,
and many other things. There would be so many things going through a person's mind that it would take forever even just to write it all down for a few pages of text; and, therefore, it would cost companies like Google or Facebook a fortune to pay people to annotate the text, if they had to do it "by hand" (without BCIs).
Not only would BCIs make the annotation effortless, but they would capture subtle shades of meaning that we aren't even aware of. Word clouds can't capture this, some of which would be "subsymbolic".
Other media could similarly be annotated. Every second of film in movies would have an associated 10,000-dimensional brain vector, representing many different facets of the story. Using eye-tracking, there would even be vectors associated to individual actors and objects in the story -- what do people think of the tears of the actors? The dancing? The background? -- every single detail would be annotated, and with no effort on the part of the viewers providing the annotation.
It would be so easy to annotate (with BCI) images, films, Wikipedia, novels, news articles, and so forth, that people may even do it for free, as a public service, just as they do maintaining Wikipedia.
Some lawyers may also BCI-annotate legal opinions; and scientists may BCI-annotate scientific papers.
Once all this annotation data is collected, some very, very simple algorithms could be used to look things up with incredible specificity. For example, say you want an article that can be described as:
A piece that superficially appears to be skeptical of global warming, but that seems to have the ulterior motive of convincing the reader that global warming is real -- the kind of piece that would convince a denialist.
It would be very difficult to find this using a traditional search engine; but could possibly be found using all those BCI annotations. First, you would tell Google Assistant, say, what kind of article you were looking for; it would scan (as you are speaking or writing your request) and extract from your brain vector the key features of the kind of article you were looking for; then, it would simply look through its vast trove of articles, matching the subject area based on your words, along with the subtler shades of meaning based on your brain vector. A fraction of a second later you would get exactly the kind of article you were looking for.
You could do the same with videos and images -- not only could you find a video that kind-of is in the ballpark of what you were looking for, but you could find an exact match, along with the specific 3 second clip within the video that is relevant, and even the specific objects in context within that clip that you were seeking.
Having the web be neurally-annotated would also take question-answering to a whole other level. For example, if you wanted to know who the villain in the Wizard of Oz is, with every word neurally-annotated, it would be easy to pick out "The Wicked Witch of the West"; that wouldn't even require anything fancy. But systems today can do that already. You could ask much more complicated questions, like, "What was Dorothy's reaction upon first meeting the Scarecrow?" -- and the system might be able to pick out "sad" or "frightened" or something as the answer, from brain scan data. This would require interpreting the brain vector annotations in the right way, but this should be a lot less complicated than the daunting task of building a machine reading system that can make deep inferences about text.
And why stop at text, images, videos, and audio recordings? We could neurally annotate the whole world! -- stores, houses, city streets, forests -- everything! Two obvious applications would be to home robots and self-driving cars:
Some problems faced in building home robots are the fact that they don't necessarily know how to grasp objects, whether an object is trash or not, whether it is delicate and shouldn't be handled, and so on. If people wear BCIs (and eye-trackers) even just a few minutes as they wander around their homes, they could be neurally-annotated; and then the robots would know what they should be doing. They would know that this needs to be put in the trash, and that needs to be left where it is.
As to self-driving cars: as people drive their cars while wearing advanced BCIs, the entire landscape could be annotated. Every sign could be passively identified and recorded based on brain data (at least the signs you look at -- but since the data from multiple viewers will be pooled, at least a few annotators are sure to notice each sign); the road lanes and free space could be identified; any confusing landmarks could be identified (e.g. ones with reflective surfaces); the visceral sense of which areas are "dangerous" and that one should pay extra-attention to the road, could be identified; bad parts of the road to watch out for (e.g. potholes) could be identified; and so on.
Many of these can probably already be picked out using existing algorithms, with high accuracy. The brain data could be used to "check their work" -- if a difference is found between the BCI annotations and what the self-driving companies have in their files, their labels could be checked by a human and updated. Hopefully, the number of errors is sufficiently small that not many checks would be needed.
This "check their work" also applies to moving objects in the environment: for example, as the car is moving, it may make lots of identification errors that never would result in having to hand over control to the human driver -- so wouldn't get caught. Maybe the car sees a plastic bag in a different lane, mistakes it for a rock that would need to be avoided; but since it is in another lane, doesn't bother with it -- and the driver never notices that it made that error. Or, maybe a child wearing a costume holding a mirror crosses the street ahead, the car identifies it as an obstacle, correctly slows down; but doesn't register it as a "pedestrian". Using BCIs, all these extra errors would get caught -- and that should help improve the safety of the cars pretty efficiently.
I'm sure there are hundreds more scenarios like the ones I've listed. As I've written before, I'm excited by what BCIs will mean for the advance of AI; but BCIs will have a huge impact even without much contact with advanced AI. Even very simple algorithms applied to mass neural annotations of text, video, images, audio, the home, and the rest of the world, will unlock whole new realms of possibility.