Just this year we got chatbots that can do a pretty good job in short conversations. I discussed it here, and also later on in that thread:https://www.futureti...-like/?p=270632
I hate to guess what 10 more years of refinement will do for that.
Already, these so-called "language models" like GPT-2 and Megatron can do a fair bit of "improvizing", as I discussed here:https://www.futureti...-like/?p=271299
(I recommend clicking that link and reading it. It's totally insane!)
It just can't do it perfectly reliably.
Here is what I would guess is true: let's say you have a person engage in an online chat with GPT-2, where a human picks out the best of 25 responses that the program generates. The best response is probably an almost perfect thing to say in the situation more than 95% of the time. This means that all that separates us from hard-to-distinguish-from-humans chatbots is a sufficiently good "critic" module to pick the best-of-25.
I base this on several pieces of evidence I've seen -- e.g. an article in The Economist, and then also the performance of that DialoGPT chatbot, when selecting from 16 possible responses.
This is not to say that you can't reach its limitations. It has limitations; but during most casual conversations -- even ones that require a little bit of logic -- you would not be able to tell what they are.
For example, let's say you were to test it out:
User: What is the first word, alphabetically, in this sentence?
User: How the $%#$^@ did you do that! NO WAY someone programmed you to answer that question!
That's what it will look like. Somehow, it knew to sort the words in that sentence and find "alphabetically".
To find its limits, you are going to have to try harder:
User: Sort the words in this sentence, and then write down the fourth word in that list.
Chatbot: "in" (appears twice)
User: Again, how the #%@%& did you do that!!!! Is somebody pulling my leg? Is there are human on the other end?
Still not good enough. Try even harder:
User: Sort the words in this sentence, write the list in reverse order, and then write down the nth word, where n is 2 times 3.
User: Ah, hah! Gotcha! You're just a dumb machine!
But that's if they only do limited depth of reasoning, like language models currently do. In 10 years, who knows how far they will come?
Perhaps you think you can prove its in-humanness by asking it questions that require commonsense reasoning? That probably won't work -- it will have so very much "world knowledge", represented by statistical relations among word patterns, that it will get your questions right as often as an above-average-intelligence human.
User: Suppose you set a dictionary, an apple, a bible, and a copy of War and Peace on a table. How many books are on the table?
Chatbot: Three. What a stupid question!
Maybe you can ask it philosophical questions to find it out? Again, won't work -- it will answer those perfectly. See the DialoGPT examples in that thread above; the chatbot can handle these types of questions, too.
Maybe ask it to write (bad) poetry, like a limerick? Again, it will spin out as much poetry as you are willing to read -- and it will be as good as an above-average human poet.
To tell that the thing isn't human in 2029, you're going to have to pay very close attention, and ask very subtle questions. Kurzweil may very well be proved right by 2029, that machines can pass an official Turing Test. 10 years is such a long time away, and I could see enough of the gaps being filled by then to prove him right.
I recently took a test to see how well I can spot GPT-2-generated headlines -- I forget the website it was on; might have been vox.com. I got pretty much all of them right, and the computer said that I scored above the 99 percentile among people who took it. I see that as confirmation that I am very good at discriminating the subtleties of machine and human-generated text.
And here is what this ability to "parse" machine-generated text is telling me: it's getting better at a frightening pace; and people who think it won't pose a danger (e.g. acting as an online troll army) are seriously deluded. You better pray there are adequate safeguards in place by 2029! -- maybe even 2025!