Technically, I guess, the Deep Learning revolution goes further back than 2012 (or 2011); but the first big success that got everyone's attention, and led to large numbers of people taking neural nets seriously again, was its success at the ImageNet competition. Shortly after that, there was explosive progress on a whole host of other computer vision tasks and on speech recognition. People said, "But NLP is still far out of reach of these pattern-recognizers." Then there was word2vec (not technically DL) and Collobert & Weston (from 2011, but it wasn't immediately appreciated). NLP people then said, "You can't cram the meaning of a whole %&!$# sentence into a single $&!#* vector." Now, there has been massive progress on many of those NLP tasks that seemed out of reach, with Google's BERT being the most recent system to take a crack at them. There were Atari-playing systems, AlphaGo and AlphaZero, Dota 2 bots, photorealistic image synthesis, music synthesis, and the beginnings of dexterous robots, in those 6 years as well.
Where will it end?
People are still skeptical of Deep Learning's applicability to the really hard NLP tasks like the WInograd Schema problems and so-called commonsense reasoning; but they were also skeptical in the past and shown to be wrong. They say things like, "Deep Learning is just pattern-matching. You need something better than that to do actual reasoning."
To be sure, the progress hasn't been steady. Over the past 6 years it has come in waves. Someone notices Deep Learning is good in some area, and then people work like gangbusters to push it to its limits (or the limits of available resources). We're seeing that right now with the mid-level NLP tasks, and its beginning to happen with robotics.
I think the skeptics are right about the really hard NLP stuff at the moment, and probably also the really hard problems in robotics. However, as I've said before, there is an awful lot of implicit information about the world locked-up in unstructured text, and there is a lot of room for reinforcement learning to improve robotics -- just needs a lot more compute. I wouldn't be surprised if some not-too-distant future method finds a way to exploit it to solve these really hard problems.
I also think that if we had a lot of good-quality human brain data to work with, machine learning engineers would find ways to quickly exploit it to accelerate the progress ever further -- at an even faster clip than the already hurried pace of ordinary Deep Learning progress. But even without that brain data, it still looks to me like existing data sources will take us pretty far, pretty fast.
What will this look like in the next 6 years? At the rate we're going, I'd say we will see things like: virtual assistants on par with the Knowledge Navigator:
video synthesis-from-scratch that is indistinguishable from a movie clip; photo-realistic 3D VR scenes one can walk around in (as good as the best 2D GAN images); conversational socialbot agents that you easily mistake for human on the phone over a few conversation rounds (so long as you don't probe its limits); almost perfect machine translation of newspaper and magazine articles (we are almost there now; but it will work so well in 6 years you will almost never notice its limitations); neural net models that can generate a short essay that would fool most humans (descriptive essays, and maybe even ones that make an argument or two); robots that can cook and clean, at least as demonstration models; massively-improved self-driving cars; and many more.
Deployment of many of these things in the real world will maybe have to wait a few years, until all the kinks have been worked out, and the systems are deemed safe. It's like what we see with self-driving cars, where there are crazy-impressive demonstrations... but then it takes years to make sure they can handle most of the edge-cases.
And 12 years from now?
That's anybody's guess. I think that we might at least see many of the things to be developed in 6 years as demonstrations turned into mass consumer products. For example, we might have house robots in 12 years available to consumers, that can clean tables, pick up stuff off the floor and put it away, straighten the bookshelves, and prepare simple meals.
Of course, those same robots will also be cleaning up hotel rooms, so what will the hotel workers do? And what about all the other workers displaced by self-driving cars, machine translators, virtual assistants that can do many of the tasks of secretaries, and so on?
Note: There are a lot of caveats and qualifications that I should probably add to many of the things I wrote above; but life is too short to waste time on it. For example, if you say that an image recognition system achieves "human-level accuracy", the qualification might look like, "On this particular test, with this particular list of object categories, drawn from a particular distribution, with a particular set of humans, who have had a particular amount of time to prepare for the test, the ML system achieved a higher score." Adding that is a sure way to make the writing boring.
It's a reaction to seeing Google's recent BERT system. The leaps in performance seemed to come "out of the blue". The skeptics have turned up their game on Twitter, though. They are right at the moment... but that could quickly change. And when it does, they'll move on to something else -- e.g., "But it needs insane amounts of data to do what a human does. Just because you can solve Winograd Schema problems will trillions of words of data doesn't mean it's really solving them like a human. In fact, it's not really solving them at all -- just more pattern-matching." Gary Marcus will talk about how neural nets still don't have "systematicity" (reviving a very old complaint). His goal is basically to get ML people accept all the scholarship that he, Fodor and others produced back in the 80s and 90s.