Jump to content

Welcome to FutureTimeline.forum
Register now to gain access to all of our features. Once registered and logged in, you will be able to create topics, post replies to existing threads, give reputation to your fellow members, get your own private messenger, post status updates, manage your profile and so much more. If you already have an account, login here - otherwise create an account for free today!

These ads will disappear if you register on the forum

Photo

Assuming compute power devoted to the largest AI project doubles every 3.5 months, what kind of progress can we expect in the near-future?

AI Deep Learning

  • Please log in to reply
18 replies to this topic

#1
starspawn0

starspawn0

    Member

  • Members
  • PipPipPipPipPipPip
  • 505 posts
Recall this posting by OpenAI from earlier today:

https://blog.openai....ai-and-compute/
 

We’re releasing an analysis showing that since 2012, the amount of compute used in the largest AI training runs has been increasing exponentially with a 3.5 month-doubling time (by comparison, Moore’s Law had an 18-month doubling period). Since 2012, this metric has grown by more than 300,000x (an 18-month doubling period would yield only a 12x increase). Improvements in compute have been a key component of AI progress, so as long as this trend continues, it’s worth preparing for the implications of systems far outside today’s capabilities.


Let's assume the analysis is correct (there aren't a lot of datapoints). What can we say about the near-future progress in AI?

I would say that there are two basic kinds of material limitations (i.e. we are excluding theoretical limitations from discussion) to machine learning progress:

* Data limitations -- that is, it's hard to find enough data to train models.

* Computation limitations -- people have long known how to make progress on a problem, as the data are plentiful; it's just that there isn't enough computing power.


Those problems that are data-limited are things like Machine Reading Comprehension and Video Question-Answering. I think plentiful data will soon arrive with advanced BCIs to make serious progress on those problems. On the other hand, problems that are computation-limited will see serious progress after people feel they can get the results they are looking for, using vastly more computing power -- i.e. their estimation of the risk of failure improves. The 3.5 month doubling time reflects a combination of improvements on the hardware side, reduction in cost, as well as an understanding of the risk profile of throwing more computing power at problems.

So what kinds of problems are computation-limited, but not data-limited? Here are a few:

* Video synthesis. There is plenty of video data -- petabytes, in fact. It's just computationally-demanding to train large models to generate video, so the models used currently probably don't absorb enough world knowledge to do a good job.

* Text synthesis. Again, there is lots and lots of text out there, but models are nowhere near large enough to absorb it all, to produce more coherent text.

* Robots completing complex tasks in virtual environments (which can hopefully be transferred to the real world). You can train robots endlessly in virtual environments; but to train a large model, with lots of capabilities, you need lots and lots of training sessions, and hence lots of computing power is needed.

* Playing board games or even videogames like Starcraft II -- basically any game with definite rules that can be run many times.

* Learning to browse the web and complete a task, by using keyboard and mouse, given a text description of what to do. Endless amounts of training data can be generated automatically.

* Solve programming competition problems.

* Solve math competition problems that require generating a short (but hard-to-find) proof.


And there are lots more.

Many of these you can see would have enormous economic value (e.g. the video synthesis and robot tasks, if they could be transferred to the real world).

I would say that if the amount of computing power thrown at the most popular problems continues unabated, we will see some scary impressive progress on many of these in the next 3 years -- "scary" in the same sense as the reaction from people when AlphaGo beat the top players at Go; most people never saw it coming.
  • Casey and Yuli Ban like this

#2
TranscendingGod

TranscendingGod

    2020's the decade of our reckoning

  • Members
  • PipPipPipPipPipPipPipPip
  • 1,838 posts
  • LocationGeorgia

Shoot I didn't even know what Go was before that fateful event. That rate of resource addition is just so mind boggling that at this point i'm gonna go with Vernor Vinge's original prediction of 2023 for the Singularity. Haha or barring that at least we will accomplish what Peter Weyland said of creating "cybernetic individuals" (Youtube video Ted Talk 2023- tie in with the Prometheus movie).


  • Casey and starspawn0 like this

The growth of computation is doubly exponential growth. 


#3
funkervogt

funkervogt

    Member

  • Members
  • PipPipPipPipPip
  • 347 posts

From the prehistoric days of 1997:

 

 

''It may be a hundred years before a computer beats humans at Go -- maybe even longer,'' said Dr. Piet Hut, an astrophysicist at the Institute for Advanced Study in Princeton, N.J., and a fan of the game. ''If a reasonably intelligent person learned to play Go, in a few months he could beat all existing computer programs. You don't have to be a Kasparov.''...But winning the $1.4 million prize promised by the Ing foundation to a program that beats a human champion may be an impossible dream. The offer expires in the year 2000. Go programmers are hoping it will be extended for another century or two.

https://www.nytimes....cient-game.html

 

From the ancient days of 2006:

 

 

A very rough estimate might be that the evaluation function [for computer Go] is, at best, 100 times slower than chess, and the branching factor is four times greater at each play; taken together, the performance requirements for a chess-like approach to Go can be estimated as 1027 times greater than that for computer chess. Moore's Law holds that computing power doubles every 18 months, so that means we might have a computer that could play Go using these techniques sometime in the 22nd century.

https://www.theguard...chnologysection


  • Casey, Yuli Ban and starspawn0 like this

#4
starspawn0

starspawn0

    Member

  • Members
  • PipPipPipPipPipPip
  • 505 posts
There seems to be some debate online about what this new analysis by OpenAI means. I think it's pretty clear what it means. It doesn't mean we're beating Moore's Law on the hardware side -- it means that Moore's Law is still puttering along, that chips are getting cheaper, and perhaps most important, that peoplea are willing to use more compute (at ever-increasing cost), as they are more confident they can get it to produce good results. That last thing can be just as important as technological improvements. It won't last, of course, but could still keep going strong for a few more years.

One of the best comments I have come across is by Salesforce's Stephen Merity:

https://twitter.com/...897276118290437

Being guarded against hype is important - but being cognizant of how quickly the landscape may change in our field of endeavor (and the industries and communities tied to it) is equally important.


People who have had their skeptic high-beams turned on too high for too long, prepared to take down any little error in a popular science article about Deep Learning, should tread cautiously about this.
  • Casey and Yuli Ban like this

#5
starspawn0

starspawn0

    Member

  • Members
  • PipPipPipPipPipPip
  • 505 posts
Text synthesis is getting better, and may be one of the first to see "scary" progress in the next 3 years:

https://arxiv.org/abs/1805.06064
 

With two series of Turing tests, where the human judges are asked to distinguish the system-generated abstracts from human-written ones, our system passes Turing tests by junior domain experts at a rate up to 30% and by non-expert at a rate up to 80%.


Writing an essay or short story might be one of those seemingly out-of-reach tasks that might be doable, given enough computing power.

Let's think about it: it turns out there are only a small number of different types of stories:

https://www.theguard...es-ever-written

And there are only a small number of acceptable ways to write them. A probability distribution on narrative chains, and latent composition laws (recombine chains to produce new ones), can be learned. Some amount of common sense can be learned from free text, too. Language modeling works well on individual sentences. A set of "editor" modules could check the grammar, coherence, and logic. Put it all together, and I could see a story-writing system that can produce creative short stories being possible in the next couple years -- the whole thing would be trained on massive amounts of free text and also large numbers of short stories. The outputs would fool most humans; and people will complain, "But does it really know what it's writing?"
  • Yuli Ban likes this

#6
Alislaws

Alislaws

    Democratic Socialist Materialist

  • Members
  • PipPipPipPipPipPipPip
  • 1,246 posts
  • LocationLondon

It would be neat to be able to go to a website or similar and fill in a form where you pick various elements, and it writes you a custom story.

 

If you can do a coherent story with narrative structure etc. then it would theoretically be possible to have videogames with procedurally generated plotlines and side missions etc. which could allow for infinitely variable, but highly detailed virtual worlds right?

 

This might even be easier because you have a fully designed world that the computer system could understand more easily than the real world in which to create the storylines. The storylines could also adapt on the fly to player actions (eventually)

 

Also you could play D&D with a computer DM. 


  • Yuli Ban and starspawn0 like this

#7
Yuli Ban

Yuli Ban

    Born Again Singularitarian

  • Moderators
  • PipPipPipPipPipPipPipPipPipPipPip
  • 19,063 posts
  • LocationNew Orleans, LA

Might you write a cohesive post for /r/MediaSynthesis listing all the things we might expect in the next five to ten years if this rate of advancement continues?


  • Casey and starspawn0 like this

And remember my friend, future events such as these will affect you in the future.


#8
starspawn0

starspawn0

    Member

  • Members
  • PipPipPipPipPipPip
  • 505 posts

I'll pass.  But this piece might be good for your forum:

 

https://www.bloomber...about-two-weeks


  • Yuli Ban likes this

#9
funkervogt

funkervogt

    Member

  • Members
  • PipPipPipPipPip
  • 347 posts

What do you think of this claim?

 

 

That seems like such an odd unit of measure to use Petaflops/s days. The dimensional analysis would suggest that the seconds would just cancel out from the top and bottom. I suggest a better unit of measure would be BED- (human) brain equivalent days. Human brain can process roughly 1 exaflop per second. So, if you had a human working away for 24 hours you would have 1 BED. Thus, AlphaGo Zero achieved 10 BED. I will just add some vague claims to, ah, intellectual property for this idea.

 
This is really awesome! What I am seeing here is that the increased high CPU capability is being quickly translated into enhanced AI functionality. It seems clear to me that over the next 5 years there should be a dramatic change in the range of artificial behaviors. If all that is needed is to add CPU that is essentially already baked in. We are now on a countdown to the first wave of AI that should roll out over the next 2-3 years.
 
I also wonder how Google could access so many flops. Wonder whether they might have found a way to use PCs CPUs during searches. How many flops/s would a billion PCs give you?

http://infoproc.blog...l#disqus_thread


  • Casey and Yuli Ban like this

#10
starspawn0

starspawn0

    Member

  • Members
  • PipPipPipPipPipPip
  • 505 posts
As to the units used, it's similar to one of the standard ones in Deep Learning papers.

As to the rest of it: I am unsure about intelligent behavior rolling out. I just think if things keep going as they have, then we'll soon see lots more big tasks solved -- like video synthesis and robots that act in virtual environments in very complex ways.

Some skeptics on the web have pointed out how some of the big successes achieved with large amounts of compute were quickly superseded by smarter training methods and orders of magnitude less compute. To that I would say that that first big success probably is the reason those better models were found. A big success shows that it's possible to solve the problem using no more than a certain amount of resources; and that then spurs people on to find more efficient solutions. This is exactly what happens in the other sciences. The first solution to a problem is often messy and very complicated; and then later solutions are much cleaner and simpler.

People really underestimate -- or don't even consider -- the problem of risk estimates. Show it won't be a waste of time to pursue a particular approach, and whole teams of people will try it.

Google's Machine Translation work started with them trying a few experiments using comparatively small amounts of data, and small neural nets. Seeing how successful it was, and how the success seemed to scale with the size of the network, they finally chose to run much, much larger experiments -- and were successful. Without that inital signal showing how success scales with compute, they would not have even attempted to go further. They needed to be sure that their efforts wouldn't be a waste of time.

The story is the same with AlphaGo, and many of the other experiments.
  • Casey and Yuli Ban like this

#11
funkervogt

funkervogt

    Member

  • Members
  • PipPipPipPipPip
  • 347 posts

Some skeptics on the web have pointed out how some of the big successes achieved with large amounts of compute were quickly superseded by smarter training methods and orders of magnitude less compute. To that I would say that that first big success probably is the reason those better models were found. A big success shows that it's possible to solve the problem using no more than a certain amount of resources; and that then spurs people on to find more efficient solutions. This is exactly what happens in the other sciences. The first solution to a problem is often messy and very complicated; and then later solutions are much cleaner and simpler.

That's the process I've always thought AGI would go through. The first one will be a hot mess, running off of some massive server farm with a huge electricity bill, frequently crashing, and with such a long, convoluted code that fixing it or even understanding how it works will be nearly impossible. Over time, it will grow more efficient in every way, until there are AGIs that could run on the hardware we have in 2018. 


  • Yuli Ban likes this

#12
starspawn0

starspawn0

    Member

  • Members
  • PipPipPipPipPipPip
  • 505 posts

I'd like to add some things to this thread that I neglected to do when I wrote the OP:

First is a Tweet by OpenAI's co-founder Ilya Sutskever (who was a student of Hinton's and worked at Google Brain):

 

https://mobile.twitt...798116593483776
 

fun fact: 2 million years ago, biological evolution reached a similar conclusion re utility of making certain neural networks larger.


He shows a graph of brains getting much larger (and much smarter) over a very short period of time.

I recall he said in a talk recently that when they trained their Dota 2 bots, they saw continued improvement the more compute they threw at it. In fact, the improvement was "exponential" -- like it was getting exponentially smarter -- in the sense that the ELO or equivalent ratings continued to climb linearly with training time. One can interpret this as "exponential improvement", as small changes in the ELO score translate into "exponentially more likely to win" (if I recall correctly, a score of x+100 means you are twice as likely to win against someone with a score of x).

Sutskever seems to be a believer in taking simple algorithms people worked out decades ago, and simply add "MOAR compute!"

....

Another interesting thing -- and I recall seeing Sutskever speculate on this before in a talk -- is that there is now some empirical evidence that if you train multiple agents in an environment using Deep Reinforcement Learning (and probably also Neuroevolution), the individual agents learn to model the others. It's like saying that agents naturally evolve a kind of "empathy" or "social understanding":

https://arxiv.org/abs/1805.06020

I recall Sutskever saying that this might be a path to AGI: simply create a sophisticated enough environment, with lots of different agents, and let them evolve to complete certain tasks. The agents will learn to model the intentions of the other agents, and a social mind will emerge naturally.

Perhaps they will acquire a kind of long-term memory with episodic-procedural-declarative-iconic features, reasoning, language, and many other aspects of human intelligence. To be useful to humans, one would have to guide this evolution, so that the AIs learn to speak in English or some other human language, I suppose.

Maybe rudimentary versions of this experiment are being tried right now in the servers at OpenAI...


  • Yuli Ban and Alislaws like this

#13
starspawn0

starspawn0

    Member

  • Members
  • PipPipPipPipPipPip
  • 505 posts
Comment I saw on Twitter:

https://twitter.com/...615227018866688
 

As an example, the @Salesforce Research team when given pitches about the compute capabilities of upcoming custom neural hardware had no lack of ideas with what to do with excess compute. Nearly no task is perfectly solid in the face of infinite compute :p


No shortage of ideas to try, that they have at least some confidence more compute would improve quality of results.

I also think BCIs will produce mounds of high-value data in a couple years.

So, Deep Learning is not going to go away anytime soon -- it will go on at least probably 10 more years.
  • Casey and Yuli Ban like this

#14
starspawn0

starspawn0

    Member

  • Members
  • PipPipPipPipPipPip
  • 505 posts
Zaremba (OpenAI) from 2 days ago:

https://mobile.twitt...245233150259201
 

It’s insane that we have observed 5 orders of magnitude increase in compute for AI over last 5 years (x100000). How many more orders is about to come ?


Just think about it: if what is deployed is already sufficient to do so well at Dota 2 in team play, what would 100,000 the compute enable. Surely with some tweaks (to handle sparse rewards) these methods will crack Starcraft 2 in the very near future. What else? See my list of possibilities above.
  • Yuli Ban likes this

#15
tomasth

tomasth

    Member

  • Members
  • PipPipPip
  • 66 posts

If Thomas Sterling speculation [https://www.top500.o...-architectures/] are true and will happen "we can reach yottaflops by 2030".

 

If compenies in 2030 only have access to half (0.5 10^24) that is 0.5 10^9 pfs-days. (that is about 5 order a bit above AlphaGoZero according to the chart).


  • Yuli Ban likes this

#16
funkervogt

funkervogt

    Member

  • Members
  • PipPipPipPipPip
  • 347 posts

The best estimates of the human brain's computation converge on the tens of petaflops, give or take one order of magnitude. 

 

The world's best supercomputer, "Summit," is capable of 200 petaflops and cost $400 - $600 million to build. 
https://qz.com/13015...etaflop-summit/

 

Depending on whose estimates you choose to accept, supercomputer MIPS/$ increases by an order of magnitude every 5 - 16 years. 

https://aiimpacts.or...t-of-computing/

 

Optimistically, computers that match upper-level estimates of one human brain's computation will cost single-digit millions of dollars to build by 2028. Pessimistically, they won't get that cheap until 2050. 

 

The significance of the "single-digit millions of dollars" benchmark is that it puts computers with the same computational capabilities as a human brain within reach of midsized businesses and second-tier college Computer Science departments, which will allow an enormous amount of experimentation building AGI. Big entities like militaries, spy agencies and Google will collectively have tens or hundreds of thousands of them. 

 

We'll soon reach the point when hardware and high costs will no longer be impediments to building AGI. As a matter of fact, just one order of magnitude improvement in MIPS/$ (e.g. - Summit drops in price from $400 - $600 million to $40 - $60 million) could be enough to enable a "breakout" in AGI research. Google, the DoD, and the Chinese government could easily spare that much money for such a project. 



#17
tomasth

tomasth

    Member

  • Members
  • PipPipPip
  • 66 posts
Given the assumption of human brain's computation and the fact of having now supercomputers with that computation , then we are already a the point when hardware and high costs will no longer be impediments to building AGI.

#18
funkervogt

funkervogt

    Member

  • Members
  • PipPipPipPipPip
  • 347 posts

Why better, cheaper hardware matters:

 

 

While I think that better, more fully-fleshed-out theories of mind would be helpful, I don’t think he has correctly identified the core reasons why we don’t have human-level AGI yet.

 
The main reason, I think, is simply that our hardware is far weaker than the human brain. It may actually be possible to create human-level AGI on current computer hardware, or even the hardware of five or ten years ago. But the process of experimenting with various proto-AGI approaches on current hardware is very slow, not just because proto-AGI programs run slowly, but because current software tools, engineered to handle the limitations of current hardware, are complex to use.
 
With faster hardware, we could have much easier to use software tools, and could explore AGI ideas much faster. Fortunately, this particular drag on progress toward advanced AGI is rapidly diminishing as computer hardware exponentially progresses.
 
Another reason is an AGI funding situation that’s slowly rising from poor to sub-mediocre. Look at the amount of resources society puts into, say, computer chip design, cancer research, or battery development. AGI gets a teeny tiny fraction of this. Software companies devote hundreds of man-years to creating products like word processors, video games, or operating systems; an AGI is much more complicated than any of these things, yet no AGI project has ever been given nearly the staff and funding level of projects like OS X, Microsoft Word, or World of Warcraft.

http://www.kurzweila...nt-have-agi-yet



#19
starspawn0

starspawn0

    Member

  • Members
  • PipPipPipPipPipPip
  • 505 posts

The situation has changed since Goertzel wrote that opinion -- in fact, the OP is basically saying the exact opposite.  
 
Experimenting with AGI models until you get it right is an incrementalist approach.  The way headline-making systems are built now is that there is some low level experimenting to set parameters and picking the right combination of well-known building blocks, followed by very rapid scaling-up of the idea; the experimental phase is affordable by most labs.  AlphaZero, and other systems, came about when people suddenly threw far more compute at the problem -- far above trend.  The reason they did this is that they had very good reasons to suspect it would pay off.  They probably had a mix of empirical and theoretical evidence.  Very strong theoretical evidence for the viability of an AGI approach would cause a rich company to devote very large sums of money to purchasing the needed compute; but wishy-washy treatises (such as Goertzel has written and as Kurzweil wrote), would not.

 
The era we are in right now is partly "faith based" -- in the sense that people believe Deep Learning will solve a problem if you just scale up the compute -- and partly risk-analysis based -- i.e. people estimate the risk of failure, and conclude that they can't lose if they magnify the compute 100x.  The combination of these two things is driving people to throw large amounts of compute at AI problems.
 
As to the funding situation:  people say that they are working on narrow AI, and use watered-down language to describe what they are doing; but, in reality, there is a lot more AGI research going on than people realize.  You could say that Yann Lecun is an AGI researcher, though he would maybe deny it.  He has said before that his long-term goal is to build "rat AI" -- AI that works as well as a rat.  And commonsense reasoning is a current research topic he is focusing on -- that's really a large part of AGI.
 
....
 
An aside about Goertzel:  the machine learning community used to be merely amused by him; but after the way the Sophia robot has been portrayed (as some kind of sentient AI), their opinion of him has fallen:
 
https://www.facebook...159946457470538
 

More BS from the (human) puppeteers behind Sophia.

Many of the comments would be good fun if they didn't reveal the fact that many people are being deceived into thinking that this (mechanically sophisticated) animatronic puppet is intelligent. It's not. It has no feeling, no opinions, and zero understanding of what it says. It's not hurt. It's a puppet.

In case there is any doubt, let me be totally clear: this tweet was typed by a person who has read my post. No AI whatsoever was involved.

Here is an example of comment to the tweet (there are many like it): "Don’t take it personal Sophia. Humans like @ylecun and many others make such remarks out of ignorance. I love you, Sophia."

People are being deceived.
This is hurtful.


Others think SingularityNet is similarly dishonest:

https://twitter.com/...437022616879104

Read Delip Rao's twitter thread. Delip was a scientist and engineer at Amazon a few years back, and helped usher in the first iteration of Alexa. He is also a cosultant that helps startups with their AI (I think that's what he does now, anyways).

My opinion of Goertzel is: he is a smart guy, who is basically good; but blinded by ambition. He wants AGI to come to be so badly that he will do anything. He also seems to have definite ideas about how to do it (as does his friend Peter Voss) -- but I don't think any of them will be successful. It's not going to come from some super-high-IQ libertarian holed-up somewhere in the mountains (not suggesting Goertzel is libertarian, but that's the profile of the kind of person people like to throw money at; it's almost like a Bene Gesserit mystery religion crafted decades ago by Ayn Rand, that pays off here in the present, if you say the right things); it's going to come from people with strong connections to the academic community. SingularityNet DOES seem to have a few academics behind the project; but it's still very fringe.

I suppose there are much dumber ways to throw ones money down the toilet (like with gambling).







Also tagged with one or more of these keywords: AI, Deep Learning

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users