Jump to content

Welcome to FutureTimeline.forum
Register now to gain access to all of our features. Once registered and logged in, you will be able to create topics, post replies to existing threads, give reputation to your fellow members, get your own private messenger, post status updates, manage your profile and so much more. If you already have an account, login here - otherwise create an account for free today!

GPT-3 as Proto-AGI [Or at least AXI]

artificial intelligence deep learning artificial neural network OpenAI GPT-2 GPT-3 AGI AXI NLP language model

  • Please log in to reply
5 replies to this topic

Yuli Ban

Yuli Ban

    Born Again Singularitarian

  • Moderators
  • PipPipPipPipPipPipPipPipPipPipPip
  • 21,025 posts
  • LocationNew Orleans, LA

I recently came across this brief LessWrong discussion:

What should we expect from GPT-3?

When it will appear? (My guess is 2020).
Will it be created by OpenAI and will it be advertised? (My guess is that it will not be publicly known until 2021, but other companies may create open versions before it.)
How much data will be used for its training and what type of data? (My guess is 400 GB of text plus illustrating pictures, but not audio and video.)
What it will be able to do? (My guess: translation, picture generation based on text, text generation based on pictures – with 70 per cent of human performance.)
How many parameters will be in the model? (My guess is 100 billion to trillion.)
How much compute will be used for training? (No idea.)

At first, I'd have been skeptical. But then Starspawn0 brought this to my attention:
GPT-2 trained on ASCII-art appears to have learned how to draw Pokemon characters— and perhaps it has even acquired some rudimentary visual/spatial understanding
The guy behind this actually commented on the /r/MediaSynthesis post:

OMG I forgot I never did do a blog writeup for this. But this person almost did it for me lol.
https://iforcedabot....o-draw-pokemon/ just links to my tweets. Need more time in my life.
This whole thing started because I wanted to make *movies* with GPT-2, but I really wanted color and full pictures, so I figured I should start with pictures and see if it did anything at all. I wanted the movie 'frames' to have the subtitles in the frame, and I really wanted the same model to draw both the text and the picture so that they could at least in theory be related to each other. I'm still not sure how to go about turning it into a full movie, but it's on the list of things to try if I get time.

I think for movies, I would need a much smaller and more abstract ASCII representation, which makes it hard to get training material. It would have to be like, a few single ASCII letters moving across the screen. I could convert every frame from a movie like I did the pokemon but it would be absolutely huge -- a single Pokemon can use a LOT of tokens, many use up more than the 1024 token limit even (generated over multiple samples, by feeding the output back in as the prompt.)

Finally, I've also heard that GPT-2 is easily capable of generating code or anything text-based, really. It's NLP's ImageNet moment.
This made me think.

"Could GPT-2 be used to write music?"
If it were trained on enough data, it would gain a rough understanding of how melodies work and could then be used to generate the skeleton for music. It already knows how to generate lyrics and poems, so the "songwriting" aspect is not beyond it. 
But if I fed enough sheet music into it, then theoretically it ought to create new music as well. It would even theoretically be able to generate that music, at least in the form of MIDI files (though generating a waveform is also possible, if far beyond it). 
And once I thought of this, I realized that GPT-2 is essentially a very, very rudimentary proto-AGI. It's just a language model, yes, but that brings quite a bit with it. If you understand natural language, you can meaningfully create data— and data & maths is just another language. If GPT-2 can generate binary well enough, it can theoretically generate anything that can be seen on the internet. 
But GPT-2 is too weak. Even GPT-2 Large. What we'd need to put this theory to the test is the next generation: GPT-3. 
This theoretical GPT-3 is GPT-2 + much more data. 
Now when I say that it's a proto-AGI, I don't mean to say that it's part of a spectrum that will lead to AGI with enough data. I only use "proto-AGI" because my created term, "artificial expert intelligence", never took off and thus most people have no idea what that is. 
But "artificial expert intelligence" or AXI is exactly what GPT-2 is and a theoretical GPT-3 would be. 

Artificial Expert Intelligence: Artificial expert intelligence (AXI), sometimes referred to as “less-narrow AI”, refers to software that is capable of accomplishing multiple tasks in a relatively narrow field. This type of AI is new, having become possible only in the past five years due to parallel computing and deep neural networks.

At the time I wrote that, the only AI I could think of that qualified was DeepMind's AlphaZero which I was never fully comfortable with, but the more I learn about GPT-2, the more it feels like the "real deal." 
An AXI would be a network that works much like GPT-2/GPT-3, using a root capability (like NLP) to do a variety of tasks. GPT-3 may be able to generate images and MIDI files, something it wasn't explicitly made to do and sounds like an expansion beyond merely predicting the next word in a sequence (even though that's still fundamentally what it does). More importantly, there ought to still be limitations. You couldn't use GPT-2 for tasks completely unrelated to natural language processing, like predicting protein folding for example, and it will never gain its own agency. In that regard, it's not AGI and never will be— AGI is something even further beyond it. 
It's like the difference between a line (ANI), a square (AXI), and a tesseract (AGI). 
GPT-2 is "weak AXI" since nothing it does comes close to human-level competence at tasks (not even the full version). GPT-3 might become par-human at a few certain things, like holding short conversations or generating passages of text. It will be so convincing that it will start freaking people out and make some wonder if OpenAI has actually done it. A /r/SubSimulatorGPT3 would be virtually indistinguishable from an actual subreddit, with very few oddities and glitches. It will be the first time that a neural network is doing magic, rather than the programmers behind it being so amazingly competent. And it may even be the first time that some seriously consider AGI as a possibility for the near future.


Who knows! Maybe if GPT-2 had the entire internet as its parameters, it would be AGI as well as the internet becoming intelligent. But at the moment, I'll stick to what we know it can do and its likely abilities in the near future. 


I suppose one reason why it's also hard to gauge just how capable GPT-2 Large is comes down to the fact so few people have access to it. One guy remade it, but he decided not to release it. As far as I can tell, it's just because he talked with OpenAI and some others and decided to respect their decision instead of something more romantic (i.e. "he saw just how powerful GPT-2 really was").  And even if he did release it, it was apparently "significantly worse" than OpenAI's original network (his 1.5 billion parameter version was apparently weaker than OpenAI's 117 million parameter version). So for right now, only OpenAI and whomever they shared the original network with know the full scope of GPT-2's abilities, however far or limited they really are. We can only guess based on GPT-2 Small and GPT-2 Medium. 



Nevertheless, I can at least confidently state that GPT-2 is the most general AI on the planet at the moment (as far as we know). There are very good reasons for people to be afraid of it, though they're all because of humans rather than the AI itself. And I, for one, am extremely excited to see where this goes while also being amazed that we've come this far. 

And remember my friend, future events such as these will affect you in the future.




  • Members
  • PipPipPipPipPipPipPip
  • 1,371 posts

Turchin's (a Russian transhumanist) estimates aren't unreasonable, given that Brockman publically said OpenAI wants to scale up GPT-2 by a few orders of magnitude, near-term, using more data, and given how low the training and compute requirements of GPT-2 are (meaning there is room to scale).


As impressive as the outputs of these models are, however, they are still going to be pretty limited.  Basically, they have a time-limit to how much "thinking" they can do per output.  In a conversation, humans are similarly limited -- we can't say "give me an hour to think on it" in the middle of a conversation, then take that hour, and resume speaking.  


The model also has text-length limits -- though, presumably that can be fixed with a better model.


Finally, there are limits to what is learnable, given training time and example constraints.  A complicated algorithm with no clues how it works is going to take a very long time to learn.  Language, fortunately, is hierarchically-organized, which makes learning easier:  you can first learn how word pairs co-occur.  Then you can learn how combinations of words combine to form a sentence.  Then, how sentences combine to form a paragraph -- and paragraphs to form stories.  For each level, there are statistical clues in the corpus indicating the unwritten rules of composition; and these, the machine learning algorithms can pick up on.


Brain data will probably allow us to go much further.  Those hard-to-learn algorithms the brain uses will be "exposed" in the brain scan data.  Perhaps I will do another post on this.




  • Members
  • PipPip
  • 43 posts

Who knows! Maybe if GPT-2 had the entire internet as its parameters, it would be AGI as well as the internet becoming intelligent.

Uh... are you sure you'd want that?





Let's not forget Tay.




  • Members
  • PipPipPipPipPip
  • 255 posts
Even brain-trained will be far from human-level competence after more areas and bigger training and ever then ; until its way over it. No alarm clock as eliezer yudkowsky say.

Yuli Ban

Yuli Ban

    Born Again Singularitarian

  • Moderators
  • PipPipPipPipPipPipPipPipPipPipPip
  • 21,025 posts
  • LocationNew Orleans, LA

I almost forgot: here's a tighter, edited version:

GPT-3 as Proto-AGI (or AXI)

What exactly should GPT-3 be able to do? That, I cannot answer because I’m not fully aware of the full breadth of GPT-2, but the knowledge that it and MuseNet are fundamentally the same network trained on different data sets suggests to me that a theoretical 100B parameter version ought to be able to do at least the following:

  • Reach roughly 90% accuracy on either the Winograd Schema Challenge or the WNLI
  • Generate upwards of 1,000 to 2,000 words of coherent, logical text based on a short prompt
  • Increase the accuracy of its output by adding linked resources from which it can immediately draw/spin/summarize
  • Generate extended musical pieces
  • Generate low-resolution images, perhaps even short gifs
  • Translate between languages, perhaps even figuring out context better than Google Translate
  • Understand basic arithmetic
  • Generate usable code
  • Caption images based on the data presented
  • Generate waveforms rather than just MIDIs
  • Gain a rudimentary understanding of narrative (i.e. A > B > C)

All this and perhaps even more from a single network. Though it’s probable we’ll get more specialized versions (like MuseNet), the basic thing will be a real treat.

I myself don’t understand the specifics, so I can’t say that GPT-X will be able to use language modeling to learn how to play an Atari video game, but I can predict that it may be able to create an Atari-tier video game some time next decade. Any data-based tasks can be automated by an agent such as GPT-X, and this includes things like entertainment and news. It’s the purest form of “synthetic media”.



When I say that we'll get more specialized versions, I'm referring to transformers in general. GPT-2 and MuseNet aren't the same thing; just using the same architecture. 

And remember my friend, future events such as these will affect you in the future.

Yuli Ban

Yuli Ban

    Born Again Singularitarian

  • Moderators
  • PipPipPipPipPipPipPipPipPipPipPip
  • 21,025 posts
  • LocationNew Orleans, LA

This is INSANE. This throwaway theory recently got a massive kick in the rear!

"A Very Unlikely Chess Game" [GPT-2 for generating chess transcripts]

Black is GPT-2. Its excuse is that it’s a text prediction program with no concept of chess. As far as it knows, it’s trying to predict short alphanumeric strings like “e2e4” or “Nb7”. Nobody told it this represents a board game. It doesn’t even have a concept of 2D space that it could use to understand such a claim. But it still captured my rook! Embarrassing!
Backing up: last year, I wrote GPT-2 As Step Toward General Intelligence, where I argued that the program wasn’t just an essay generator, it was also kind a general pattern-recognition program with text-based input and output channels. Figure out how to reduce a problem to text, and you can make it do all kinds of unexpected things.
Friend-of-the-blog Gwern Branwen has been testing the limits of this idea.
Last month, I asked him if he thought GPT-2 could play chess. I wondered if he could train it on a corpus of chess games written in standard notation (where, for example, e2e4 means “move the pawn at square e2 to square e4”). There are literally millions of games written up like this. GPT-2 would learn to predict the next string of text, which would correspond to the next move in the chess game. Then you would prompt it with a chessboard up to a certain point, and it would predict how the chess masters who had produced its training data would continue the game – ie make its next move using the same heuristics they would.


What the fuck!!


so I can’t say that GPT-X will be able to use language modeling to learn how to play an Atari video game

Holy shit! I was actually WRONG here! Sort of. It's only chess, not a video game. But still!!!!

And you can even play against GPT-2. It's not particularly good, but the fact that this natural language generator can LEARN TO PLAY A GAME, something it DEFINITELY wasn't programmed to do is... just wow.

And remember my friend, future events such as these will affect you in the future.

Also tagged with one or more of these keywords: artificial intelligence, deep learning, artificial neural network, OpenAI, GPT-2, GPT-3, AGI, AXI, NLP, language model

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users