Jump to content

Welcome to FutureTimeline.forum
Register now to gain access to all of our features. Once registered and logged in, you will be able to create topics, post replies to existing threads, give reputation to your fellow members, get your own private messenger, post status updates, manage your profile and so much more. If you already have an account, login here - otherwise create an account for free today!
Photo

24/7 Media [Possibly by 2025?]

media synthesis synthetic media 24/7 artificial intelligence GAN deep learning 2025 2030

  • Please log in to reply
3 replies to this topic

#1
Yuli Ban

Yuli Ban

    Born Again Singularitarian

  • Moderators
  • PipPipPipPipPipPipPipPipPipPipPip
  • 21,076 posts
  • LocationNew Orleans, LA

This concept started in funkervogt's thread

Far Beyond DADABots | The never-ending movies of tomorrow
 
I've decided to expand it further by explaining what we needed here: https://www.reddit.c...he_neverending/
 
Here's what we need to make a rudimentary 24/7 movie:

  • Novel video synthesis. By this, I mean "a generative network produces full-motion video that is not directly based on an existing piece of data." That excludes deepfakes: they work by transferring one face to another. That excludes style transfer: making a pre-existing video look like a Van Gogh painting or pixel art doesn't count. It has to be novel, like ThisPersonDoesNotExist is for human faces. As far as I know, novel video synthesis remains at least a few good papers away. Needs another year or two.
  • Text-to-image and text-to-video synthesis. We have rudimentary TTI models, but they are indeed rudimentary. Thus, text-to-video synthesis utterly experimental at best. It might be best described as "where novel image synthesis was in 2014" (back when GANs generated fuzzy, ethereal black and white images of human faces, a very far cry from ThisPersonDoesNotExist). Might need two or more years.
  • Superior natural language generation abilities. NLG is actually quite a bit more advanced than some people presume. Networks like Transformer-LM and XLNet and Baidu's ERNIE team excel at semantic sentence-pair understanding, showing that these networks can derive meaning & understanding from at least a short paragraph of text. GPT-2 scores around a 70% on the Winograd Schema Challenge (which tests AI's ability for commonsense reasoning; a human reliably scores a 92% to 95%). Baidu's latest ERNIE model scores a 90.1%. This is fantastic for showing commonsense reasoning in a certain area of natural language processing and tells me that SOTA language models can indeed generate a text that makes sense. Of course, the Winograd Schema Challenge is based more on deducing if a sentence makes sense if the meaning is not immediately clear (which is still a massive skill necessary for proper NLU), so simply being as good as a human in figuring out a confusing sentence's unclear subject isn't going to lead to perfectly coherent scripts tomorrow. And what's more, I don't believe the SOTA models are available for public use like GPT-2 is. But that's besides the point, because we're discussing what ought to be possible in more than a few years. Capable of coherent scripts, as long as you're referring to SOTA natural language models.
  • Audio synthesis. We're already capable of generating speech that almost perfectly matches a human, and we can also generate waveforms for music as well (that is to say, computers can 'imagine' the sounds of instruments rather than play MIDI files). With further improvements, we ought to be able to improve text-to-speech to a level that's close to being indistinguishable from natural speech. This is all possible today.

Of course, for the first 24/7 movies, we won't need scripts that are necessarily coherent, nor will we need video synthesis networks that can generate an infinite amount of detail. What I can foresee is something like a video being posted to YouTube that is run by a generative-adversarial network with some simple instructions: "take this endlessly-prompted script and generate video from it." It might only use the last couple of sentences from the script to serve as a prompt for the next generated part of the script, which will reduce its long-term coherency greatly. However, it will still function.
This, I can absolutely see being done by 2022 at the latest. We're but a few papers from a team demonstrating this live.
And yes, it will definitely be surreal and likely overly literal with things. And the novel video generator might break for unclear things, like "the man takes off."
By 2025, considering the rate at which compute is increasing (which means more data for models to use, which means greater accuracy and more competent outputs), it would be bizarre if we couldn't do a surrealist "indie" movie.
And yes, I will hold to the claim that it will become coherent by 2030.

 

 

One of the guys from DADAbots commented and revealed that they were doing something like a really rudimentary version of this: 


And remember my friend, future events such as these will affect you in the future.


#2
Erowind

Erowind

    Anarchist without an adjective

  • Members
  • PipPipPipPipPipPipPip
  • 1,207 posts

The first person to twitch stream an entertaining enough model has potential to get as big as TwitchPlaysPokemon did. Especially if they could figure out how to get some chat integration where chat could prompt the model with new datasets. 



#3
Alislaws

Alislaws

    Democratic Socialist Materialist

  • Members
  • PipPipPipPipPipPipPipPip
  • 2,103 posts
  • LocationLondon

The first person to twitch stream an entertaining enough model has potential to get as big as TwitchPlaysPokemon did. Especially if they could figure out how to get some chat integration where chat could prompt the model with new datasets. 

That was my first thought as well, if you could hook in the chat or comments on the stream to influence the eternal movie that would be hilarious. 

 

So you turn up and theres some very odd and disjointed war movie thing going on, so you get a bunch of friends to start talking about Santa in the chat and before long Santa shows up in the movie, but how and where would depend on the AI and would very likely not make sense.

 

You'd need some filter to stop people just typing "boobs" in the chat a million times until your incredible pioneering AI driven eternal movie is reduced to a synthesised compilation of imaginary topless women. Possibly all with swastika tattoos, if past internet influenced AI experiments are anything to go by. 



#4
Yuli Ban

Yuli Ban

    Born Again Singularitarian

  • Moderators
  • PipPipPipPipPipPipPipPipPipPipPip
  • 21,076 posts
  • LocationNew Orleans, LA

Another thing: novel video synthesis is a thing.

 

0*4xz_f4x_5ds9N9UC.gif

0*f-aROKpjnDUSEWM9.gif

 

But that's from summer 2019. It reminds me of image synthesis from around 2015.

I can absolutely see a "This Gif Does Not Exist" later this year, but it'll take much larger training sets and more powerful models before we get something passable and capable of being done 24/7.


And remember my friend, future events such as these will affect you in the future.






Also tagged with one or more of these keywords: media synthesis, synthetic media, 24/7, artificial intelligence, GAN, deep learning, 2025, 2030

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users