Abstract
Recent breakthroughs in text-to-image synthesis have been driven by diffusion models trained on billions of image-text pairs. Adapting this approach to 3D synthesis would require large-scale datasets of labeled 3D assets and efficient architectures for denoising 3D data, neither of which currently exist. In this work, we circumvent these limitations by using a pretrained 2D text-to-image diffusion model to perform text-to-3D synthesis. We introduce a loss based on probability density distillation that enables the use of a 2D diffusion model as a prior for optimization of a parametric image generator. Using this loss in a DeepDream-like procedure, we optimize a randomly-initialized 3D model (a Neural Radiance Field, or NeRF) via gradient descent such that its 2D renderings from random angles achieve a low loss. The resulting 3D model of the given text can be viewed from any angle, relit by arbitrary illumination, or composited into any 3D environment. Our approach requires no 3D training data and no modifications to the image diffusion model, demonstrating the effectiveness of pretrained image diffusion models as priors.
Synthetic Media & Generative AI News and Discussions
Re: Synthetic Media & Deepfakes News and Discussions
And remember my friend, future events such as these will affect you in the future
Re: Synthetic Media & Deepfakes News and Discussions
And remember my friend, future events such as these will affect you in the future
Re: Synthetic Media & Deepfakes News and Discussions
And remember my friend, future events such as these will affect you in the future
Re: Synthetic Media & Deepfakes News and Discussions
Big corporations like Meta basically HAVE to show off this tech using either "cute animals wearing funny clothes" or "landscapes and abstract objects" because they don't want to invite controversy early by showing off humans, because journalists are watching this technology like a hawk, primed to ask the question "But what about the potential for abuse?" Which to be fair, is a good question to ask.
Alas, wait until Stability releases THEIR text to video AI in the coming months to see a less filtered version.
Alas, wait until Stability releases THEIR text to video AI in the coming months to see a less filtered version.
And remember my friend, future events such as these will affect you in the future
Re: Synthetic Media & Deepfakes News and Discussions
^ I would actually like to see "abstract" prompts.
●●●●●●●●●
Re: Synthetic Media & Deepfakes News and Discussions
HOLY SH....OOZIES.
"Imagen Video": Google announces video version of Imagen (Ho et al 2022)
"Imagen Video": Google announces video version of Imagen (Ho et al 2022)
I AM SPEECHLESSAbstract
We present Imagen Video, a text-conditional video generation system based on a cascade of video diffusion models. Given a text prompt, Imagen Video generates high definition videos using a base video generation model and a sequence of interleaved spatial and temporal video super-resolution models. We describe how we scale up the system as a high definition text-to-video model including design decisions such as the choice of fully-convolutional temporal and spatial super-resolution models at certain resolutions, and the choice of the v-parameterization of diffusion models. In addition, we confirm and transfer findings from previous work on diffusion-based image generation to the video generation setting. Finally, we apply progressive distillation to our video models with classifier-free guidance for fast, high quality sampling. We find Imagen Video not only capable of generating videos of high fidelity, but also having a high degree of controllability and world knowledge, including the ability to generate diverse videos and text animations in various artistic styles and with 3D object understanding.
And remember my friend, future events such as these will affect you in the future
Re: Synthetic Media & Deepfakes News and Discussions
https://imagen.research.google/video/hdvideos/4.mp4Yuli Ban wrote: ↑Wed Oct 05, 2022 6:23 pm HOLY SH....OOZIES.
"Imagen Video": Google announces video version of Imagen (Ho et al 2022)I AM SPEECHLESSAbstract
We present Imagen Video, a text-conditional video generation system based on a cascade of video diffusion models. Given a text prompt, Imagen Video generates high definition videos using a base video generation model and a sequence of interleaved spatial and temporal video super-resolution models. We describe how we scale up the system as a high definition text-to-video model including design decisions such as the choice of fully-convolutional temporal and spatial super-resolution models at certain resolutions, and the choice of the v-parameterization of diffusion models. In addition, we confirm and transfer findings from previous work on diffusion-based image generation to the video generation setting. Finally, we apply progressive distillation to our video models with classifier-free guidance for fast, high quality sampling. We find Imagen Video not only capable of generating videos of high fidelity, but also having a high degree of controllability and world knowledge, including the ability to generate diverse videos and text animations in various artistic styles and with 3D object understanding.
https://imagen.research.google/video/hdvideos/5.mp4
https://imagen.research.google/video/hdvideos/9.mp4
https://imagen.research.google/video/hdvideos/46.mp4
https://imagen.research.google/video/hdvideos/50.mp4
●●●●●●●●●
Re: Synthetic Media & Deepfakes News and Discussions
And remember my friend, future events such as these will affect you in the future
Re: Synthetic Media & Deepfakes News and Discussions
So if I got it right:
- Make-A-Video: Best image quality
- Phenaki: Longest video length
- Imagen Video: Can generate phrases
●●●●●●●●●