Google AI and DeepMind News and Discussions

MelanieWi1l · Post by **MelanieWi1l** » Tue May 25, 2021 1:09 pm

Performance and generalization yes but a very important aspect is also HOW it does things. So here are some additional details of what the paper is about:
Those alphago like agents used to have an external game simulator (to process the move they actually commit to play) and an internal simulator (to process the move they are browsing during the search to decide the best option). A game simulator does several things:
-you cannot play illegal move with it.
-it gives you the dynamic answer of rules to a move, like by example capturing the stones your move is actually capturing.
-it gives you a terminal status: "this is a win, loss, draw state".
The breakthrough of MuZero is that it has NO internal simulator of the game given, it should learn it's own representation of a game state and of game dynamic when "playing in it's head during search" (but still have an external simulator to process the moves it commit to play, so which is called only once by move of a game of a training session)
So basically MuZero could:
-read illegal moves as a possibility.
-badly process the dynamic of it's move (like it could forgot to capture the stones actually captured by it's move as long as this happens in it's head during search)
-miss that the game is ended (keep reading after a terminal state of the game).
So it's basically what every beginner in chess go through as the beginning of their learning process. Yet it still manages to learn a quite perfect and optimized model of the game, and use it to master the game itself.

Yuli Ban · Post by **Yuli Ban** » Mon Jun 28, 2021 9:34 pm

DeepMind AGI paper adds urgency to ethical AI
( An overview of where we are with AI in June 2021)

We are not ready for artificial general intelligence

Despite assurances from stalwarts that AGI will benefit all of humanity, there are already real problems with today’s single-purpose narrow AI algorithms that calls this assumption into question. According to a Harvard Business Review story, when AI examples from predictive policing to automated credit scoring algorithms go unchecked, they represent a serious threat to our society. A recently published survey by Pew Research of technology innovators, developers, business and policy leaders, researchers, and activists reveals skepticism that ethical AI principles will be widely implemented by 2030. This is due to a widespread belief that businesses will prioritize profits and governments continue to surveil and control their populations. If it is so difficult to enable transparency, eliminate bias, and ensure the ethical use of today’s narrow AI, then the potential for unintended consequences from AGI appear astronomical.
And that concern is just for the actual functioning of the AI. The political and economic impacts of AI could result in a range of possible outcomes, from a post-scarcity utopia to a feudal dystopia. It is possible too, that both extremes could co-exist. For instance, if wealth generated by AI is distributed throughout society, this could contribute to the utopian vision. However, we have seen that AI concentrates power, with a relatively small number of companies controlling the technology. The concentration of power sets the stage for the feudal dystopia.

Perhaps less time than thought

The DeepMind paper describes how AGI could be achieved. Getting there is still some ways away, from 20 years to forever, depending on the estimate, although recent advances suggest the timeline will be at the shorter end of this spectrum and possibly even sooner. I argued last year that GPT-3 from OpenAI has moved AI into a twilight zone, an area between narrow and general AI. GPT-3 is capable of many different tasks with no additional training, able to produce compelling narratives, generate computer code, autocomplete images, translate between languages, and perform math calculations, among other feats, including some its creators did not plan. This apparent multifunctional capability does not sound much like the definition of narrow AI. Indeed, it is much more general in function.
Even so, today’s deep-learning algorithms, including GPT-3, are not able to adapt to changing circumstances, a fundamental distinction that separates today’s AI from AGI. One step towards adaptability is multimodal AI that combines the language processing of GPT-3 with other capabilities such as visual processing. For example, based upon GPT-3, OpenAI introduced DALL-E, which generates images based on the concepts it has learned. Using a simple text prompt, DALL-E can produce “a painting of a capybara sitting in a field at sunrise.” Though it may have never “seen” a picture of this before, it can combine what it has learned of paintings, capybaras, fields, and sunrises to produce dozens of images. Thus, it is multimodal and is more capable and general, though still not AGI.

Yuli Ban · Post by **Yuli Ban** » Fri Jul 02, 2021 11:53 pm

Yuli Ban · Post by **Yuli Ban** » Mon Jul 05, 2021 12:53 am

[2106.13884] Multimodal Few-Shot Learning with Frozen Language Models

Starspawn0's comments: Deepmind. I think I might have posted the tweet thread to this before. It's *amazing*. It's like the kind of thing you would expect from GPT-4 -- super-fast / few-shot learning of new visual-and-text combined skills.

So what do we expect from GPT-4?? We might expect it to have few-shot capability, whereby you can show it an image, and then teach it a new task on-the-fly. For example, maybe it's an analogy task: {image} is to X as Y is to ....? [fill in the blank], and it quickly learns to output Z (where Z is the correct answer). Or, you can maybe teach it to play chess -- you show it a board, and say, "white to move," and it gives a decent move. Maybe you need to give it a few examples, first, so that it gets the idea of what you want it to do -- just like the few-shot learning in GPT-3; except here it's with text and images combined.
What's missing is the image-synthesis. That's what OpenAI's DallE is all about. If you combine what DallE can deliver with the model in this Deepmind paper, and then scale it up way, way up, you'll have something mind-blowing. So, take that chess example: instead of you always supplying the board for it to decide the next move, it could also generate the board! A sufficiently powerful version of this would literally allow you to create a chess game on-the-fly, just by giving it a few examples.
You could even make up a whole new board game, and teach it how to play with some examples, and then it would maybe do a passable, amateur-level job as your opponent -- and would even generate subsequent game boards for you.
Just think of the business applications. You could show it some graphs and ask if there is anything that "stands out", and it might generate a paragraph or two -- and it would use its world-knowledge about other companies, industries, supply chains, and so on, to give a plausible answer.
Or maybe you're a student in a chemistry class. You took some hand-written notes about some of the molecules the teacher drew at the board. You could show it one of your drawings, and ask it some questions about it. Maybe you made a mistake, and ask it to correct -- and it will do that, similar to doing "grammar correction".

Addendum: Take a look at the example in Figure 1. It's amazing that it knew to map Macaulay Culkin's scream pose to a scream emoji. Look also at Figure 4 -- learns on the fly.
I haven't read it through that deeply yet, but it doesn't seem they are revealing what language model they used -- I could be totally wrong, though. They say, on page 13 in A.2:

The pretrained transformer language model we used has a GPT-like architecture [29]. It consists of a series of identical residual layers, each comprised of a self-attention operation followed by a positionwise MLP. The only deviation from the architecture described as GPT-2 is the use of relative position encodings [36]. Our seven billion parameter configuration used 32 layers, with each hidden layer having a channel dimensionality of 4096 hidden units. The attention operations use 32 heads each with key/value size dimensionality of 128, and the hidden layer of each MLP had 16384 hidden units. The 400 million parameter configuration used 12 layers, 12 heads, hidden dimensionality of 1536, and 6144 units in the MLP hidden layers.

They trained their own GPT-2??

Yuli Ban · Post by **Yuli Ban** » Fri Jul 09, 2021 2:45 am

DeepMind uses AI to tackle neglected deadly diseases

Artificial intelligence is to be used to tackle the most deadly parasitic diseases in the developing world, tech company DeepMind has announced.

The London-based Alphabet-owned lab will work with the Drugs for Neglected Diseases initiative (DNDI) to treat Chagas disease and Leishmaniasis.

Scientists spend years in laboratories mapping protein structures.

But last year, DeepMind's AlphaFold program was able to achieve the same accuracy in a matter of days.

Yuli Ban · Post by **Yuli Ban** » Fri Jul 09, 2021 2:47 am

Google’s Supermodel: DeepMind Perceiver is a step on the road to an AI machine that could process anything and everything

Arguably one of the premiere events that has brought AI to popular attention in recent years was the invention of the Transformer by Ashish Vaswani and colleagues at Google in 2017. The Transformer led to lots of language programs such as Google's BERT and OpenAI's GPT-3 that have been able to produce surprisingly human-seeming sentences, giving the impression machines can write like a person.
Now, scientists at DeepMind in the U.K., which is owned by Google, want to take the benefits of the Transformer beyond text, to let it revolutionize other material including images, sounds and video, and spatial data of the kind a car records with LiDAR.
The Perceiver, unveiled this week by DeepMind in a paper posted on arXiv, adapts the Transformer with some tweaks to let it consume all those types of input, and to perform on the various tasks, such as image recognition, for which separate kinds of neural networks are usually developed.

Ozzie guy · Post by **Ozzie guy** » Fri Jul 09, 2021 3:45 am

Yuli Ban wrote: ↑Fri Jul 09, 2021 2:47 am Google’s Supermodel: DeepMind Perceiver is a step on the road to an AI machine that could process anything and everything
Arguably one of the premiere events that has brought AI to popular attention in recent years was the invention of the Transformer by Ashish Vaswani and colleagues at Google in 2017. The Transformer led to lots of language programs such as Google's BERT and OpenAI's GPT-3 that have been able to produce surprisingly human-seeming sentences, giving the impression machines can write like a person.
Now, scientists at DeepMind in the U.K., which is owned by Google, want to take the benefits of the Transformer beyond text, to let it revolutionize other material including images, sounds and video, and spatial data of the kind a car records with LiDAR.
The Perceiver, unveiled this week by DeepMind in a paper posted on arXiv, adapts the Transformer with some tweaks to let it consume all those types of input, and to perform on the various tasks, such as image recognition, for which separate kinds of neural networks are usually developed.

I can't help but think this is Proto AGI although I am sure it isn't as you are not hyping it up as such.

"The Perceiver, unveiled this week by DeepMind in a paper posted on arXiv, adapts the Transformer with some tweaks to let it consume all those types of input, and to perform on the various tasks, such as image recognition, for which separate kinds of neural networks are usually developed."

I assumed this is Proto AGI as it is a transformer that can consume multiple types of input and do multiple tasks what is it missing?

Even if it isn't one I think this AI is general in some form meaning AI with generality us now going to be developed at a compounding rate.

Human level AGI feels truly near now it feels like the beginning of the end, before the singularity (or as you have said extreme but predictable change not singularity).

Yuli Ban · Post by **Yuli Ban** » Fri Jul 09, 2021 5:45 am

It's certainly interesting, but it's still got a ways to go in terms of performance.

Yuli Ban · Post by **Yuli Ban** » Thu Jul 15, 2021 7:30 pm

Researchers match DeepMind’s AlphaFold2 protein folding power with faster, freely available model

DeepMind stunned the biology world late last year when its AlphaFold2 AI model predicted the structure of proteins (a common and very difficult problem) so accurately that many declared the decades-old problem “solved.” Now researchers claim to have leapfrogged DeepMind the way DeepMind leapfrogged the rest of the world, with RoseTTAFold, a system that does nearly the same thing at a fraction of the computational cost. (Oh, and it’s free to use.)

AlphaFold2 has been the talk of the industry since November, when it blew away the competition at CASP14, a virtual competition between algorithms built to predict the physical structure of a protein given the sequence of amino acids that make it up. The model from DeepMind was so far ahead of the others, so highly and reliably accurate, that many in the field have talked (half-seriously and in good humor) about moving on to a new field.

But one aspect that seemed to satisfy no one was DeepMind’s plans for the system. It was not exhaustively and openly described, and some worried that the company (which is owned by Alphabet/Google) was planning on more or less keeping the secret sauce to themselves — which would be their prerogative but also somewhat against the ethos of mutual aid in the scientific world.

Future Timeline

Google AI and DeepMind News and Discussions

Google AI and DeepMind News and Discussions

Re: Google DeepMind News and Discussions

Re: Google DeepMind News and Discussions

Re: Google DeepMind News and Discussions

Re: Google DeepMind News and Discussions

Re: Google DeepMind News and Discussions

Re: Google DeepMind News and Discussions

Re: Google DeepMind News and Discussions

Re: Google DeepMind News and Discussions

Re: Google DeepMind News and Discussions