Google's new AI can outperform GPT-4

6th December 2023

Google's DeepMind division has today announced its Gemini multimodal language model, which it claims has advanced "reasoning capabilities" and can outperform GPT-4 on a variety of tasks.

google gemini

Almost exactly one year ago, OpenAI announced the public release of ChatGPT, a large language model (LLM) based on GPT-3.5 with remarkable conversational and coding abilities, which set a new benchmark for artificial intelligence (AI). It quickly became the fastest-growing consumer software application in history, gaining 100 million users and contributing to OpenAI's valuation of $29 billion.

Initially launched as a freely available research preview, a more advanced version based on GPT-4 followed in March 2023. This featured even greater capabilities including a longer text input size, more nuanced and helpful responses, better accuracy, and improved safety.

The phenomenal success of ChatGPT triggered an AI "arms race" between various competing firms. Keen to avoid falling behind, Google attempted its own chatbot known as Bard. But this received mixed and negative reviews, being prone to errors and lacklustre responses to user prompts.

Google's reputation could now be restored, however, as the company's DeepMind division has today launched Gemini, a family of multimodal LLMs that the company claims will outperform GPT-4 on a variety of industry benchmarks. Gemini 1.0, the first version, is being made available in three model sizes:

Gemini Ultra – the largest and most capable model, for highly complex tasks.
Gemini Pro – the best model for scaling across a wide range of tasks.
Gemini Nano – the most efficient model for on-device tasks.

google gemini model sizes

Gemini is natively multimodal, pre-trained from the start on different modalities, and then fine-tuned with additional multimodal data. This means it can accept multiple input types – not just text, but also images, videos, and audio – and convert them into different outputs.

Take, for example, a series of random images that you need it to describe. Gemini can recognise these and speak in real-time. It can turn images into code, identify similarities between images, understand "hybrid" combinations of images, guess movies from images or clips, make sense of unfamiliar environments, and more. It also has multilinguality and cultural understanding.

Just some of Gemini's creative skills can be seen in the video below.

With a score of 90%, Gemini Ultra is the first AI to outperform human experts on the 57-subject MMLU (Massive Multitask Language Understanding) benchmark. ChatGPT's premium version, for comparison, can achieve 86.4%. A human expert is defined as 89.8%.

"Our new benchmark approach to MMLU enables Gemini to use its reasoning capabilities to think more carefully before answering difficult questions, leading to significant improvements over just using its first impression," says Demis Hassabis, CEO and Co-Founder of Google DeepMind, in a blog post. "With the image benchmarks we tested, Gemini Ultra outperformed previous state-of-the-art models, without assistance from object character recognition (OCR) systems that extract text from images for further processing. These benchmarks highlight Gemini's native multimodality and indicate early signs of Gemini's more complex reasoning abilities."

Other test results are shown in the table below, such as better mathematical and coding abilities.

google gemini tasks vs chatgpt

Google has worked to ensure safety and security: "We've built Gemini responsibly from the start, incorporating safeguards and working together with partners to make it safer and more inclusive," the company said. "To identify blindspots in our internal evaluation approach, we're working with a diverse group of external experts and partners to stress-test our models across a range of issues."

Gemini Pro and Nano are freely available in Bard and across other apps from today. In six of eight benchmarks, the Pro version outperforms GPT-3.5, making it "the most powerful free chatbot on the market today".

Gemini Ultra, the largest model version, is planned for integration into a "Bard Advanced" and will become available to software developers in early 2024.

		Latest updates »
		Timeline »
		Blogs »
		Features »
		Community »
Privacy policy


Latest updates »
Timeline »
Blogs »
Features »
Community »

Privacy policy