
DeepMind unveiled a new multi-modal AI system capable of performing more than 600 different tasks.
Gato is arguably the most impressive all-in-one machine learning kit the world’s seen yet.
With a single set of weights, Gato can engage in dialogue, caption images, stack blocks with a real robot arm, outperform humans at playing Atari games, navigate in simulated 3D environments, follow instructions, and more.
On the one hand, the program is able to do better than a dedicated machine learning program at controlling a robotic Sawyer arm that stacks blocks.
On the other hand, it produces captions for images that in many cases are quite poor.
Inspired by progress in large-scale language modeling, DeepMind applied a similar approach toward building a single generalist agent beyond the realm of text outputs.
The agent, which is referred to as Gato, works as a multi-modal, multi-task, multi-embodiment generalist policy.
To bring to your notice, Gato is considered game-changing! But how?
Gato is not just a transformer but also an agent. Think of it as a transformer combined with an RL agent for multi-tasking reinforcement learning and with the ability to perform multiple tasks – hence game-changing.
More Trending Stories
Top 10 Cryptocurrencies to Buy After Selling Your Terra Investment
Small DefenseTech Companies will Compete with Meta in the Metaverse Race
Metaverse ETFs are Losing their Place to Gaming ETFs Now
Why Cybersecurity Jobs should be More than ‘We Didn’t Get Hacked’?
Can A Bigger Version of GPT-3 or Deep Mind’s Gato Take Us to AGI?
Top 10 Indian Companies Hiring AI Professionals for INR 25 Lakh Package
The post Gato AI: A Multi-Modal, Multi-Tasker, and a Multi-Embodiment Generalist appeared first on .
from https://ift.tt/N29PsBn