Gato AI: A Multi-Modal, Multi-Tasker, and a Multi-Embodiment Generalist

Gato AI

DeepMind unveiled a new multi-modal AI system capable of performing more than 600 different tasks.

Gato is arguably the most impressive all-in-one machine learning kit the world’s seen yet.

With a single set of weights, Gato can engage in dialogue, caption images, stack blocks with a real robot arm, outperform humans at playing Atari games, navigate in simulated 3D environments, follow instructions, and more.

On the one hand, the program is able to do better than a dedicated machine learning program at controlling a robotic Sawyer arm that stacks blocks.

On the other hand, it produces captions for images that in many cases are quite poor.

Inspired by progress in large-scale language modeling, DeepMind applied a similar approach toward building a single generalist agent beyond the realm of text outputs.

The agent, which is referred to as Gato, works as a multi-modal, multi-task, multi-embodiment generalist policy.


To bring to your notice, Gato is considered game-changing! But how?

Gato is not just a transformer but also an agent. Think of it as a transformer combined with an RL agent for multi-tasking reinforcement learning and with the ability to perform multiple tasks – hence game-changing.


More Trending Stories 

The post Gato AI: A Multi-Modal, Multi-Tasker, and a Multi-Embodiment Generalist appeared first on .



Post a Comment (0)
Previous Post Next Post