Home → News → Google's PaLM-E Generalist Robot Brain Takes Commands → Full Text

Google's PaLM-E Generalist Robot Brain Takes Commands

By Ars Technica

March 13, 2023

[article image]

Researchers at Google and Germany's Technical University of Berlin debuted PaLM-E, described as the largest visual-language model (VLM) ever created.

The multimodal embodied VLM contains 562 billion parameters and combines vision and language for robotic control; Google claimed it can formulate a plan of action to execute high-level commands using its mobile robot platform equipped with an arm.

PaLM-E analyzes data from the robot's camera without requiring pre-processed scene representations, eliminating human data pre-processing or annotation.

The VLM's integration into the control loop also instills resistance to interruptions during tasks.

PaLM-E encodes continuous observations into a sequence of vectors identical in size to language tokens, so it can "understand" sensor data in the same way it processes language.

From Ars Technica
View Full Article


Abstracts Copyright © 2023 SmithBucklin, Washington, D.C., USA


No entries found