Google sets the bar for AI language models with PaLM
Google’s new large language model (LLM) called PaLM (Pathways Language Model) is the first outcome of Pathways, Google’s new AI architecture, which aims to handle many tasks at once, learn new tasks quickly and reflect a better understanding of the world.
PaLM is a massive undertaking with ambitious goals. Although many aspects of PaLM require further evaluation, it represents an important step forward for LLMs. The process of developing and evaluating PaLM is detailed in an arXiv publication and summarized by Google in a blog post.
Google’s publication outlines the philosophy of Pathways at every step of the process of training PaLM. The versions of the new architecture include PaLM 8B with 8 billion parameters, PaLM 62B with 62 billion parameters and PaLM 540B with 540 billion parameters. Google created different versions in order to evaluate the cost-value function as well as the benefits of scale.
The number of parameters is important in LLMs, although more parameters don’t necessarily translate to a better-performing model. PaLM 540B is in the same league as some of the largest LLMs available regarding the number of parameters: OpenAI’s GPT-3 with 175 billion, DeepMind’s Gopher and Chinchilla with 280 billion and 70 billion, Google’s own GLaM and LaMDA with 1.2 trillion and 137 billion and Microsoft – Nvidia’s Megatron–Turing NLG with 530 billion.
The first thing to consider when discussing LLMs, like any other AI model, is the efficiency of the training process. Even the Googles of the world need to answer this question: “Given a certain quantity of compute, how large of a model should I train in order to get the best possible performance?”