NVIDIA’s AI advance: Natural language processing gets faster and better all the time
Yesterday NVIDIA announced record-breaking developments in machine learning for natural language processing. How and why did it do this, and what does it mean for the world at large?
When NVIDIA announced breakthroughs in language understanding to enable real-time conversational AI, we were caught off guard. We were still trying to digest the proceedings of ACL, one of the biggest research events for computational linguistics worldwide, in which Facebook, Salesforce, Microsoft and Amazon were all present.
While these represent two different sets of achievements, they are still closely connected. Here is what NVIDIA’s breakthrough is about, and what it means for the world at large.
As ZDNet reported yesterday, NVIDIA says its AI platform now has the fastest training record, the fastest inference, and largest training model of its kind to date. NVIDIA has managed to train a large BERT model in 53 minutes, and to have other BERT models produce results in 2.2 milliseconds. But we need to put that into context to understand its significance.
BERT (Bidirectional Encoder Representations from Transformers) is research (paper, open source code and datasets) published by researchers at Google AI Language in late 2018. BERT has been among a number of breakthroughs in natural language processing recently, and has caused a stir in the AI community by presenting state-of-the-art results in a wide variety of natural language processing tasks.
What NVIDIA did was to work with the datasets Google released (two flavors, BERT-Large and BERT-Base) and its own GPUs to slash the time needed to train the BERT machine learning model and then use it in applications. This is how machine learning works — first there is a training phase, in which the model learns by being shown lots of data, and then an inference phase, in which the model processes new data.
NVIDIA used different configurations, producing different results for this. It took the NVIDIA DGX SuperPOD using 92 NVIDIA DGX-2H systems running 1,472 NVIDIA V100 GPUs to train a BERT model on BERT-Large, while the same task took one NVIDIA DGX-2 system 2.8 days. The 2.2 millisecond inference result is on a different system/dataset (NVIDIA T4 GPUs running NVIDIA TensorRT / BERT-Base).