Deep Learning Software vs. Hardware: NVIDIA releases TensorRT 7 inference software, Intel acquires Habana Labs

Deep Learning Software vs. Hardware: NVIDIA releases TensorRT 7 inference software, Intel acquires Habana Labs

NVIDIA’s software library latest release brings significant performance improvements, which NVIDIA says enable conversational AI. But Intel is stepping up its game too, by acquiring Habana Labs, an AI chip startup that promises top performance on the hardware level.

In GTC China yesterday, NVIDIA made a series of announcements. Some had to do with local partners and related achievements, such as powering the likes of Alibaba and Baidu. Partners of this magnitude are bound to generate impressive numbers and turn some heads. Another part of the announcements had to do with new hardware.

NVIDIA unveiled Orin, a new system-on-a-chip (SoC) designed for autonomous vehicles and robots, as well as a new software-defined platform powered by the SoC, called Nvidia Drive AGX Orin. This signifies NVIDIA’s interest and progress in domain-specific applications, and an attempt to foster an ecosystem.

But what was perhaps the most interesting part of the announcement in terms of the bigger picture in AI chips was actually software: the new NVIDIA TensorRT 7. TensorRT is inference software developers can use to deliver applications utilizing trained deep learning models.

In machine learning (which deep learning is a branch of), there are 2 key parts in developing applications. The first part is training a model based on existing data, which is the equivalent of developing software in traditional application development. The second part is using the model to process new data, also known as inference, which is the equivalent of deploying software.

TensorRT works on the inference part. In other words: just by keeping a previously trained model, but using TensorRT 7 for inference on models deployed on NVIDIA GPUs, you can expect to see a performance boost.

NVIDIA chose to emphasize conversational AI applications, mentioning how inference latency has until now impeded true, interactive engagement. NVIDIA’s press release notes that TensorRT 7 speeds the components of conversational AI by more than 10x compared to when run on CPUs, driving latency below the 300-millisecond threshold considered necessary for real-time interactions.

Read the full article on ZDNet

Join the Orchestrate all the Things Newsletter

Stories about how Technology, Data, AI and Media flow into each other shaping our lives. Analysis, Essays, Interviews, News. Mid-to-long form, 1-3 times/month.


Write a Reply or Comment

Your email address will not be published.