MarTech360 Interview With Hemant Dhulla, Global Head Of AI Software at NVIDIA

By Tanvi Dabadgaonkar Last updated May 10, 2023

“Successful AI adoption requires knowledge and expertise across the full stack — from hardware to software.”

Hemant, can you tell us about your professional background and your current role at NVIDIA.

I am currently the global head of NVIDIA AI Software, leading sales and product marketing for software AI and developer tools. Before I joined NVIDIA in 2021, I was work on establishing roadmaps and cultivating product growth across the semiconductor and software industry, holding leadership positions at Intel, Xilinx and Rambus. For more than 20 years I’ve been growing and establishing multi-hundred million dollar businesses.

I’m honored to be a part of this “iphone moment for AI,” as our founder and CEO Jensen Huang has remarked.

You have a diverse experience in both software and hardware, what do you think are some of the most exciting developments happening at the intersection of these two fields, particularly in the areas of cloud computing and machine learning?

Much of the current machine learning and AI applications are, of course, based on large language models (LLMs) ‌and require the highest performing, most powerful infrastructure to run. Building this out on-prem can be prohibitively expensive, requiring too much of an investment for many smaller enterprises or researchers. The cloud computing model is solving this problem— it opens accessibility of machine learning and AI development and deployment to a much broader audience. NVIDIA has really embraced cloud computing and works closely with every major CSP to enable users to access both NVIDIA hardware and software in flexible and cost-effective ways.

It’s also great to see the different ways that enterprises are incorporating the latest AI into their existing offerings to enable more users to easily access and use AI-powered products. A company like Microsoft is a good example. They have these existing cloud-based products, like their Teams application, that have a massive global user base. What they are doing is pulling NVIDIA Riva technology into Teams to add functionality in a way that the user can just start benefiting from advanced speech and translation AI functionality without loading a new product. This is another example of the creative ways enterprises are using AI to provide the benefits of modern technology with a massive disruption to the user.

What are some of the biggest challenges the AI industry is facing today, and how’s NVIDIA working to address them?

Good question, and one that underlies our work here at NVIDIA. Our solutions are built to help solve the biggest challenges enterprises face. Two of the top of mind concerns for enterprises right now are 1) how to adopt the latest AI technology, and 2) how to address the reliability and security concerns that need to be considered when working with the current AI technology.

Enterprises are wondering how to integrate AI into their business model — do they need to add in to an existing workflow and infrastructure, or are they building from the ground up? Successful AI adoption requires knowledge and expertise across the full stack — from hardware to software. Many enterprises are playing catch-up with the lightspeed pace of innovation, and are considering what to build today and how to scale for the future. NVIDIA addresses these concerns with our enterprise AI software solution, NVIDIA AI Enterprise, which is designed to accelerate enterprises with 100+ frameworks, pretrained models, and AI workflows. When AI is deployed on an enterprise scale, security and API stability are critical to providing continuity and uptime. That’s another reason NVIDIA created this enterprise solution that comes with enterprise support directly from NVIDIA.

NVIDIA AI software also provides support for the second biggest challenge: how can enterprises keep AI applications built on large language models aligned with a company’s safety and security requirements. We recently introduced an open-source toolkit, NeMo Guardrails, that can be used for easily adding programmable guardrails to LLM-based conversational systems.

How does Natural Language Processing work, and what is NVIDIA doing to bring innovations through NLP to the market?

Natural language processing (NLP) refers to the AI process concerned with enabling computers to understand language in much the same way human beings can. At NVIDIA, we’re deeply involved in some of the most interesting research driving innovation in language-related tasks.

Speech AI is a term that we coined for two specific subsets of NLP — automatic speech recognition (ASR) and text-to-speech (TTS). While the common term, conversational AI, encompasses the larger universe of language-based applications, not all conversational AI applications are inclusive of speech AI. There are hundreds of languages used across the globe, and enterprises building global text-based conversational AI applications are adding speech and translation AI to provide multilingual voice-based interfaces for their applications. NVIDIA’s AI Platform offers a full breadth of solutions to bring the power of AI seamlessly to enterprises. Oftentimes, customers integrate several of these building blocks to build a full fledged application. Enterprises offering conversational AI-based applications turn to NVIDIA speech and translation AI solutions like our NVIDIA Riva SDK, and NVIDIA NeMo for language-based solutions to extend the functionality of an application. NVIDIA Riva enables organizations to build multilingual, fully customizable, real-time conversational AI pipelines with automatic speech recognition (ASR), text-to-speech (TTS) capabilities, and neural machine translation (NMT). NVIDIA NeMo offers a suite of solutions to build, customize, and deploy large-scale large language models that can be used to provide intelligence to conversational AI pipelines

A great example is Infosys, a global leader in next-generation digital services and consulting, has integrated Riva within their conversational AI solution, Cortex, a customer engagement platform.

How can marketers utilize modern NLP models to enhance customer engagement and improve the overall customer experience?

Voice AI is making its way into ever more types of conversational applications powered by language models, such as customer support virtual assistants and chatbots, video conferencing systems, drive-thru convenience food orders, retail by phone, and media and entertainment. Global organizations have adopted NVIDIA Riva to drive multilingual voice AI efforts, including T-Mobile, Deloitte, HPE, Interactions, 1-800-Flowers.com, Quantiphi and Kore.ai. These partner examples are outlined below to provide more clarity into how large enterprises use this technology to improve customer engagement.

T-Mobile adopted Riva for its T-Mobile Expert Assist—a custom-built call center application that uses AI to transcribe real-time customer conversations and recommend solutions—for 17,000 customer service agents.
Deloitte supports clients looking to deploy ASR and TTS use cases, such as for order-taking systems in some of the world’s largest quick-order restaurants. It’s also developing chatbot services for healthcare providers that will enable accurate and efficient transcriptions for patient questions and chat summarizations.
Interactions has integrated Riva with its Curo software platform to create seamless, personalized engagements for customers in a broad range of industries that include telecommunications, as well as for companies such as 1-800-Flowers.com, which has deployed a speech AI order-taking system.
Quantiphi is a solution-delivery partner that develops closed-captioning solutions using Riva for customers in media and entertainment, including Fox News. It’s also developing digital avatars with Riva for telecommunications and other industries.

How does NVIDIA Riva help businesses and organizations leverage AI technology to improve their marketing strategies?

AI technology for voice, such as speech and translation AI, presents a massive opportunity for enterprises to gain valuable feedback from global customer interactions, enabling critical improvements in products, services, and business processes.
Research is showing us that customers prefer AI-driven voice services over manually entering data. AI allows enterprises to easily convert voice-acquired data into insights and offer recommendations to improve business strategies.

I think one of the biggest challenges today for global enterprises is adopting an AI-based voice-drive enterprise which can meet the needs of users across hundreds of different languages.

Here at NVIDIA we focus our speech and translation AI SDK to have a multilingual user-interface which meets the opportunity to collect voice data from multiple languages throughout the entire enterprise customer engagement pipeline.

How can technologies such as edge computing and AI/automation shape the future of cloud computing?

Enterprises are already producing a massive amount of data that becomes a bottleneck for cloud computing, resulting in high latency. Edge AI, by combining edge computing and AI, speeds up data processing and reduces latency enabling businesses to gain real-time insights for decision-making. The training is typically performed in a data center or the cloud, and trained models or inference engines are deployed at the edge, empowering real-time decisions with no need for cloud connectivity. Some examples are real-time fraud-detection in the financial service industry and sensors’ data processing for autonomous vehicles. Edge AI can also improve privacy and security by enabling securing sensitive data locally or discarding unnecessary data before cloud transfer.

What is the biggest problem you or your team is solving this year?

One of the biggest challenges customers face when deploying speech and translation AI is the inability to get the accuracy needed to drive business value. In voice AI there isn’t much room for inaccuracy when asking a machine to engage with your customers, and one wrong word can prevent a satisfying, or even understandable, engagement. The thing you want to make sure to nail when building a voice AI customer pipeline is finding solutions that provide the highest out-of-the box accuracy available on the market for speech recognition, speech synthesis and speech translation.

Is there anything that you’re currently reading, or any favorite books, that you’d recommend?

Practical application of AI and law, as well as capacity of AI and its potential future implications are something that preoccupies me these days.

I’m also compelled to recommend NVIDIA’s tech blog as an accessible way to keep up with the latest technology updates, and our corporate blog is excellent for a higher-level overview of trends in AI.

About Hemant
About NVIDIA

Aggressive and passionate leader with an outstanding track record of Growing and Establishing multiple multi-hundred million $ businesses.

Unique combination of Solutions, Software, Systems and Semiconductors expertise for Cloud, Machine learning/AI, Compute Acceleration, Storage, Networking, HPC and Semiconductor industry.

20 years of diverse General Management (P & L), Strategy planning (5 year plans), Product Management, Marketing (Product/Technical), Business Development (Collaboration, M & A) , Sales, Customer Management and Engineering experience.