Nvidia's push for AI agents

Good morning. We’re coming to you a little later than usual today with news fresh from Nvidia’s AI Summit in Washington D.C.

If you’re in Florida, here are some online preparation resources from the state regarding Hurricane Milton.

Be safe.

— Ian Krietzberg, Editor-in-Chief, The Deep View

In today’s newsletter:

MBZUAI Research: Multi-lingual diversity in vision models

Source: Created with AI by The Deep View

A significant setback of large language models and the generative AI systems built atop that architecture involves limitations in language. These models are primarily centered around English, with a few other major Western-centric languages. 

The result — when it comes to visual question answering (VQA) models — according to some researchers, is a “narrow cultural representation.” 

New research from the Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) strives to bridge that gap. 

The details: In a new paper, researchers proposed CVQA, “a novel, large-scale, multilingual, culturally nuanced VQA benchmark that includes a diverse set of languages, including many that are underrepresented and understudied.”

  • The researchers constructed this “culturally diverse” dataset by focusing on “common ground” knowledge, i.e., local dishes, history, shared places, etc. 

  • In applying the CVQA benchmark to a variety of open-source models, the researchers found that they “performed worse when queried in local languages compared to English.”

Why it matters: “We hope that publishing CVQA encourages the AI community to pay more attention to non-English-centric models and benchmarking, thereby advancing progress in multilingual, multimodal research,” the paper reads. 

To learn more about MBZUAI’s research visit their website

  • OpenAI funding fuels wave of big AI deals (The Information).

  • Nobel Prize for medicine goes to scientists who discovered microRNA (Semafor).

  • US antitrust case against Amazon to move forward (Reuters).

  • Tensions rise between banks and tech companies over online fraud liability in the UK (CNBC).

  • Hurricane Milton strengthens into a Category 4 as Florida prepares for evacuations (AP News).

If you want to get in front of an audience of 200,000+ developers, business leaders and tech enthusiasts, get in touch with us here.

  • Director Data Science: American Express, New York, NY(Apply Here).

  • Product Manager, AI: Microsoft, Redmond, WA (Apply Here).

  • DryMerge: A tool to automate repetitive tasks.

  • Task Base: Virtual assistants packaged with AI-powered software.

Supermicro delivering 100k GPUs each quarter

Source: Nvidia

IT and semiconductor firm Supermicro said Monday that it is delivering more than 100,000 GPUs per quarter, an announcement that sent its stock price soaring by about 15%. 

  • The firm’s focus is on its liquid cooling technology, which enables these GPUs to run far more efficiently in the data center. 

  • According to Supermicro, the tech allows “organizations to run larger training models with a smaller data center footprint.”

The context: GPUs have been likened to the picks and shovels of the gold rush; they are the hardware that makes today’s generative AI software possible. It is this fact that has elevated Nvidia to such extreme heights over the past two years. 

Nvidia CEO Jensen Huang recently said that demand for the company’s latest GPU is “insane,” adding: “Everybody wants to have the most and everybody wants to be first.” Elon Musk’s xAI recently deployed a cluster of 100,000 Nvidia H100 GPUs; Microsoft and OpenAI have plans to build a supercluster with millions of the chips. 

A single chip here goes for more than $30,000 … 100,000 per quarter isn’t exactly small change.

Nvidia’s push for AI agents 

Source: Nvidia

Nvidia this week began its Washington D.C. AI Summit.

And it’s all about the company's newest thing: AI agents, something the semiconductor sees as a natural progression from our current generative AI starting point. 

It’s not just Nvidia; the entire tech sector has been quite excited about the idea of “agentic AI” for a while, now. But there is a problem — rather common in this field — of definitional confusion. As with other terms, including AGI and AI itself, there remains no unified definition of what an AI agent is, or what it would have to do to be considered an agent. 

So let’s start there. 

Nvidia last year defined agents as “a system that can use an LLM (Large Language Model) to reason through a problem, create a plan to solve the problem and execute the plan with the help of a set of tools.” 

  • According to IBM, an AI agent “refers to a system or program that is capable of autonomously performing tasks on behalf of a user or another system by designing its workflow and utilizing available tools.”

  • And Amazon’s AWS defines an agent as “a software program that can interact with its environment, collect data and use the data to perform self-determined tasks to meet predetermined goals. Humans set goals, but an AI agent independently chooses the best actions it needs to perform to achieve those goals.”

The synchronicity here is in the LLMs that sit at the core of these agentic systems, and in the multi-step functionality of these systems. If you think of ChatGPT essentially as a question-answering interface, the idea of an agent goes several steps further; imagine a session with an agentic bot bouncing around ideas for a vacation plan that concludes — with your approval — with the bot purchasing plane and hotel tickets and emailing you a rough itinerary. 

Now, as always, there are plenty of questions that first need answering around the ethics of deployment and legitimate capability, but that’s the general idea. 

Nvidia already has a series of what it calls “NIM Agent Blueprints,” which refer to customizable generative AI workflows designed for enterprise use. This week, the semiconductor launched a new agentic blueprint for cybersecurity. 

  • The main goal of the blueprint is in “vulnerability detection and resolution,” something that would ordinarily be a laborious manual process. 

  • Nvidia said that Deloitte has already adopted this cybersecurity agent. 

The tech giant also announced partnerships with AT&T, the University of Florida and Lowes, all of which are developing generative AI-based technologies built on Nvidia’s agentic blueprints and other software offerings. 

Parts of the healthcare sector are also beginning to work with Nvidia’s NIM agents; Nvidia said that researchers at the National Cancer Institute are using generative models, including NIM, for segmenting and annotating 3D CT images. 

In this vein, Nvidia announced the introduction of two new NIM blueprints centered around drug development.

“We are just at the surface of what we can do in the healthcare industry,” Bob Petty, Nvidia's VP of enterprise platforms, said in a briefing call. 

I can say this about many terms in this sector, but I do not love the term “agent” in this context. As with AI itself, it implies a level of anthropomorphization that does not exist. “Automated workflows” would be better. Issues of bias and hallucination have not been solved, which makes over-hyped terminology dangerous. 

And, as per the norm with Big Tech, we do not know the energy and water cost of running these agentic models, but you can be sure it’s more dramatic than non-agentic software.

Which image is real?

Login or Subscribe to participate in polls.

🤔 Your thought process:

Selected Image 1 (Left):

  • “Whilst both looked fake, the shadows on image 2 just didn't sit right.”

Selected Image 2 (Right):

  • “I regularly mountain bike in the desert and image 2 looks like something I would regularly see while biking down to the chunks of dirt/sand on the ground. Nicely done.”

💭 A poll before you go

Thanks for reading today’s edition of The Deep View!

We’ll see you in the next one.

Here’s your view on Meta’s Movie Gen:

More than a third of you are thrilled to get your hands on Movie Gen; 22%, however, just hate Meta and everything it does. 20% aren’t excited about it at all, and 13% wish Meta would be more open with its training data.

I hate Meta:

  • “‘Meta’ and ‘ethical company’ don't belong in the same sentence unless you're explaining why it's not an ethical company. For that reason, more powerful generative video technology in their hands seems like a recipe for disaster, especially for women whose Instagram (was) probably used for training data.”

Thrilled:

  • “I teach AI to grad students in an entertainment industry program. We’ve tried unsuccessfully to access Sora for our assignments, so are definitely interested in a viable alternative.”

Do you think linguistic diversity is important in LLMs?

Login or Subscribe to participate in polls.