• The Deep View
  • Posts
  • ⚙️ The big gap between AI and the real world

⚙️ The big gap between AI and the real world

Good morning. Apple, Microsoft and Amazon step up to the earnings plate this week.

It’ll test the strength of last week’s rebound, and provide a marker for the health of the “AI trade.”

Can’t possibly be more stressful than watching the Knicks in Game 4 ..

— Ian Krietzberg, Editor-in-Chief, The Deep View

In today’s newsletter:

  • 🌊 AI for Good: Ocean topography

  • 🚘 US eases up on self-driving regulation  

  • 🚨 Big Tech companies already know you. AI startups want that privilege, too

  • 👁️‍🗨️ The big gap between AI and the real world

AI for Good: Ocean topography

Source: Unsplash

In its mission to map our oceans by the end of the decade, Seabed 2030 — a collaboration between the Nippon Foundation and GEBCO — last year partnered with SeaDeep, a company that’s developing an AI-powered platform for oceanic exploration and monitoring. 

The details: The company’s platform delivers highly accurate information regarding the goings-on beneath the waves; its introduction to the Seabed 2030 effort marked a “significant milestone” in the journey toward a fully mapped ocean floor.

  • SeaDeep, a startup that spun out of Tufts University, is developing underwater monitoring technology specifically designed to address the challenges associated with gathering data about the ocean deeps, namely, low light and turbulence. 

  • It’s main approach involves an integration between advanced sensors, robotics, analysts and AI models; the sensors can capture data across the light spectrum, the robots get them in place and the AI models turn that data into actionable information. 

Why it matters: The world’s oceans cover roughly 70% of its surface. As of June, 2024, only 26% of those oceans have been mapped. This lack of detailed knowledge about the ocean floor, according to Seabed 2030, is hindering “our capacity to manage marine resources sustainably and safeguard coastal communities worldwide.” 

According to Seabed 2030, “knowing the seafloor’s shape is fundamental for understanding ocean circulation and climate models, resource management, tsunami forecasting and public safety, sediment transportation, environmental change, cable and pipeline routing and much more.” 

From design to live in seconds.

Framer is a new no-code website-building tool that allows you to launch any website and marketing campaign in just days without needing a developer.

It looks, feels, and works just like a design tool, but it has the power to output real websites that look, feel, and function like they were built by a developer.

Framer has an advanced CMS, out-of-the-box SEO optimization, built-in analytics, localization, and you can add advanced scroll animations to make your website come alive, all without code.

US eases up on self-driving regulation

Source: Unsplash

Self-driving cars are one of the original missions of this field. 

Though it has proven to be an immense engineering and technical challenge, there are a few major companies that currently operate, or are on the verge of operating, fleets of autonomous taxis. 

The major player here is Waymo, Google’s self-driving unit. Then there’s Tesla, whose CEO has been promising fully autonomous Teslas for about a decade, something that has yet to exist. 

Still, shares of Tesla spiked some 10% on Friday in the wake of dual announcements regarding lighter regulation for autonomous vehicles (AVs). 

What happened: First, the California Department of Motor Vehicles is now seeking public comment on an updated series of regulations for AV operation in the state. 

  • The proposal would, most importantly, remove explicit restrictions that ban the operation of autonomous vehicles about 10,001 pounds — in other words, autonomous semi trucks. 

  • The proposal would also increase data reporting requirements for manufacturers.

It additionally features a “phased permitting” process, where a manufacturer must first acquire a permit to test with a safety driver, then a permit without a safety driver, then a permit to deploy a self-driving fleet. The issuance of each permit comes paired with testing requirements by the mile. 

The public comment period will end June 9, at which point the DMV will host a public hearing before implementing the rules. 

At the same time, the National Highway Traffic and Safety Administration announced a new autonomous vehicle framework, one that will allow automakers to report less crash information to the federal government. 

“The new framework will unleash American ingenuity, maintain key safety standards, and prevent a harmful patchwork of state laws and regulations,” according to the Department of Transportation. 

Safety Engineer Dr. Missy Cummings told me last year that autonomous vehicles remain limited by the same technical faults that limit generative AI chatbots; since they, unlike chatbots, operate at speed in the real world, the risk of a hallucination, for instance, is severe. These kinds of incidents, she said, are likely to scale as the scope of the self-driving experiment scales in kind, expanding to highways, more cities and more diverse weather conditions.

  • Chasing OpenAI: xAI is reportedly in talks to raise $20 billion, a round that would value the company at $120 billion. This shortly follows OpenAI’s $40 billion raise at a $300 billion valuation. This also shortly follows the merger of X (formerly Twitter) with xAI.

  • Deep Lite: OpenAI said it’s rolling out a light (so, cheaper) version of its Deep Research tool to free users, a response, according to the company, to consistently high demand.

  • An AI-generated radio host in Australia went unnoticed for months (The Verge).

  • Israel’s AI experiments in Gaza raise ethical concerns (NYT).

  • Playing ‘whack-a-mole’ with Meta over my fraudulent avatars (FT).

  • Google’s AI search numbers are growing, and that’s by design (TechCrunch).

  • Will the humanities survive artificial intelligence? (The New Yorker).

Big Tech companies already know you. AI startups want that privilege, too

Source: Perplexity

The internet and social media companies that rose to prominence in the late 2000s and early 2010s have spent the past couple of decades — somewhat unwittingly — gathering the most vital component of the AI age: data

For them, the key thing was predictive analytics. Algorithms not too dissimilar from those that have become so popular today, designed to parse a ton of data points to predict what post, video or link will keep someone on the platform longer, or what ad an individual user might actually click on, or what time a push notification might actually result in an app being opened (and a purchase being made). 

These analytics are so powerful that the line between prediction and influence has become blurred to the point of nonexistence. 

And when these same companies decided to get in on the generative AI train, they were already fueled up with ceaseless quantities of the thing that makes algorithms work. Data, data, data, data. 

It’s an advantage that the AI startups want access to. 

What happened: The reason the AI search startup is building a browser, according to CEO Aravind Srinivas, is to know its users even better through access to additional data sources, all so the company can sell more personalized ads. 

  • “We want to get data even outside the app to better understand you. Because some of the prompts that people do in these AIs is purely work-related. It’s not like that’s personal,” he said on the TBPN podcast

  • He’s interested in the things people buy, the hotels and restaurants they go to, the specific subjects they browse for. “We plan to use all the context to build a better user profile and, maybe you know, through our discover feed we could show some ads there.” 

The browser is set to launch in May. 

The big gap between AI and the real world

Source: Unsplash

Having seemingly conquered the digital space, major AI developers have been focusing in on the next big thing. Nvidia CEO Jensen Huang calls it “physical AI,” an umbrella encompassing robots, self-driving cars and ‘smart spaces’ that allow generative AI to interact with the real world. 

The calling card of such physical systems, however, is also their greatest weakness. Unlike cyberspace, the real world is full of unpredictable complications that come paired with severe consequences. 

Self-driving cars are a good example. 

Hundreds of billions of dollars have been invested over the past decade. The AI systems that power these cars have gotten better and better. And still, self-driving cars are in their infancy, forced to scale with deliberate caution in an attempt to avoid the kinds of accidents that could tank entire companies. 

The big problem in self-driving cars is edge cases, situations that have not appeared in the training data. The problem is that it’s impossible to gather training data on all possible edge cases. And since the models that power these systems don’t actually reason the way humans do, and since they remain specifically limited and constrained by their sensors and algorithms, some engineers don’t believe autonomous cars will ever scale to every situation, in every location, all the time. 

Recent research out of Johns Hopkins found that generative AI systems — video, image or language models — are far worse than humans at describing and predicting human interactions in a moving scene. These models are pretty good at reading static images, but their inability to reliably predict social interactions is a glaring weakness when it comes to bringing AI to the real world. 

  • The researchers asked human participants to watch three-second clips and rate features important for understanding social interactions on a scale of one to five. The researchers then prompted more than 350 language, image and video models to predict the human responses and how human brains would react to watching the clips. 

  • The human participants were largely in agreement with one another. And though the language models tended to perform the best of the generative AI grouping, the researchers identified “a notable gap in all models’ ability to predict human responses.”

"AI for a self-driving car, for example, would need to recognize the intentions, goals, and actions of human drivers and pedestrians. You would want it to know which way a pedestrian is about to start walking, or whether two people are in conversation versus about to cross the street," lead author Leyla Isik, an assistant professor of cognitive science at Johns Hopkins, said in a statement. "Any time you want an AI to interact with humans, you want it to be able to recognize what people are doing. I think this sheds light on the fact that these systems can't right now."

The researchers believe the source of the gap is that the artificial neural networks that make up generative AI are designed to mimic the part of the brain that processes static images. But the “human brain processes dynamic social scenes in regions that are distinct from those involved in classical object perception.” Current models, according to the researchers, struggle to match even human infants “in their ability to understand social scenes.”

Data processing and prediction are simply not the same thing as genuine understanding. 

"There's a lot of nuances, but the big takeaway is none of the AI models can match human brain and behavior responses to scenes across the board, like they do for static scenes," Isik said. "I think there's something fundamental about the way humans are processing scenes that these models are missing."

Hype has no time for nuance. 

This, rather simply, is because nuance is the hype-killer. 

Predictive systems go a long way, but when we’re talking about real-world integrations, how and why these systems work (or don’t) matters. 

The difference between a human making a judgment error, and an AI system hallucinating, is enormously significant. 

Which image is real?

Login or Subscribe to participate in polls.

🤔 Your thought process:

Selected Image 2 (Left):

  • “The guitar cutout is proportional on the bottom image, but is an odd shape on the top.”

Selected Image 1 (Right):

  • “The fingers don't look right.”

💭 A poll before you go

Thanks for reading today’s edition of The Deep View!

We’ll see you in the next one.

P.S. Enjoyed reading? Take The Deep View with you on the go! We’ve got exclusive, in-depth interviews for you on The Deep View: Conversations podcast every Tuesday morning. Subscribe here!

Do you want hyper-personalized ads from Perplexity?

Login or Subscribe to participate in polls.

If you want to get in front of an audience of 450,000+ developers, business leaders and tech enthusiasts, get in touch with us here.