The Deep View
Posts
⚙️ Google responds to OpenAI with high-tech unveils — Nvidia is the real winner

⚙️ Google responds to OpenAI with high-tech unveils — Nvidia is the real winner

Ian Krietzberg
May 15, 2024

Good Morning. Our poll yesterday had a fascinating mix of opinions & perspectives on Meta’s use of your Instagram/Facebook content to train its AI models. The mix pretty accurately reflects the overall impression I’ve picked up on in conversations with family and friends, which is interesting.

37% of you are not cool with it, but as one reader wrote: “They will do it anyway.”
35% don’t really care
28% are perfectly alright with it; as one reader wrote: “If you don't want it out there for all to do with (as) they choose, don't post it.”

In today’s newsletter:

🚘 Cruise returns to the streets
🚔 Waymo is under investigation
🛜 Here’s how to understand the capabilities of LLMs
📺 Google unveils new AI tech at I/O conference

One quick thing:

After almost a decade, I have made the decision to leave OpenAI. The company’s trajectory has been nothing short of miraculous, and I’m confident that OpenAI will build AGI that is both safe and beneficial under the leadership of @sama, @gdb, @miramurati and now, under the… x.com/i/web/status/1…
— Ilya Sutskever (@ilyasut)
11:00 PM • May 14, 2024

Cruise returns to the streets

Image Source: Cruise

This week, Cruise’s autonomous vehicles will begin supervised autonomous driving again in Phoenix, Arizona.

Each robotaxi will have a safety driver behind the wheel, just in case things go off the rails again. But Cruise said that the decision to resume supervised driving follows weeks of road mapping and validating safety requirements.

Cruise said that it will gradually expand this supervised autonomous driving effort into the other cities in which it operates.

Cruise’s rough road:

In October of last year, a San Francisco pedestrian, after being hit by another car, was dragged 20 feet by a Cruise robotaxi (the woman was injured, but survived).
After the incident, Cruise halted its activity and brought in a whole new leadership team as it re-examined its operations and safety standards. Last month, the company laid out a road to resuming its driverless operations; this step gets everyone that much closer to autonomous Cruise fleets once again roaming the streets.

Waymo is under federal investigation

Image source: Waymo

In more self-driving news, the National Highway Traffic Safety Administration's Office of Defects Investigation (ODI) said it opened a probe into Waymo on Monday.

The safety regulator flagged 22 incidents where Waymo’s autonomous vehicles either broke traffic laws or caused single-car accidents with everything from parked vehicles to “gates and chains.”
The report also identified situations where Waymo cars drove “in opposing lanes with nearby oncoming traffic.”

Waymo said in a statement that “we are proud of our performance and safety record over tens of millions of autonomous miles driven, as well as our demonstrated commitment to safety transparency.”

The broader picture:

NHTSA also recently opened an investigation into Zoox, Amazon’s self-driving unit.
I’ve spoken with multiple AI/autonomous driving experts over the past year, and the consensus toward current autonomous vehicles is skeptical. Yes, improvements are coming fast, but since the AI powering this tech lacks the human ability to reason, it becomes very difficult to adequately train an AV for edge cases.
- These companies might rack up millions of miles, but a lack of reasoning means that those miles will not prepare the vehicles for each unique element of a random, unexpected edge case. And edge cases are where things go wrong.

Together with Hubspot

Turn AI Into Your Personal Assistant

Discover how to turn AI into your personal productivity powerhouse with HubSpot’s highly anticipated AI Task Delegation Playbook.

Master the art of AI delegation and optimize your workflow like never before. Get ready to save time and boost efficiency with their easy-to-use templates and calculators. Don’t miss out—download your copy today and start transforming your workday!

Leverage AI for streamlined task management, significantly enhancing time efficiency
Utilize AI tools to elevate decision-making and maximize workflow efficiency across teams
Explore comprehensive templates and detailed examples to master straightforward AI delegation
Evaluate and optimize productivity by assessing measurable impacts of AI on your daily output

Get your playbook today!

How scientists are determining LLM capabilities

Image Source: Unsplash

In the midst of product launches from both OpenAI and Google, it feels important to understand just how capable LLMs are. Both companies, especially in their demos, are really good at making their AI models appear intelligent, but scientists have to sink into the weeds (and through a couple of black boxes) to actually understand what’s going on here. And often, all they can do is make educated guesses, since these companies tend to keep their training data under lock and key.

AI researcher Melanie Mitchell recently broke down the problem in a bit of depth.

The challenge:

Mitchell said that, even as LLMs might excel on certain benchmarks, we don’t “know the extent to which the test items in a benchmark appeared in the training data.”
- It’s hard to tell if models are reasoning, or if they are good at pattern retrieval based on their training data.
- As one computer scientist once put it to me: Imagine a law student who reads textbooks and passes the bar exam, versus a student (with a photographic memory) who was trained on thousands of bar exams, then passes the bar exam.
  - In this example, if there were no bar exams in the training data, an AI passing the bar exam becomes a much bigger deal.

The Solution: The counterfactual task paradigm

In this paradigm, models are tested on pairs of tasks that require the same type of reasoning:
- The first features content likely in a model’s training data.
- The second features content unlikely in the training data.
  - Using this, scientists can determine the extent that reasoning might be present in an LLM.
“In short, LLMs seem to have some ability to reason, but without stress-testing them (e.g., with counterfactual tasks), one can’t conclude that they are reasoning in a general way rather than relying on their training data in ways that won’t generalize to out-of-distribution examples.”

💰AI Jobs Board:

Python AI & ML Developer: Deloitte · United States · Philadelphia, PA, Hybrid · Full-time · (Apply here)
Applied Scientist II — Generative AI Innovation Center: Amazon · United States · Multiple Locations · Full-time · (Apply here)
AI Product Strategist: Verizon · United States · New Jersey or Texas, Hybrid · Full-time · (Apply here)

🗺️ Events: *

Ai4, the world’s largest gathering of artificial intelligence leaders in business, is coming to Las Vegas — August 12-14, 2024.
- Don’t wait — passes are going fast. Apply today for a complimentary pass or register now for 41% off final prices.

🌎 The Broad View:

April wholesale inflation came in hotter than expected (CNBC).
Kyle Vogt (Cruise co-founder) is launching a bot company — the bots will “do chores so you don’t have to” (Kyle Vogt).
Meta is developing AI-powered earphones with cameras (The Information).

*Indicates a sponsored link

Together with Enquire AI

Enquire PRO is designed for entrepreneurs and consultants who want to make better-informed decisions, faster, leveraging AI. Think of us as the best parts of LinkedIn Premium and ChatGPT.

We built a network of 20,000 vetted business leaders, then used AI to connect them for matchmaking and insight gathering.

Our AI co-pilot, Ayda, is trained on their insights, and can deliver a detailed, nuanced brief in seconds. When deeper context is needed, use a NetworkPulse to ask the network, or browse for the right clients, collaborators, and research partners.

Right now, Deep View readers can get Enquire PRO for just $49 for 12 months, our best offer yet. Click the link, sign up for the annual plan, and use code DISCOUNT49 at checkout for the AI co-pilot you deserve.

Google responds to OpenAI with tech unveils — Nvidia is the real winner

Image source: Google

Google spent Tuesday trying in earnest to establish itself as a dominant force in the AI sector.

Everything (important) unveiled at I/O:

Google launches a new model – and Google Drive gets an AI companion
- Google introduced Gemini 1.5 Flash into its lineup of AI models, a model that’s just as powerful as the pro version, but is optimized for “narrow, high-frequency, low-latency tasks.”
- Google also made improvements to its flagship Gemini 1.5 model & Gemini 1.5 Pro is rolling out to Google Drive as an on-hand assistant (for paid subscribers) next month.

For a long time, we’ve been working towards a universal AI agent that can be truly helpful in everyday life. Today at #GoogleIO we showed off our latest progress towards this: Project Astra. Here’s a video of our prototype, captured in real time.
— Demis Hassabis (@demishassabis)
7:36 PM • May 14, 2024

AI is coming to Google Search (this week):
- Google’s Search Generative Experience is leaving its beta round — “AI Overviews” will be integrated into Google Search this week in the U.S.
- Now, when you search on Google, the first thing you’ll see is an AI-generated summary of the information you might have been looking for. Publishers, this is the moment you’ve been waiting for … say ‘goodbye’ to your Google web traffic.
A Google AI assistant is coming to Chrome:
- Starting with Chrome 126 (we’re in 124 at the moment) Gemini Nano will be a built-in feature.
  Starting in Chrome 126, Gemini Nano will be built into the Chrome Desktop client itself. So you'll be able to deliver powerful AI features to Chrome’s billions of users without worrying about prompt engineering, fine tuning, capacity and cost. #GoogleIO
  — Google (@Google)
  9:06 PM • May 14, 2024

Project Astra: Google’s response to OpenAI’s GPT-4o:
- Project Astra is a conversational multimodal AI assistant that can receive input in the form of images, videos and text — the goal is for this prototype to become the basis for natural, universal AI agents that can interact with people and the world around them (very similar to OpenAI’s GPT-4o).

My thoughts:

The timing of this really puts the “race” of AI development into sharper focus; OpenAI jumps out with a cool new product launch, Google leapfrogs OpenAI with a bunch of cool new product launches — some similar, some different — and so on. It’s a blow-for-blow competition between the largest companies in the world – and every bout reaffirms Nvidia’s dominance (Jensen Huang doesn’t care who wins, just as long as both sides keep buying his GPU chips).

The first thing I have to note here is the same thing I noted after OpenAI’s demo; this is just that, a demo, and Google has a bit of a history of its demos being less than legitimate. Hard to tell at this stage how good/reliable/usable these products will be in the real world.

Also worth noting that, with all these unveils, Google still did not spill the beans on its training data (which is particularly interesting for the multimodal model), or the energy cost of training/running all these models.

We’re at a stage where, given the prominence of Google’s search and other products, generative AI is about to become a fact of daily life for a lot of people, whether they want it or not.

I’m wondering if a lot of people who’d rather not deal with AI in search are about to have no choice in the matter, or, perhaps, vice versa.

Are you excited about AI Overviews in Google Search?

Image 1

Which image is real?

Image 2

Cargo: A tool to enhance your CRM, aggregate your workflow & drive revenue.
Yoodli: An AI tool to improve your speaking skills.
Venturefy: A tool that provides a wiki of B2B relationships — a “blue check” for businesses.

Have cool resources or tools to share? Submit a tool or reach us by replying to this email (or DM us on Twitter).

*Indicates a sponsored link

SPONSOR THIS NEWSLETTER

The Deep View is currently one of the world’s fastest-growing newsletters, adding thousands of AI enthusiasts a week to our incredible family of over 200,000! Our readers work at top companies like Apple, Meta, OpenAI, Google, Microsoft and many more.

If you want to share your company or product with fellow AI enthusiasts before we’re fully booked, reserve an ad slot here.

One last thing👇

The scene in “Her” when Theodore finds out his AI girlfriend Samantha has cheated on him has aged like fine wine:
Theodore: Are you talking to anyone else?
Samantha: Yeah.
Theodore: How many?
Samantha: 8,316.
Theodore (shook): Are you in love with anyone else?
Samantha:… x.com/i/web/status/1…
— Trung Phan (@TrungTPhan)
4:41 AM • May 14, 2024

That's a wrap for now! We hope you enjoyed today’s newsletter :)

What did you think of today's email?

We appreciate your continued support! We'll catch you in the next edition 👋

-Ian Krietzberg, Editor-in-Chief, The Deep View