- The Deep View
- Posts
- ⚙️ Nvidia releases new generative AI model
⚙️ Nvidia releases new generative AI model
Good morning. Following a months-long delay sparked by rampant cybersecurity concerns, Microsoft last week started allowing users to test its highly controversial Recall feature.
During a test of the program, CNBC reported that “if you specify that Recall shouldn’t save content from a given website, it might get captured anyway.”
This is the same program that one cybersecurity researcher said had security gaps that you could “drive a plane through.”
— Ian Krietzberg, Editor-in-Chief, The Deep View
In today’s newsletter:
🌲 AI for Good: Advanced urban tree monitoring
🧠 Study: The robustness of LLM reasoning
💰 Intel to lose millions in government funding
💻 Nvidia releases new generative AI model
AI for Good: Advanced urban tree monitoring
Source: MIT
We are living through a changing climate. And as our efforts turn to climate change adaptation, rather than prevention, trees will play a critical role, not only in providing natural carbon sinks, but also in providing what will soon be much-needed shade and shelter from ever-increasing global temperatures.
But that requires a ton of planning, and planning requires models and simulations.
What happened: Researchers at MIT, in collaboration with Perdue University and Google, recently developed a “Tree-D Fusion” system that creates accurate, 3D models of existing urban trees. According to MIT, it’s the first-ever large-scale database of more than “600,000 environmentally aware, simulation-ready tree models” in North America.
The system, powered in part by generative AI, allows the team to “not just identify trees in cities, but to predict how they’ll grow and impact their surroundings over time.”
MIT said that the system could be leveraged by city planners to identify areas that could benefit from strategic tree placement, shifting urban forest management to proactive planning rather than reactive maintenance.
“This marks just the beginning for Tree-D Fusion,” Jae Joong Lee, a Purdue PhD student who worked on the system, said. “Together with my collaborators, I envision expanding the platform’s capabilities to a planetary scale. Our goal is to use AI-driven insights in service of natural ecosystems — supporting biodiversity, promoting global sustainability, and ultimately, benefiting the health of our entire planet.”
Tired of Battling Spam Calls on Your Phone? Here's How to Make Them Disappear.
Every day, your personal data, including your phone number, is sold to the highest bidder by data brokers. This leads to annoying robocalls from random companies and, worse, makes you vulnerable to scammers.
Meet Incogni: your solution against robocalls. It actively removes your personal data from the web, fighting data brokers and protecting your privacy. Unlike other services, Incogni targets all data brokers, including those elusive People Search Sites.
Put an end to those never-ending robocalls and email spam on your iPhone now.
Incogni protects you from identity theft, spam calls, increased health insurance rates, and more. Only for The Deep View readers: Get a 58% discount on Incogni's annual plan using code: DEEPVIEW
Study: The robustness of LLM reasoning
Source: Created with AI by The Deep View
A big area of study (and debate) has to do with the genuine (or otherwise) reasoning capabilities of the Large Language Models (LLMs) that back modern generative AI systems. Much of the study in this area has to do with training data; the ability of an LLM to perform well on reasoning benchmarks for problems that are likely not included in the training set is an indicator of the model’s ability to actually conduct abstract reasoning.
A model’s ability to perform well on problems that are included in the training data does not demonstrate reasoning capability; it just verifies interpolation. Lack of transparency around training data makes this kind of research something of a challenge.
The details: A new study by Martha Lewis and Melanie Mitchell sought to test the robustness of analogical reasoning in LLMs, compared to that of humans. It does so by testing both people and models on letter-string problems, digit-matrix problems and story analogy problems.
Here, the study specifically dealt with zero-shot reasoning attempts; it did not explore the increasingly popular chain of thought (CoT) approach. The researchers evaluated GPT-3, GPT-3.5 and GPT-4 turbo — it is specifically an answer to a 2023 paper that claimed that GPT-3 demonstrated zero-shot reasoning capabilities.
In the letter-string analogy domain (a b c d → a b c e ; 1 2 3 4 → ?), the researchers found that, for simple problems, humans far outperformed the models. On harder problems, both humans and the models performed poorly.
In the story analogy domain, the researchers found that LLMs are highly susceptible to the order in which potential answers are provided, where humans are not.
“Overall, our results provide evidence that, despite previously reported successes of LLMs on zero-shot analogical reasoning, these models in many cases lack the robustness of zero-shot human analogy-making, exhibiting brittleness on most of the variations and biases we tested,” the researchers wrote. “These results support the conclusions of other work showing that LLMs’ performance is better, sometimes dramatically, on versions of reasoning tasks that are likely similar to those seen in training data.”
The study is a pre-print, and so has yet to be peer-reviewed.
Payroll and state tax compliance shouldn't keep you up at night.
Apple’s thin iPhone has no physical sim (The Information).
Global effort to end plastic pollution faces final hurdle (Semafor).
New publisher Spines aims to 'disrupt' industry by using AI to publish 8,000 books in 2025 alone (The Bookseller).
Amazon’s Moonshot Plan to Rival Nvidia in AI Chips (Bloomberg).
San Francisco tech company Forward, once worth $1B, abruptly shuts down (SF Gate).
If you want to get in front of an audience of 200,000+ developers, business leaders and tech enthusiasts, get in touch with us here.
Software Development Manager: Amazon, New York, NY
Director, API Commercialization: Verizon, Basking Ridge, NJ
Meta’s LLama — through a partnership with Scale AI — is being marketed to the military. Experts told The Intercept that the advertisement offers “terrible” advice on how to blow up a building … “Scale AI defended the advertisement by telling The Intercept its marketing is not intended to accurately represent its product’s capabilities.”
A new study found that, even when not prompted to do so, generative AI chatbots integrate ‘dark patterns’ into website designs, according to Fast Company.
Intel to lose millions in government funding
Source: Intel
Intel and the U.S. Commerce Department are reportedly close to finalizing a deal that would grant the semiconductor $8 billion in funding under the Chips and Sciences Act, according to the Wall Street Journal.
The details: Intel in March announced that it had reached a preliminary agreement to receive $8.5 billion in funding under the Chips Act to advance its semiconductor research and manufacturing sites. In September, Intel was awarded an additional $3 billion under the Act to develop chips for the Department of Defense.
On Sunday, the New York Times, citing four unnamed sources, reported that the government had decided to reduce the size of Intel’s grant by around $500 million, due to concerns with Intel’s ability to execute the agreement.
Reuters reported Monday that Intel does expect a reduction in the size of the grant, but that this reduction is intended to compensate for the $3 billion deal with the Pentagon. Sources told Reuters that the reduction is smaller than what had been earlier reported.
Intel didn’t respond to a request for comment.
This comes shortly after the government finalized TSMC’s $6.5 billion grant.
Intel reported a net loss of nearly $17 billion last quarter, a 6% year-over-year revenue decline. Shares of Intel, down 50% for the year, lifted around 2% on Monday.
Nvidia releases new generative AI model
Source: Nvidia
Nvidia on Monday unveiled a new generative AI model nicknamed “Fugatto,” which it describes as being a “Swiss Army knife” for sound. It is the result of a year-long effort by a team of AI researchers at Nvidia to create a more dexterous generative model.
The details: First, Nvidia did not disclose what data it used to train the model, and whether it had licensed that data, or scraped it — without the consent or knowledge of its original creators — off the internet. In an accompanying paper, Nvidia said it plans to release the dataset for research purposes.
Nvidia also did not choose to disclose the cost of energy — and subsequent carbon footprint — both of training and operating the model.
Nvidia did not return a comment on either of the two above points. It was reported in July that Nvidia, in addition to Apple and Anthropic, used a scraped YouTube dataset to train its AI models.
The model: The model, according to Nvidia, can generate “or transform any mix of music, voices and sounds described with prompts using any combination of text and audio files.”
It can isolate tracks, change the accent or emotion of a speaking or singing voice and produce new sounds.
Nvidia suggested that an “ad agency could apply Fugatto to quickly target an existing campaign for multiple regions or situations, applying different accents and emotions to voiceovers.”
Nvidia added that, unlike most generative AI models, which produce output based exclusively on their training data, this one “allows users to create soundscapes it’s never seen before.”
Again, though, since Nvidia is not sharing its training set, this is a claim to be taken with a grain of salt. We don’t know what it has and has not seen, so this “emergent” capability is anything but verified. Its research paper has not been peer-reviewed.
The full version of Fugatto is a 2.5 billion parameter model that was trained on a bank of 32 H100 GPUs. It enters Nvidia into the semi-competitive realm of music generators, which includes Meta’s AudioCraft, OpenAI’s MuseNet, Suno and Udio.
A group of record labels somewhat recently filed a massive copyright infringement lawsuit against Suno and Udio.
A few things. First, when we talk about copyright violations, this encapsulates the issue — content is being scraped without permission, consent (or payment), and that data is then being used to build models that will absolutely be used to supplant the original creators. Nvidia’s ad agency use case is a perfect example of this; the idea is to employ fewer people, not to unleash creativity.
As Ed Newton-Rex, the CEO of Fairly Trained, wrote last year: “Companies worth billions of dollars are, without permission, training generative AI models on creators’ works, which are then being used to create new content that in many cases can compete with the original works. I don’t see how this can be acceptable in a society that has set up the economics of the creative arts such that creators rely on copyright.”
Second, this flies in the face of Dr. Sasha Luccioni’s idea of digital sobriety and intentionality surrounding the application of generative AI, and generally, of carbon-intensive technologies. These systems are incredibly energy intensive — though the companies don’t want you to know how bad it is — and the payoff for all those emissions is a solution to a problem that doesn’t exist; humans have been composing music from the very beginning, this doesn’t help that, it just quickens and cheapens the process.
To this, I would add that music has always been a significant part of my life. I’ve been composing music since I was six. It is a process that matters far more than the end result. It’s a vital human thing, and the notion that it might begin to go away, even a little, is not something I feel good about.
Which image is real? |
🤔 Your thought process:
Selected Image 2 (Left):
“Image 1 has discrete vegetables, but Image 2’s vegetables are realistically ‘messed up.’ Also, the silverware in 2 is realistically intricate.”
Selected Image 1 (Right):
“Fooled me. I liked the identifiable pieces of veg and spice.”
💭 A poll before you go
Thanks for reading today’s edition of The Deep View!
We’ll see you in the next one.
Here’s your view on Thanksgiving:
A third of you said the sides are better than the turkey (to this crowd, I would suggest spatchcocking, dry brines and smoking), and 12% said it’s your favorite holiday.
20% are here for the football; 12% are here for the parade.
Something else:
“Glad to spend time with family and friends. It's been a tough year, and it's nice to have a break to spend with loved ones, and remember what the hard work is all for.”
Well said.
How do YOU feel about AI-generated music? |