The Deep View
Posts
⚙️ Filtering Out The Noise

⚙️ Filtering Out The Noise

Bot Buddy
October 23, 2023

Good morning. Ever find yourself wishing you could mute the crowd's cheers just to hear the thud of Khamzat Chimaev's fist meeting its target? Today we talk about AudioSep, the AI that zeroes in on specific sounds with the same precision that Chimaev uses to zero in on his opponent's weaknesses.

In today’s newsletter:

🗣 AudioSep: Revolutionizing How We Separate Sounds
🌐 From Simulated to Real: Nvidia and Meta's Quest for Practical AI
🎙️ Cutting the Noise: Tape It's New AI Algorithm
🇨🇳 Zhipu AI: China's Answer to the AI Race
🌎 Luzia: The Chatbot Connecting LatAm, Spain, and the U.S.
🗽 Breaking Ground: New York City Unveils Comprehensive AI Action Plan

NEWS

AudioSep: Changing How We Separate Sounds

Audio separation technology has been inching closer to human-like hearing capabilities, but most existing solutions have limitations. They either focus only on specific types of audio like music or speech or are not practical for real-world, real-time applications. Enter AudioSep, a groundbreaking AI model that promises to change this landscape dramatically. Developed as an enhancement of existing audio separation frameworks, AudioSep can isolate specific sounds from a noisy background based on natural language queries. Imagine being able to instruct your computer to “separate the sound of the violin from this orchestra piece," and getting just that.

Next gen AI audio separation is here 🤯
AudioSep is a model that can separate audio events, musical instruments, and even enhance speech with natural language queries which makes this a versatile tool for different audio tasks.
audio-agi.github.io/Separate-Anyth…
— Dreaming Tulpa 🥓👑 (@dreamingtulpa)
8:08 AM • Oct 21, 2023

The underlying magic is in how AudioSep uses text-based queries to understand what sounds to focus on. Unlike traditional methods that rely on pre-set instructions, AudioSep can understand your request in plain English. It's almost like asking a skilled sound engineer to isolate a sound for you, except it's all automated and instantaneous. This feature opens the door to numerous applications ranging from audio editing to multimedia content creation, making it easier for both developers and everyday users.

What makes AudioSep stand out is its advanced architecture that combines two major components: a text encoder and a separation model. The text encoder understands your natural language query, while the separation model does the heavy lifting of isolating the requested sound. More importantly, it's trained on a wide range of audio-visual data, allowing it to work effectively across different types of sounds and environments. Think of it as a 'universal translator' but for separating sounds.

AudioSep is a leap forward in the realm of audio separation. Its ability to understand natural language queries and its versatile training make it a model with promising applications in real-world scenarios. And you can even try it out yourself right here on hugging face.

SPONSOR THIS NEWSLETTER

The Deep View is currently one of the world’s fastest-growing newsletters, adding thousands of AI enthusiasts a week to our incredible family of over 130,000! Our readers work at top companies like Apple, Meta, OpenAI, Google, Microsoft, and many more.

If you want to share your company or product with fellow AI enthusiasts before we’re fully booked, reserve an ad slot here.

NEWS

Spinning Pens and Smart Homes: How Virtual Worlds are Shaping Practical AI

Meta and Nvidia, two tech giants, are using virtual worlds to accelerate the learning curve of artificial intelligence (AI). Imagine this: before a self-driving car hits the road, it practices in a digital simulation to understand all possible scenarios. In a similar fashion, Nvidia and Meta are creating virtual environments where AI agents can learn practical tasks more efficiently. Nvidia’s project, dubbed EUREKA, even employs advanced language models to streamline the learning process, making it more effective than traditional methods.

But why go virtual? The answer is simple: speed and complexity. Training AI agents in the real world is a slow, laborious process that can take days or even weeks. On the flip side, a virtual world allows these agents to perform thousands of tasks in a fraction of the time, honing their skills in a complex but controlled setting. This helps researchers and developers to iterate and improve AI capabilities far more rapidly. Take a look at Nvidia’s initial test of having AI spin a pen in its fingers.

Meta is also in the game with its Habitat 3.0—a sophisticated, simulated environment where robots can not only learn tasks but also interact with human avatars. This is crucial for training AI agents to operate in dynamic environments, like a home filled with people and pets. The end goal? Practical robots that can coexist and assist in our daily lives, whether that's cleaning up the living room or helping the elderly.

These advancements aren't confined to the labs of Meta and Nvidia; both companies are sharing their findings and tools with the broader research community. The upshot? We're on the brink of a new era where AI isn't just a buzzword but a tangible part of our everyday lives, thanks to accelerated learning environments that are making robots smarter, faster.

Too easy…

Here’s your thoughts 👇

Image 1

Which of the two images are real?

Image 2

🎙️ Cutting the Noise: Tape It's New AI Algorithm

A startup called Tape It is revolutionizing audio recording for musicians by introducing an AI-powered noise reduction algorithm. Founded by Thomas Walther and Jan Nash in 2020, Tape It initially aimed to replace Apple’s discontinued Music Memos app with an iOS app that could automatically detect instruments and annotate recordings. The startup has now released a free web app featuring its new AI denoiser, which removes background noise to produce studio-quality sound. Tape It plans to license the technology to vendors and integrate it into their flagship app. The company, boasting around 10,000 monthly active users, is also exploring funding options.

🇨🇳 Zhipu AI: China's Answer to the AI Race

Zhipu AI, a foundation model developer in China, recently announced that it has raised $340 million in financing this year. The funding comes mainly from yuan-denominated funds, breaking from the past trend of USD-based investments, reflecting the growing tech divide due to geopolitical tensions. This announcement comes as the U.S. imposes further restrictions on the export of Nvidia AI chips to China. Zhipu's latest round of investment saw contributions from major Chinese tech firms like Alibaba, Tencent, and Xiaomi, as well as venture capital firms like HongShan. The company has also open-sourced significant AI models.

🌎 Luzia: The Chatbot Connecting LatAm, Spain, and the U.S.

Spain-based startup Luzia is democratizing access to AI chatbots via WhatsApp, specifically targeting Spanish and Portuguese-speaking markets. Founded in 2023, Luzia boasts over 17 million users and recently secured $10 million in Series A funding led by Khosla Ventures. The bot, which uses various AI models like GPT 3.5/4, is also set to expand into the U.S. market. With a focus on growth over immediate monetization, the company aims to introduce more people to the capabilities of AI-powered conversational agents.

🗽 Breaking Ground: New York City Unveils Comprehensive AI Action Plan

New York City has introduced a first-of-its-kind AI Action Plan aimed at safeguarding residents from AI-induced bias or discrimination. The plan outlines roughly 40 policy initiatives and includes the development of AI standards for city agencies. City council member Jennifer Gutiérrez introduced legislation for an Office of Algorithmic Data Integrity to oversee AI applications across the city. This office would serve as an ombudsman for AI, assessing systems for bias and fielding citizen complaints. While federal AI regulation remains stalled, NYC is positioning itself as a leader in governance and ethical implementation of AI technology.

KAI: ChatGPT straight in your iPhone's keyboard (link)
Noise GPT: A platform for censorship-free text-to-speech and voice cloning (link)
Quick ads: Create ads for all platforms and formats (link)
StyleAI: Create and optimize websites, search rankings, and launch Google ad campaigns (link)
Charpixel: Make data analysis simpler (link)
Auro: Quickly and easily record and organize their thoughts, ideas, and inspirations (link)
Botrush: An alternative to ChatGPT, offering an elegant, user-friendly interface and enhanced features (link)

Have cool resources or tools to share? Submit a tool or reach us by replying to this email (or DM us on Twitter).

Prompt: a tropical paradise island, in the Caribbean, realism, --ar 16-9

Prompt: a finger painting of a buggati, --ar 16-9

What’d you think of today’s email?

That's a wrap for now! We hope you enjoyed today’s newsletter :)

Should you have any captivating projects or concepts, don't hesitate to connect with us by replying to this email or dropping us a direct message on Twitter: @thedeepview

We appreciate your continued support, and we'll catch you in the next edition.

-Bot Buddy and The Deep View Team