The Deep View
Posts
⚙️ Meet Datambit, a new AI startup working on deepfake detection

⚙️ Meet Datambit, a new AI startup working on deepfake detection

Ian Krietzberg
September 20, 2024

Good morning. We made it to Friday.

Scientists are using AI to hunt for aliens.

We break it down below.

— Ian Krietzberg, Editor-in-Chief, The Deep View

In today’s newsletter:

🛸 AI for Good: The hunt for alien life
🧠 Language database shuts down due to genAI
🚨 Report: Social media companies ‘harvest and monetize’ Americans’ info
💻 Meet Datambit, a new AI startup working on deepfake detection

AI for Good: The hunt for alien life

Source: NASA

Humans have been looking out at the stars — wondering what may lie beyond — long before we had long-range telescopes, computer programs and remote satellites.

But now we do have those technological means of searching the outer reaches of space. And in addition to telescopes and satellites, we have artificial intelligence, something that is helping scientists in their endless search for extra-terrestrial life.

The details: These scientists aren’t necessarily looking for bona-fide little green guys in saucer-shaped rocketships. Right now, they’re just looking for long-range audio anomalies.

A team of scientists last year built an AI system that was trained to sift through data from radio telescopes to identify signals that couldn’t have been generated by natural astrophysical processes.
When the scientists fed the system a previously studied dataset, it highlighted eight “signals of interest” that classical algorithms missed.

The working hypothesis is that these signals, if confirmed, might indicate the presence of technology, implying the existence of an alien society. But the scientists weren’t able to re-detect any of the signals, saying that they were likely instances of unusual local radio interference.

Why it matters: Such technology enables astronomers to sift through a far greater quantity of data with a far finer-toothed comb. They called it a “leading solution” in the search for intelligent alien life.

[Webinar] How to Operationalize AI

Join Camunda and featured guest speaker from Forrester for a live webinar where you’ll learn how to put your teams on the fast track to orchestrate business processes using AI and machine learning. The AI experts will cover how to:

Eliminate siloed use of these tools
Account for security, data privacy, and governance
Embed generative AI in your automation strategy

Register now for the upcoming webinar and take your automation initiatives to the next level with AI and process orchestration.

Language database shuts down due to genAI

Source: Unsplash

For years, there’s been an online database — Wordfreq — that has collected data about the frequency of language usage.

Wordfreq helped researchers track the way language evolves over time, and was a specific aid to natural language processing researchers, who are often trying to understand how common certain words are.

What happened: The project — last updated in 2021 — will no longer be updated.

Its creator, Robyn Speer, said in a note on Github that reliable sources of language use (by humans) have become null and void in the post-2021 world.
“The Web at large is full of slop generated by large language models, written by no one to communicate nothing. Including this slop in the data skews the word frequencies,” according to Speer.

An addition to this is that, in the frenzy of social media platforms to cash in on AI venture capital checks, data from platforms like Reddit and Twitter are now either inaccessible or simply too expensive to access.

“I don't want to work on anything that could be confused with generative AI, or that could benefit generative AI,” Speer wrote. “OpenAI and Google can collect their own damn data. I hope they have to pay a very high price for it, and I hope they're constantly cursing the mess that they made themselves.”

AI Startups get up to $350,000 in credits with Google Cloud

For startups, especially those in the deep tech and AI space, having a dependable cloud provider is absolutely vital to success.

Fortunately, Google Cloud exists. And it offers an experience — the Google for Startups Cloud Program — specifically designed to make sure startups succeed.

The program importantly offers eligible startups up to $200,000 in Google Cloud Credits over two years. For AI startups, that number is $350,000.

Beyond the additional cloud credits eligible AI startups get access to, AI startups will also get access to Google Cloud’s in-house AI experts, training and resources.

This includes webinars and live Q&As with Google Cloud AI product managers, engineers and developer advocates, in addition to insight into Google Cloud’s latest advances in AI.

Program applications are reviewed and approved based on the eligibility requirements here.

Apply today!

Building AI systems? Whether you're a beginner or looking to build confidence in launching to production, Vellum’s new webinar series is for you. Sign up today, space is limited.*
Tired of receiving compliance notices from state tax agencies? Automate payroll, compliance, and benefits management with Warp. Get started today and get a $1,000 gift card.*

China’s Alibaba launches over 100 new open-source AI models, releases text-to-video generation tool (CNBC).
Google buys carbon removal credits from Brazil startup (Reuters).
A bottle of water per email: the hidden environmental costs of using AI chatbots (Washington Post).

If you want to get in front of an audience of 200,000+ developers, business leaders and tech enthusiasts, get in touch with us here.

Join us for an inside look at how Ada and OpenAI build trust in enterprise AI adoption for customer service, starting with minimizing risks such as hallucinations. Viewers will also be able to participate in a Q&A.
- September 26, 1PM ET/10AM PT. Sign up for the webinar today!

Government report: Social media companies ‘harvest and monetize’ Americans’ info

Source: Unsplash

A new report from the U.S. Federal Trade Commission (FTC) breaks down the may ways in which social media platforms have been gathering and using all manner of data about their users (and, it turns out, non-users as well).

The details: The 129-page report is the result of orders issued by the FTC in 2020 to nine of the largest social media and video streaming platforms in the country, including Meta, YouTube, Twitter, Reddit and ByteDance.

In terms of algorithms and AI, the report found that most of the companies heavily rely on algorithms and automated systems to run their platforms, and that they feed these systems with an extensive quantity of personal user data.
Users are not given a choice about the harvesting that occurs here, and are rarely even told it’s happening.

In addition to user-sourced information, platforms also collect tons of data about users and non-users from third-party data brokers and access to on-device data.

The FTC said that legislation is “badly needed” to mitigate the harms of automated on-platform decision-making and to enshrine legitimate data privacy rights for U.S. citizens.

“Social media and video streaming companies harvest an enormous amount of Americans’ personal data and monetize it to the tune of billions of dollars a year,” FTC Chair Lina Khan said. “While lucrative for the companies, these surveillance practices can endanger people’s privacy, threaten their freedoms and expose them to a host of harms.”

Meet Datambit, a new startup working on deepfake detection

Source: Created with AI by The Deep View

Deepfakes are one of those offshoots of generative AI technology that were fun and rather innocuous in the beginning. It didn’t take long for them to turn malicious.

Deepfake tech is nothing new; it’s been on display in different environments for years, helping, at first, to bring back or de-age actors in movies. Then, in 2017, the same tech was leveraged to create fake pornography of celebrities, an issue that has since gotten worse.

But back then, it took people — good or bad actors — hours, and sometimes days, to produce deepfakes that, even then, looked very obviously synthetic.
The advent of generative AI injected a dramatic shift to that environment, introducing cheaply accessible natural-language tools that can create convincing, hyper-realistic deepfakes in seconds.

The implications of this are as real as they are obvious — the past year has been marked by reports of AI-generated deepfake images, videos, phone calls and Zoom calls that have served a variety of nefarious purposes, from targeted sexual harassment to electoral misinformation, fraud, identity theft and thievery.

Families have run afoul of fraudulent scams that leverage deepfake audio over the phone to demand ransom payments or ask a loved one for bail money. Earlier this year, scammers stole $25 million from a company after appearing as a lineup of the company’s leadership in a Zoom call, asking a real employee to move money to a different account.

This is the problem Datambit is aiming to solve.

The details: Datambit, a nine-month-old British startup, has developed an AI-powered model called Genui that’s designed specifically for deepfake detection.

The system employs machine learning algorithms to detect anomalies in video, audio and visual content, identifying certain patterns that could indicate the presence of a deepfake or otherwise manipulated content.
It includes facial recognition algorithms, audio analysis, voice biometrics and audio forensics. Genui can analyze audio spectrograms to identify elements suggestive of a deepfake. Working with specific people, the model can use vocal biometrics to verify the identity of a speaker based on previous vocal samples.

Hari Natarajan, a member of Datambit’s advisory board, told me that right now, the company is in an early stage of beta testing; at the moment, the system is a bit closed-off, requiring early test adopters to either bulk-upload material through its API or directly upload material into its detection engine.

He said that, “at the end of the day, this solution can go to pretty much anybody that’s out there.” But Datambit plans to focus on the financial services industry first, “because that looks like an early pain point.”

The journey so far: In May, the company competed in the U.K.’s Deepfake Detection Challenge, earning an accuracy score of .96 for video detection and .99 for audio detection, with an error rate of .03 and .05 respectively.

Based on its success in that challenge, the firm was recently selected to compete alongside nine other tech startups in the City of London’s AI Innovation Challenge, where the startups will be paired with a participating financial services firm over the course of seven weeks to trial their solution.

Natarajan told me the company is completely bootstrapped right now and plans to raise funds within the next few months.
“Right now, the way we have been doing it is just keep our heads down, build the product,” Natarajan said. “Just get it out there, rather than wave (around) a PowerPoint presentation and try to go raise money, the thought was: ‘let's go build it, get some traction, and then go raise money so that we know that there is something viable behind it.’”

It’s a marked difference from many players in the AI startup space, where the reverse approach has become rather popular: raise money at crazy valuations first, figure out the business second.

Which image is real?

🤔 Your thought process:

Selected Image 1 (Left):

“Slight color fringing from the castle’s sunlit edge indicated a real lens was in use… until AI learns to replicate that!”

Selected Image 2 (Right):

“Intuition of perception.”

💭 A poll before you go

Thanks for reading today’s edition of The Deep View!

We’ll see you in the next one.

Here’s your view on FHE and private computing:

A third of you said it’s absolutely necessary. A quarter said it would make sense for corporations, but not individuals. 15% said it’s not even worth exploring.

Something else:

“Is speed more important than security — I don’t think so, at least not on a personal level.”