What About the Data Labelers Themselves?

Earlier this month I discussed a class action lawsuit, originated in the United States, from people who believe their privacy is being violated by the use of Kenyan data labelers to view their video output.

And the data labelers themselves are not happy, according to a 404 Media article “AI is African Intelligence.”

Before I get to the Kenyans, let’s talk about the reality of AI. No, AI output is not 100% generated by computers alone. There is often human review.

In some cases human review is understandable. There was a recent brouhaha when it was publicly highlighted that when a Waymo vehicle runs into a problematic situation, Waymo calls upon a human reviewer to intervene. People’s anger about this is pointless: would they prefer that Waymo NOT call upon a human reviewer, and just let the car do whatever?

Back to Kenya and the Data Labelers Association (DLA) reports of what data labelers actually do.

“Every day, Michael Geoffrey Asia spent eight consecutive hours at his laptop in Kenya staring at porn, annotating what was happening in every frame for an AI data labeling company. When he was done with his shift, he started his second job as the human labor behind AI sex bots, sexting with real lonely people he suspected were in the United States. His boss was an algorithm that told him to flit in and out of different personas.”

I’ve previously seen reports about people in the U.S. reviewing shocking material for social media companies, but it’s a heck of a lot cheaper to outsource the work abroad.

Unless the U.S. Government insists on bringing data labeling work to the United States, in the same way that it wants to bring call center jobs back here.

I do offer one caution: there is a lot of data labeling work that is NOT pornographic. In the identity verification industry, data labelers review real and fake faces, real and fake documents, and the like to train AI models. Such work does not have the emotional stress that you get from watching certain videos.

But it’s still hard work.

Data Labelers Gonna Label, and Class Action Lawyers Gonna Lawyer

On Wednesday, I described how Meta’s Kenyan data labelers ended up watching explicit videos from people who presumably didn’t know that smart glasses were recording their activity.

To no one’s surprise, class action lawyers are now involved.

“In the newly filed complaint, plaintiffs Gina Bartone of New Jersey and Mateo Canu of California, represented by the public interest-focused Clarkson Law Firm, allege that Meta violated privacy laws and engaged in false advertising.

“The complaint alleges that the Meta AI smart glasses are advertised using promises like “designed for privacy, controlled by you,” and “built for your privacy,” which might not lead customers to assume their glasses’ footage, including intimate moments, was being watched by overseas workers. The plaintiffs believed Meta’s marketing and said they saw no disclaimer or information that contradicted the advertised privacy protections.”

So what does Meta say?

“Clear, easy device and app settings help you manage your information, giving you control over what content you choose to share with others, and when.”

Except that according to Clarkson, people can’t opt out of the data labeling process.

This could get very revealing.

Data Labelers Gonna Label

Before diving in, I should note that this is not just a Meta Ray Ban AI glasses issue.

This is an issue with ANY video feed that requires AI processing.

Because AI can’t do its job on its own.

To ensure that the AI is trained properly, an army of humans looks at the same data and uses data labeling to classify it.

We allow this when we sign those Terms of Service. And I personally believe it’s a good thing, since it helps correct errors from uncontrolled AI.

But Futurism notes the types of video feeds that the human data labelers have to label.

“I saw a video where a man puts the glasses on the bedside table and leaves the room,” one data annotator told the newspapers. “Shortly afterwards his wife comes in and changes her clothes.”

Grok.

Basically we record more than we should. One example: a bank card.

But regardless of whether data labelers are present or not, assume that any recording device will record anything, and potentially distribute it.