The Imperfect Way to Enforce New York’s Child Data Protection Act

It’s often good to use emotion in your marketing.

For example, when biometric companies want to justify the use of their technology, they have found that it is very effective to position biometrics as a way to combat sex trafficking.

Similarly, moves to rein in social media are positioned as a way to preserve mental health.

By Marc NL at English Wikipedia – Transferred from en.wikipedia to Commons., Public Domain, https://commons.wikimedia.org/w/index.php?curid=2747237

Now that’s a not-so-pretty picture, but it effectively speaks to emotions.

“If poor vulnerable children are exposed to addictive, uncontrolled social media, YOUR child may end up in a straitjacket!”

In New York state, four government officials have declared that the ONLY way to preserve the mental health of underage social media users is via two bills, one of which is the “New York Child Data Protection Act.”

But there is a challenge to enforce ALL of the bill’s provisions…and only one way to solve it. An imperfect way—age estimation.

This post only briefly addresses the alleged mental health issues of social media before plunging into one of the two proposed bills to solve the problem. It then examines a potentially unenforceable part of the bill and a possible solution.

Does social media make children sick?

Letitia “Tish” James is the 67th Attorney General for the state of New York. From https://ag.ny.gov/about/meet-letitia-james

On October 11, a host of New York State government officials, led by New York State Attorney General Letitia James, jointly issued a release with the title “Attorney General James, Governor Hochul, Senator Gounardes, and Assemblymember Rozic Take Action to Protect Children Online.”

Because they want to protect the poor vulnerable children.

By Paolo Monti – Available in the BEIC digital library and uploaded in partnership with BEIC Foundation.The image comes from the Fondo Paolo Monti, owned by BEIC and located in the Civico Archivio Fotografico of Milan., CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=48057924

And because the major U.S. social media companies are headquartered in California. But I digress.

So why do they say that children need protection?

Recent research has shown devastating mental health effects associated with children and young adults’ social media use, including increased rates of depression, anxiety, suicidal ideation, and self-harm. The advent of dangerous, viral ‘challenges’ being promoted through social media has further endangered children and young adults.

From https://ag.ny.gov/child-online-safety

Of course one can also argue that social media is harmful to adults, but the New Yorkers aren’t going to go that far.

So they are just going to protect the poor vulnerable children.

CC BY-SA 4.0.

This post isn’t going to deeply analyze one of the two bills the quartet have championed, but I will briefly mention that bill now.

  • The “Stop Addictive Feeds Exploitation (SAFE) for Kids Act” (S7694/A8148) defines “addictive feeds” as those that are arranged by a social media platform’s algorithm to maximize the platform’s use.
  • Those of us who are flat-out elderly vaguely recall that this replaced the former “chronological feed” in which the most recent content appeared first, and you had to scroll down to see that really cool post from two days ago. New York wants the chronological feed to be the default for social media users under 18.
  • The bill also proposes to limit under 18 access to social media without parental consent, especially between midnight and 6:00 am.
  • And those who love Illinois BIPA will be pleased to know that the bill allows parents (and their lawyers) to sue for damages.

Previous efforts to control underage use of social media have faced legal scrutinity, but since Attorney General James has sworn to uphold the U.S. Constitution, presumably she has thought about all this.

Enough about SAFE for Kids. Let’s look at the other bill.

The New York Child Data Protection Act

The second bill, and the one that concerns me, is the “New York Child Data Protection Act” (S7695/A8149). Here is how the quartet describes how this bill will protect the poor vulnerable children.

CC BY-SA 4.0.

With few privacy protections in place for minors online, children are vulnerable to having their location and other personal data tracked and shared with third parties. To protect children’s privacy, the New York Child Data Protection Act will prohibit all online sites from collecting, using, sharing, or selling personal data of anyone under the age of 18 for the purposes of advertising, unless they receive informed consent or unless doing so is strictly necessary for the purpose of the website. For users under 13, this informed consent must come from a parent.

From https://ag.ny.gov/child-online-safety

And again, this bill provides a BIPA-like mechanism for parents or guardians (and their lawyers) to sue for damages.

But let’s dig into the details. With apologies to the New York State Assembly, I’m going to dig into the Senate version of the bill (S7695). Bear in mind that this bill could be amended after I post this, and some of the portions that I cite could change.

The “definitions” section of the bill includes the following:

“MINOR” SHALL MEAN A NATURAL PERSON UNDER THE AGE OF EIGHTEEN.

From https://www.nysenate.gov/legislation/bills/2023/S7695, § 899-EE, 2.

This only applies to natural persons. So the bots are safe, regardless of age.

Speaking of age, the age of 18 isn’t the only age referenced in the bill. Here’s a part of the “privacy protection by default” section:

§ 899-FF. PRIVACY PROTECTION BY DEFAULT.

1. EXCEPT AS PROVIDED FOR IN SUBDIVISION SIX OF THIS SECTION AND SECTION EIGHT HUNDRED NINETY-NINE-JJ OF THIS ARTICLE, AN OPERATOR SHALL NOT PROCESS, OR ALLOW A THIRD PARTY TO PROCESS, THE PERSONAL DATA OF A COVERED USER COLLECTED THROUGH THE USE OF A WEBSITE, ONLINE SERVICE, ONLINE APPLICATION, MOBILE APPLICA- TION, OR CONNECTED DEVICE UNLESS AND TO THE EXTENT:

(A) THE COVERED USER IS TWELVE YEARS OF AGE OR YOUNGER AND PROCESSING IS PERMITTED UNDER 15 U.S.C. § 6502 AND ITS IMPLEMENTING REGULATIONS; OR

(B) THE COVERED USER IS THIRTEEN YEARS OF AGE OR OLDER AND PROCESSING IS STRICTLY NECESSARY FOR AN ACTIVITY SET FORTH IN SUBDIVISION TWO OF THIS SECTION, OR INFORMED CONSENT HAS BEEN OBTAINED AS SET FORTH IN SUBDIVISION THREE OF THIS SECTION.

From https://www.nysenate.gov/legislation/bills/2023/S7695

So a lot of this bill depends upon whether a person is over or under the age of eighteen, or over or under the age of thirteen.

And that’s a problem.

How old are you?

The bill needs to know whether or not a person is 18 years old. And I don’t think the quartet will be satisfied with the way that alcohol websites determine whether someone is 21 years old.

This age verification method is…not that robust.

Attorney General James and the others would presumably prefer that the social media companies verify ages with a government-issued ID such as a state driver’s license, a state identification card, or a national passport. This is how most entities verify ages when they have to satisfy legal requirements.

For some people, even some minors, this is not that much of a problem. Anyone who wants to drive in New York State must have a driver’s license, and you have to be at least 16 years old to get a driver’s license. Admittedly some people in the city never bother to get a driver’s license, but at some point these people will probably get a state ID card.

You don’t need a driver’s license to ride the New York City subway, but if the guitarist wants to open a bank account for his cash it would help him prove his financial identity. By David Shankbone – Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=2639495
  • However, there are going to be some 17 year olds who don’t have a driver’s license, government ID or passport.
  • And some 16 year olds.
  • And once you look at younger people—15 year olds, 14 year olds, 13 year olds, 12 year olds—the chances of them having a government-issued identification document are much less.

What are these people supposed to do? Provide a birth certificate? And how will the social media companies know if the birth certificate is legitimate?

But there’s another way to determine ages—age estimation.

How old are you, part 2

As long-time readers of the Bredemarket blog know, I have struggled with the issue of age verification, especially for people who do not have driver’s licenses or other government identification. Age estimation in the absence of a government ID is still an inexact science, as even Yoti has stated.

Our technology is accurate for 6 to 12 year olds, with a mean absolute error (MAE) of 1.3 years, and of 1.4 years for 13 to 17 year olds. These are the two age ranges regulators focus upon to ensure that under 13s and 18s do not have access to age restricted goods and services.

From https://www.yoti.com/wp-content/uploads/Yoti-Age-Estimation-White-Paper-March-2023.pdf

So if a minor does not have a government ID, and the social media firm has to use age estimation to determine a minor’s age for purposes of the New York Child Data Protection Act, the following two scenarios are possible:

  • An 11 year old may be incorrectly allowed to give informed consent for purposes of the Act.
  • A 14 year old may be incorrectly denied the ability to give informed consent for purposes of the Act.

Is age estimation “good enough for government work”?

Safety vs. Privacy in Montana School Video Surveillance

At the highest level, debates regarding government and enterprise use of biometric technology boil down to a debate about whether to keep people safe, or whether to preserve individual privacy.

In the state of Montana, school safety is winning over school privacy—for now.

The one exception in Montana Senate Bill 397

Biometric Update links to a Helena Independent Record article on how Montana’s far-reaching biometric ban has one significant exception.

The state Legislature earlier this year passed a law barring state and local governments from continuous use of facial recognition technology, typically in the form of cameras capable of reading and collecting a person’s biometric data, like the identifiable features of their face and body. A bipartisan group of legislators went toe-to-toe with software companies and law enforcement in getting Senate Bill 397 over the finish line, contending public safety concerns raised by the technology’s supporters don’t overcome individual privacy rights. 

School districts, however, were specifically carved out of the definition of state and local governments to which the facial recognition technology law applies.

From the Helena Independent Record.

At a minimum Montana school districts seek to abide by two existing Federal laws when installating facial recognition and video surveillance systems.

Without many state-level privacy protection laws in place, school policies typically lean on the Children’s Online Privacy Protection Act (COPPA), a federal law requiring parental consent in order for websites to collect data on their children, or the Family Educational Rights and Privacy Act (FERPA), which protects the privacy of student education records. 

From the Helena Independent Record.

If a vendor doesn’t agree to abide by these laws, then the Montana School Board Association recommends that the school district not do business with the vendor.

Other vendors agree. Here is the statement of one vendor, Verkada (you’ll see them again later) on FERPA:

The Family Educational Rights and Privacy Act was passed by the US federal government to protect the privacy of students’ educational records. This law requires public schools and school districts to give families control over any personally identifiable information about the student.

Verkada provides educational organizations the tools they need to maintain FERPA compliance, such as face blurring for archived footage.

From https://www.verkada.com/security/#compliance

Simms High School’s use of the technology

How are the schools using these systems? In ways you may expect.

(The Sun River Valley School District’s) use of the technology is more focused on keeping people who shouldn’t be on school property away, he said, such as a parent who lost custody of their child.

(Simms) High School Principal Luke McKinley said it’s been more frequent to use the facial recognition technology during extra-curricular activities, when football fans get too rowdy for a high school sports event. 

From the Helena Independent Record.

Technology (in this case from Verkada) helps the Sun River School District, especially in its rural setting. Back in 2022, it took law enforcement an estimated 45 minutes to respond to school incidents. The hope is that the technology could identify those who engaged in illegal activity, or at least deter it.

What about other school districts?

When I created my educational identity page, I included the four key words “When permitted by law.” While Montana school districts are currently permitted to use facial recognition and video surveillance, other school districts need to check their local laws before implementing such a system, and also need to ensure that they comply with federal laws such as COPPA and FERPA.

I may be, um, biased in my view, but as long as the school district (or law enforcement agency, or apartment building owner, or whoever) complies with all applicable laws, and implements the technology with a primary purpose of protecting people rather than spying on them, facial recognition is a far superior tool to protect people than manual recognition methods that rely on all-too-fallible human beings.

What if Machine Learning Models Can’t Get Generative AI Training Data?

An image of a neural network. By DancingPhilosopher – Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=135594693

Machine learning models need training data to improve their accuracy—something I know from my many years in biometrics.

And it’s difficult to get that training data—something else I know from my many years in biometrics. Consider the acronyms GDPR, CRPA, and especially BIPA. It’s very hard to get data to train biometric algorithms, so they are trained on relatively limited data sets.

At the same time that biometric algorithm training data is limited, Kevin Indig believes that generative AI large language models are ALSO going to encounter limited accessibility to training data. Actually, they are already.

The lawsuits have already begun

A few months ago, generative AI models like ChatGPT were going to solve all of humanity’s problems and allow us to lead lives of leisure as the bots did all our work for us. Or potentially the bots would get us all fired. Or something.

But then people began to ask HOW these large language models work…and where they get their training data.

Just like biometric training models that just grab images and associated data from the web without asking permission (you know the example that I’m talking about), some are alleging that LLMs are training their models on copyrighted content in violation of the law.

I am not a lawyer and cannot meaningfully discuss what is “fair use” and what is not, but suffice it to say that alleged victims are filing court cases.

Sarah Silverman et al and copyright infringement

Here’s one example from July:

Comedian and author Sarah Silverman, as well as authors Christopher Golden and Richard Kadrey — are suing OpenAI and Meta each in a US District Court over dual claims of copyright infringement.

The suits alleges, among other things, that OpenAI’s ChatGPT and Meta’s LLaMA were trained on illegally-acquired datasets containing their works, which they say were acquired from “shadow library” websites like Bibliotik, Library Genesis, Z-Library, and others, noting the books are “available in bulk via torrent systems.”

From https://www.theverge.com/2023/7/9/23788741/sarah-silverman-openai-meta-chatgpt-llama-copyright-infringement-chatbots-artificial-intelligence-ai

This could be a big mess, especially since copyright laws vary from country to country. This description of copyright law LLM implications, for example, is focused upon United Kingdom law. Laws in other countries differ.

And now the technical blocks are beginning

Just today, Kevin Indig highlighted another issue that could limit LLM access to online training data.

Some sites are already blocking the LLM crawlers

Systems that get data from the web, such as Google, Bing, and (relevant to us) ChatGPT, use “crawlers” to gather the information from the web for their use. ChatGPT, for example, has its own crawler.

By Yintan at English Wikipedia, CC BY 4.0, https://commons.wikimedia.org/w/index.php?curid=63631702

Guess what Indig found out about ChatGPT’s crawler?

An analysis of the top 1,000 sites on the web from Originality AI shows 12% already block Chat GPT’s crawler. (source)

From https://www.kevin-indig.com/most-sites-will-block-chat-gpt/

But that only includes the sites that blocked the crawler when Originality AI performed its analysis.

More sites will block the LLM crawlers

Indig believes that in the future, the number of the top 1000 sites that will block ChatGPT’s crawler will rise significantly…to 84%. His belief is based on analyzing the business models for the sites that already block ChatGPT and assuming that other sites that use the same business models will also find it in their interest to block ChatGPT.

The business models that won’t block ChatGPT are assumed to include governments, universities, and search engines. Such sites are friendly to the sharing of information, and thus would have no reason to block ChatGPT or any other LLM crawler.

The business models that would block ChatGPT are assumed to include publishers, marketplaces, and many others. Entities using these business models are not just going to turn it over to an LLM for free.

As Indig explains regarding the top two blocking business models:

By Karl Thomas Moore – Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=58968347

For publishers, content is the product. Giving it away for free to generative AI means foregoing most if not all, ad revenue. Publishers remember the revenue drops caused by social media and modern search engines in the late 2,000s.

Marketplaces build their own AI assistants and don’t want competition.

From https://www.kevin-indig.com/most-sites-will-block-chat-gpt/

What does this mean for LLMs?

One possibility is that LLMs will run into the same training issues as biometric algorithms.

  • In biometrics, the same people that loudly exclaim that biometric algorithms are racist would be horrified at the purely technical solution that would solve all inaccuracy problems—let the biometric algorithms train on ALL available biometric data. In the activists’ view (and in the view of many), unrestricted access to biometric data for algorithmic training would be a privacy nightmare.
  • Similarly, those who complain that LLMs are woefully inaccurate would be horrified if the LLM accuracy problem were solved by a purely technical solution: let the algorithms train themselves on ALL available data.

Could LLMs buy training data?

Of course, there’s another solution to the problem: have the companies SELL their data to the LLMs.

By Nic McPhee from Morris, Minnesota, USA – London – 14-15 Dec 2007 – 034, CC BY-SA 2.0, https://commons.wikimedia.org/w/index.php?curid=10606179

In theory, this could provide the data holders with a nice revenue stream while allowing the LLMs to be extremely accurate. (Of course the users who actually contribute the data to the data holders would probably be shut out of any revenue, but them’s the breaks.)

But that’s only in theory. Based upon past experience with data holders, the people who want to use the data are probably not going to pay the data holders sufficiently.

Google and Meta to Canada: Drop dead / Mourir

By The original uploader was Illegitimate Barrister at Wikimedia Commons. The current SVG encoding is a rewrite performed by MapGrid. – This vector image is generated programmatically from geometry defined in File:Flag of Canada (construction sheet – leaf geometry).svg., Public Domain, https://commons.wikimedia.org/w/index.php?curid=32276527

Even today, Google and Meta (Facebook et al) are greeting Canada’s government-mandated Bill C-18 with resistance. Here’s what Google is saying:

Bill C-18 requires two companies (including Google) to pay for simply showing links to Canadian news publications, something that everyone else does for free. The unprecedented decision to put a price on links (a so-called “link tax”) breaks the way the web and search engines work, and exposes us to uncapped financial liability simply for facilitating access to news from Canadian publications….

As a result, we have informed them that we have made the difficult decision that, when the law takes effect, we will be removing links to Canadian news publications from our Search, News, and Discover products.

From https://blog.google/canada-news-en/#overview

But wait, it gets better:

In addition, we will no longer be able to operate Google News Showcase – our product experience and licensing program for news – in Canada.

From https://blog.google/canada-news-en/#overview

Google News Showcase is the program that gives money to news organizations in Canada. Meta has a similar program. Peter Menzies notes that these programs give tens of millions of (Canadian) dollars to news organizations, but that could end, despite government threats.

The federal and Quebec governments pulled their advertising spends, but those moves amount to less money than Meta will save by ending its $18 million in existing journalism funding. 

From https://thehub.ca/2023-09-15/peter-menzies-the-media-is-boycotting-meta-and-nobody-cares/

What’s next?

Bearing in mind that Big Tech is reluctant to give journalistic data holders money even when a government ORDERS that they do so…

…what is the likelihood that generative AI algorithm authors (including Big Tech companies like Google and Microsoft) will VOLUNTARILY pay funds to data holders for algorithm training?

If Kevin Indig is right, LLM training data will become extremely limited, adversely affecting the algorithms’ use.

AdvoLogix on “9 Ways to Use AI in the Workplace”

Bredemarket occasionally gets pitches from people who want to write for the blog, or to link to something they’ve already written.

Most of these pitches are crap.

But I just received an excellent and relevant pitch from a PR coordinator. I won’t reproduce his pitch, though, because I don’t want to get sued.

Which in this case is a very distinct possibility.

Who is AdvoLogix?

The PR coordinator represents a company called AdvoLogix, and wanted me to reshare something his company had written.

My first question (of course) is why AdvoLogix exists.

We build and deliver technologies that help legal teams collaborate and grow.

From https://www.advologix.com/about-advologix/

The company provides legal software, resident on Salesforce, that addresses several areas:

Now I am not a lawyer, but I’m sure these terms mean something to lawyers. If you’re looking for these types of solutions, check the links above.

Why did AdvoLogix pitch me?

The PR coordinator had observed Bredemarket’s previous posts on artificial intelligence (excluding the one that I wrote after his pitch), and thought that AdvoLogix’s recent blog post on the same topic would be of interest to Bredemarket’s readers.

What does AdvoLogix say about using AI in the workplace?

AdvoLogix’s post is clear in its intent. It is entitled “9 Ways to Use AI in the Workplace.” The introduction to the post explains AdvoLogix’s position on the use of artificial intelligence.

Rather than replacing human professionals, AI applications take a complementary role in the workplace and improve overall efficiency. Here are nine actionable ways to use artificial intelligence, no matter your industry.

From https://www.advologix.com/ai-applications-business/

I won’t list ALL nine of the ways—I want you to go read the post, after all. But let me highlight one of them—not the first one, but the eighth one.

Individual entrepreneurs can also benefit from AI-driven technologies. Entrepreneurship requires great financial and personal risk, especially when starting a new business. Entrepreneurs must often invest in essential resources and engage with potential customers to build a brand from scratch. With AI tools, entrepreneurs can greatly limit risk by improving their organization and efficiency. 

From https://www.advologix.com/ai-applications-business/

The AdvoLogix post then goes on to recommend specific ways that entrepreneurs can use artificial intelligence, including:

  • AI shopping
  • Use AI Chatbots for Customer Engagement

Regardless of how you feel about the use of AI in these areas, you should at least consider them as possible options.

Why did AdvoLogix write the post?

Obviously the company had a reason for writing the post, and for sharing the post with people like me (and like you).

AdvoLogix provides law firms, legal offices, and public agencies with advanced, cloud-based legal software solutions that address their actual needs. 

Thanks to AI tools like Caster, AdvoLogix can provide your office with effective automation of data entry, invoicing, and other essential but time-consuming processes. Contact AdvoLogix to request a free demo of the industry’s best AI tools for law offices like yours. 

From https://www.advologix.com/ai-applications-business/

So I’m not even going to provide a Bredemarket call to action, since AdvoLogix already provided its own. Good for AdvoLogix.

But what about Steven Schwartz?

The AdvoLogix post did not specifically reference Steven Schwartz, although the company stated that you should control the process yourself and not cede control to your artificial intelligence tool.

Something that Schwartz did not do.

Roberto Mata sued Avianca airlines for injuries he says he sustained from a serving cart while on the airline in 2019, claiming negligence by an employee. Steven Schwartz, an attorney with Levidow, Levidow & Oberman and licensed in New York for over three decades, handled Mata’s representation.

But at least six of the submitted cases by Schwartz as research for a brief “appear to be bogus judicial decisions with bogus quotes and bogus internal citations,” said Judge Kevin Castel of the Southern District of New York in an order….

In late April, Avianca’s lawyers from Condon & Forsyth penned a letter to Castel questioning the authenticity of the cases….

Among the purported cases: Varghese v. China South Airlines, Martinez v. Delta Airlines, Shaboon v. EgyptAir, Petersen v. Iran Air, Miller v. United Airlines, and Estate of Durden v. KLM Royal Dutch Airlines, all of which did not appear to exist to either the judge or defense, the filing said.

Schwartz, in an affidavit, said that he had never used ChatGPT as a legal research source prior to this case and, therefore, “was unaware of the possibility that its content could be false.” He accepted responsibility for not confirming the chatbot’s sources.

Schwartz is now facing a sanctions hearing on June 8.

From https://www.cnn.com/2023/05/27/business/chat-gpt-avianca-mata-lawyers/index.html

On that sanctions hearing date, Schwartz was mercilessly grilled by the judge. Later that month, the judge sanctioned and fined Schwartz and another lawyer.

In the end, you are responsible, not the tool you use.

By the way, Roberto Mata lost the case. Not because of his lawyers’ misuse of AI, but because the case was filed too late.

We Survived Gummy Fingers. We’re Surviving Facial Recognition Inaccuracy. We’ll Survive Voice Spoofing.

(Part of the biometric product marketing expert series)

Some of you are probably going to get into an automobile today.

Are you insane?

The National Highway Traffic Safety Administration has released its latest projections for traffic fatalities in 2022, estimating that 42,795 people died in motor vehicle traffic crashes.

From https://www.nhtsa.gov/press-releases/traffic-crash-death-estimates-2022

When you have tens of thousands of people dying, then the only conscionable response is to ban automobiles altogether. Any other action or inaction is completely irresponsible.

After all, you can ask the experts who want us to ban biometrics because it can be spoofed and is racist, so therefore we shouldn’t use biometrics at all.

I disagree with the calls to ban biometrics, and I’ll go through three “biometrics are bad” examples and say why banning biometrics is NOT justified.

  • Even some identity professionals may not know about the old “gummy fingers” story from 20+ years ago.
  • And yes, I know that I’ve talked about Gender Shades ad nauseum, but it bears repeating again.
  • And voice deepfakes are always a good topic to discuss in our AI-obsessed world.

Example 1: Gummy fingers

My recent post “Why Apple Vision Pro Is a Technological Biometric Advance, but Not a Revolutionary Biometric Event” included the following sentence:

But the iris security was breached by a “dummy eye” just a month later, in the same way that gummy fingers and face masks have defeated other biometric technologies.

From https://bredemarket.com/2023/06/12/vision-pro-not-revolutionary-biometrics-event/

A biometrics industry colleague noticed the rhyming words “dummy” and “gummy” and wondered if the latter was a typo. It turns out it wasn’t.

To my knowledge, these gummy fingers do NOT have ridges. From https://www.candynation.com/gummy-fingers

Back in 2002, researcher Tsutomu Matsumoto used “gummy bears” gelatin to create a fake finger that fooled a fingerprint reader.

Back in 2002, this news WAS really “scary,” since it suggested that you could access a fingerprint reader-protected site with something that wasn’t a finger. Gelatin. A piece of metal. A photograph.

Except that the fingerprint reader world didn’t stand still after 2002, and the industry developed ways to detect spoofed fingers. Here’s a recent example of presentation attack detection (liveness detection) from TECH5:

TECH5 participated in the 2023 LivDet Non-contact Fingerprint competition to evaluate its latest NN-based fingerprint liveness detection algorithm and has achieved first and second ranks in the “Systems” category for both single- and four-fingerprint liveness detection algorithms respectively. Both submissions achieved the lowest error rates on bonafide (live) fingerprints. TECH5 achieved 100% accuracy in detecting complex spoof types such as Ecoflex, Playdoh, wood glue, and latex with its groundbreaking Neural Network model that is only 1.5MB in size, setting a new industry benchmark for both accuracy and efficiency.

From https://tech5.ai/tech5s-mobile-fingerprint-liveness-detection-technology-ranked-the-most-accurate-in-the-market/

TECH5 excelled in detecting fake fingers for “non-contact” reading where the fingers don’t even touch a surface such as an optical surface. That’s appreciably harder than detecting fake fingers that touch contact devices.

I should note that LivDet is an independent assessment. As I’ve said before, independent technology assessments provide some guidance on the accuracy and performance of technologies.

So gummy fingers and future threats can be addressed as they arrive.

But at least gummy fingers aren’t racist.

Example 2: Gender shades

In 2017-2018, the Algorithmic Justice League set out to answer this question:

How well do IBM, Microsoft, and Face++ AI services guess the gender of a face?

From http://gendershades.org/. Yes, that’s “http,” not “https.” But I digress.

Let’s stop right there for a moment and address two items before we continue. Trust me; it’s important.

  1. This study evaluated only three algorithms: one from IBM, one from Microsoft, and one from Face++. It did not evaluate the hundreds of other facial recognition algorithms that existed in 2018 when the study was released.
  2. The study focused on gender classification and race classification. Back in those primitive innocent days of 2018, the world assumed that you could look at a person and tell whether the person was male or female, or tell the race of a person. (The phrase “self-identity” had not yet become popular, despite the Rachel Dolezal episode which happened before the Gender Shades study). Most importantly, the study did not address identification of individuals at all.

However, the findings did find something:

While the companies appear to have relatively high accuracy overall, there are notable differences in the error rates between different groups. Let’s explore.

All companies perform better on males than females with an 8.1% – 20.6% difference in error rates.

All companies perform better on lighter subjects as a whole than on darker subjects as a whole with an 11.8% – 19.2% difference in error rates.

When we analyze the results by intersectional subgroups – darker males, darker females, lighter males, lighter females – we see that all companies perform worst on darker females.

From http://gendershades.org/overview.html

What does this mean? It means that if you are using one of these three algorithms solely for the purpose of determining a person’s gender and race, some results are more accurate than others.

Three algorithms do not predict hundreds of algorithms, and classification is not identification. If you’re interested in more information on the differences between classification and identification, see Bredemarket’s November 2021 submission to the Department of Homeland Security. (Excerpt here.)

And all the stories about people such as Robert Williams being wrongfully arrested based upon faulty facial recognition results have nothing to do with Gender Shades. I’ll address this briefly (for once):

  • In the United States, facial recognition identification results should only be used by the police as an investigative lead, and no one should be arrested solely on the basis of facial recognition. (The city of Detroit stated that Williams’ arrest resulted from “sloppy” detective work.)
  • If you are using facial recognition for criminal investigations, your people had better have forensic face training. (Then they would know, as Detroit investigators apparently didn’t know, that the quality of surveillance footage is important.)
  • If you’re going to ban computerized facial recognition (even when only used as an investigative lead, and even when only used by properly trained individuals), consider the alternative of human witness identification. Or witness misidentification. Roeling Adams, Reggie Cole, Jason Kindle, Adam Riojas, Timothy Atkins, Uriah Courtney, Jason Rivera, Vondell Lewis, Guy Miles, Luis Vargas, and Rafael Madrigal can tell you how inaccurate (and racist) human facial recognition can be. See my LinkedIn article “Don’t ban facial recognition.”

Obviously, facial recognition has been the subject of independent assessments, including continuous bias testing by the National Institute of Standards and Technology as part of its Face Recognition Vendor Test (FRVT), specifically within the 1:1 verification testing. And NIST has measured the identification bias of hundreds of algorithms, not just three.

In fact, people that were calling for facial recognition to be banned just a few years ago are now questioning the wisdom of those decisions.

But those days were quaint. Men were men, women were women, and artificial intelligence was science fiction.

The latter has certainly changed.

Example 3: Voice spoofs

Perhaps it’s an exaggeration to say that recent artificial intelligence advances will change the world. Perhaps it isn’t. Personally I’ve been concentrating on whether AI writing can adopt the correct tone of voice, but what if we take the words “tone of voice” literally? Let’s listen to President Richard Nixon:

From https://www.youtube.com/watch?v=2rkQn-43ixs

Richard Nixon never spoke those words in public, although it’s possible that he may have rehearsed William Safire’s speech, composed in case Apollo 11 had not resulted in one giant leap for mankind. As noted in the video, Nixon’s voice and appearance were spoofed using artificial intelligence to create a “deepfake.”

It’s one thing to alter the historical record. It’s another thing altogether when a fraudster spoofs YOUR voice and takes money out of YOUR bank account. By definition, you will take that personally.

In early 2020, a branch manager of a Japanese company in Hong Kong received a call from a man whose voice he recognized—the director of his parent business. The director had good news: the company was about to make an acquisition, so he needed to authorize some transfers to the tune of $35 million. A lawyer named Martin Zelner had been hired to coordinate the procedures and the branch manager could see in his inbox emails from the director and Zelner, confirming what money needed to move where. The manager, believing everything appeared legitimate, began making the transfers.

What he didn’t know was that he’d been duped as part of an elaborate swindle, one in which fraudsters had used “deep voice” technology to clone the director’s speech…

From https://www.forbes.com/sites/thomasbrewster/2021/10/14/huge-bank-fraud-uses-deep-fake-voice-tech-to-steal-millions/?sh=8e8417775591

Now I’ll grant that this is an example of human voice verification, which can be as inaccurate as the previously referenced human witness misidentification. But are computerized systems any better, and can they detect spoofed voices?

Well, in the same way that fingerprint readers worked to overcome gummy bears, voice readers are working to overcome deepfake voices. Here’s what one company, ID R&D, is doing to combat voice spoofing:

IDVoice Verified combines ID R&D’s core voice verification biometric engine, IDVoice, with our passive voice liveness detection, IDLive Voice, to create a high-performance solution for strong authentication, fraud prevention, and anti-spoofing verification.

Anti-spoofing verification technology is a critical component in voice biometric authentication for fraud prevention services. Before determining a match, IDVoice Verified ensures that the voice presented is not a recording.

From https://www.idrnd.ai/idvoice-verified-voice-biometrics-and-anti-spoofing/

This is only the beginning of the war against voice spoofing. Other companies will pioneer new advances that will tell the real voices from the fake ones.

As for independent testing:

A final thought

Yes, fraudsters can use advanced tools to do bad things.

But the people who battle fraudsters can also use advanced tools to defeat the fraudsters.

Take care of yourself, and each other.

Jerry Springer. By Justin Hoch, CC BY 2.0, https://commons.wikimedia.org/w/index.php?curid=16673259

(Old Draft) The Temperamental Writer’s Three Suggestions for Using Generative AI

(This is the early version of a post. Here’s the final version.)

Don’t let that smiling face fool you.

Behind that smiling face beats the heart of an opinionated, crotchety, temperamental writer.

When you’ve been writing, writing, and writing for…um…many years, you tend to like to write things yourself, especially when you’re being paid to write.

So you can imagine…

  • how this temperamental writer would feel if someone came up and said, “Hey, I wrote this for you.”
  • how this temperamental writer would feel if someone came up and said, “Hey, I had ChatGPT write this for you.”
By Mindaugas Danys from Vilnius, Lithuania, Lithuania – scream and shout, CC BY 2.0, https://commons.wikimedia.org/w/index.php?curid=44907034

Yeah, I’m temperamental.

So how do you think that I feel about ChatGPT, Bard, and other generative AI text writing tools?

Actually, I love them.

But the secret is in knowing how to use these tools.

Bredemarket’s 3 suggestions for using generative AI

So unless someone such as an employer or a consulting client requires that I do things differently, here are three ways that I use generative AI tools to assist me in my writing. You may want to consider these yourself.

Bredemarket Suggestion 1: A human should always write the first draft

The first rule that I follow is that I always write the first draft. I don’t send a prompt off and let a bot write the first draft for me.

Obviously pride of authorship comes into play. But there’s something else at work also.

When the bot writes draft 1

If I send a prompt to a generative AI application and instruct the application to write something, I can usually write the prompt and get a response back in less than a minute. Even with additional iterations, I can compose the final prompt in five minutes…and the draft is done!

And people will expect five-minute responses. I predicted it:

Now I consider myself capable of cranking out a draft relatively quickly, but even my fastest work takes a lot longer than five minutes to write.

“Who cares, John? No one is demanding a five minute turnaround.”

Not yet.

Because it was never possible before (unless you had proposal automation software, but even that couldn’t create NEW text).

What happens to us writers when a five-minute turnaround becomes the norm?

From https://www.linkedin.com/posts/jbredehoft_generativeai-activity-7065836499702861824-X8PO/

When I write draft 1

Now what happens when, instead of sending a few iterative prompts to a tool, I create the first draft the old-fashioned way? Well obviously it takes a lot longer than five minutes…even if I don’t “sleep on it.”

But the entire draft-writing process is also a lot more iterative and (sort of) collaborative. For example, take the “Bredemarket Suggestion 1” portion of the post that you’re reading right now.

  • It originally wasn’t “Bredemarket Suggestion 1.” It was “Bredemarket Rule 1,” but then I decided not to be so dictatorial with you, the reader. “Here’s what I do, and you MAY want to do it also.”
  • And I haven’t written this section, or the rest of the post, in a linear fashion. I started writing Suggestion 3 before I started the other 2 suggestions.
  • I’ve been jumping back and forth throughout the entire post, tweaking things here and there.
  • Just a few minutes ago (as I type this) I remember that I had never fully addressed my two-week old LinkedIn post regarding future expectations of five-minute turnarounds. I still haven’t fully addressed it, but I was able to repurpose the content here.

Now imagine that, instead of my doing all of that manually, I tried to feed all of these instructions into a prompt:

Write a blog post about 3 rules for using generative AI, in which the first rule is for a human to write the first draft, the second rule is to only feed small clumps of text to the tool for improvement, and the third rule is to preserve confidentiality. Except don’t call them rules, but instead use a nicer term. And don’t forget to work in the story about the person who wrote something in ChatGPT for me. Oh, and mention how ornery I am, but use three negative adjectives in place of ornery. Oh, and link to the Writing, Writing, Writing subsection of the Who I Am page on the Bredemarket website. And also cite the LinkedIn post I wrote about five minute responses; not sure when I wrote it, but find it!

What would happen if I fed that prompt to a generative AI tool?

You’ll find out at the end of this post.

Bredemarket Suggestion 2: Only feed little bits and pieces to the generative AI tool

The second rule that I follow is that after I write the first draft, I don’t dump the whole thing into a generative AI tool and request a rewrite of the entire block of text.

Instead I dump little bits and pieces into the tool.

  • Such as a paragraph. There are times when I may feed an entire paragraph to a tool, just to look at some alternative ways to say what I want to say.
  • Or a sentence. I want my key sentences to pop. I’ll use generative AI to polish them until they shine.
The “code snippet” (?) rewrite that created the sentence above, after I made a manual edit to the result.
  • Or the title. You can send blog post titles or email titles to generative AI for polishing. (Not my word.) But check them; HubSpot flagged one generated email title as “spammy.”
  • Or a single word. Yes, I know that there are online thesauruses that can take care of this. But you can ask the tool to come up with 10 or 100 suggestions.

Bredemarket Rule 3: Don’t share confidential information with the tool

Actually, this one isn’t a suggestion. It’s a rule.

Remember the “Hey, I had ChatGPT write this for you” example that I cited above? That actually happened to me. And I don’t know what the person fed as a prompt to ChatGPT, since I only saw the end result, a block of text that included information that was, at the time, confidential.

OK, not THAT confidential. By July_12,_2007_Baghdad_airstrike_unedited_part1.ogv: US Apache helicopterderivative work: Wnt (talk) – July_12,_2007_Baghdad_airstrike_unedited_part1.ogv, Public Domain, https://commons.wikimedia.org/w/index.php?curid=9970435

Did my “helper” feed that confidential information to ChatGPT, allowing it to capture that information and store it in its systems?

If someone at Samsung did that, they’d get into real trouble.

Let me share an example.

  • Let’s say that Bredemarket is developing a new writing service, the “Bredemarket 288 Tweet Writing Service.” (I’m not. It’s not economically feasible. But bear with me.)
  • Now this is obviously an extremely valuable trade secret.
  • If someone scouring generative AI data found out about this offering and beat me to the punch, I would lose $45 billion. Or maybe less.

So how should I have a generative AI tool edit text about my new service?

  1. First, don’t use a Bredemarket account to submit the prompt. Even if I follow all the obfuscation steps that I am about to list below, the mere fact that the prompt was associated with a Bredemarket account links Bredemarket to the data.
  2. Second, if the word “Bredemarket” appears in the prompt, change it to something else. Like my standby WidgetCo, or maybe Wildebeest Inc.
  3. Third, obfuscate other parts of the prompt. Perhaps change 288 (a number closely associated with modern-day Twitter) to something else, and maybe change other things also.

To show how this is done, here’s something I just submitted to a well-known generative AI tool.

Convert the following to eye-catching text with 250 or fewer characters: “Wildebeest Inc. is pleased to announce that we are providing a brand new service for you, the Wildebeest 176 YouTube Video Creation Service. This service, priced at $50, allows me to collaborate with you in creating a YouTube video. Before I start work, I ask you a number of questions about your desired video, including where the video will be shown, the features you want to highlight, and the target audience. I prepare seventeen drafts during the process, and you can review each of them. The result is a YouTube video that meets your needs and brings your company a lack of profits.”

So my collaborative service provided…a code snippet!

Code snippet

**Wildebeest 176 YouTube Video Creation Service**

**Get a custom YouTube video for $50**

* Collaborate with a video creator
* 17 drafts for review
* Meet your needs and bring profits

**Sign up today!**

Use code with caution.

Now I can edit the text and insert the real words, adjusting them as needed, without revealing my super-secret project.

Would a temperamental writer really do all of this?

Yes, a temperamental writer would really do all of this.

Despite my (overly?) high opinion of my own written work vs. something a bot would write, in certain circumstances the bot can improve my writing.

And as long as I disclose to a potential Bredemarket client (or an employer) my three suggestions (whoops, two suggestions and one rule) for using generative AI, there should be no ethical or legal problem in using a tool. In a sense it’s like using online grammar correction tools, or a book like a dictionary or thesaurus.

Just fact-check everything. It’s important.

Roberto Mata sued Avianca airlines for injuries he says he sustained from a serving cart while on the airline in 2019, claiming negligence by an employee. Steven Schwartz, an attorney with Levidow, Levidow & Oberman and licensed in New York for over three decades, handled Mata’s representation.

But at least six of the submitted cases by Schwartz as research for a brief “appear to be bogus judicial decisions with bogus quotes and bogus internal citations,” said Judge Kevin Castel of the Southern District of New York in an order….

In late April, Avianca’s lawyers from Condon & Forsyth penned a letter to Castel questioning the authenticity of the cases….

Among the purported cases: Varghese v. China South Airlines, Martinez v. Delta Airlines, Shaboon v. EgyptAir, Petersen v. Iran Air, Miller v. United Airlines, and Estate of Durden v. KLM Royal Dutch Airlines, all of which did not appear to exist to either the judge or defense, the filing said.

Schwartz, in an affidavit, said that he had never used ChatGPT as a legal research source prior to this case and, therefore, “was unaware of the possibility that its content could be false.” He accepted responsibility for not confirming the chatbot’s sources.

Schwartz is now facing a sanctions hearing on June 8.

From https://www.cnn.com/2023/05/27/business/chat-gpt-avianca-mata-lawyers/index.html

Oops…

Postscript: what if I didn’t follow suggestion 1?

Earlier in this post, I imagined the prompt that I would have to deliver to a generative AI tool to create the first draft that I created manually.

Write a blog post about 3 rules for using generative AI, in which the first rule is for a human to write the first draft, the second rule is to only feed small clumps of text to the tool for improvement, and the third rule is to preserve confidentiality. Except don’t call them rules, but instead use a nicer term. And don’t forget to work in the story about the person who wrote something in ChatGPT for me. Oh, and mention how ornery I am, but use three negative adjectives in place of ornery. Oh, and link to the Writing, Writing, Writing subsection of the Who I Am page on the Bredemarket website. And also cite the LinkedIn post I wrote about five minute responses; not sure when I wrote it, but find it!

Are you wondering what I’m wondering?

So I fed the prompt above into Google Bard. The result begins with the next paragraph. I think my version is more entertaining, but the bot version is admittedly more succinct, if less accurate (especially the “ornery” paragraph). And after a lot of thought, I canned this version of the post entirely and came up with a more succinct, mostly manually-created version instead. You can read it here.

3 Guidelines for Using Generative AI

Generative AI is a powerful tool that can help you write better content, faster. However, it’s important to use it responsibly. Here are 3 guidelines to follow when using generative AI:

  1. Start with a human-written draft. Generative AI is best used to improve existing content, not to create it from scratch. Start by writing a draft of your content, then use generative AI to polish it.
  2. Feed small clumps of text to the tool. Generative AI can be inaccurate if you feed it too much text at once. Instead, feed it small clumps of text, and then review and edit the results.
  3. Prioritize confidentiality. Generative AI can be used to create confidential content. If you are using generative AI to create confidential content, be sure to take steps to protect it.

A Story About ChatGPT

I once had a client who asked me to write a blog post about a new product they were launching. I was really busy at the time, so I decided to use ChatGPT to help me write the post. I gave ChatGPT a brief overview of the product, and it generated a draft of the post in about 10 minutes. The post was well-written and informative, and it saved me a lot of time.

Ornery

I’m not sure if you would call me ornery, but I can be a bit of a perfectionist. I like to make sure that my work is the best it can be, and I’m not afraid to ask for help when I need it.

Writing, Writing, Writing

If you want to learn more about how I use generative AI to write content, check out the Writing, Writing, Writing subsection of my website. I also wrote a LinkedIn post about five minute responses. You can find it here: [link to LinkedIn post]

I hope these guidelines help you use generative AI responsibly.

From defund the police to fund the police. But what about technology?

There’s been a tactical reversal by some cities.

Defund the police, then re-fund the police

In November, the Portland Oregon City Council unanimously voted to increase police funding, a little over a year after the city reduced police funding in the wake of the Black Lives Matter movement.

Now this month, Oakland California has also decided to increase police funding after similarly defunding the police in the past. This vote was not unanimous, but the City Council was very much in favor of the measure.

By Taymaz Valley – https://www.flickr.com/photos/taymazvalley/49974424258, CC BY 2.0, https://commons.wikimedia.org/w/index.php?curid=91013003

Not that Oakland has returned to the former status quo.

[Mayor Libby] Schaaf applauded the vote in a statement, saying that residents “spoke up for a comprehensive approach to public safety — one that includes prevention, intervention, and addressing crime’s root causes, as well as an adequately staffed police department.”

From https://www.police1.com/patrol-issues/articles/oakland-backtracks-votes-to-add-police-as-crimes-surge-MDirxJZAHV41wyxg/

So while Oakland doesn’t believe that police are the solution to EVERY problem, it feels that police are necessary as part of a comprehensive approach. The city had 78 homicides in 2019, 109 in 2020, and 129 so far in 2021. Granted that it’s difficult to compare year-over-year statistics in the COVID age, but clearly defunding the police hasn’t been a major success.

But if crime is to be addressed by a comprehensive approach including “prevention, intervention, … addressing crime’s root causes, … (and) an adequately staffed police department…

…what about police technology?

What about police technology?

Portland and Oakland have a lot in common. Not only have they defunded and re-funded the police, but both have participated in the “facial recognition is evil” movement.

Oakland was the third U.S. city to limit the use of facial recognition, back in July 2019.

A city ordinance … prohibits the city of Oakland from “acquiring, obtaining, retaining, requesting, or accessing” facial recognition technology….

From https://www.vice.com/en/article/zmpaex/oakland-becomes-third-us-city-to-ban-facial-recognition-xz

Portland joined the movement later, in September 2020. But when it did, it made Oakland and other cities look like havens of right-wing totalitarianism.

The Portland City Council has passed the toughest facial recognition ban in the US, blocking both public and private use of the technology. Other cities such as BostonSan Franciscoand Oakland have passed laws barring public institutions from using facial recognition, but Portland is the first to prohibit private use.

From https://www.theverge.com/2020/9/9/21429960/portland-passes-strongest-facial-recognition-ban-us-public-private-technology
The Mayor of Portland, Ore. Ted Wheeler. By Naval Surface Warriors – 180421-N-UK248-023, Public Domain, https://commons.wikimedia.org/w/index.php?curid=91766933

Mayor Ted Wheeler noted, “Portlanders should never be in fear of having their right of privacy be exploited by either their government or by a private institution.”

Coincidentally, I was talking to someone this afternoon about some of the marketing work that I performed in 2015 for then-MorphoTrak’s video analytics offering. The market analysis included both government customers (some with acronyms, some without) and potential private customers such as large retail chains.

In 2015, we hadn’t yet seen the movements that would result in dampening both market segments in cities like Portland. (Perpetual Lineup didn’t appear until 2016, while Gender Shades didn’t appear until 2018.)

Flash – ah ah, robber of the universe

But there’s something else that I didn’t imagine in 2015, and that’s the new rage that’s sweeping the nation.

Flash!

By Dynamite Entertainment, Fair use, https://en.wikipedia.org/w/index.php?curid=57669050
Normally I add the music to the end of the post, but I stuck it in the middle this time as a camp break before this post suddently gets really serious. From https://www.youtube.com/watch?v=LfmrHTdXgK4

Specifically, flash mobs. And not the fun kind, but the “flash rob” kind.

District Attorney Chesa Boudin, who is facing a recall election in June, called this weekend’s brazen robberies “absolutely unacceptable” and was preparing tough charges against those arrested during the criminal bedlam in Union Square….

Boudin said his office was eagerly awaiting more arrests and plans to announce felony charges on Tuesday. He said 25 individuals are still at large in connection with the Union Square burglaries on Friday night….

“We know that when it comes to property crime in particular, sadly San Francisco police are spread thin,” said Boudin. “They’re not able to respond to every single 911 call, they’re only making arrests at about 3% of reported thefts.”

From https://sanfrancisco.cbslocal.com/2021/11/23/smash-and-grab-embattled-san-francisco-district-attorney-chesa-boudin-prosecution/

So there are no arrests in 97% of reported thefts in San Francisco.

To be honest, this is not a “new” rage that is sweeping the nation.

In fact, “flash robs” were occurring as early as 2012 in places like…Portland, Oregon.

If only there were a technology that could recognize flash rob participants and other thieves even when the police WEREN’T present.

A technology that is continuously tested by the U.S. government for accuracy, demographic effects (see this PDF and the individual “report cards” from the 1:1 tests), and other factors.

Does anyone know of any technology that would fill this need?

Perhaps Oakland and Portland could adopt it.

The dangers of removing facial recognition and artificial intelligence from DHS solutions (DHS ICR part four)

And here’s the fourth and final part of my repurposing exercise. See parts one, two, and three if you missed them.

This post is adapted from Bredemarket’s November 10, 2021 submitted comments on DHS-2021-0015-0005, Information Collection Request, Public Perceptions of Emerging Technology. As I concluded my request, I stated the following.

Of course, even the best efforts of the Department of Homeland Security (DHS) will not satisfy some members of the public. I anticipate that many of the respondents to this ICR will question the need to use biometrics to identify individuals, or even the need to identify individuals at all, believing that the societal costs outweigh the benefits.

By Banksy – One Nation Under CCTV, CC BY-SA 2.0, https://commons.wikimedia.org/w/index.php?curid=3890275

But before undertaking such drastic action, the consequences of following these alternative paths must be considered.

Taking an example outside of the non-criminal travel interests of DHS, some people prefer to use human eyewitness identification rather than computerized facial recognition.

By Zhe Wang, Paul C. Quinn, James W. Tanaka, Xiaoyang Yu, Yu-Hao P. Sun, Jiangang Liu, Olivier Pascalis, Liezhong Ge and Kang Lee – https://www.frontiersin.org/articles/10.3389/fpsyg.2015.00559/full, CC BY 4.0, https://commons.wikimedia.org/w/index.php?curid=96233011

However, eyewitness identification itself has clear issues of bias. The Innocence Project has documented many cases in which eyewitness (mis)identification has resulted in wrongful criminal convictions which were later overturned by biometric evidence.

Archie Williams moments after his exoneration on March 21, 2019. Photo by Innocence Project New Orleans. From https://innocenceproject.org/fingerprint-database-match-establishes-archie-williams-innocence/

Mistaken eyewitness identifications contributed to approximately 69% of the more than 375 wrongful convictions in the United States overturned by post-conviction DNA evidence.

Inaccurate eyewitness identifications can confound investigations from the earliest stages. Critical time is lost while police are distracted from the real perpetrator, focusing instead on building the case against an innocent person.

Despite solid and growing proof of the inaccuracy of traditional eyewitness ID procedures – and the availability of simple measures to reform them – traditional eyewitness identifications remain among the most commonly used and compelling evidence brought against criminal defendants.”

Innocence Project, Eyewitness Identification Reform, https://innocenceproject.org/eyewitness-identification-reform/

For more information on eyewitness misidentification, see my November 24, 2020 post on Archie Williams (pictured above) and Uriah Courtney.

Do we really want to dump computerized artificial intelligence and facial recognition, only to end up with manual identification processes that are proven to be even worse?

How the “CSI effect” can obscure the limited role of DNA-based investigative leads

(Part of the biometric product marketing expert series)

People have been talking about the “CSI effect” for decades.

In short, the “CSI effect” is characterized as the common impression that forensic technologies can solve crimes (and must be used to solve crimes) in less than an hour, or within the time of a one-hour television show.

When taken to its extreme, juries may ask why the law enforcement agency didn’t use advanced technological tools to solve that jaywalking case.

Advanced technological tools like DNA, which has been commonly perceived to be the tool that can solve every single crime.

Well, that and video, because video is powerful enough to secure a conviction. But that’s another story.

Can DNA result in an arrest in a Denver homicide case?

A case in point is this story from KDVR entitled “DNA in murder case sits in Denver crime lab for 11 months.”

This is a simple statement of fact, and is not that surprising a statement of fact. Many crime labs are inundated with backlogs of DNA evidence and other forensic evidence that has yet to be tested. And these backlogs ARE creating difficulties in solving crimes such as rapes.

But when you read the article itself, the simple statement of fact is painted as an abrogation of responsibility on the part of law enforcement.

A father is making an emotional plea and putting up $25,000 of his own money to help find his son’s killer.

He is also asking the Problem Solvers to look into the time it has taken for DNA evidence to be tested in this case and others.

Tom O’Keefe said it’s taking too long to get answers and justice.

From this and other statements in the article, a picture emerges of an unsolved crime that can only be solved by the magical tool of DNA. If DNA is applied to this, just like they do on TV, arrests will be made and the killer will be convicted.

So why is it taking so long to do this?

Why is justice not being served?

KDVR is apparently not run by impassioned activists, but by journalists. And it is important from a journalistic perspective to get all sides of the story. Therefore, KDVR contacted the Denver Police Department for its side of the story.

The Denver Police Department has identified all parties involved, and the investigation shows multiple handguns were fired during this incident. While this complex case remains open, which limits details we can provide, we can verify that a significant amount of forensic work has been completed, but some remains. Investigators believe the pending forensic analysis can potentially support a weapon-related charge but will not further the ongoing homicide investigation.

OK, let’s grant that they’re not trying to identify an unknown assailant, since “all parties involved” are known.

But once that DNA is tested, isn’t that going to be the magic tool that provides the police with probable cause to arrest the killer?

Um, no.

Even IF the DNA evidence DOES happen to show a significant probability that an identifiable person committed the homicide, that in itself is not sufficient reason to arrest someone.

Why not?

Because you can’t arrest someone on DNA evidence alone.

DNA evidence can provide an investigative lead, but it has to be corroborated with other evidence in order to secure an arrest and a conviction. (Don’t forget that the evidence has to result in a conviction, and in most of the United States that requires that the evidence show beyond a reasonable doubt that the person committed the crime.)

Why was a serial killer in three European countries never brought to justice, despite overwhelming DNA evidence?

Reasonable schmeasonable.

If DNA ties someone to a crime, then the person committed the crime, right?

Let’s look at the story of a serial killer who terrorized Europe for over a decade, even though ample DNA evidence was found at each of the murder scenes, beginning with this one:

In 1993, a 62-year-old woman was found dead in her house in the town of Idar-Oberstein, strangled by wire taken from a bouquet of flowers discovered near her body.

Nobody had any information on what might have happened to Lieselotte Schlenger. No witnesses, no suspects, no signs of suspicious activity (except for the fact that she’d been strangled to death with a piece of wire, of course). But on a bright teacup near Schlenger, the police found DNA, the only clue to surface at all.

The case went cold, given that the only lead was the DNA of an unknown woman, and there was no match. Yet.

Eight years later, in 2001, there was a match when the same woman’s DNA was found at a murder scene of a strangulation victim in Freiburg, Germany. Police now knew that they were dealing with a serial killer.

But this time, the woman didn’t wait another eight years to strike again.

Five months after the second murder scene, her DNA showed up on a discarded heroin syringe, after a 7-year-old had stepped on it in a playground in Gerolstein. A few weeks later it showed up on an abandoned cookie in a burgled caravan near Bad Kreuznach, like she’d deliberately spat out a Jammy Dodger as a calling card. It was found in a break-in in an office in Dietzenbach, in an abandoned stolen car in Heilbronn, and on two beer bottles and a glass of wine in a burgled bar in Karlsruhe, like she’d robbed the place but stuck around for a few cheeky pints.

And her activities were not confined to Germany.

Over the apparent crime spree, her DNA was sprayed across an impressive 40 crime scenes in Austria, southern Germany, and France, including robberies, armed robberies, and murders.

In 2009, the case took an even more bizarre turn.

Police in France had discovered the burned body of a man, believed to be from an asylum seeker who went missing in 2002. During his application, the man had submitted fingerprints, which the police used to try and confirm his identity. Only, once again, they found the DNA of the phantom.

“Obviously that was impossible, as the asylum seeker was a man and the Phantom’s DNA belonged to a woman,” a spokesperson for the Saarbrücken public prosecutor’s office told Spiegel Online in 2009.

But how could this be?

DNA evidence had tied the woman, or man, or whatever, to six murders and numerous other crimes. There was plenty of evidence to identify the criminal.

What went wrong?

Well, in 2009 police finally figured out how DNA evidence had ended up at all of these crime scenes in three countries.

The man’s death led to an explanation of the case: there was no serial killer, and the DNA could be traced to a woman working in a packing center specializing in medical supplies. It was all down to DNA contamination.

Well, couldn’t that packing woman be convicted of the serial murders and other crimes, based upon the DNA evidence?

No, because there was no other evidence linking the woman to the crimes, and certainly “reasonable doubt” (or the European criminal justice equivalent) that the woman was also the dead male asylum seeker.

This is why DNA is only an investigative lead, and not evidence in and of itself.

But the Innocence Project always believes that DNA is authoritative evidence, right?

Even those who champion the use of DNA admit this.

If you look through the files of people exonerated by the Innocence Project, you find a common thread in many of them.

Much of the evidence gathered before the suspect’s original conviction indicated that the suspect was NOT the person who committed the crime. Maybe the family members testified that the suspect was at home the entire time and couldn’t have committed the crime in question. Or maybe the suspect was in another city.

However, some piece of evidence was so powerful that the person was convicted anyway. Perhaps it was eyewitness testimony, or perhaps something else, but in the end the suspect was convicted.

Eventually the Innocence Project got involved, and subsequent DNA testing indicated that the suspect was NOT the person who committed the crime.

This in and of itself didn’t PROVE that the person was innocent, but the DNA test aligned with much of the other evidence that had previously been collected. It was enough to cast a reasonable doubt on the conviction, allowing the improperly convicted suspect to go free.

But there are some cases in which the Innocence Project says that even DNA evidence is not to be trusted.

Negligence in the Baltimore Police Department’s crime lab tainted DNA analysis in an unknown number of criminal cases for seven years and raises serious questions about other forensic work in the lab, the Innocence Project said today in a formal allegation that the state is legally required to investigate.

DNA contamination, the same thing that caused the issues in Europe, also caused issues in Baltimore.

And there may be other explanations for how a person’s DNA ended up at a crime scene. Perhaps a police officer was careless and left his or her DNA at a crime scene. Perhaps someone was at a crime scene and left DNA evidence, even though that person had nothing to do with the crime.

In short, a high probability DNA match, in and of itself, proves nothing.

Investigative leads and reasonable doubt are very important considerations, even if they don’t fit into a one-hour TV show script.

Investigative leads and DNA booking stations

(Part of the biometric product marketing expert series)

A July Bredemarket post on Facebook has garnered some attention in September.

I wanted to answer some questions about rapid DNA use in a booking station, how (and when) DNA is used in booking (arrests), what an “investigative lead” is, and whether acquiring DNA at booking is Constitutional.

(TL;DR on the last question is “yes,” per Maryland v. King.)

Are rapid DNA booking stations a Big Brother plot?

The post in question was a Facebook post to the Bredemarket Identity Firm Services Facebook group. I posted this way back in July, when Thermo Fisher Scientific became the second rapid DNA vendor (of two rapid DNA vendors; ANDE is the other) whose system was approved by the U.S. Federal Bureau of Investigation (FBI) for use as a law enforcement booking station.

When I shared this on Facebook, I received some concerned comments:

“Big brother total control”

“Is this Constitutional??? Will the results of this test hold up in courtrooms???”

I’ll address the second question later: not just in regard to rapid DNA, but to DNA in general. At this point, however, I will go ahead and say that the use of rapid DNA in booking was authorized legislatively by the Rapid DNA Act of 2017. This was followed by over three years of procedural stuff until rapid DNA booking station use was authorized this year.

To accurately state what “rapid DNA booking station use” actually means, let me refer to the FBI’s language, starting with the purpose:

The FBI Laboratory Division has been working with the FBI Criminal Justice Information Services (CJIS) Division and the CJIS Advisory Policy Board (CJIS APB) Rapid DNA Task Force to plan the effective integration of Rapid DNA into the booking station process.

By way of definition, a “booking station” is a computer that processes individuals who are “booked,” or arrested. The FBI’s plan was that (when authorized by federal, state, or local law) when an arrested individual’s fingerprints were captured, the individual’s DNA would be captured at the same time. (Again, only when authorized.)

The use of the term “reference sample buccal (cheek) swab” is intentional. The FBI’s current development and validation efforts have been focused on the DNA samples obtained from known individuals (e.g., persons under arrest). Because known reference samples are taken directly from the individual, they contain sufficient amounts of DNA, and there are no mixed DNA profiles that would require a scientist to interpret them. For purposes of uploading or searching CODIS, Rapid DNA systems are not authorized for use on crime scene samples.

“CODIS,” by the way, is the Combined DNA Index System, a combination of federal, state, and local systems.

“Rapid DNA” is an accelerated, automated DNA method that can process DNA samples in less than two hours, as opposed to the more traditional DNA processes that can take a lot longer.

The FBI is NOT ready to use rapid DNA to solve crimes, although some local police agencies have chosen to do so. And until February of this year, the FBI was not ready to use rapid DNA in the booking process either.

So what has been authorized?

The Bureau recognizes that National DNA Index System (NDIS) approval of the Rapid DNA Booking Systems and training of law enforcement personnel using the approved systems are integral to ensuring that Rapid DNA is used in a manner that maintains the quality and integrity of CODIS and NDIS.

Rapid DNA Booking System(s) approved for use at NDIS by a law enforcement booking station are listed below.

ANDE 6C Series G (effective February 1, 2021)

RapidHIT™ ID DNA Booking System v1.0 (effective July 1, 2021) 

If you read the FBI rapid DNA page, you can find links to a number of forensic, security, and other standards that have to be followed when using rapid DNA in a booking environment.

But those aren’t the only restrictions on rapid DNA use.

Can ANY law enforcement agency use rapid DNA in booking?

Um, no.

According to the National Conference of State Legislatures (2013; see PDF), not all states authorize the taking of DNA after an arrest. As of 2013, 20 states did NOT allow the taking of DNA from individuals who had been arrested but not convicted. And of the 30 remaining states, some (such as Connecticut) only allowed taking of DNA for “serious felonies,” some (such as California) for all felonies, and various mixtures in between. Oklahoma, for example, only allowed taking of DNA for “aliens unlawfully present under federal immigration law.”

Now, of course, a rogue police officer could take your DNA when not legally authorized to do so. Then again, a rogue restaurant employee could put laxatives in your food; that doesn’t mean we outlaw laxatives.

An “investigative lead”

So let’s say that you’re arrested for a crime, and your state allows the taking of DNA for your crime at arrest, and your local law enforcement agency has a rapid DNA instrument.

Now let’s assume that your DNA is searched against a DNA database of unsolved crimes, and your DNA matches a sample from another crime. What happens next?

If there is a match, police will likely want to take a closer look.

Wait a minute. There’s a DNA match! Doesn’t that mean that the police can swoop in and arrest the individual, and the individual is immediately convicted?

Um, no. Stop trusting your TV.

It takes more than DNA to convict a person of a crime.

While DNA can provide an investigative lead, DNA in and of itself is not sufficient to convict an individual. The DNA evidence usually has to be supported by additional evidence.

Especially since there may be other explanations of how the DNA got there.

In 2011, Adam Scott’s DNA matched with a sperm sample taken from a rape victim in Manchester—a city Scott, who lived more than 200 miles away, had never visited. Non-DNA evidence subsequently cleared Scott. The mixup was due to a careless mistake in the lab, in which a plate used to analyze Scott’s DNA from a minor incident was accidentally reused in the rape case.

Then there’s the uncomfortable and inconvenient truth that any of us could have DNA present at a crime scene—even if we were never there. Moreover, DNA recovered at a crime scene could have been deposited there at a time other than when the crime took place. Someone could have visited beforehand or stumbled upon the scene afterward. Alternatively, their DNA could have arrived via a process called secondary transfer, where their DNA was transferred to someone else, who carried it to the scene.

But there is a DNA case that was (originally) puzzling. Actually, a whole bunch of DNA cases.

There is an interesting case, known as the Phantom of Heilbonn, that dates from 1993 in Austria, France and Germany. From that year the DNA of an unknown female was detected at crime scenes in those countries, including at six murder scenes, one of the victims being a female police officer from Heilbronn, Germany. Between 1993 and March 2009 the woman’s DNA was detected at 40 crime scenes which ranged from murder to burglaries and robberies. The DNA was found on items ranging from a biscuit to a heroin syringe to a stolen car.

Then it got really weird.

In March 2009 investigators discovered the same DNA on the burned body of a male asylum-seeker in France. Now this presented something of an anomaly: the corpse was male but the DNA was of a female.

You guessed it; it was the swabs themselves that were contaminated.

So a DNA match is just the start of an investigative process, but it could provide the investigative lead that eventually leads to the conviction of an individual.

Perhaps you’ve noticed that I use the phrase “investigative lead” a lot when talking about DNA and about facial recognition. Trust me, it’s important.

But is the taking of DNA at booking Constitutional?

Obviously this is a huge question, because technical ability to do something does not automatically mean that you are Constitutionally authorized to do so. There is, after all, Fourth Amendment language protecting us against “unreasonable searches and seizures.”

Is the taking of DNA from arrestees who have not been convicted (assuming state law allows it) reasonable, or unreasonable?

Alonzo Jay King, Jr. had a vested interest in this question.

Alonzo Jay King Jr…was arrested in 2009 on assault charges. Before he was convicted of that crime, police took a DNA sample pursuant to Maryland’s new law allowing for such collections at the time of arrest in certain offenses….

I want to pause right here to make sure that the key point is highlighted. King, an arrestee who had not been convicted at the time of any crime, was compelled to provide evidence. At the time of arrest, collection of certain types of evidence (such as fingerprints) is “reasonable.” But collection of certain other types of evidence (such as a forced confession) is “unreasonable.”

So King’s DNA was taken and was searched against a Maryland database of DNA from unsolved crimes. You won’t believe what happened next! (Actually, you will.)

The DNA matched a sample from an unsolved 2003 rape case, and Mr. King was convicted of that crime.

Sentenced to life in prison, actually.

Wicomico County Assistant State’s Attorney Elizabeth L. Ireland said she requested the court impose a life sentence on King, not only because of his past criminal convictions, but also because it turned out that he was a friend of the victim’s family. She said this proved King was a continuing danger to the community.

Before you say, “well, if he was the rapist, he should be imprisoned, legal niceties notwithstanding,” think of the implications of that statement. The entire U.S. legal system is based upon the premise that it is better for a guilty person to mistakenly go free than for an innocent person to mistakenly be punished.

And if that doesn’t sink in…what if YOU were arrested and convicted unlawfully? What if a plate analyzing YOUR DNA wasn’t cleaned properly, and you were unjustly convicted of rape? Or what if a confession were coerced from YOU, and used to convict you?

So King’s question was certainly important, regardless of whether or not he actually committed the rape for which he was convicted.

King therefore appealed on Fourth Amendment grounds, the Maryland Court of Appeals overturned his conviction (PDF), and the State of Maryland brought the case to the U.S. Supreme Court in 2013 (Maryland v. King). In a close 5-4 decision (PDF) in which both conservatives and liberals were on both sides of the argument, the Court ruled that the taking of DNA from arrestees WAS Constitutional.

But that wasn’t the end of the argument, because a new case arose in the state of California. But the California Supreme Court ruled in 2018 that the practice was allowed in that state.

So the taking of DNA at booking is not only authorized (in some states, for some charges), it’s also Constitutional. (Although the Supreme Court’s opinion is still widely debated.)

So anyone who gets arrested for a felony in my home state of California should be ready for a buccal (cheek) swab.