synthetic identity – Page 3

Communicating How Your Firm Fights Synthetic Identities

(Updated question count 10/23/2023)

Does your firm fight crooks who try to fraudulently use synthetic identities? If so, how do you communicate your solution?

This post explains what synthetic identities are (with examples), tells four ways to detect synthetic identities, and closes by providing an answer to the communication question.

While this post is primarily intended for identity firms who can use Bredemarket’s marketing and writing services, anyone else who is interested in synthetic identities can read along.

What are synthetic identities?

To explain what synthetic identities are, let me start by telling you about Jason Brown.

Jason Brown wasn’t Jason Brown

You may not have heard of him unless you lived in Atlanta, Georgia in 2019 and lived near the apartment he rented.

Jason Brown’s renting of an apartment isn’t all that unusual.

If you were to visit Brown’s apartment in February 2019, you would find credit cards and financial information for Adam M. Lopez and Carlos Rivera.

Now that’s a little unusual, especially since Lopez and Rivera never existed.

For that matter, Jason Brown never existed either.

Brown was synthetically created from a stolen social security number and a fake California driver’s license. The creator was a man named Corey Cato, who was engaged in massive synthetic identity fraud. If you want to talk about a case that emphasizes the importance of determining financial identity, this is it.

From https://www.ice.gov/news/releases/hsi-investigates-synthetic-identities-scheme-defrauded-banks-nearly-2m

A Georgia man was sentenced Sept. 1 (2022) to more than seven years in federal prison for participating in a nationwide fraud ring that used stolen social security numbers, including those belonging to children, to create synthetic identities used to open lines of credit, create shell companies, and steal nearly $2 million from financial institutions….

Cato joined conspiracies to defraud banks and illegally possess credit cards. Cato and his co-conspirators created “synthetic identities” by combining false personal information such as fake names and dates of birth with the information of real people, such as their social security numbers. Cato and others then used the synthetic identities and fake ID documents to open bank and credit card accounts at financial institutions. Cato and his co-conspirators used the unlawfully obtained credit cards to fund their lifestyles.
From https://www.ice.gov/news/releases/hsi-investigates-synthetic-identities-scheme-defrauded-banks-nearly-2m

Talking about synthetic identity at Victoria Gardens

Here’s a video that I created on Saturday that describes, at a very high level, how synthetic identities can be used fraudulently. People who live near Rancho Cucamonga, California will recognize the Victoria Gardens shopping center, proof that synthetic identity theft can occur far away from Georgia.

From https://www.youtube.com/watch?v=oDrSBlDJVCk

Note that synthetic identity theft different from stealing someone else’s existing identity. In this case, a new identity is created.

So how do you catch these fraudsters?

Catching the identity synthesizers

If you’re renting out an apartment, and Jason Brown shows you his driver’s license and provides his Social Security Number, how can you detect if Brown is a crook? There are four methods to verify that Jason Brown exists, and that he’s the person renting your apartment.

Method One: Private Databases

One way to check Jason Brown’s story is to perform credit checks and other data investigations using financial databases.

Did Jason Brown just spring into existence within the past year, with no earlier credit record? That seems suspicious.
Does Jason Brown’s credit record appear TOO clean? That seems suspicious.
Does Jason Brown share information such as a common social security number with other people? Are any of those other identities also fraudulent? That is DEFINITELY suspicious.

This is one way that many firms detect synthetic identities, and for some firms it is the ONLY way they detect synthetic identities. And these firms have to tell their story to their prospects.

If your firm offers a tool to verify identities via private databases, how do you let your prospects know the benefits of your tool, and why your solution is better than all other solutions?

Method Two: Check That Driver’s License (or other government document)

What about that driver’s license that Brown presented? There are a wide variety of software tools that can check the authenticity of driver’s licenses, passports, and other government-issued documents. Some of these tools existed back in 2019 when “Brown” was renting his apartment, and a number of them exist today.

Maybe your firm has created such a tool, or uses a tool from a third party.

If your firm offers this capability, how can your prospects learn about its benefits, and why your solution excels?

Method Three: Check Government Databases

Checking the authenticity of a government-issued document may not be enough, since the document itself may be legitimate, but the implied credentials may no longer be legitimate. For example, if my California driver’s license expires in 2025, but I move to Minnesota in 2023 and get a new license, my California driver’s license is no longer valid, even though I have it in my possession.

Why not check the database of the Department of Motor Vehicles (or the equivalent in your state) to see if there is still an active driver’s license for that person?

The American Association of Motor Vehicle Administrators (AAMVA) maintains a Driver’s License Data Verification (DLDV) Service in which participating jurisdictions allow other entities to verify the license data for individuals. Your firm may be able to access the DLDV data for selected jurisdictions, providing an extra identity verification tool.

If your firm offers this capability, how can your prospects learn where it is available, what its benefits are, and why it is an important part of your solution?

Method Four: Conduct the “Who You Are” Test

There is one more way to confirm that a person is real, and that is to check the person. Literally.

If someone on a smartphone or videoconference says that they are Jason Brown, how do you know that it’s the real Jason Brown and not Jim Smith, or a previous recording or simulation of Jason Brown?

This is where tools such as facial recognition and livene s s detection come to play.

You can ensure that the live face matches any face on record.
You can also confirm that the face is truly a live face.

In addition to these two tests, you can compare the face against the face on the presented driver’s license or passport to offer additional confirmation of true identity.

Now some companies offer facial recognition, others offer liveness detection, others match the live face to a face on a government ID, and many companies offer two or three of these capabilities.

One more time: if your firm offers these capabilities—either your own or someone else’s—what are the benefits of your algorithms? (For example, are they more accurate than competing algorithms? And under what conditions?) And why is your solution better than the others?

This is for the firms who fight synthetic identities

While most of this post is of general interest to anyone dealing with synthetic identities, this part of this post is specifically addressed to identity and biometric firms who provide synthetic identity-fighting solutions.

When you communicate about your solutions, your communicator needs to have certain types of experience.

Industry experience. Perhaps you sell your identity solution to financial institutions, or educational institutions , or a host of other industries (gambling/gaming, healthcare, hospitality, retailers, or sport/concert venues, or others). You need someone with this industry experience.
Solution experience. Perhaps your communications require someone with 29 years of experience in identity, biometrics, and technology marketing, including experience with all five factors of authentication (and verification).
Communication experience. Perhaps you need to effectively communicate with your prospects in a customer focused, benefits-oriented way. (Content that is all about you and your features won’t win business.)

Perhaps you can use Bredemarket, the identity content marketing expert. I work with you (and I have worked with others) to ensure that your content meets your awareness, consideration, and/or conversion goals.

How can I work with you to communicate your firm’s anti-synthetic identity message? For example, I can apply my identity/biometric blog expert knowledge to create an identity blog post for your firm. Blog posts provide an immediate business impact to your firm, and are easy to reshare and repurpose. For B2B needs, LinkedIn articles provide similar benefits.

If Bredemarket can help your firm convey your message about synthetic identity, let’s talk.

Email me at john.bredehoft@bredemarket.com.
Book a meeting with me at calendly.com/bredemarket.
Contact me at bredemarket.com/contact/.
Subscribe to my mailing list at http://eepurl.com/hdHIaT.

And thirteen more things

If you haven’t read a Bredemarket blog post before, or even if you have, you may not realize that this post is jam-packed with additional information well beyond the post itself. This post alone links to the following Bredemarket posts and other content. You may want to follow one or more of the 13 links below if you need additional information on a particular topic:

Synthetic Identity video (YouTube), August 12, 2023. https://www.youtube.com/watch?v=oDrSBlDJVCk
Using “Multispectral” and “Liveness” in the Same Sentence (Bredemarket blog), June 6, 2023. https://bredemarket.com/2023/06/06/using-multispectral-and-liveness-in-the-same-sentence/
Who is THE #1 NIST facial recognition vendor? (Bredemarket blog), February 23, 2022. https://bredemarket.com/2022/02/23/number1frvt/
Financial Identity (Bredemarket website). https://bredemarket.com/financial-identity/
Educational Identity (Bredemarket website). https://bredemarket.com/educational-identity/
The five authentication factors (Bredemarket blog), March 2, 2021. https://bredemarket.com/2021/03/02/the-five-authentication-factors/
Customer Focus (Bredemarket website). https://bredemarket.com/customer-focus/
Benefits (Bredemarket website). https://bredemarket.com/benefits/
Seven Questions Your Content Creator Should Ask You: the e-book version (Bredemarket blog and e-book), October 22, 2023. https://bredemarket.com/2023/10/22/seven-questions-your-content-creator-should-ask-you-the-e-book-version/
Four Mini-Case Studies for One Inland Empire Business—My Own (Bredemarket blog and e-book), April 16, 2023. https://bredemarket.com/2023/04/16/four-mini-case-studies-for-one-inland-empire-business-my-own/
Identity blog post writing (Bredemarket website). https://bredemarket.com/identity-blog-post-writing/
Blog About Your Identity Firm’s Benefits Now. Why Wait? (Bredemarket blog), August 11, 2023. https://bredemarket.com/2023/08/11/blog-about-your-identity-firms-benefits-now-why-wait/
Why Your Company Should Write LinkedIn Articles (Bredemarket LinkedIn article), July 31, 2023. https://www.linkedin.com/pulse/why-your-company-should-write-linkedin-articles-bredemarket/

That’s twelve more things than the Cupertino guys do, although my office isn’t as cool as theirs.

By Arne Müseler / http://www.arne-mueseler.com, CC BY-SA 3.0 de, https://commons.wikimedia.org/w/index.php?curid=78985341

Well, why not one more?

Here’s my latest brochure for the Bredemarket 400 Short Writing Service, my standard package to create your 400 to 600 word blog posts and LinkedIn articles. Be sure to check the Bredemarket 400 Short Writing Service page for updates.

Bredemarket 400 Short Writing Service (June 2022)Download

If that doesn’t fit your needs, I have other offerings.

Plus, I’m real. I’m not a bot.

Today’s Synthetic Identity Video

I will have more to say about this later, but for now here is the synthetic identity video (the long version) that I created this morning.

From https://youtu.be/oDrSBlDJVCk

We Survived Gummy Fingers. We’re Surviving Facial Recognition Inaccuracy. We’ll Survive Voice Spoofing.

(Part of the biometric product marketing expert series)

Some of you are probably going to get into an automobile today.

Are you insane?

The National Highway Traffic Safety Administration has released its latest projections for traffic fatalities in 2022, estimating that 42,795 people died in motor vehicle traffic crashes.
From https://www.nhtsa.gov/press-releases/traffic-crash-death-estimates-2022

When you have tens of thousands of people dying, then the only conscionable response is to ban automobiles altogether. Any other action or inaction is completely irresponsible.

After all, you can ask the experts who want us to ban biometrics because it can be spoofed and is racist, so therefore we shouldn’t use biometrics at all.

I disagree with the calls to ban biometrics, and I’ll go through three “biometrics are bad” examples and say why banning biometrics is NOT justified.

Even some identity professionals may not know about the old “gummy fingers” story from 20+ years ago.
And yes, I know that I’ve talked about Gender Shades ad nauseum, but it bears repeating again.
And voice deepfakes are always a good topic to discuss in our AI-obsessed world.

Example 1: Gummy fingers

My recent post “Why Apple Vision Pro Is a Technological Biometric Advance, but Not a Revolutionary Biometric Event” included the following sentence:

But the iris security was breached by a “dummy eye” just a month later, in the same way that gummy fingers and face masks have defeated other biometric technologies.
From https://bredemarket.com/2023/06/12/vision-pro-not-revolutionary-biometrics-event/

A biometrics industry colleague noticed the rhyming words “dummy” and “gummy” and wondered if the latter was a typo. It turns out it wasn’t.

To my knowledge, these gummy fingers do **NOT** have ridges. From https://www.candynation.com/gummy-fingers

Back in 2002, researcher Tsutomu Matsumoto used “gummy bears” gelatin to create a fake finger that fooled a fingerprint reader.

Back in 2002, this news WAS really “scary,” since it suggested that you could access a fingerprint reader-protected site with something that wasn’t a finger. Gelatin. A piece of metal. A photograph.

Except that the fingerprint reader world didn’t stand still after 2002, and the industry developed ways to detect spoofed fingers. Here’s a recent example of presentation attack detection (liveness detection) from TECH5:

TECH5 participated in the 2023 LivDet Non-contact Fingerprint competition to evaluate its latest NN-based fingerprint liveness detection algorithm and has achieved first and second ranks in the “Systems” category for both single- and four-fingerprint liveness detection algorithms respectively. Both submissions achieved the lowest error rates on bonafide (live) fingerprints. TECH5 achieved 100% accuracy in detecting complex spoof types such as Ecoflex, Playdoh, wood glue, and latex with its groundbreaking Neural Network model that is only 1.5MB in size, setting a new industry benchmark for both accuracy and efficiency.
From https://tech5.ai/tech5s-mobile-fingerprint-liveness-detection-technology-ranked-the-most-accurate-in-the-market/

TECH5 excelled in detecting fake fingers for “non-contact” reading where the fingers don’t even touch a surface such as an optical surface. That’s appreciably harder than detecting fake fingers that touch contact devices.

I should note that LivDet is an independent assessment. As I’ve said before, independent technology assessments provide some guidance on the accuracy and performance of technologies.

So gummy fingers and future threats can be addressed as they arrive.

But at least gummy fingers aren’t racist.

Example 2: Gender shades

In 2017-2018, the Algorithmic Justice League set out to answer this question:

How well do IBM, Microsoft, and Face++ AI services guess the gender of a face?
From http://gendershades.org/. Yes, that’s “http,” not “https.” But I digress.

Let’s stop right there for a moment and address two items before we continue. Trust me; it’s important.

This study evaluated only three algorithms: one from IBM, one from Microsoft, and one from Face++. It did not evaluate the hundreds of other facial recognition algorithms that existed in 2018 when the study was released.
The study focused on gender classification and race classification. Back in those primitive innocent days of 2018, the world assumed that you could look at a person and tell whether the person was male or female, or tell the race of a person. (The phrase “self-identity” had not yet become popular, despite the Rachel Dolezal episode which happened before the Gender Shades study). Most importantly, the study did not address identification of individuals at all.

However, the findings did find something:

While the companies appear to have relatively high accuracy overall, there are notable differences in the error rates between different groups. Let’s explore.

All companies perform better on males than females with an 8.1% – 20.6% difference in error rates.

All companies perform better on lighter subjects as a whole than on darker subjects as a whole with an 11.8% – 19.2% difference in error rates.

When we analyze the results by intersectional subgroups – darker males, darker females, lighter males, lighter females – we see that all companies perform worst on darker females.
From http://gendershades.org/overview.html

What does this mean? It means that if you are using one of these three algorithms solely for the purpose of determining a person’s gender and race, some results are more accurate than others.

Three algorithms do not predict hundreds of algorithms, and classification is not identification. If you’re interested in more information on the differences between classification and identification, see Bredemarket’s November 2021 submission to the Department of Homeland Security. (Excerpt here.)

And all the stories about people such as Robert Williams being wrongfully arrested based upon faulty facial recognition results have nothing to do with Gender Shades. I’ll address this briefly (for once):

In the United States, facial recognition identification results should only be used by the police as an investigative lead, and no one should be arrested solely on the basis of facial recognition. (The city of Detroit stated that Williams’ arrest resulted from “sloppy” detective work.)
If you are using facial recognition for criminal investigations, your people had better have forensic face training. (Then they would know, as Detroit investigators apparently didn’t know, that the quality of surveillance footage is important.)
If you’re going to ban computerized facial recognition (even when only used as an investigative lead, and even when only used by properly trained individuals), consider the alternative of human witness identification. Or witness misidentification. Roeling Adams, Reggie Cole, Jason Kindle, Adam Riojas, Timothy Atkins, Uriah Courtney, Jason Rivera, Vondell Lewis, Guy Miles, Luis Vargas, and Rafael Madrigal can tell you how inaccurate (and racist) human facial recognition can be. See my LinkedIn article “Don’t ban facial recognition.”

Obviously, facial recognition has been the subject of independent assessments, including continuous bias testing by the National Institute of Standards and Technology as part of its Face Recognition Vendor Test (FRVT), specifically within the 1:1 verification testing. And NIST has measured the identification bias of hundreds of algorithms, not just three.

In fact, people that were calling for facial recognition to be banned just a few years ago are now questioning the wisdom of those decisions.

But those days were quaint. Men were men, women were women, and artificial intelligence was science fiction.

The latter has certainly changed.

Example 3: Voice spoofs

Perhaps it’s an exaggeration to say that recent artificial intelligence advances will change the world. Perhaps it isn’t. Personally I’ve been concentrating on whether AI writing can adopt the correct tone of voice, but what if we take the words “tone of voice” literally? Let’s listen to President Richard Nixon:

From https://www.youtube.com/watch?v=2rkQn-43ixs

Richard Nixon never spoke those words in public, although it’s possible that he may have rehearsed William Safire’s speech, composed in case Apollo 11 had not resulted in one giant leap for mankind. As noted in the video, Nixon’s voice and appearance were spoofed using artificial intelligence to create a “deepfake.”

It’s one thing to alter the historical record. It’s another thing altogether when a fraudster spoofs YOUR voice and takes money out of YOUR bank account. By definition, you will take that personally.

In early 2020, a branch manager of a Japanese company in Hong Kong received a call from a man whose voice he recognized—the director of his parent business. The director had good news: the company was about to make an acquisition, so he needed to authorize some transfers to the tune of $35 million. A lawyer named Martin Zelner had been hired to coordinate the procedures and the branch manager could see in his inbox emails from the director and Zelner, confirming what money needed to move where. The manager, believing everything appeared legitimate, began making the transfers.

What he didn’t know was that he’d been duped as part of an elaborate swindle, one in which fraudsters had used “deep voice” technology to clone the director’s speech…
From https://www.forbes.com/sites/thomasbrewster/2021/10/14/huge-bank-fraud-uses-deep-fake-voice-tech-to-steal-millions/?sh=8e8417775591

Now I’ll grant that this is an example of human voice verification, which can be as inaccurate as the previously referenced human witness misidentification. But are computerized systems any better, and can they detect spoofed voices?

Well, in the same way that fingerprint readers worked to overcome gummy bears, voice readers are working to overcome deepfake voices. Here’s what one company, ID R&D, is doing to combat voice spoofing:

IDVoice Verified combines ID R&D’s core voice verification biometric engine, IDVoice, with our passive voice liveness detection, IDLive Voice, to create a high-performance solution for strong authentication, fraud prevention, and anti-spoofing verification.

Anti-spoofing verification technology is a critical component in voice biometric authentication for fraud prevention services. Before determining a match, IDVoice Verified ensures that the voice presented is not a recording.
From https://www.idrnd.ai/idvoice-verified-voice-biometrics-and-anti-spoofing/

This is only the beginning of the war against voice spoofing. Other companies will pioneer new advances that will tell the real voices from the fake ones.

As for independent testing:

ID R&D has participated in multiple ASVspoof tests, and performed well in them.
NIST has long conducted speaker recognition evaluations. Perhaps future tests will be expanded to check for deepfakes, in the same way that the FRVT 1:1 test was expanded to check for bias.

A final thought

Yes, fraudsters can use advanced tools to do bad things.

But the people who battle fraudsters can also use advanced tools to defeat the fraudsters.

Take care of yourself, and each other.

Jerry Springer. By Justin Hoch, CC BY 2.0, https://commons.wikimedia.org/w/index.php?curid=16673259

I may be a fraudster!

I’ve previously contacted a journalist via Help a Reporter Out (HARO), and I occasionally pitch to journalists on the service. In fact, I submitted a new pitch earlier this month.

So I noted with interest this story of how fraudsters fool Help a Reporter Out pitch recipients with synthetic or otherwise fraudulent identities.

When a reporter is writing a story that requires a source that he or she does not have, that reporter will likely turn to HARO, a service that “connects journalists seeking expertise to include in their content with sources who have that expertise.”…
Now, shady SEOs hide behind fake photos and personalities. The latest black hat search-engine optimization trend is to respond to Help-a-Reporter-Out (HARO) inquiries pretending to be a person of whichever gender/ethnicity the journalist is seeking comment from.
From https://www.johnwdefeo.com/articles/deepfakes-are-ruining-the-internet

As it turns out, I have never responded to a pitch that specifically requested comments from white males. (Probably because if a pitch DOESN’T request gender/ethnicity information, chances are that the respondent will be a white male.) But it’s clear how a HARO pitch scammer could create a synthesized identity of a biometric proposal writing expert.

So if you’re asking your source for a picture, John W. Defeo suggests that you ask for TWO pictures. I think that the technical term for this is MPA, or Multi Photo Authentication.

There’s one other suggestion.

Take those photographs and plug them into a reverse image lookup service like Tineye (or even Google Images). Have they appeared on the web before? Does the context make sense?
From https://www.johnwdefeo.com/articles/deepfakes-are-ruining-the-internet

I often use the picture that is found on my jebredcal Twitter profile.

So I plugged that in to a Google reverse image search. As expected, it hit on Twitter, but also hit on some other social media platforms such as LinkedIn.

I hadn’t heard of TinEye before, so I figured I’d give it a shot. Here’s what TinEye found:

Very odd, since as I previously mentioned this particular image is available on Twitter, LinkedIn, and other sources. But it turns out that TinEye honors requests from social media services NOT to crawl their sites. (No comment.) And TinEye apparently hasn’t crawled the relevant page on bredemarket.com yet.

Which leads to the scary thought – what if someone searched TinEye for me, and didn’t bother to search anywhere else after getting 0 results? Would the searcher conclude that I was a synthetically-generated biobot?

Wow, talk about identity concerns…

“Who Are You” by The Who. Fair use, https://en.wikipedia.org/w/index.php?curid=11316153

Facial recognition and the U.S. Capitol attack

This post examines a number of issues regarding the use of facial recognition. Specifically, it looks at various ways to use facial recognition to identify people who participated in the U.S. Capitol attack.

By TapTheForwardAssist – Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=98670006

Let’s start with the technological issues before we look at the legal ones. Specifically, we’ll look at three possible ways to construct databases (galleries) to use for facial recognition, and the benefits and drawbacks of each method.

What a facial recognition system does, and what it doesn’t do

The purpose of a one-to-many facial recognition system is to take a facial image (a “probe” image), process it, and compare it to a “gallery” of already-processed facial images. The system then calculates some sort of mathematical likelihood that the probe matches some of the images in the gallery.

That’s it. That’s all the system does, from a technological point of view.

Although outside of the scope of this particular post, I do want to say that a facial recognition system does NOT determine a match. Now the people USING the system could make the decision that one or more of the images in the gallery should be TREATED as a match, based upon mathematical considerations. However, when using a facial recognition system in the United States for criminal purposes, the general procedure is for a trained facial examiner to use his/her expertise to compare the probe image with selected gallery images. This trained examiner will then make a determination, regardless of what the technology says.

But forget about that for now. I want to concentrate on another issue—adding data to the gallery.

Options for creating a facial recognition “gallery”

As I mentioned earlier, the “gallery” is the database against which the submitted facial image (the “probe”) is compared. In a one-to-many comparison, the probe image is compared against all or some of the images in the gallery. (I’m skipping over the “all or some” issue for now.)

So where do you get facial images to put in the gallery?

For purposes of this post, I’m going to describe three sources for gallery images.

Government facial images of people who have been convicted of crimes.
Government facial images of people who have not necessarily been convicted of crimes, such as people who have been granted driver’s licenses or passports.
Publicly available facial images.

Before delving into these three sources of gallery images, I’m going to present a use case. A few of you may recognize it.

Let’s say that there is an important government building located somewhere, and that access to the building is restricted for security reasons. Now let’s say that some people breach that access and illegally enter the building. Things happen, and the people leave. (Again, why they left and weren’t detained immediately is outside the scope of this post.)

Now that a crime has been committed, the question arises—how do you use facial recognition to solve the crime?

A gallery of government criminal facial images

Let’s look at a case in which the images of people who trespassed at the U.S. Capitol…

Whoops, I gave it away! Yes, for those of you who didn’t already figure it out, I’m specifically talking about the people who entered the U.S. Capitol on Wednesday, January 6. (This will NOT be the only appearance of Captain Obvious in this post.)

Anyway, let’s see how the images of people who trespassed at the U.S. Capitol can be compared against a gallery of images of criminals.

From here on in, we need to not only look at technological issues, but also legal issues. Technology does not exist in a vacuum; it can (or at least should) only be used in accordance with the law.

So we have a legal question: can criminal facial images be lawfully used to identify people who have committed crimes?

In most cases, the answer is yes. The primary reason that criminal databases are maintained in the first place is to identify repeat offenders. If someone habitually trespasses into government buildings, the government would obviously like to know when the person trespasses into another government building.

But why did I say “in most cases”? Because there are cases in which a previously-created criminal record can no longer be used.

The record is sealed or expunged. This could happen, for example, if a person committed a crime as a juvenile. After some time, the record could be sealed (prohibiting most access) or expunged (removed entirely). If a record is sealed or expunged, then data in the record (including facial images) shouldn’t be available in the gallery.
The criminal is pardoned. If someone is pardoned of a crime, then it’s legally the same as if the crime were never committed at all. In that case, the pardoned person’s criminal record may (or may not) be removed from the criminal database. If it is removed, then again the facial image shouldn’t be in the gallery.
The crime happened a long time ago. Decades ago, it cost a lot of money to store criminal records, and due to budgetary constraints it wasn’t worthwhile to keep on storing everything. In my corporate career, I’ve encountered a lot of biometric requests for proposal (RFPs) that required conversion of old data to the new biometric system…with the exception of the old stuff. It stands to reason that if the old arrest record from 1960 is never converted to the new system, then that facial image won’t be in the gallery.

So, barring those exceptions, a search of our probe image from the U.S. Capitol could potentially hit against records in the gallery of criminal facial images.

Great, right?

Well, there’s a couple of issues to consider.

First, there are a lot of criminal databases out there. For those who imagine that the FBI, and the CIA, and the BBC, BB King, and Doris Day (yes) have a single massive database with every single criminal record out there…well, they don’t.

There are multiple federal criminal databases out there, and it took many years to get two of the major ones (from the FBI and the Department of Homeland Security) to talk to each other.
And every state has its own criminal database; some records are submitted to the FBI, and some aren’t.
Oh, and there are also local databases. For many years, one of my former employers was the automated fingerprint identification system provider for Bullhead City, Arizona. And there are a lot of Bullhead City-sized databases; one software package, AFIX Tracker (now owned by Aware) has over 500 installations.

So it you want to search criminal databases, you’re going to have to search a bunch of them. Between the multiple federal databases, the state and territory databases, and the local databases, there are hundreds upon hundreds of databases to search. That could take a while.

Which brings us to the second issue, in which we put on our Captain Obvious hat. If a person has never committed a crime, the person’s facial image is NOT in a criminal database. While biometric databases are great at identifying repeat offenders, they’re not so good at identifying first offenders. (They’re great at identifying second offenders, when someone is arrested for a crime and matches against an unidentified biometric record from a previous crime.)

So even if you search all the criminal databases, you’re only going to find the people with previous records. Those who were trespassing at the U.S. Capitol for the first time are completely invisible to a criminal database.

So something else is needed.

A gallery of government non-criminal facial images

Faced with this problem, you may ask yourself (yes), “What if the government had a database of people who hadn’t committed crimes? Could that database be used to identify the people who stormed the U.S. Capitol?”

Well, various governments DO have non-criminal facial databases. The two most obvious examples are the state databases of people who have driver’s licenses or state ID cards, and the federal database of people who have passports.

(This is an opportune time to remind my non-U.S. readers that the United States does not have national ID cards, and any attempt to create a national ID card is fought fiercely.)

I’ll point out the Captain Obvious issue right now: if someone never gets a passport or driver’s license, they’re not going to be in a facial database. This is of course a small subset of the population, but it’s a potential issue.

There’s a much bigger issue regarding the legal ability to use driver’s license photos in criminal investigation. As of 2018, 31 states allowed the practice…which means that 19 didn’t.

So while searches of driver’s license databases offer a good way to identify Capitol trespassers, it’s not perfect either.

A gallery of publicly available facial images

Which brings us to our third way to populate a gallery of facial images to identify Capitol trespassers.

It turns out that governments are not the only people that store facial images. You can find facial images everywhere. My own facial image can be found in countless places, including a page on the Bredemarket website itself.

There are all sorts of sites that post facial images that can be accessible to the public. A few of these sites include Facebook, Google (including YouTube), LinkedIn (part of Microsoft), Twitter, and Venmo. (We’ll return to those companies later.)

In many cases, these image are tied to (non-verified) identities. For example, if you go to my LinkedIn page, you will see an image that purports to be the image of John Bredehoft. But LinkedIn doesn’t know with 100% certainty that this is really an image of John Bredehoft. Perhaps “John Bredehoft” exists, but the posted picture is not that of John Bredehoft. Or perhaps “John Bredehoft” doesn’t exist and is a synthetic identity.

But regardless, there are billions of images out there, tied to billions of purported identities.

What if you could compare the probe images from the U.S. Capitol against a gallery of those billions of images—many more images than held by any government?

It turns out that you CAN perform that comparison, and that law enforcement did perform that comparison.

Clearview AI’s…facial-recognition app has seen a spike in use as police track down the pro-Trump insurgents who descended on the Capitol on Wednesday….
Clearview AI CEO Hoan Ton-That confirmed to Gizmodo that the app saw a 26% jump in search volume on Jan. 7 compared to its usual weekday averages….
Detectives at the Miami Police Department are using Clearview’s tech to identify rioters in images and videos of the attack and forwarding suspect leads to the FBI, per the Times. Earlier this week, the Wall Street Journal reported that an Alabama police department was also employing Clearview’s tech to ID faces in footage and sending potential matches along to federal investigators.

But now we need to return to the legal question: is “publicly available” equivalent to “publicly usable”?

Certain companies, including the aforementioned Facebook, Google (including YouTube), LinkedIn (part of Microsoft), Twitter, and Venmo, maintain that Clearview AI does NOT have permission to use their publicly available data. Not because of government laws, but because of the companies’ own policies. Here’s what two of the companies said about a year ago:

“Scraping people’s information violates our policies, which is why we’ve demanded that Clearview stop accessing or using information from Facebook or Instagram,” Facebook’s spokesperson told Business Insider….
“YouTube’s Terms of Service explicitly forbid collecting data that can be used to identify a person. Clearview has publicly admitted to doing exactly that, and in response, we sent them a cease-and-desist letter.”

For its part, Clearview AI maintains that its First Amendment government rights supersede the terms of service of the companies.

But other things come in play in addition to terms of service. Lawsuits filed in 2020 allege that Clearview AI’s practices violate the California Consumer Privacy Act of 2018, and the even more stringent Illinois Biometric Information Privacy Act of 2008. BIPA is so stringent that even Google is affected by it; as I’ve previously noted, Google’s Nest Hello Video Doorbell’s “familiar face” alerts is not available in Illinois.

Between corporate complaints and aggrieved citizens, the jury is literally still out on Clearview AI’s business model. So while it may work technologically, it may not work legally.

And one more thing

Of course, people are asking themselves, why do we even need to use facial recognition at all? After all, some of the trespassers actually filmed themselves trespassing. And when people see the widely-distributed pictures of the trespassers, they can be identified without using facial recognition.

Yes, to a point.

While it seems intuitive that eyewitnesses can easily identify people in photos, it turns out that such identifications can be unreliable. As the California Innocence Project reminds us:

One of the main causes of wrongful convictions is eyewitness misidentifications. Despite a high rate of error (as many as 1 in 4 stranger eyewitness identifications are wrong), eyewitness identifications are considered some of the most powerful evidence against a suspect.

The California Innocence Project then provides an example of a case in which someone was inaccurately identified due to an eyewitness misidentification. Correction: it provided 11 examples, including ones in which the witnesses were presented to the viewer in a controlled environment (six-pack lineups, similar backgrounds).

The FBI project, in which people look at images captured from the U.S. Capitol itself, is NOT a controlled environment.

Fame, fortune, or both? Gradations of synthetic identity fraud, with a North Hollywood company as an example

In many cases, identity fraud is accomplished by a bad actor impersonating the identity of another person. Many people have found unauthorized credit or debit card transactions that they didn’t perform, and have had to shut down and re-open their cards as a result.

However, there are other cases in which the identity fraud is accomplished by inventing a “person” out of whole cloth. Or partial cloth; a real piece of identity, such as a legitimate U.S. social security number, is combined with fake information, such as non-existent addresses, stock photography headshots, and unverified social media accounts.

The process could be less rigorous, such as creating a Twitter bot to inflate followers (no government ID needed), or it could be more rigorous, in which the synthetic identity gains legitimate credentials such as passports (although this is becoming more difficult as facial recognition compares applicant faces to existing faces).

Synthetic identity fraud can be damaging. Henry Engler of Thomson Reuters (not Thomas Reuters) cites a figure of $6 billion in losses to U.S. lenders from synthetic identity fraud.

But sometimes the fraud, while still fraudulent, is relatively innocuous.

Take the case of a particular web design company in North Hollywood, California. If you visit its website (which, oddly enough, is on the “org” domain), the only listed contact for the company is a guy named Eric.

It’s a whole different story on LinkedIn, however.

According to LinkedIn, the company has dozens of employees, including a vast number of co-founders, chief technology officers, and chief information officers. While some are based in Los Angeles, others are based in Chicago, Dallas, Maidenhead, Kyrgyzstan, and other exotic locations. Most remarkably, based upon some of the employee pictures, the company goes over and above in its attempts to attract female technologists. It’s a statistical anomaly!

Under normal circumstances, this remarkable string of oddities would have gone completely unnoticed. I have never interacted with any of these employees, and they don’t seem to be all that active on the LinkedIn platform. Well, with some exceptions; the Chicago-based CEO of the company has made a valuable contribution to the LinkedIn discussion.

Now most of this went under the radar, until a number of LinkedIn employees made connection requests to a particular individual. Unfortunately, this particular individual was Kris’ Rides, a cybersecurity specialist with Tiro Security.

(Before you ask whether Rides himself is a bot, I should note that he has received 40 recommendations on LinkedIn from people that appear to be real, and has amassed over 500 connections. So if Rides is a bot, he is a very effective one.)

When Rides received these connection requests (including two CTOs and two CIOs at the same company), they struck him as odd. So he shared his experience with his connections, which included other cybersecurity professionals, and people (such as me) who were connected to those other cybersecurity professionals. And they’re talking.

Pro tip: if you’re engaging in synthetic identity fraud, don’t reach out to a cybersecurity professional.

Now this story probably won’t be a trending topic on Twitter, even if the bots try to make it so, but it’s certainly gaining traction in the audience that counts: namely, technology experts who have the power to tell LinkedIn and others about questionable marketing techniques.

So what happens next? A mea culpa from Eric (or whatever his or her real name is)? Time will tell.