Eric Weiss of FindBiometrics has opined on the Tel Aviv master faces study that I previously discussed.
While he does not explicitly talk about the myriad of facial recognition algorithms that were NOT addressed in the study, he does have some additional details about the test dataset.
The three algorithms that were tested
Here’s what FindBiometrics says about the three algorithms that were tested in the Israeli study.
The researchers described (the master faces) as master keys that could unlock the three facial recognition systems that were used to test the theory. In that regard, they challenged the Dlib, FaceNet, and SphereFace systems, and their nine master faces were able to impersonate more than 40 percent of the 5,749 people in the LFW set.
While it initially sounds impressive to say that three facial recognition algorithms were fooled by the master faces, bear in mind that there are hundreds of facial recognition algorithms tested by NIST alone, and (as I said earlier) the test has NOT been duplicated against any algorithms other than the three open source algorithms mentioned.
…let’s look at the algorithms themselves and evaluate the claim that results for the three algorithms Dlib, FaceNet, and SphereFace can naturally be extrapolated to ALL facial recognition algorithms….NIST’s subsequent study…evaluated 189 algorithms specially for 1:1 and 1:N use cases….“Tests showed a wide range in accuracy across developers, with the most accurate algorithms producing many fewer errors.”
In short, just because the three open source algorithms were fooled by master faces doesn’t mean that commercial grade algorithms would also be fooled by master faces. Maybe they would be fooled…or maybe they wouldn’t.
What about the dataset?
The three open source algorithms were tested against the dataset from Labeled Faces in the Wild. As I noted in my prior post, the LFW people emphasize some important caveats about their dataset, including the following:
Many groups are not well represented in LFW. For example, there are very few children, no babies, very few people over the age of 80, and a relatively small proportion of women. In addition, many ethnicities have very minor representation or none at all.
In the FindBiometrics article, Weiss provides some additional detail about dataset representation.
…there is good reason to question the researchers’ conclusion. Only two of the nine master faces belong to women, and most depicted white men over the age of 60. In plain terms, that means that the master faces are not representative of the global public, and they are not nearly as effective when applied to anyone that falls outside one particular demographic.
That discrepancy can largely be attributed to the limitations of the LFW dataset. Women make up only 22 percent of the dataset, and the numbers are even lower for children, the elderly (those over the age of 80), and for many ethnic groups.
Valid points to be sure, although the definition of a “representative” dataset varies depending upon the use case. For example, a representative dataset for a law enforcement database in the city of El Paso, Texas will differ from a representative dataset for an airport database catering to Air France customers.
So what conclusion can be drawn?
Perhaps it’s just me, but scientific entities that conduct studies are always motivated by the need for additional funding. After a study is concluded, it seems that the entities always conclude that “more research is needed”…which can be self-serving, because as long as more research is needed, the scientific entities can continue to receive necessary funding. Imagine the scientific entity that would dare to say “Well, all necessary research has been conducted. We’re closing down our research center.”
But in this case, there IS a need to perform additional research, to test the master faces against different algorithms and against different datasets. Then we’ll know whether this statement from the FindBiometrics article (emphasis mine) is actually true:
Any face-based identification system would be extremely vulnerable to spoofing…