NIST FRTE Evaluation
Last updated
Last updated
To ensure that InterLink ID meets the highest global standards for biometric performance, our core algorithm — interlinklabs_001 — was submitted to the U.S. National Institute of Standards and Technology (NIST) for evaluation under the FRTE (Face Recognition Technology Evaluation) program ().
NIST is the world’s most recognized authority in biometric benchmarking. Their FRTE 1:1 Verification Track is considered the gold standard for evaluating face recognition systems in identity-matching scenarios. It is used by governments, banks, and enterprise security platforms worldwide to assess the accuracy, speed, and reliability of identity verification algorithms.
False Match Rate (FMR):
Probability that two different individuals are incorrectly matched.
False Non-Match Rate (FNMR):
Probability that two images of the same person fail to match.
This chart evaluates how biometric accuracy changes over time — specifically how False Non-Match Rate (FNMR) increases as the time gap between two face captures (photos) grows from 2 to 16 years.
The rightmost panel shows InterLink’s algorithm (interlinklabs_001), compared against other top-performing algorithms across different demographic groups:
InterLink shows gradual and predictable increase in FNMR as faces age — no major spikes, indicating strong resilience to long-term facial changes.
Its curve is consistent across demographics, demonstrating fairness and robustness.
Compared to other algorithms (left panels), InterLink performs competitively — even with age gaps of 10–16 years.
The figure shows similarity scores for 12 genuine and 8 impostor image pairs used in the May 2018 paper https://doi.org/10.1073/pnas.1721355115 Face recognition accuracy of forensic examiners, superrecognizers, and face recognition algorithms (Phillips et al.). The threshold (red horizontal line) is a value calibrated to give FMR = 0.0001 on mugshot images. Points above the threshold correspond to pairs determined to be genuine, and points below the threshold correspond to pairs determined to be impostors. If the determined class (genuine or impostor) matches the real class, points will be blue; if not, red. An X represents face detection failure in either of the images in the pair. Note that the sample size (n=20) is small, and the figure may change substantially if larger or different sets are used. The images can be viewed at
The threshold T is tuned to ensure a False Match Rate (FMR) = 0.0001 on the NIST mugshot dataset.
https://www.pnas.org/doi/suppl/10.1073/pnas.1721355115/suppl_file/pnas.1721355115.sapp.pdf, where Gen 01 corresponds to Same-Identity Pair 1, Gen 02 corresponds to Same-Identity Pair 2, and so on.
This figure presents an analysis of False Non-Match Rates (FNMR) for InterLink’s biometric algorithm (interlinklabs_001) under varying demographic and quality conditions, using the Visa-Border dataset from NIST.
The system is evaluated at an operating point of
FNMR is computed by comparing low-quality border-crossing photos with high-quality enrollment images (e.g., visa application portraits), across 20+ countries of birth. Each group is segmented by gender and age bin (≤45 years vs. >45 years).
Lower FNMR values indicate better matching accuracy.
Square dots represent empirical FNMR estimates;
Vertical bars represent 95% bootstrap confidence intervals.
Overlapping intervals suggest no statistically significant bias across age or gender.
For women, left, and men, the panels show false non-match rates when mediocre border cross photos are compared against high quality reference application portraits collected from individuals born in the country identified on the horizontal axis and aged either above or below 45 years of age at the time of the application photo. The square dots give the empirical FNMR point estimate. The vertical lines give bootstrap 95-percent confidence intervals around the point estimate. The intervals are wider when the country and age group is less-represented in this dataset. Overlapping intervals is an indication of no significant difference. Low FNMR values are synonymous with high accuracy.
Where:
i = demographic group of the probe image
j = demographic group of the enrollment image
For non-mate comparisons of mugshots of black and white (B-W) males and females (M-F), the panels show false match rates for five algorithms: two for which on-diagonal demographic differentials are low, two for which they're high, and the target algorithm in this report. In the top row of panels the threshold is set for each algorithm to give FMR = 0.001 for white males which is the demographic that usually gives the lowest FMR. In the second row the white-male FMR = 0.0001. This means the top right box is the same color in all panels of a row.
the lower the values of both, the better. Ideal performance sits at the bottom-left corner of each graph.
: Black Female
: Black Male
: White Female
: White Male
Let be the similarity score between two facial images and .
The system applies a calibrated threshold T such that:
To ensure equitable performance across diverse populations, we evaluate our face verification algorithm ( interlinklabs_001) using demographic-specific False Match Rates (FMR). This analysis is based on non-mate comparisons across four demographic groups — Black Female (FB), White Female (FW), Black Male (MB), and White Male (MW) — using the mugshot dataset provided by NIST FRTE.
represents the probability of a false match between two individuals from demographic groups i and j