NIST FRTE Evaluation

To ensure that InterLink ID meets the highest global standards for biometric performance, our core algorithm — interlinklabs_001 — was submitted to the U.S. National Institute of Standards and Technology (NIST) for evaluation under the FRTE (Face Recognition Technology Evaluation) program (see more).

NIST is the world’s most recognized authority in biometric benchmarking. Their FRTE 1:1 Verification Track is considered the gold standard for evaluating face recognition systems in identity-matching scenarios. It is used by governments, banks, and enterprise security platforms worldwide to assess the accuracy, speed, and reliability of identity verification algorithms.

1. Performance is measured using two standard metrics

False Match Rate (FMR):

Probability that two different individuals are incorrectly matched.

\small \text{FMR} = \frac{\text{False Matches}}{\text{Impostor Comparisons}}

False Non-Match Rate (FNMR):

Probability that two images of the same person fail to match.

\small \text{FNMR} = \frac{\text{False Non-Matches}}{\text{Genuine Comparisons}}

$\small \text{FNMR} = \frac{\text{False Non-Matches}}{\text{Genuine Comparisons}}$ the lower the values of both, the better. Ideal performance sits at the bottom-left corner of each graph.

2. FNMR vs. Elapsed Time for Mugshot Images Across Algorithms and Demographics

This chart evaluates how biometric accuracy changes over time — specifically how False Non-Match Rate (FNMR) increases as the time gap between two face captures (photos) grows from 2 to 16 years.

The rightmost panel shows InterLink’s algorithm (interlinklabs_001), compared against other top-performing algorithms across different demographic groups:

$\small B_F$ : Black Female
$\small B_M$ : Black Male
$\small W_F$ : White Female
$\small W_M$ : White Male

🔎 Key Observations:

InterLink shows gradual and predictable increase in FNMR as faces age — no major spikes, indicating strong resilience to long-term facial changes.
Its curve is consistent across demographics, demonstrating fairness and robustness.
Compared to other algorithms (left panels), InterLink performs competitively — even with age gaps of 10–16 years.

This confirms InterLink’s suitability for long-term, one-time onboarding use cases such as decentralized identity, KYC, and Proof-of-Personhood.

3. Similarity Scores for Genuine and Impostor Image Pairs

The figure shows similarity scores for 12 genuine and 8 impostor image pairs used in the May 2018 paper https://doi.org/10.1073/pnas.1721355115 Face recognition accuracy of forensic examiners, superrecognizers, and face recognition algorithms (Phillips et al.). The threshold (red horizontal line) is a value calibrated to give FMR = 0.0001 on mugshot images. Points above the threshold correspond to pairs determined to be genuine, and points below the threshold correspond to pairs determined to be impostors. If the determined class (genuine or impostor) matches the real class, points will be blue; if not, red. An X represents face detection failure in either of the images in the pair. Note that the sample size (n=20) is small, and the figure may change substantially if larger or different sets are used. The images can be viewed at

🔎 Key Definitions:

Let $\small S(x_1,x_2)$ be the similarity score between two facial images $x_1$ and $x_2$ .
The system applies a calibrated threshold T such that: $\small S(x_1, x_2) \geq T \Rightarrow \text{Genuine Match} \quad ; \quad S(x_1, x_2) < T \Rightarrow \text{Impostor}$
The threshold T is tuned to ensure a False Match Rate (FMR) = 0.0001 on the NIST mugshot dataset.

https://www.pnas.org/doi/suppl/10.1073/pnas.1721355115/suppl_file/pnas.1721355115.sapp.pdf, where Gen 01 corresponds to Same-Identity Pair 1, Gen 02 corresponds to Same-Identity Pair 2, and so on.

4. False Negative Demographic Effects (Visa-Border Dataset)

This figure presents an analysis of False Non-Match Rates (FNMR) for InterLink’s biometric algorithm (interlinklabs_001) under varying demographic and quality conditions, using the Visa-Border dataset from NIST.

📐 Methodology:

The system is evaluated at an operating point of

\text{FMR} = 0.00001 \ \text{with threshold} \ T = 177.01

FNMR is computed by comparing low-quality border-crossing photos with high-quality enrollment images (e.g., visa application portraits), across 20+ countries of birth. Each group is segmented by gender and age bin (≤45 years vs. >45 years).

🔎 Interpretation:

Lower FNMR values indicate better matching accuracy.
Square dots represent empirical FNMR estimates;
Vertical bars represent 95% bootstrap confidence intervals.
Overlapping intervals suggest no statistically significant bias across age or gender.

For women, left, and men, the panels show false non-match rates when mediocre border cross photos are compared against high quality reference application portraits collected from individuals born in the country identified on the horizontal axis and aged either above or below 45 years of age at the time of the application photo. The square dots give the empirical FNMR point estimate. The vertical lines give bootstrap 95-percent confidence intervals around the point estimate. The intervals are wider when the country and age group is less-represented in this dataset. Overlapping intervals is an indication of no significant difference. Low FNMR values are synonymous with high accuracy.

5. Demographic Fairness Evaluation — False Match Rate (FMR)

To ensure equitable performance across diverse populations, we evaluate our face verification algorithm ( $InterLinklabs_{001}$ interlinklabs_001) using demographic-specific False Match Rates (FMR). This analysis is based on non-mate comparisons across four demographic groups — Black Female (FB), White Female (FW), Black Male (MB), and White Male (MW) — using the mugshot dataset provided by NIST FRTE.

🔎 Definition:

\ \text{FMR}_{i,j} = \frac{\text{False Matches}_{(i,j)}}{\text{Total Impostor Comparisons}_{(i,j)}}

Where:

i = demographic group of the probe image
j = demographic group of the enrollment image
$\small {FMR}_{i,j}$ represents the probability of a false match between two individuals from demographic groups i and j

For non-mate comparisons of mugshots of black and white (B-W) males and females (M-F), the panels show false match rates for five algorithms: two for which on-diagonal demographic differentials are low, two for which they're high, and the target algorithm in this report. In the top row of panels the threshold is set for each algorithm to give FMR = 0.001 for white males which is the demographic that usually gives the lowest FMR. In the second row the white-male FMR = 0.0001. This means the top right box is the same color in all panels of a row.

PreviousPrevents Identity Spoofing & Fraud NextInterLink App

Last updated 7 months ago