Verification

Safety Assurance of ML-Based Systems

Overview

ML-based software makes predictions in settings where failures carry real safety consequences. Our motivating case study was the DHS passenger screening challenge, hosted on Kaggle with the largest prize pool in its history ($1.5 million): TSA screens more than two million passengers daily, high false alarm rates create checkpoint bottlenecks, and false negatives pose severe safety risks. We built abstractions of such ML-enabled systems and inferred preconditions that provide probable guarantees on the safety of their predictions.

Fairify: Fairness Verification of Neural Networks

Overview

We built Fairify, an SMT-based approach that verifies individual fairness of deep neural networks in production. Individual fairness requires that any two individuals who differ only in protected attributes such as race, sex, or age receive similar predictions; unlike group metrics, it captures worst-case discrimination. The property is hard to verify because it must be checked globally over the input domain and because of the non-linear computation nodes in the network.