Validating Your Model with AI Stress Testing
This tutorial will guide you through validating a Binary Classification model with AI Stress Testing.
This model was trained on a slightly modified version of the Adult Census Income dataset and is available in the rime_trial/
bundle provided during installation.
An AI Stress Test is a statistical evaluation of a machine learning model, designed to detect a specific vulnerability. At Robust Intelligence, we are constantly researching new vulnerabilities to test.
For a full list of available stress tests, see our Test Bank.
Running Stress Testing on the Income Example
Stress Testing with a Model and Datasets
In this example, we will be providing the model directly to RI, which enables the most thorough possible analysis. However, RI can be run with prediction logs alone (or even just datasets), which we will illustrate below as well.
To kick off a run of AI Stress Testing using a model and datasets:
rime-engine run-stress-tests --config-path examples/income/stress_tests_model.json
After this finishes running, you should be able to see the results in the web client, where they will be uploaded to the Default Project.
For additional command line options, please see the CLI Reference.
Stress Testing with Prediction Logs
To kick off a run of AI Stress Testing using a model and datasets:
rime-engine run-stress-tests --config-path examples/income/stress_tests_prediction_logs.json
Note that the command is exactly the same EXCEPT for the --config-path
provided.
Stress Testing with Bias And Fairness tests
This will run a specific suite of tests geared towards bias and fairness.
To kick off a run in the Compliance setting:
rime-engine run-stress-tests --config-path examples/income/stress_tests_compliance.json
The command is exactly the same as the others; however, in the configuration file we have specified Bias And Fairness
under categories
as well as a list of protected_features
in our data_info
:
{
"run_name": "Income - Bias And Fairness",
"data_info": {
"protected_features": ["sex", "race", "education", "age", "native.country"]
...
},
"test_config": {
"categories": ["Bias And Fairness"],
"run_default": false
},
"model_info": { ... }
}
Running Stress Testing on Your Own Model and Datasets
This guide will cover how to run AI Stress Testing on your own model and datasets.
Define a Python Model File
A model is not required for AI Stress Testing, but providing one will produce better results.
For step-by-step instructions, please see How to Create a Python Model File.
Gather Datasets
For a detailed specification of data formatting, see Input Data Format.
1. Split the Data
For AI Stress Testing, RI requires two datasets: a reference dataset (typically the training data) and an evaluation dataset (typically the validation, testing, or other data).
Currently RI expects each dataset to be passed in as a .csv
or .parquet
file where each column is a separate feature.
2. Specify Labels (Recommended)
Providing labels allows RI to surface model performance metrics across our tests. These should be passed in as a column in both datasets.
NOTE: For Multi-Class Classification models, each label should be a nonnegative integer i
where the corresponding prediction for label i
should be found in the ith
dimenson of the prediction vector.
3. Specify Predictions (Recommended)
If a model has not already been specified, providing predictions will produce better results. These should be passed in as a column in both datasets.
NOTE: For Multi-Class Classification models, predictions must be specified in a separate csv or parquet file as a dataframe where the ith
column represents the predicted probabilities for the ith
class/label.
Create Configuration
With your data and model ready, you can now create a configuration file. Examples of these can be found in the rime_trial/
bundle (the ones used for this example are under examples/income/
).
For a detailed reference on what the configuration should look like, see AI Stress Testing Configuration Reference.
Run the CLI
To kick off a run of AI Stress Testing using your configuration file, simply replace the --config-path
argument below:
rime-engine run-stress-tests --config-path <PATH-TO-CONFIGURATION>
After this finishes running, you should be able to see the results in the web client, where they will be uploaded to the Default Project.
Troubleshooting
If you run into issues, please refer to our Troubleshooting page for help! Additionally, your RI representative will be happy to assist — feel free to reach out!