Continuous Testing feedback and observability
Once a model is in production, RIME can provide detailed information on the model’s performance to enable you to identify and correct issues.
Model monitors
A model under continuous testing displays summary information about model health on the Overview page of the project that contains the model. RIME monitors machine learning model performance across the following categories:
Operational Health tests a model’s performance and accuracy.
Security tests a model’s resilience against compromise from external attacks.
Fairness tests a model’s outcome for fair treatment among subcategories in the data.
Viewing the Project Overview page
Sign in to a RI Platform instance.
The Workspaces page appears.
Click a workspace.
The Workspaces summary page appears.
Select a project.
You can filter or sort the list of projects in a workspace with the Sort and Filter controls in the upper right. Click the glyph to the right of the Filter control to switch between list and card display for projects. Type a string in Search Projects… to display only projects that match the string.
The project overview page appears.
Monitors and risk categories
Monitors of particular interest can be pinned to the top of the list by clicking the pushpin icon at the top left corner of the monitor. Use the Zoom controls to change your view of the monitor chart.
Enabling or disabling notifications for a monitor
You can enable or suppress notifications from a specified monitor.
Sign in to a RI Platform instance.
The Workspaces page appears.
Click a workspace.
The Workspaces summary page appears.
Select a project.
You can filter or sort the list of projects in a workspace with the Sort and Filter controls in the upper right. Click the glyph to the right of the Filter control to switch between list and card display for projects. Type a string in Search Projects… to display only projects that match the string. The project overview page appears.
In the top right corner of a monitor, click Edit Monitor.
The Edit Monitor wizard appears.
Toggle Add to Project Notifications and click Save Settings.
Notifications for this monitor are added or removed from the project according to the position of the toggle.
Operational risk
Tests for operational risk assess a model’s performance and accuracy. These tests are divided into tests for performance, drift, and abnormal input.
Performance tests
Performance test | Description |
---|---|
Accuracy | The ratio of correct predictions to the total number of predictions. |
Average Thresholded Confidence (ATC) | Tests the variance of the ATC between reference and evaluation datasets. ATC estimates the accuracy of unlabeled examples. |
Average Confidence | Tests the variance of average prediction confidence between the reference and evaluation datasets. |
Calibration Comparison | Tests whether the calibration curve of the evaluation set has changed relative to the reference set. |
Drift tests
Drift test | Description |
---|---|
Prediction drift | Tests the change in distribution between the prediction sets generated by the reference and evaluation datasets. |
Label Drift | Tests the change in distribution in the model's output. |
Label Drift in Population Stability Index (PSI) | Tests the change in distribution in the PSI of a model's output. |
Categorical Feature Drift | Tests the change in distribution within a given categorical feature. |
Abnormal inputs test
There is only one abnormal inputs test, for Rare Categories. This test measures the model’s response to data that contains categorical values that are rarely observed in the reference set.
Security risk
Tests for security risk assess the security of the model and underlying dataset, providing alerts in cases of model evasion or subversion.
Security risk test | Description |
---|---|
Security Events | Lists specific attack events. |
Data Poisoning | Tests for corrupted input data. |
Model Evasion | Tests for adversarial evasion attacks. |
Fairness and Compliance risk
Tests for fairness and compliance risk assess a model’s outcome for fair treatment among subcategories in the data.
Events
The Overview page of a model under continuous testing displays a list of Active Events to the right of the active monitors.
The Events list provides several filter selectors to focus on a specific set of events.
Filter | Description | Potential states |
---|---|---|
Testing | Test type | Stress test or Continuous test |
Risk Categories | Major risk category | Operational, Security, Fairness |
Status | The status of a specific test | Fail, Warning, Pass, Skip |
Level | The importance level of the event | None, Low, High |
Inspecting the Events list
You can display the full list of events in a project.
Sign in to a RI Platform instance.
The Workspaces page appears.
Click a workspace.
The Workspaces summary page appears.
Select a project.
You can filter or sort the list of projects in a workspace with the Sort and Filter controls in the upper right. Click the glyph to the right of the Filter control to switch between list and card display for projects. Type a string in Search Projects… to display only projects that match the string. The project overview page appears.
At the lower right, click View all events –>.
The Events list appears.
(Optional) Click Show Details to display in-depth information about the type of event.
(Optional) Click Resolve to view specific details about the detected event.
This information includes specific information about the location in the dataset affected by the event.
(Optional, while viewing event details) Click Export CSV to download the event details as a CSV file.
Event root-cause analysis and actionability
RIME can provide significant context and analysis of a detected event. This analysis includes the metric and threshold value defining the event, as well as the underlying data leading to the event.
Sign in to a RI Platform instance.
The Workspaces page appears.
Click a workspace.
The Workspaces summary page appears.
Select a project.
You can filter or sort the list of projects in a workspace with the Sort and Filter controls in the upper right. Click the glyph to the right of the Filter control to switch between list and card display for projects. Type a string in Search Projects… to display only projects that match the string. The project overview page appears.
At the lower right, click View all events –>.
The Events list appears.
Click an event.
The Analysis pane for that event appears.