RIME v19 Release Note

Important Notes

  • All deployments now require authentication. Users can login via authN service or SSO.

  • Robust Intelligence now has workspaces. For existing deployments, all objects (projects, stress test runs, firewalls, etc) are moved to a default “Workspace 1” and assigned the existing agent.

  • API tokens are now associated with user. All existing API tokens are assigned to “Workspace 1” and assigned to the default admin.

  • User cannot create and update firewalls from test runs prior to v19

Changelog

General

  • Embeddings - RI Platform now offers Stress Testing and Continuous Testing for models trained to run over embeddings.

  • NLI - The new Natural Language Inference is a subset of the Text Classification user mode and it enables transformation and adversarial attack tests to apply over multiple text fields. This modality also adds support for NLI Hugging Face models

  • Governance Dashboard - RI platform now offers a dashboard that serves as a single pane of glass for all models in production, providing model health status and the ability to track models to any custom metric. This is behind a feature flag, so reach out if you’d like to know more.

Firewall and Continuous Testing

  • New Model Promotion Flow - When promoting a model to production, the user flow to set up Continuous Testing and Firewall has been updated. The user now has the ability to choose the reference data set, set up a data source, activate scheduling, and set up the Firewall in this set up step.

  • Scheduling CT Runs - Whereas previously, users of CT had to manually upload new bins of data via the RI SDK, the user can now connect a data source to which they can send data and then schedule runs of CT over this data source such that it automatically kicks off runs at the right time.

  • New Data Source Linking - CT now accepts new data sources.

    • Data Collector - The user can send data points to the RI Data Collector in real time as a data source for CT. This automatically processes the data to be ready to processed by CT and removes the need to format data files for upload. Scheduled CT runs can be run over this data source

    • Delta Lake Data Connector - The user can establish a connection to their Databricks Delta Lake account and use any tables as a data source for CT. By Scheduling runs over this dataset, the user can simply append inference data points to this table and scheduled CT will run over all new data.

  • Updating the Reference Set - Whereas users were previously constrained to using the reference set of the associated Stress Test for a CT instance, now they can update the baseline reference set after the CT instance is created.

    • Manual - The user can upload a new reference set manually via the SDK

    • Fixed Bin - The user can choose to set a specific time bin as the new reference set.

    • Rolling Window - The user can choose between default rolling windows for the reference set (e.g. the last 7 days).

RI Platform Changes

  • Workspaces - A workspace is an environment for organizing and accessing all of your Robust Intelligence assets. All the assets in a workspace are unique to that workspace.More information here.

  • Multiple Agents - Previously there could be only a single agent (data plane) as part of a deployment. Now, multiple data planes can be configured per deployment and assigned to workspaces.

  • GKE Agent - The agent can now be deployed on GKE cluster (Google Cloud Platform). This allows the data plane to execute the machine learning testing suite on your models and data from the Google Cloud Platform infrastructure.

  • Data Sources - Administrators can configure data sources like Databricks Delta Lake using the UI. These data sources can then directly be used to run stress tests and continuous tests. More information here.

Compliance Management

  • Enhanced Auto-generated Model Cards - Additional test case details have been added to our comprehensive downloadable report that is auto-populated with validation information from test runs and allows for entering of custom information. This can be used to meet internal team or external regulator reporting requirements.