Python SDK Reference

Python package providing access to Robust Intelligence.

The RIME SDK provides a programmatic interface to a Robust Intelligence instance, allowing you to create projects, start stress tests, query the backend for test run results, and more from within your Python code. To begin, initialize a client, which acts as the main entry point to SDK functions.

class rime_sdk.Client(domain: str, api_key: str = '', channel_timeout: float = 180.0, disable_tls: bool = False, ssl_ca_cert: Optional[Union[Path, str]] = None, cert_file: Optional[Union[Path, str]] = None, key_file: Optional[Union[Path, str]] = None, assert_hostname: Optional[bool] = None)

A Client object provides an interface to Robust Intelligence’s services.

To initialize the Client, provide the address of your RI instance and your API key. The Client can be used for creating Projects, starting Stress Test Jobs, and querying the backend for current Stress Test Jobs.

Parameters:
  • domain – str The base domain/address of the RIME service.

  • api_key – str The API key used to authenticate to RIME services.

  • channel_timeout – Optional[float] The amount of time in seconds to wait for responses from the cluster.

  • disable_tls – Optional[bool] A Boolean that enables TLS when set to FALSE. By default, this value is set to FALSE.

  • ssl_ca_cert – Optional[Union[Path, str]] Specifies the path to the certificate file used to verify the peer.

  • cert_file – Optional[Union[Path, str]] Path to the Client certificate file.

  • key_file – Optional[Union[Path, str]] Path to the Client key file.

  • assert_hostname – Optional[bool] Enable/disable SSL hostname verification.

Raises:

ValueError – This error is generated when a connection to the RIME cluster cannot be established before the interval specified in timeout expires.

Example

rime_client = Client("my_vpc.rime.com", "api-key")
create_project(name: str, description: str, model_task: str, use_case: Optional[str] = None, ethical_consideration: Optional[str] = None, profiling_config: Optional[dict] = None, general_access_role: Optional[str] = 'ACTOR_ROLE_NONE', run_time_info: Optional[dict] = None) Project

Create a new RIME Project.

Projects enable you to organize Stress Test runs. A natural way to organize Stress Test runs is to create a Project for each specific ML task, such as predicting whether a transaction is fraudulent.

Parameters:
  • name – str Name of the new Project.

  • description – str Description of the new Project.

  • model_task – str Machine Learning task associated with the Project. Must be one of “MODEL_TASK_REGRESSION”, “MODEL_TASK_BINARY_CLASSIFICATION”, “MODEL_TASK_MULTICLASS_CLASSIFICATION”, “MODEL_TASK_NAMED_ENTITY_RECOGNITION”, “MODEL_TASK_RANKING”, “MODEL_TASK_OBJECT_DETECTION”, “MODEL_TASK_NATURAL_LANGUAGE_INFERENCE”, “MODEL_TASK_FILL_MASK”.

  • use_case – Optional[str] Description of the use case of the Project.

  • ethical_consideration – Optional[str] Description of ethical considerations for this Project.

  • profiling_config – Optional[dict] Configuration for the data and model profiling across all test runs.

  • general_access_role – Optional[str], Project roles assigned to the workspace members. Allowed Values: ACTOR_ROLE_USER, ACTOR_ROLE_VIEWER, ACTOR_ROLE_NONE.

  • run_time_info – Optional[dict] Default runtime information for all test runs in the project.

Return type:

Project

Raises:

ValueError – This error is generated when the request to the Project service fails.

Example

project = rime_client.create_project(
    name="foo", description="bar", model_task="Binary Classification"
)
get_project(project_id: str) Project

Get Project by ID.

Parameters:

project_id – str ID of the Project to return.

Return type:

Project

Example

project = rime_client.get_project("123-456-789")
delete_project(project_id: str, force: Optional[bool] = False) None

Delete a Project by ID.

Parameters:
  • project_id – str ID of the Project to delete.

  • force – Optional[bool] = False When set to True, the Project will be deleted immediately. By default, a confirmation is required.

Example

project = rime_client.delete_project("123-456-789", True)
create_managed_image(name: str, requirements: List[ManagedImagePipRequirement], package_requirements: Optional[List[ManagedImagePackageRequirement]] = None, python_version: Optional[str] = None) ImageBuilder

Create a new managed Docker image with the desired custom requirements to run RIME on.

These managed Docker images are managed by the RIME cluster and automatically upgrade when the installed version of RIME upgrades. Note: Images take a few minutes to be built.

This method returns an object that can be used to track the progress of the image building job. The new custom image is only available for use in a stress test once it has status READY.

Managed images are not currently supported in a Cloud deployment. Please reach out to Robust Intelligence for support if this functionality is required for your deployment.

Parameters:
  • name – str The name of the new Managed Image. This name serves as the unique identifier of the Managed Image. The call fails when an image with the specified name already exists.

  • requirements – List[ManagedImagePipRequirement] List of additional pip requirements to be installed on the managed image. A ManagedImagePipRequirement can be created with the helper method Client.pip_requirement. The first argument is the name of the library (e.g. tensorflow or xgboost) and the second argument is a valid pip version specifier <https://www.python.org/dev/peps/pep-0440/#version-specifiers> (e.g. >=0.1.2 or ==1.0.2) or an exact version<https://peps.python.org/pep-0440/>, such as 1.1.2, for the library.

  • package_requirements – Optional[List[ManagedImagePackageRequirement]] [BETA] An optional List of additional operating system package requirements to install on the Managed Image. Currently only Rocky Linux package requirements are supported. Create a ManagedImagePackageRequirement parameter with the Client.os_requirement helper method. The first argument is the name of the package (e.g. texlive or vim) and the second optional argument is a valid yum version specifier (e.g. 0.1.2) for the package.

  • python_version – Optional[str] An optional version string specifying only the major and minor version for the python interpreter used. The string should be of the format X.Y and be present in the set of supported versions.

Returns:

A ImageBuilder object that provides an interface for monitoring the job in the backend.

Return type:

ImageBuilder

Raises:

ValueError – This error is generated when the request to the ImageRegistry service fails.

Example

reqs = [
     # Fix the version of `xgboost` to `1.0.2`.
     rime_client.pip_requirement("xgboost", "==1.0.2"),
     # We do not care about the installed version of `tensorflow`.
     rime_client.pip_requirement("tensorflow")
 ]

# Start a new image building job
builder_job = rime_client.create_managed_image("xgboost102_tf", reqs)

# Wait until the job has finished and print out status information.
# Once this prints out the `READY` status, your image is available for
# use in Stress Tests.
builder_job.get_status(verbose=True, wait_until_finish=True)
has_managed_image(name: str, check_status: bool = False) bool

Check whether a Managed Image with the specified name exists.

Parameters:
  • name – str The unique name of the Managed Image to check. The call returns False when no image with the specified name exists.

  • check_status – bool Flag that determines whether to check the image status. When this flag is set to True, the call returns True if and only if the image with the specified name exists AND the image is ready to be used.

Returns:

Specifies whether a Managed Image with this name exists.

Return type:

bool

Example

if rime_client.has_managed_image("xgboost102_tensorflow"):
     print("Image exists.")
get_managed_image(name: str) Dict

Get Managed Image by name.

Parameters:

name – str The unique name of the new Managed Image. The call raises an error when no image exists with this name.

Returns:

A dictionary with information about the Managed Image.

Return type:

Dict

Example

image = rime_client.get_managed_image("xgboost102_tensorflow")
delete_managed_image(name: str) None

Delete a managed Docker image.

Parameters:

name – str The unique name of the Managed Image.

Example

image = rime_client.delete_managed_image("xgboost102_tensorflow")
static pip_requirement(name: str, version_specifier: Optional[str] = None) ManagedImagePipRequirement

Construct a PipRequirement object for use in create_managed_image().

static os_requirement(name: str, version_specifier: Optional[str] = None) ManagedImagePackageRequirement

Construct a PackageRequirement object for create_managed_image().

static pip_library_filter(name: str, fixed_version: Optional[str] = None) ListImagesRequestPipLibraryFilter

Construct a PipLibraryFilter object for use in list_managed_images().

list_managed_images(pip_library_filters: Optional[List[ListImagesRequestPipLibraryFilter]] = None) Iterator[Dict]

List all managed Docker images.

Enables searching for images with specific pip libraries installed so that users can reuse Managed Images for Stress Tests.

Parameters:

pip_library_filters – Optional[List[ListImagesRequestPipLibraryFilter]] Optional list of pip libraries to filter by. Construct each ListImagesRequest.PipLibraryFilter object with the pip_library_filter convenience method.

Returns:

An iterator of dictionaries, each dictionary represents a single Managed Image.

Return type:

Iterator[Dict]

Raises:

ValueError – This error is generated when the request to the ImageRegistry service fails or the list of pip library filters is improperly specified.

Example

# Filter for an image with catboost1.0.3 and tensorflow installed.
filters = [
    rime_client.pip_library_filter("catboost", "1.0.3"),
    rime_client.pip_library_filter("tensorflow"),
]

# Query for the images.
images = rime_client.list_managed_images(
    pip_library_filters=filters)

# To get the names of the returned images.
[image["name"] for image in images]
list_agents() Iterator[Dict]

List all Agents available to the user.

Returns:

An iterator of dictionaries, each dictionary represents a single Agent.

Return type:

Iterator[Dict]

Raises:

ValueError – This error is generated when the request to the AgentManager service fails.

Example

# Query for the images.
agents = rime_client.list_agents()

# To get the names of the returned Agents.
[agent["name"] for agent in agents]
list_projects() Iterator[Project]

List all Projects.

Returns:

An iterator of Projects.

Return type:

Iterator[Project]

Raises:

ValueError – This error is generated when the request to the Project service fails.

Example

# Query for projects.
projects = rime_client.list_projects()
start_stress_test(test_run_config: dict, project_id: str, agent_id: Optional[str] = None, **exp_fields: Dict[str, object]) Job

Start a Stress Testing run.

Parameters:
  • test_run_config – dict Configuration for the test to be run, which specifies unique ids to locate the model and datasets to be used for the test.

  • project_id – str Identifier for the Project where the resulting test run will be stored. When not specified, stores the results in the default Project.

  • agent_id – Optional[str] ID for the Agent where the Stress Test will be run. Uses the default Agent for the workspace when not specified.

  • exp_fields – Dict[str, object] [BETA] Fields for experimental features.

Returns:

A Job that provides information about the model Stress Test job.

Return type:

Job

Raises:

ValueError – This error is generated when the request to the ModelTesting service fails.

Example

This example assumes that reference and evaluation datasets are registered with identifiers “foo” and “bar” respectively, and that a model with the unique identifier model_uuid is registered.

config = {
    "data_info": {"ref_dataset_id": "foo", "eval_dataset_id": "bar"},
    "model_id": model_uuid,
    "run_name": "My Stress Test Run",
}

Run the job using the specified configuration and the default Docker image in the RIME backend. Store the results in the RIME Project associated with this object.

job = rime_client.start_stress_test_job(
   test_run_config=config, project_id="123-456-789"
)
get_test_run(test_run_id: str) TestRun

Get a TestRun object with the specified test_run_id.

Checks to see if the test_run_id exists, then returns TestRun object.

Parameters:

test_run_id – str ID of the test run to query for

Returns:

A TestRun object corresponding to the test_run_id

Return type:

TestRun

get_ct_for_project(project_id: str) ContinuousTest

Get the active ct for a Project if it exists.

Query the backend for an active ContinuousTest in a specified Project which can be used to perform continuous testing operations. If there is no active ContinuousTest for the Project, this call will error.

Parameters:

project_id – ID of the Project which contains a ContinuousTest.

Returns:

A ContinuousTest object.

Return type:

ContinuousTest

Raises:

ValueError – This error is generated when the ContinuousTest does not exist or when the request to the Project service fails.

Example

# Get CT in foo-project if it exists.
ct = rime_client.get_ct_for_project("foo-project")
upload_file(file_path: Union[Path, str], upload_path: Optional[str] = None) str

Upload a file to make it accessible to the RIME cluster.

The uploaded file is stored in the RIME cluster in a blob store using its file name.

File uploading is not currently supported in a Cloud deployment. Please use an external data source instead.

Parameters:
  • file_path – Union[Path, str] Path to the file to be uploaded to RIME’s blob store.

  • upload_path – Optional[str] = None Name of the directory in the blob store file system. If omitted, a unique random string will be the directory.

Returns:

A reference to the uploaded file’s location in the blob store. This reference can be used to refer to that object when writing RIME configs. Please store this reference for future access to the file.

Return type:

str

Raises:
  • FileNotFoundError – When the path file_path does not exist.

  • IOError – When file_path is not a file.

  • ValueError – When the specified upload_path is an empty string or there was an error in obtaining a blobstore location from the RIME backend or in uploading file_path to RIME’s blob store. When the file upload fails, the incomplete file is NOT automatically deleted.

Example

# Upload the file at location data_path.
client.upload_file(data_path)
upload_local_image_dataset_file(file_path: Union[Path, str], image_features: List[str], upload_path: Optional[str] = None) Tuple[List[Dict], str]

Upload an image dataset file where image files are stored locally.

The image dataset file is expected to be a list of JSON dictionaries, with an image_features that reference an image (either an absolute path or a relative path to an image file stored locally). Every image within the file is also uploaded to blob store, and the final file is also uploaded. If your image paths already reference an external blob storage, then use upload_file instead to upload the dataset file.

File uploading is not currently supported in a Cloud deployment. Please use an external data source instead.

Parameters:
  • file_path – Union[Path, str] Path to the file to be uploaded to RIME’s blob store.

  • image_features – List[str] Keys to image file paths.

  • upload_path – Optional[str] Name of the directory in the blob store file system. If omitted, a unique random string will be the directory.

Returns:

The list of dicts contains the updated dataset file with image paths replaced by s3 paths. The string contains a reference to the uploaded file’s location in the blob store. This reference can be used to refer to that object when writing RIME configs. Please store this reference for future access to the file.

Return type:

Tuple[List[Dict], str]

Raises:
  • FileNotFoundError – When the path file_path does not exist.

  • IOError – When file_path is not a file.

  • ValueError – When there was an error in obtaining a blobstore location from the RIME backend or in uploading file_path to RIME’s blob store. In the scenario the file fails to upload, the incomplete file will NOT automatically be deleted.

upload_data_frame(data_frame: DataFrame, name: Optional[str] = None, upload_path: Optional[str] = None) str

Upload a pandas DataFrame to make it accessible to the RIME cluster.

The uploaded file is stored in the RIME cluster in a blob store using its file name.

File uploading is not currently supported in a Cloud deployment. Please use an external data source instead.

Parameters:
  • data_frame – pd.DataFrame Path to the file to be uploaded to RIME’s blob store.

  • name – Optional[str] = None Name of the file in the blob store file system. If omitted, a unique random string will be assigned as the file name.

  • upload_path – Optional[str] = None Name of the directory in the blob store file system. If omitted, a unique random string will be the directory.

Returns:

A reference to the uploaded file’s location in the blob store. This reference can be used to refer to that object when writing RIME configs. Please store this reference for future access to the file.

Return type:

str

Raises:

ValueError – When the specified upload_path is an empty string or there was an error in obtaining a blobstore location from the RIME backend or in uploading file_path to RIME’s blob store. When the file upload fails, the incomplete file is NOT automatically deleted.

Example

# Upload pandas data frame.
client.upload_data_frame(df)
upload_directory(dir_path: Union[Path, str], upload_hidden: bool = False, upload_path: Optional[str] = None) str

Upload a model directory to make it accessible to the RIME cluster.

The uploaded directory is stored in the RIME cluster in a blob store. All files contained within dir_path and its subdirectories are uploaded according to their relative paths within dir_path. When upload_hidden is set to False, all hidden files and subdirectories that begin with a ‘.’ are not uploaded.

File uploading is not currently supported in a Cloud deployment. Please use an external data source instead.

Parameters:
  • dir_path – Union[Path, str] Path to the directory to be uploaded to RIME’s blob store.

  • upload_hidden – bool = False Whether to upload hidden files or subdirectories (i.e. those beginning with a ‘.’).

  • upload_path – Optional[str] = None Name of the directory in the blob store file system. If omitted, a unique random string will be the directory.

Returns:

A reference to the uploaded directory’s location in the blob store. This reference can be used to refer to that object when writing RIME configs. Please store this reference for future access to the directory.

Return type:

str

Raises:
  • FileNotFoundError – When the directory dir_path does not exist.

  • IOError – When dir_path is not a directory or contains no files.

  • ValueError – When the specified upload_path is an empty string or there was an error in obtaining a blobstore location from the RIME backend or in uploading dir_path to RIME’s blob store. In the scenario the directory fails to upload, files will NOT automatically be deleted.

list_uploaded_file_urls() Iterator[str]

Return an iterator of file paths that have been uploaded using client.upload_file.

Returns:

An iterator of file path strings.

Return type:

Iterator[str]

Example

# List all file URLs
urls = rime_client.list_uploaded_file_urls()
delete_uploaded_file_url(upload_url: str) None

Delete the file at the specified upload url in the RIME blob store.

Parameters:

upload_url – str Url to the file to be deleted in the RIME blob store.

Returns:

None

Example

# Delete a file URL returned by list_uploaded_file_urls
urls = rime_client.list_uploaded_file_urls()
first_url = next(urls)
rime_client.delete_uploaded_file_url(first_url)
get_job(job_id: str) BaseJob

Get job by ID.

Parameters:

job_id – ID of the Job to return.

Returns:

A Job object.

Return type:

Job

Raises:

ValueError – This error is generated when no Job with the specified ID exists.

Example

# Get Job with ID if it exists.
job = rime_client.get_job("123-456-789")
start_file_scan(model_id: str, project_id: str, custom_image: Optional[RuntimeinfoCustomImage] = None, rime_managed_image: Optional[str] = None, ram_request_megabytes: Optional[int] = None, cpu_request_millicores: Optional[int] = None, agent_id: Optional[str] = None) FileScanJob

Start a File Scan job.

Parameters:
  • model_id – str The model ID of the model to be scanned. Only registered models can be scanned.

  • project_id – str The project to which the file scan result will be saved. Must be the project whose registry contains the model to be scanned.

  • custom_image – Optional[RuntimeinfoCustomImage] Specification of a customized container image to use running the model test. The image must have all dependencies required by your model. The image must specify a name for the image and optionally a pull secret (of type RuntimeinfoCustomImagePullSecret) with the name of the kubernetes pull secret used to access the given image.

  • rime_managed_image – Optional[str] Name of a Managed Image to use when running the model test. The image must have all dependencies required by your model. To create new Managed Images with your desired dependencies, use the client’s create_managed_image() method.

  • ram_request_megabytes – Optional[int] Megabytes of RAM requested for the Stress Test Job. The limit is equal to megabytes requested.

  • cpu_request_millicores – Optional[int] Millicores of CPU requested for the Stress Test Job. The limit is equal to millicores requested.

  • agent_id – Optional[str] ID of the Agent that runs the File Scan job. When unspecified, the workspace’s default Agent is used.

Returns:

An ML File Scan Job object.

Return type:

FileScanJob

Raises:

ValueError – This error is generated when the request to the FileScanning service fails.

Example

This example shows how to scan a Huggingface model file.

job = rime_client.start_file_scan(model_id="123-456-789")
get_file_scan_result(file_scan_id: str) dict

Get a file scan result with the specified file_scan_id.

Parameters:

file_scan_id – str ID of the file scan result to query for

Returns:

A dictionary representation of the file scan result.

Return type:

Dict

list_file_scan_results(project_id: str, model_id: Optional[str] = '') Iterator[dict]

List all file scan results within a project.

Optionally filters for all scan results of a specific model.

Parameters:
  • project_id – str The project ID of the project whose file scan results are to be returned.

  • model_id – Optional[str] The model ID of file scan results to be returned.

File scan results contain the security reports for the scanned files or repositories.

Returns:

An iterator of dictionaries, each dictionary represents a single ML File Scan result.

Return type:

Iterator[dict]

Raises:

ValueError – This error is generated when the request to the FileScanning service fails.

Example

# List all ML file scan results.
results = rime_client.list_file_scan_results(project_id="123-456-789")
delete_file_scan_result(file_scan_id: str) None

Deletes a file scan result with the specified file_scan_id.

Parameters:

file_scan_id – str ID of the file scan result to delete

Returns:

None

create_integration(workspace_id: str, name: str, integration_type: str, integration_schema: List[Dict]) str

Create an integration and return its UUID.

Parameters:
  • workspace_id – str ID of the workspace for which to create the integration.

  • name – str Name that will be given to the integration.

  • integration_type – str The type of integration. Must be one of “INTEGRATION_TYPE_CUSTOM”, “INTEGRATION_TYPE_DATABRICKS”, “INTEGRATION_TYPE_AWS_ACCESS_KEY”, “INTEGRATION_TYPE_AWS_ROLE_ARN”, “INTEGRATION_TYPE_HUGGINGFACE”, “INTEGRATION_TYPE_GCS”, “INTEGRATION_TYPE_AZURE_CLIENT_SECRET”, “INTEGRATION_TYPE_AZURE_WORKLOAD_IDENTITY”.

  • integration_schema

    List[Dict] List of Python dicts where each dict represents a variable and has the following keys:

    ”name”: str (required) “sensitivity”: str (required). Must be one of “VARIABLE_SENSITIVITY_PUBLIC”, “VARIABLE_SENSITIVITY_WORKSPACE_SECRET”, “VARIABLE_SENSITIVITY_USER_SECRET”. value: str (optional)

Returns:

The integration id of the newly created integration.

Return type:

str

Raises:

ValueError – This error is generated when the user provides an invalid integration_type or integration_schema is missing required information.

get_model_security_report(repo_id: str) dict

Gets the supply chain risk report for a Hugging Face model.

Parameters:
  • repo_id – str ID of the Hugging Face model.

  • repo_type – enum Currently only Hugging Face is supported.

Returns:

A dictionary representation of the model scan result.

Return type:

Dict

class rime_sdk.Project(api_client: ApiClient, project_id: str)

An interface to a Project object.

The Project object is used for editing, updating, and deleting Projects.

property info: ProjectInfo

Return description, use case and ethical consideration of the Project.

Return the web app URL to the Project.

This link directs to your organization’s deployment of RIME. You can view more detailed information in the web app, including information on your Test Runs, comparisons of those results, and monitored models.

property name: str

Return the name of this Project.

property description: str

Return the description of this Project.

list_stress_testing_jobs(status_filters: Optional[List[str]] = None) Iterator[Job]

Get list of Stress Testing Jobs for the Project filtered by status.

Note that this only returns jobs from the last two weeks, because the time-to-live of job objects in the cluster is set at two weeks.

Parameters:

status_filters – Optional[List[str]] = None Filter for selecting jobs by a union of statuses. The following list enumerates all acceptable values. [‘pending’, ‘running’, ‘failed’, ‘succeeded’] If omitted, jobs will not be filtered by status.

Returns:

An iterator of Job objects. These are not guaranteed to be in any sorted order.

Return type:

Iterator[Job]

Raises:

ValueError – This error is generated when the request to the JobReader service fails or when the provided status_filters array has invalid values.

Example

# Get all running and succeeded jobs for a Project.
jobs = project.list_stress_testing_jobs(
    status_filters=['pending', 'succeeded'],
)
# To get the names of all jobs.
[job["name"] for job in jobs]
list_scheduled_ct_jobs() Iterator[ContinuousTestJob]

Get list of Scheduled CT Jobs for the Project.

Note that this only returns jobs from the last two weeks, because the time-to-live of job objects in the cluster is set at two weeks.

Returns:

An iterator of ContinuousTestJob objects. These are not guaranteed to be in any sorted order.

Return type:

Iterator[ContinuousTestJob]

Raises:

ValueError – This error is generated when the request to the JobReader service fails.

Example

# Get all Scheduled CT jobs for a Project.
ct_jobs = project.list_scheduled_ct_jobs()

# To get the names of all jobs.
[ct_job["name"] for ct_job in ct_jobs]
list_test_runs(test_type: Optional[RimeTestType] = None) Iterator[TestRun]

List the Test Runs associated with this Project.

Parameters:

test_type – Optional[RimeTestType] = None. Filter for selecting test runs by RimeTestType. If omitted, stress test runs are returned, excluding continuous test runs.

Returns:

An iterator of TestRun objects.

Return type:

Iterator[TestRun]

Raises:

ValueError – This error is generated when the request to the ResultsReader Service fails.

Example

# List all stress test runs in the Project.
test_runs = project.list_test_runs()
# Get the IDs of the test runs.
[test_run.test_run_id for test_run in test_runs]


# List all continuous test runs in the Project.
test_runs = project.list_test_runs(RimeTestType.CONTINUOUS_TESTING)
create_ct(model_id: str, ref_data_id: str, bin_size: timedelta, scheduled_ct_eval_data_integration_id: Optional[str] = None, scheduled_ct_eval_data_info: Optional[dict] = None, scheduled_ct_eval_pred_integration_id: Optional[str] = None, scheduled_ct_eval_pred_info: Optional[dict] = None) ContinuousTest

Create a ContinuousTest in the current Project.

Parameters:
  • model_id – str The model ID that this ContinuousTest is testing. Model IDs are created by the Registry.

  • ref_data_id – str The ID of the reference dataset that this ContinuousTest compares against during testing. Dataset IDs are created by the Registry.

  • bin_size – timedelta The length of each time bin to test over as a timedelta object. Must have a minimum value of 1 hour.

  • scheduled_ct_eval_data_integration_id – Optional[str] The integration ID will be used to fetch eval data.

  • scheduled_ct_eval_data_info – Optional[dict] The data info needed to fetch eval data.

  • scheduled_ct_eval_pred_integration_id – Optional[str] The integration ID will be used to fetch eval predictions.

  • scheduled_ct_eval_pred_info – Optional[dict] The predcition info needed to fetch eval predictions.

Returns:

A ContinuousTest object that is used to monitor the model.

Return type:

ContinuousTest

Raises:

ValueError – This error is generated when the request to the ContinuousTest Service fails.

Example

from datetime import timedelta
# Create ContinuousTest using previously registered model and dataset IDs.
ct = project.create_ct(model_id, ref_data_id, timedelta(days=2))
get_ct() ContinuousTest

Return the active ContinuousTest for a Project if one exists.

Query the backend for an active ContinuousTest in this Project which can be used to perform ContinuousTest operations. If there is no active ContinuousTest for the project, this call will return an error.

Returns:

A ContinuousTest object.

Return type:

ContinuousTest

Raises:

ValueError – This error is generated when the ContinuousTest does not exist or when the request to the ContinuousTest Service fails.

Example

# Get ContinuousTest for this Project.
ct = project.get_ct()
has_ct() bool

Check whether a Project has a ContinuousTest.

delete_ct(force: Optional[bool] = False) None

Delete the ContinuousTest for this Project if one exists.

Parameters:

force – Optional[bool] = False When set to True, the ContinuousTest will be deleted immediately. By default, a confirmation is required.

get_notification_settings() Dict

List all Notification settings for the Project.

Queries the backend to get a list of Notifications settings added to the Project. The Notifications are grouped by the type of Notification: each type contains a list of emails and webhooks which are added to the Notification setting.

Returns:

A Dictionary of Notification type and corresponding emails and webhooks added for that Notification type.

Return type:

Dict

Example

notification_settings = project.get_notification_settings()
add_email(email: str, notif_type_str: str) None

Add an email to the Notification settings for the specified Notification type.

The valid Notification types are: [“Job_Action”, “Monitoring”, “Daily_Digest”]

Example

notification_settings = project.add_email("<email>", "Monitoring")
remove_email(email: str, notif_type_str: str) None

Remove an email from the Notification settings for the specified Notification type.

The valid Notification types are: [“Job_Action”, “Monitoring”, “Daily_Digest”]

Example

notification_settings = project.remove_email("<email>", "Monitoring")
add_webhook(webhook: str, notif_type_str: str) None

Add a webhook to the Notification settings for the specified Notification type.

The valid Notification types are: [“Job_Action”, “Monitoring”, “Daily_Digest”]

Example

notification_settings = project.add_webhook("<webhook>", "Monitoring")
remove_webhook(webhook: str, notif_type_str: str) None

Remove a webhook from the Notification settings for the specified Notification type.

The valid Notification types are: [“Job_Action”, “Monitoring”, “Daily_Digest”]

Example

notification_settings = project.remove_webhook("<webhook>", "Monitoring")
register_dataset(name: str, data_config: dict, integration_id: Optional[str] = None, tags: Optional[List[str]] = None, metadata: Optional[dict] = None, ct_info: Optional[dict] = None, skip_validation: Optional[bool] = False, agent_id: Optional[str] = None) str

Register and validate a new dataset in this Project.

Parameters:
  • name – str The chosen name of the dataset.

  • data_config – dict A dictionary that contains the data configuration. The data configuration must match the API specification of the data_info field in the RegisterDataset request.

  • integration_id – Optional[str] = None, Provide the integration ID for datasets that require an integration.

  • tags – Optional[List[str]] = None, An optional list of tags to associate with the dataset.

  • metadata – Optional[dict] = None, An optional dictionary of metadata to associate with the dataset.

  • ct_info – Optional[dict] = None, An optional dictionary that contains the CT info. The CT info must match the API specification of the ct_info field in the RegisterDataset request.

  • skip_validation – Optional[bool] = False, The param is deprecated, validate is always performed.

  • agent_id – Optional[str] = None, Agent for running validation. If omitted the workspace’s default agent will be used.

Returns:

The ID of the newly registered dataset.

Return type:

str

Raises:
  • ValueError – This error is generated when the request to the Registry service fails.

  • DatasetValidationError – This error is generated when the dataset is invalid.

Example

dataset_id = project.register_dataset(
    name=DATASET_NAME,
    data_config={
        "connection_info": {"data_file": {"path": FILE_PATH}},
        "data_params": {"label_col": LABEL_COL},
    },
    integration_id=INTEGRATION_ID,
    agent_id=AGENT_ID,
)
register_and_validate_dataset(name: str, data_config: dict, integration_id: Optional[str] = None, tags: Optional[List[str]] = None, metadata: Optional[dict] = None, ct_info: Optional[dict] = None, agent_id: Optional[str] = None) Tuple[str, Optional[Job]]

Register and validate a new dataset in this Project.

Parameters:
  • name – str The chosen name of the dataset.

  • data_config – dict A dictionary that contains the data configuration. The data configuration must match the API specification of the data_info field in the RegisterDataset request.

  • integration_id – Optional[str] = None, Provide the integration ID for datasets that require an integration.

  • tags – Optional[List[str]] = None, An optional list of tags to associate with the dataset.

  • metadata – Optional[dict] = None, An optional dictionary of metadata to associate with the dataset.

  • ct_info – Optional[dict] = None, An optional dictionary that contains the CT info. The CT info must match the API specification of the ct_info field in the RegisterDataset request.

  • agent_id – Optional[str] = None, Agent for running validation. If omitted the workspace’s default agent will be used.

Returns:

The returned Tuple contains the ID of the newly registered dataset and the Job object that represents the validation job.

Return type:

Tuple[str, Optional[Job]]

Raises:
  • ValueError – This error is generated when the request to the Registry service fails.

  • DatasetValidationError – This error is generated when the dataset is invalid.

Example

dataset_id = project.register_dataset(
    name=DATASET_NAME,
    data_config={
        "connection_info": {"data_file": {"path": FILE_PATH}},
        "data_params": {"label_col": LABEL_COL},
    },
    integration_id=INTEGRATION_ID,
    agent_id=AGENT_ID,
)
register_dataset_from_file(name: str, remote_path: str, data_params: Optional[dict] = None, integration_id: Optional[str] = None, tags: Optional[List[str]] = None, metadata: Optional[dict] = None, ct_info: Optional[dict] = None, skip_validation: Optional[bool] = False, agent_id: Optional[str] = None) str

Register and validate a new dataset in this Project.

Parameters:
  • name – str The chosen name of the dataset.

  • remote_path – str The path to the dataset artifact.

  • data_params – dict A dictionary that contains the data parameters. The data parameters must match the API specification of the data_info.data_params field in the RegisterDataset request.

  • integration_id – Optional[str] = None, Provide the integration ID for datasets that require an integration.

  • tags – Optional[List[str]] = None, An optional list of tags to associate with the dataset.

  • metadata – Optional[dict] = None, An optional dictionary of metadata to associate with the dataset.

  • ct_info – Optional[dict] = None, An optional dictionary that contains the CT info. The CT info must match the API specification of the ct_info field in the RegisterDataset request.

  • skip_validation – Optional[bool] = False, The param is deprecated, validate is always performed.

  • agent_id – Optional[str] = None, Agent for running validation. If omitted the workspace’s default agent will be used.

Returns:

The ID of the newly registered dataset.

Return type:

str

Raises:
  • ValueError – This error is generated when the request to the Registry service fails.

  • DatasetValidationError – This error is generated when the dataset is invalid.

register_and_validate_dataset_from_file(name: str, remote_path: str, data_params: Optional[dict] = None, integration_id: Optional[str] = None, tags: Optional[List[str]] = None, metadata: Optional[dict] = None, ct_info: Optional[dict] = None, agent_id: Optional[str] = None) Tuple[str, Optional[Job]]

Register and validate a new dataset in this Project.

Parameters:
  • name – str The chosen name of the dataset.

  • remote_path – str The path to the dataset artifact.

  • data_params – dict A dictionary that contains the data parameters. The data parameters must match the API specification of the data_info.data_params field in the RegisterDataset request.

  • integration_id – Optional[str] = None, Provide the integration ID for datasets that require an integration.

  • tags – Optional[List[str]] = None, An optional list of tags to associate with the dataset.

  • metadata – Optional[dict] = None, An optional dictionary of metadata to associate with the dataset.

  • ct_info – Optional[dict] = None, An optional dictionary that contains the CT info. The CT info must match the API specification of the ct_info field in the RegisterDataset request.

  • agent_id – Optional[str] = None, Agent for running validation. If omitted the workspace’s default agent will be used.

Returns:

The returned Tuple contains the ID of the newly registered dataset and the Job object that represents the validation job.

Return type:

Tuple[str, Optional[Job]]

Raises:
  • ValueError – This error is generated when the request to the Registry service fails.

  • DatasetValidationError – This error is generated when the dataset is invalid.

upload_and_register_dataset_from_file(name: str, file_path: Union[Path, str], upload_path: Optional[str] = None, data_params: Optional[dict] = None, integration_id: Optional[str] = None, tags: Optional[List[str]] = None, metadata: Optional[dict] = None, ct_info: Optional[dict] = None, skip_validation: Optional[bool] = False, agent_id: Optional[str] = None) str

Upload, register and validate a new dataset in this Project.

The uploaded file is stored in the Robust Intelligence cluster in a blob store using its file name.

Parameters:
  • name – str The chosen name of the dataset.

  • file_path – Union[Path, str] The local path to the dataset artifact, to be uploaded to Robust Intelligence’s blob store.

  • upload_path – Optional[str] = None, Name of the directory in the blob store file system. If omitted, a unique random string will be the directory.

  • data_params – Optional[dict] = None, A dictionary that contains the data parameters. The data parameters must match the API specification of the data_info.data_params field in the RegisterDataset request.

  • integration_id – Optional[str] = None, Provide the integration ID for datasets that require an integration.

  • tags – Optional[List[str]] = None, An optional list of tags to associate with the dataset.

  • metadata – Optional[dict] = None, An optional dictionary of metadata to associate with the dataset.

  • ct_info – Optional[dict] = None, An optional dictionary that contains the CT info. The CT info must match the API specification of the ct_info field in the RegisterDataset request.

  • skip_validation – Optional[bool] = False, The param is deprecated, validate is always performed.

  • agent_id – Optional[str] = None, Agent for running validation. If omitted the workspace’s default agent will be used.

Returns:

The ID of the newly registered dataset.

Return type:

str

Raises:
  • ValueError – This error is generated when the request to the Registry service fails.

  • DatasetValidationError – This error is generated when the dataset is invalid.

upload_register_and_validate_dataset_from_file(name: str, file_path: Union[Path, str], upload_path: Optional[str] = None, data_params: Optional[dict] = None, integration_id: Optional[str] = None, tags: Optional[List[str]] = None, metadata: Optional[dict] = None, ct_info: Optional[dict] = None, agent_id: Optional[str] = None) Tuple[str, Optional[Job]]

Upload, register and validate a new dataset in this Project.

The uploaded file is stored in the Robust Intelligence cluster in a blob store using its file name.

Parameters:
  • name – str The chosen name of the dataset.

  • file_path – Union[Path, str] The local path to the dataset artifact, to be uploaded to Robust Intelligence’s blob store.

  • upload_path – Optional[str] = None, Name of the directory in the blob store file system. If omitted, a unique random string will be the directory.

  • data_params – Optional[dict] = None, A dictionary that contains the data parameters. The data parameters must match the API specification of the data_info.data_params field in the RegisterDataset request.

  • integration_id – Optional[str] = None, Provide the integration ID for datasets that require an integration.

  • tags – Optional[List[str]] = None, An optional list of tags to associate with the dataset.

  • metadata – Optional[dict] = None, An optional dictionary of metadata to associate with the dataset.

  • ct_info – Optional[dict] = None, An optional dictionary that contains the CT info. The CT info must match the API specification of the ct_info field in the RegisterDataset request.

  • agent_id – Optional[str] = None, Agent for running validation. If omitted the workspace’s default agent will be used.

Returns:

The returned Tuple contains the ID of the newly registered dataset and the Job object that represents the validation job.

Return type:

Tuple[str, Optional[Job]]

Raises:
  • ValueError – This error is generated when the request to the Registry service fails.

  • DatasetValidationError – This error is generated when the dataset is invalid.

register_model(name: str, model_config: Optional[dict] = None, tags: Optional[List[str]] = None, metadata: Optional[dict] = None, external_id: Optional[str] = None, integration_id: Optional[str] = None, model_endpoint_integration_id: Optional[str] = None, skip_validation: Optional[bool] = False, agent_id: Optional[str] = None) str

Register and validate a new model in this Project.

Parameters:
  • name – str The chosen name of the model.

  • model_config – Optional[dict] = None, A dictionary that contains the model configuration. Any model configuration that is provided must match the API specification for the model_info field of the RegisterModel request.

  • tags – Optional[List[str]] = None, An optional list of tags to associate with the model.

  • metadata – Optional[dict] = None, An optional dictionary of metadata to associate with the model.

  • external_id – Optional[str] = None, An optional external ID that can be used to identify the model.

  • integration_id – Optional[str] = None, Provide the integration ID for models that require an integration for accessing the model.

  • model_endpoint_integration_id – Optional[str] = None, Provide the integration ID for models that require an integration when running the model.

  • skip_validation – Optional[bool] = False, The param is deprecated, validate is always performed.

  • agent_id – Optional[str] = None, Agent for running validation. If omitted the workspace’s default agent will be used.

Returns:

The ID of the newly registered model.

Return type:

str

Raises:
  • ValueError – This error is generated when the request to the Registry service fails.

  • ModelValidationError – This error is generated when the model is invalid.

Example

model_id = project.register_model(
    name=MODEL_NAME,
    model_config={
        "hugging_face": {
            "model_uri": URI,
            "kwargs": {
                "tokenizer_uri": TOKENIZER_URI,
                "class_map": MAP,
                "ignore_class_names": True,
            },
        }
    },
    tags=[MODEL_TAG],
    metadata={KEY: VALUE},
    external_id=EXTERNAL_ID,
    agent_id=AGENT_ID,
)
register_and_validate_model(name: str, model_config: Optional[dict] = None, tags: Optional[List[str]] = None, metadata: Optional[dict] = None, external_id: Optional[str] = None, integration_id: Optional[str] = None, model_endpoint_integration_id: Optional[str] = None, agent_id: Optional[str] = None) Tuple[str, Optional[Job]]

Register and validate a new model in this Project.

Parameters:
  • name – str The chosen name of the model.

  • model_config – Optional[dict] = None, A dictionary that contains the model configuration. Any model configuration that is provided must match the API specification for the model_info field of the RegisterModel request.

  • tags – Optional[List[str]] = None, An optional list of tags to associate with the model.

  • metadata – Optional[dict] = None, An optional dictionary of metadata to associate with the model.

  • external_id – Optional[str] = None, An optional external ID that can be used to identify the model.

  • integration_id – Optional[str] = None, Provide the integration ID for models that require an integration for accessing the model.

  • model_endpoint_integration_id – Optional[str] = None, Provide the integration ID for models that require an integration when running the model.

  • agent_id – Optional[str] = None, Agent for running validation. If omitted the workspace’s default agent will be used.

Returns:

The returned Tuple contains the ID of the newly registered dataset and the Job object that represents the validation job.

Return type:

Tuple[str, Optional[Job]]

Raises:
  • ValueError – This error is generated when the request to the Registry service fails.

  • ModelValidationError – This error is generated when the model is invalid.

Example

model_id = project.register_model(
    name=MODEL_NAME,
    model_config={
        "hugging_face": {
            "model_uri": URI,
            "kwargs": {
                "tokenizer_uri": TOKENIZER_URI,
                "class_map": MAP,
                "ignore_class_names": True,
            },
        }
    },
    tags=[MODEL_TAG],
    metadata={KEY: VALUE},
    external_id=EXTERNAL_ID,
    agent_id=AGENT_ID,
)
register_model_from_path(name: str, remote_path: str, tags: Optional[List[str]] = None, metadata: Optional[dict] = None, external_id: Optional[str] = None, integration_id: Optional[str] = None, model_endpoint_integration_id: Optional[str] = None, skip_validation: Optional[bool] = False, agent_id: Optional[str] = None) str

Register and validate a new model in this Project.

Parameters:
  • name – str The chosen name of the model.

  • remote_path – str The path to the model artifact.

  • tags – Optional[List[str]] = None, An optional list of tags to associate with the model.

  • metadata – Optional[dict] = None, An optional dictionary of metadata to associate with the model.

  • external_id – Optional[str] = None, An optional external ID that can be used to identify the model.

  • integration_id – Optional[str] = None, Provide the integration ID for models that require an integration for access.

  • model_endpoint_integration_id – Optional[str] = None, Provide the integration ID for models that require an integration when running the model.

  • skip_validation – Optional[bool] = False, The param is deprecated, validate is always performed.

  • agent_id – Optional[str] = None, Agent for running validation. If omitted the workspace’s default agent will be used.

Returns:

The ID of the newly registered model.

Return type:

str

Raises:
  • ValueError – This error is generated when the request to the Registry service fails.

  • ModelValidationError – This error is generated when the model is invalid.

Example

model_id = project.register_model_from_path(
    name=MODEL_NAME,
    remote_path=MODEL_PATH,
)
register_and_validate_model_from_path(name: str, remote_path: str, tags: Optional[List[str]] = None, metadata: Optional[dict] = None, external_id: Optional[str] = None, integration_id: Optional[str] = None, model_endpoint_integration_id: Optional[str] = None, agent_id: Optional[str] = None) Tuple[str, Optional[Job]]

Register and validate a new model in this Project.

Parameters:
  • name – str The chosen name of the model.

  • remote_path – str The path to the model artifact.

  • tags – Optional[List[str]] = None, An optional list of tags to associate with the model.

  • metadata – Optional[dict] = None, An optional dictionary of metadata to associate with the model.

  • external_id – Optional[str] = None, An optional external ID that can be used to identify the model.

  • integration_id – Optional[str] = None, Provide the integration ID for models that require an integration for access.

  • model_endpoint_integration_id – Optional[str] = None, Provide the integration ID for models that require an integration when running the model.

  • agent_id – Optional[str] = None, Agent for running validation. If omitted the workspace’s default agent will be used.

Returns:

The returned Tuple contains the ID of the newly registered dataset and the Job object that represents the validation job.

Return type:

Tuple[str, Optional[Job]]

Raises:
  • ValueError – This error is generated when the request to the Registry service fails.

  • ModelValidationError – This error is generated when the model is invalid.

Example

model_id = project.register_model_from_path(
    name=MODEL_NAME,
    remote_path=MODEL_PATH,
)
upload_and_register_model_from_path(name: str, file_path: Union[Path, str], upload_model_dir: bool = False, upload_hidden: bool = False, upload_path: Optional[str] = None, tags: Optional[List[str]] = None, metadata: Optional[dict] = None, external_id: Optional[str] = None, integration_id: Optional[str] = None, skip_validation: Optional[bool] = False, agent_id: Optional[str] = None) str

Upload, register and validate a new model in this Project.

The uploaded file is stored in the Robust Intelligence cluster in a blob store using its file name.

Parameters:
  • name – str The chosen name of the model.

  • file_path – Union[Path, str] The local path to the model artifact, to be uploaded to Robust Intelligence’s blob store.

  • upload_model_dir – bool = False Whether to upload the directory containing the model artifact, in addition to the model artifact itself. Note that if set to True, this method will upload everything in the model’s containing directory, so only place files required to run the model in the directory.

  • upload_hidden – bool = False Whether or not to upload hidden files or subdirectories (ie. those beginning with a ‘.’) in dir_path.

  • upload_path – Optional[str] = None, Name of the directory in the blob store file system. If omitted, a unique random string will be the directory.

  • tags – Optional[List[str]] = None, An optional list of tags to associate with the model.

  • metadata – Optional[dict] = None, An optional dictionary of metadata to associate with the model.

  • external_id – Optional[str] = None, An optional external ID that can be used to identify the model.

  • integration_id – Optional[str] = None, Provide the integration ID for models that require an integration for access.

  • skip_validation – Optional[bool] = False, The param is deprecated, validate is always performed.

  • agent_id – Optional[str] = None, Agent for running validation. If omitted the workspace’s default agent will be used.

Returns:

The ID of the newly registered model.

Return type:

str

Raises:
  • ValueError – This error is generated when the request to the Registry service fails.

  • ModelValidationError – This error is generated when the model is invalid.

Example

model_id = project.upload_and_register_model_from_path(
    name=MODEL_NAME,
    file_path=MODEL_PATH,
)
upload_register_and_validate_model_from_path(name: str, file_path: Union[Path, str], upload_model_dir: bool = False, upload_hidden: bool = False, upload_path: Optional[str] = None, tags: Optional[List[str]] = None, metadata: Optional[dict] = None, external_id: Optional[str] = None, integration_id: Optional[str] = None, agent_id: Optional[str] = None) Tuple[str, Optional[Job]]

Upload, register and validate a new model in this Project.

The uploaded file is stored in the Robust Intelligence cluster in a blob store using its file name.

Parameters:
  • name – str The chosen name of the model.

  • file_path – Union[Path, str] The local path to the model artifact, to be uploaded to Robust Intelligence’s blob store.

  • upload_model_dir – bool = False Whether to upload the directory containing the model artifact, in addition to the model artifact itself. Note that if set to True, this method will upload everything in the model’s containing directory, so only place files required to run the model in the directory.

  • upload_hidden – bool = False Whether or not to upload hidden files or subdirectories (ie. those beginning with a ‘.’) in dir_path.

  • upload_path – Optional[str] = None, Name of the directory in the blob store file system. If omitted, a unique random string will be the directory.

  • tags – Optional[List[str]] = None, An optional list of tags to associate with the model.

  • metadata – Optional[dict] = None, An optional dictionary of metadata to associate with the model.

  • external_id – Optional[str] = None, An optional external ID that can be used to identify the model.

  • integration_id – Optional[str] = None, Provide the integration ID for models that require an integration for access.

  • agent_id – Optional[str] = None, Agent for running validation. If omitted the workspace’s default agent will be used.

Returns:

The returned Tuple contains the ID of the newly registered dataset and the Job object that represents the validation job.

Return type:

Tuple[str, Optional[Job]]

Raises:
  • ValueError – This error is generated when the request to the Registry service fails.

  • ModelValidationError – This error is generated when the model is invalid.

Example

model_id = project.upload_and_register_model_from_path(
    name=MODEL_NAME,
    file_path=MODEL_PATH,
)
register_predictions(dataset_id: str, model_id: str, pred_config: dict, integration_id: Optional[str] = None, tags: Optional[List[str]] = None, metadata: Optional[dict] = None, skip_validation: Optional[bool] = False, agent_id: Optional[str] = None) None

Register and validate a new prediction corresponding to a model and a dataset.

Parameters:
  • dataset_id – str, The ID of the dataset used to generate the predictions.

  • model_id – str, The ID of the model used to generate the predictions.

  • pred_config – dict, A dictionary that contains the prediction configuration. The prediction configuration must match the API specification for the pred_info field of the RegisterPredictions request.

  • integration_id – Optional[str] = None, Provide the integration ID for predictions that require an integration to use.

  • tags – Optional[List[str]] = None, An optional list of tags to associate with the predictions.

  • metadata – Optional[dict] = None, An optional dictionary of metadata to associate with the predictions.

  • skip_validation – Optional[bool] = False, The param is deprecated, validate is always performed.

  • agent_id – Optional[str] = None, Agent for running validation. If omitted the workspace’s default agent will be used.

Returns:

None

Raises:
  • ValueError – This error is generated when the request to the Registry service fails.

  • PredictionsValidationError – This error is generated when the predictions are invalid.

Example

project.register_predictions(
    dataset_id=DATASET_ID,
    model_id=MODEL_ID,
    pred_config={
        "connection_info": {
            "databricks": {
                # Unix timestamp equivalent to 02/08/2023
                "start_time": 1675922943,
                # Unix timestamp equivalent to 03/08/2023
                "end_time": 1678342145,
                "table_name": TABLE_NAME,
                "time_col": TIME_COL,
            },
        },
        "pred_params": {"pred_col": PREDS},
    },
    tags=[TAG],
    metadata={KEY: VALUE},
    agent_id=AGENT_ID,
)
register_and_validate_predictions(dataset_id: str, model_id: str, pred_config: dict, integration_id: Optional[str] = None, tags: Optional[List[str]] = None, metadata: Optional[dict] = None, agent_id: Optional[str] = None) Optional[Job]

Register and validate a new prediction corresponding to a model and a dataset.

Parameters:
  • dataset_id – str, The ID of the dataset used to generate the predictions.

  • model_id – str, The ID of the model used to generate the predictions.

  • pred_config – dict, A dictionary that contains the prediction configuration. The prediction configuration must match the API specification for the pred_info field of the RegisterPredictions request.

  • integration_id – Optional[str] = None, Provide the integration ID for predictions that require an integration to use.

  • tags – Optional[List[str]] = None, An optional list of tags to associate with the predictions.

  • metadata – Optional[dict] = None, An optional dictionary of metadata to associate with the predictions.

  • agent_id – Optional[str] = None, Agent for running validation. If omitted the workspace’s default agent will be used.

Returns:

The job object that represents the validation job.

Return type:

job

Raises:
  • ValueError – This error is generated when the request to the Registry service fails.

  • PredictionsValidationError – This error is generated when the predictions are invalid.

Example

project.register_predictions(
    dataset_id=DATASET_ID,
    model_id=MODEL_ID,
    pred_config={
        "connection_info": {
            "databricks": {
                # Unix timestamp equivalent to 02/08/2023
                "start_time": 1675922943,
                # Unix timestamp equivalent to 03/08/2023
                "end_time": 1678342145,
                "table_name": TABLE_NAME,
                "time_col": TIME_COL,
            },
        },
        "pred_params": {"pred_col": PREDS},
    },
    tags=[TAG],
    metadata={KEY: VALUE},
    agent_id=AGENT_ID,
)
register_predictions_from_file(dataset_id: str, model_id: str, remote_path: str, pred_params: Optional[dict] = None, integration_id: Optional[str] = None, tags: Optional[List[str]] = None, metadata: Optional[dict] = None, skip_validation: Optional[bool] = False, agent_id: Optional[str] = None) None

Register and validate a new set of predictions for a model on a dataset.

Parameters:
  • dataset_id – str, The ID of the dataset used to generate the predictions.

  • model_id – str, The ID of the model used to generate the predictions.

  • remote_path – str, The path to the prediction artifact.

  • pred_params – Optional[dict] = None, A dictionary that contains the prediction parameters.

  • integration_id – Optional[str] = None, Provide the integration ID for predictions that require an integration to use.

  • tags – Optional[List[str]] = None, An optional list of tags to associate with the predictions.

  • metadata – Optional[dict] = None, An optional dictionary of metadata to associate with the predictions.

  • skip_validation – Optional[bool] = False, The param is deprecated, validate is always performed.

  • agent_id – Optional[str] = None, Agent for running validation. If omitted the workspace’s default agent will be used.

Returns:

None

Raises:
  • ValueError – This error is generated when the request to the Registry service fails.

  • PredictionsValidationError – This error is generated when the predictions are invalid.

Example

project.register_predictions_from_file(
    dataset_id=DATASET_ID,
    model_id=MODEL_ID,
    remote_path=PREDICTIONS_PATH,
    agent_id=AGENT_ID,
)
register_and_validate_predictions_from_file(dataset_id: str, model_id: str, remote_path: str, pred_params: Optional[dict] = None, integration_id: Optional[str] = None, tags: Optional[List[str]] = None, metadata: Optional[dict] = None, agent_id: Optional[str] = None) Optional[Job]

Register and validate a new set of predictions for a model on a dataset.

Parameters:
  • dataset_id – str, The ID of the dataset used to generate the predictions.

  • model_id – str, The ID of the model used to generate the predictions.

  • remote_path – str, The path to the prediction artifact.

  • pred_params – Optional[dict] = None, A dictionary that contains the prediction parameters.

  • integration_id – Optional[str] = None, Provide the integration ID for predictions that require an integration to use.

  • tags – Optional[List[str]] = None, An optional list of tags to associate with the predictions.

  • metadata – Optional[dict] = None, An optional dictionary of metadata to associate with the predictions.

  • agent_id – Optional[str] = None, Agent for running validation. If omitted the workspace’s default agent will be used.

Returns:

The job object that represents the validation job.

Return type:

job

Raises:
  • ValueError – This error is generated when the request to the Registry service fails.

  • PredictionsValidationError – This error is generated when the predictions are invalid.

Example

project.register_predictions_from_file(
    dataset_id=DATASET_ID,
    model_id=MODEL_ID,
    remote_path=PREDICTIONS_PATH,
    agent_id=AGENT_ID,
)
list_datasets() Iterator[Dict]

Return a list of datasets registered in this Project.

Returns:

Iterator of dictionaries: each dictionary represents a dataset.

Return type:

Iterator[Dict]

Raises:

ValueError – This error is generated when the request to the Registry service fails.

list_models() Iterator[Dict]

Return a list of models.

Returns:

Iterator of dictionaries: each dictionary represents a model.

Return type:

Iterator[Dict]

Raises:

ValueError – This error is generated when the request to the Registry service fails.

get_data_collector() DataCollector

Get Data Collector for Project.

When Project has no existing Data Collector, this method creates and returns a new Data Collector.

Returns:

Data Collector object for Project

Return type:

DataCollector

Raises:

ValueError – This error is generated when the request to the Data Collector service fails.

get_general_access_roles() Dict[RimeActorRole, str]

Get the roles of workspace members for the project.

Returns:

Dict[RimeActorRole, str]

Returns a map of Actor Roles for workspace and their roles for the project.

list_predictions(model_id: Optional[str] = None, dataset_id: Optional[str] = None) Iterator[Dict]

Return a list of predictions.

Parameters:
  • model_id – Optional[str] = None The ID of the model to which the prediction sets belong.

  • dataset_id – Optional[str] = None The ID of the dataset to which the prediction sets belong.

Returns:

Iterator of dictionaries: each dictionary represents a prediction.

Return type:

Iterator[Dict]

Raises:

ValueError – This error is generated when the request to the Registry service fails.

get_dataset(dataset_id: Optional[str] = None, dataset_name: Optional[str] = None) Dict

Return a registered dataset.

Parameters:
  • dataset_id – Optional[str] = None, The ID of the dataset to retrieve.

  • dataset_name – Optional[str] = None, The name of the dataset to retrieve.

Returns:

A dictionary representing the dataset.

Return type:

Dict

Raises:

ValueError – This error is generated when the request to the Registry service fails.

has_dataset(dataset_id: Optional[str] = None, dataset_name: Optional[str] = None) bool

Return a boolean on whether the dataset is present.

Parameters:
  • dataset_id – Optional[str] = None The ID of the dataset to check for.

  • dataset_name – Optional[str] = None The name of the dataset to check for.

Returns:

A boolean on whether the dataset is present.

Return type:

bool

Raises:

ValueError – This error is generated any error other than HTTPStatus.NOT_FOUND is returned from the Registry service.

get_model(model_id: Optional[str] = None, model_name: Optional[str] = None) Dict

Return a registered model.

Parameters:
  • model_id – Optional[str] = None, The ID of the model to retrieve.

  • model_name – Optional[str] = None, The name of the model to retrieve.

Returns:

A dictionary representing the model.

Return type:

Dict

Raises:

ValueError – This error is generated when the request to the Registry service fails.

get_predictions(model_id: str, dataset_id: str) Dict

Get predictions for a model and dataset.

Parameters:
  • model_id – str, The ID of the model used to generate the predictions.

  • dataset_id – str, The ID of the dataset used to generate the predictions.

Returns:

A dictionary representing the predictions.

Return type:

Dict

Raises:

ValueError – This error is generated when the request to the Registry service fails.

delete_dataset(dataset_id: str) None

Delete a dataset.

Parameters:

dataset_id – str, The ID of the dataset to delete.

Returns:

None

Raises:

ValueError – This error is generated when the request to the Registry service fails.

delete_model(model_id: str) None

Delete a model.

Parameters:

model_id – str, The ID of the model to delete.

Returns:

None

Raises:

ValueError – This error is generated when the request to the Registry service fails.

delete_predictions(model_id: str, dataset_id: str) None

Delete predictions for a model and dataset.

Parameters:
  • model_id – str, The ID of the model used to generate the predictions.

  • dataset_id – str, The ID of the dataset used to generate the predictions.

Returns:

None

Raises:

ValueError – This error is generated when the request to the Registry service fails.

update_stress_test_categories(categories: List[str]) None

Update the project’s stress test categories.

Parameters:

categories – List[str] The list of stress test categories to update.

Returns:

None

Raises:

ValueError – This error is generated when the request to the Project service fails.

update_ct_categories(categories: List[str]) None

Update the project’s continuous test categories.

Parameters:

categories – List[str] The list of continuous test categories to update.

Returns:

None

Raises:

ValueError – This error is generated when the request to the Project service fails.

update_model_profiling_config(model_profiling_config: dict) None

Update the project’s model profiling configuration.

Parameters:

model_profiling_config – dict Model profiling configuration with which to update the project.

Returns:

None

Raises:

ValueError – This error is generated when the request to the Project service fails.

update_data_profiling_config(data_profiling_config: dict) None

Update the project’s data profiling configuration.

Parameters:

data_profiling_config – dict Data profiling configuration with which to update the project.

Returns:

None

Raises:

ValueError – This error is generated when the request to the Project service fails.

update_test_suite_config(test_suite_config: dict) None

Update the project’s test suite config.

Parameters:

test_suite_config – dict Test suite configuration with which to update the project.

Returns:

None

Raises:

ValueError – This error is generated when the request to the Project service fails.

update_run_time_info(run_time_info: dict) None

Update the runtime information object that specifies how this Project runs.

If no runtime information is provided when the test is created or started, then this runtime information is used.

Parameters:

run_time_info – dict Runtime information object with which to update the Project.

Returns:

None

Raises:

ValueError – This error is generated when the request to the Project service fails.

get_schedule(schedule_id: str) Schedule

Return a schedule.

Returns:

Schedule object for the Project.

Return type:

Schedule

Raises:

ValueError – This error is generated when the request to the Schedule service fails.

get_active_schedule() Schedule

Return the active schedule.

Returns:

Schedule object for the Project.

Return type:

Schedule

Raises:

ValueError – This error is generated when the request to the Schedule service fails.

create_schedule(test_run_config: dict, frequency_cron_expr: str) Schedule

Create a new schedule for the project.

Parameters:
  • test_run_config – dict, The test run configuration to use for the schedule, which specifies unique ids to locate the model and datasets to be used for the test.

  • frequency_cron_expr – str, The cron expression for the frequency of the schedule. Accepts “@hourly”, “@daily”, “@weekly” or “@monthly”

Returns:

Schedule object associated with the Project.

Return type:

Schedule

Raises:

ValueError – This error is generated when the request to the Schedule service fails.

update_schedule(schedule_id: str, frequency_cron_expr: str) ScheduleSchedule

Update a schedule associated with the project.

Currently only the frequency can be updated.

Parameters:
  • schedule_id – str, The ID of the schedule to update.

  • frequency_cron_expr – str, The cron expression for the frequency of the schedule. Accepts “@hourly”, “@daily”, “@weekly” or “@monthly”

Returns:

The updated schedule object.

Return type:

Schedule

Raises:

ValueError – This error is generated when the request to the Schedule service fails.

delete_schedule(schedule_id: str) None

Delete a schedule.

Parameters:

schedule_id – str, The ID of the schedule to delete.

Returns:

None

Raises:

ValueError – This error is generated when the request to the Schedule service fails.

activate_schedule(schedule_id: str) dict

Activate a schedule for this project.

Returns:

A dict containing the response from the Project service.

Raises:
  • ValueError – This error is generated when the request to the Project service fails.

  • AttributeError – This error is generated when the response from the Project service is not as expected. This should not happen.

deactivate_schedule(schedule_id: str) None

Deactivate a schedule for this project.

Returns:

None

Raises:

ValueError – This error is generated when the request to the Project service fails.

class rime_sdk.Job(api_client: ApiClient, job_id: str)

This object provides an interface for monitoring a Stress Test Job in the RIME backend.

get_test_run_id() str

Get the Test Run ID corresponding to a successful Job.

Raises:

ValueError – This error is generated when the job does not have state ‘SUCCEEDED’ or when the request to the Job Reader service fails.

get_test_run() TestRun

Get the Test Run object corresponding to a successful Job.

Raises:

ValueError – This error is generated when the job does not have state ‘SUCCEEDED’ or if the request to the Job Reader service fails.

cancel() None

Request to cancel the Job.

The RIME cluster will mark the Job with “Cancellation Requested” and then clean up the Job.

property error_msg: str

The error message, if this job failed.

get_agent_id() str

Return the Agent ID running the Job.

Return a link to the archived logs for a failed Job.

get_status(verbose: bool = True, wait_until_finish: bool = False, poll_rate_sec: float = 5.0) Dict

Return the current status of the Job.

Includes flags for blocking until the job is complete and printing information to stdout. This method displays the progress and running time of jobs.

If the job has failed, the logs of the testing engine will be dumped to stdout for debugging.

Parameters:
  • verbose – bool Specifies whether to print diagnostic information such as progress.

  • wait_until_finish – bool Specifies whether to block until the job has succeeded or failed. If verbose is enabled too, information about the job including running time and progress will be printed to stdout every poll_rate_sec.

  • poll_rate_sec – float The frequency in seconds with which to poll the job’s status.

Returns:

A dictionary representing the job’s state.

{
"id": str
"type": str
"status": str
"start_time_secs": int64
"running_time_secs": double
}

Return type:

Dict

Example

# Block until this job is finished and dump monitoring info to stdout.
job_status = job.get_status(verbose=True, wait_until_finish=True)
property job_id: str

Return the ID of the Job.

property job_type: str

Return the type of the Job.

The valid Job types are: MODEL_STRESS_TEST, FIREWALL_BATCH_TEST, IMAGE_BUILDER, FILE_SCAN

class rime_sdk.ContinuousTestJob(api_client: ApiClient, job_id: str)

This object provides an interface for monitoring a Continuous Test Job in the RIME backend.

get_test_run_ids() str

Get the set of Test Run IDs corresponding to a successful Continuous Test Job.

Raises:

ValueError – This error is generated when the job does not have state ‘SUCCEEDED’ or when the request to the Job Reader service fails.

get_test_runs() List[TestRun]

Get the list of Test Run objects corresponding to a successful Continuous Test Job.

Raises:

ValueError – This error is generated when the job does not have state ‘SUCCEEDED’ or if the request to the Job Reader service fails.

cancel() None

Request to cancel the Job.

The RIME cluster will mark the Job with “Cancellation Requested” and then clean up the Job.

property error_msg: str

The error message, if this job failed.

get_agent_id() str

Return the Agent ID running the Job.

Return a link to the archived logs for a failed Job.

get_status(verbose: bool = True, wait_until_finish: bool = False, poll_rate_sec: float = 5.0) Dict

Return the current status of the Job.

Includes flags for blocking until the job is complete and printing information to stdout. This method displays the progress and running time of jobs.

If the job has failed, the logs of the testing engine will be dumped to stdout for debugging.

Parameters:
  • verbose – bool Specifies whether to print diagnostic information such as progress.

  • wait_until_finish – bool Specifies whether to block until the job has succeeded or failed. If verbose is enabled too, information about the job including running time and progress will be printed to stdout every poll_rate_sec.

  • poll_rate_sec – float The frequency in seconds with which to poll the job’s status.

Returns:

A dictionary representing the job’s state.

{
"id": str
"type": str
"status": str
"start_time_secs": int64
"running_time_secs": double
}

Return type:

Dict

Example

# Block until this job is finished and dump monitoring info to stdout.
job_status = job.get_status(verbose=True, wait_until_finish=True)
property job_id: str

Return the ID of the Job.

property job_type: str

Return the type of the Job.

The valid Job types are: MODEL_STRESS_TEST, FIREWALL_BATCH_TEST, IMAGE_BUILDER, FILE_SCAN

class rime_sdk.TestRun(api_client: ApiClient, test_run_id: str)

An interface for a RIME Test Run object.

Return the web app URL for the Test Run page.

This page contains results for all Test Runs. To jump to the view which shows results for this specific Test Run, click on the corresponding time bin in the UI.

get_result_df() DataFrame

Return high level summary information for a complete stress Test Run in a single-row dataframe.

This dataframe includes information such as model metrics on the reference and evaluation datasets, overall RIME results such as severity across tests, and high level metadata such as the project ID and model task.

Place these rows together to build a table of test to build a table of test run results for comparison. This only works for stress test jobs that have succeeded.

Returns:

A pandas.DataFrame object containing the Test Run result. Use the .columns method on the returned dataframe to see what columns represent. Generally, these columns have information about the model and datasets as well as summary statistics such as the number of failing Test Cases or number of high severity Test Cases.

Return type:

pd.DataFrame

Example

test_run = client.get_test_run(some_test_run_id)
test_run_result_df = test_run.get_result_df()
get_test_cases_df(show_test_case_metrics: bool = False) DataFrame

Return all the Test Cases for a completed stress Test Run in a dataframe.

This enables you to perform granular queries on Test Cases. For example, if you only care about subset performance tests and want to see the results on each feature, you can fetch all the Test Cases in a dataframe, then query on that dataframe by test type. This only works on stress test jobs that have succeeded.

Parameters:

show_test_case_metrics – bool = False Whether to show Test Case specific metrics. This could result in a sparse dataframe that is returned, since Test Cases return different metrics. Defaults to False.

Returns:

A pandas.DataFrame object containing the Test Case results. Here is a selected list of columns in the output:

  1. test_run_id: ID of the parent Test Run.

  2. features: List of features that the Test Case ran on.

  3. test_batch_type: Type of test that was run (e.g. Subset AUC).

  4. status: Status of the Test Case (e.g. Pass, Fail, etc.).

  5. severity: Denotes the severity of the failure of the test.

Return type:

pd.DataFrame

Example

# Wait until the job has finished, since this method only works on
# SUCCEEDED jobs.
job.get_status(verbose=True, wait_until_finish=True)
# Get the Test Run result.
test_run = job.get_test_run()
# Dump the Test Cases in dataframe ``df``.
df = test_run.get_test_cases_df()
get_test_batch(test_type: str) TestBatch

Return the Test Batch object for the specified test type on this Test Run.

A TestBatch object allows a user to query the results for the corresponding test. For example, the TestBatch object representing subset_performance:subset_accuracy allows a user to understand the results of the subset_performance:subset_accuracy test to varying levels of granularity.

Parameters:

test_type – str Name of the test. Structured as test_type:test_name, e.g. subset_performance:subset_accuracy.

Returns:

A TestBatch representing test_type.

Return type:

TestBatch

Example

batch = test_run.get_test_batch("unseen_categorical")
get_test_batches() Iterator[TestBatch]

Return all TestBatch objects for the Test Run.

Returns:

An iterator of TestBatch objects.

Return type:

Iterator[TestBatch]

get_category_results_df(show_category_results_metrics: bool = False) DataFrame

Get all category results for a completed stress Test Run in a dataframe.

This gives you the ability to perform granular queries on category tests. This only works on stress test jobs that have succeeded.

Parameters:

show_category_results_metrics – bool Boolean flag to request metrics related to the category results. Defaults to False.

Returns:

A pandas.DataFrame object containing the Test Case results. Here is a selected list of columns in the output: 1. id: ID of the parent Test Run. 2. name: name of the category test. 3. severity: Denotes the severity of the failure of the test. 4. test_batch_types: List of tests that this category uses. 5. failing_test_types: List of failing tests in this category. 6. num_none_severity: Count of tests with NONE severity. 7. num_low_severity: Count of tests with LOW severity. 9. num_high_severity: Count of tests with HIGH severity.

Return type:

pd.DataFrame

Example

# Wait until the job has finished, since this method only works on
# SUCCEEDED jobs.
job.get_status(verbose=True, wait_until_finish=True)
# Get the Test Run result.
test_run = job.get_test_run()
# Dump the Test Cases in dataframe ``df``.
df = test_run.get_category_results_df()
get_summary_tests_df(show_summary_test_metrics: bool = False) DataFrame

Get summary tests for a completed stress Test Run in a dataframe.

This gives you the ability to perform granular queries on summary tests. This only works on stress test jobs that have succeeded.

Returns:

A pandas.DataFrame object containing the Test Case results. Here is a selected list of columns in the output: 1. id: ID of the parent Test Run. 2. name: name of the summary test. 3. severity: Denotes the severity of the failure of the test. 4. test_batch_types: List of tests that this summary tests uses. 5. failing_test_types: List of tests in this summary test that fail. 6. num_none_severity: Count of tests with NONE severity. 7. num_low_severity: Count of tests with LOW severity. 9. num_high_severity: Count of tests with HIGH severity.

Return type:

pd.DataFrame

Example

# Wait until the job has finished, since this method only works on
# SUCCEEDED jobs.
job.get_status(verbose=True, wait_until_finish=True)
# Get the Test Run result.
test_run = job.get_test_run()
# Dump the Test Cases in dataframe ``df``.
df = test_run.get_summary_tests_df()
class rime_sdk.ContinuousTestRun(api_client: ApiClient, test_run_id: str, time_bin: Tuple[datetime, datetime])

An interface for an individual RIME continuous Test Run.

property start_time: datetime

Return the start time.

property end_time: datetime

Return the end time.

Return the web app URL which points to the Continuous Tests page.

This page contains results for all Test Runs. To jump to the view which shows results for this specific Test Run, click on the corresponding time bin in the UI.

Note: this is a string that should be copy-pasted into a browser.

get_category_results_df(show_category_results_metrics: bool = False) DataFrame

Get all category results for a completed stress Test Run in a dataframe.

This gives you the ability to perform granular queries on category tests. This only works on stress test jobs that have succeeded.

Parameters:

show_category_results_metrics – bool Boolean flag to request metrics related to the category results. Defaults to False.

Returns:

A pandas.DataFrame object containing the Test Case results. Here is a selected list of columns in the output: 1. id: ID of the parent Test Run. 2. name: name of the category test. 3. severity: Denotes the severity of the failure of the test. 4. test_batch_types: List of tests that this category uses. 5. failing_test_types: List of failing tests in this category. 6. num_none_severity: Count of tests with NONE severity. 7. num_low_severity: Count of tests with LOW severity. 9. num_high_severity: Count of tests with HIGH severity.

Return type:

pd.DataFrame

Example

# Wait until the job has finished, since this method only works on
# SUCCEEDED jobs.
job.get_status(verbose=True, wait_until_finish=True)
# Get the Test Run result.
test_run = job.get_test_run()
# Dump the Test Cases in dataframe ``df``.
df = test_run.get_category_results_df()
get_result_df() DataFrame

Return high level summary information for a complete stress Test Run in a single-row dataframe.

This dataframe includes information such as model metrics on the reference and evaluation datasets, overall RIME results such as severity across tests, and high level metadata such as the project ID and model task.

Place these rows together to build a table of test to build a table of test run results for comparison. This only works for stress test jobs that have succeeded.

Returns:

A pandas.DataFrame object containing the Test Run result. Use the .columns method on the returned dataframe to see what columns represent. Generally, these columns have information about the model and datasets as well as summary statistics such as the number of failing Test Cases or number of high severity Test Cases.

Return type:

pd.DataFrame

Example

test_run = client.get_test_run(some_test_run_id)
test_run_result_df = test_run.get_result_df()
get_summary_tests_df(show_summary_test_metrics: bool = False) DataFrame

Get summary tests for a completed stress Test Run in a dataframe.

This gives you the ability to perform granular queries on summary tests. This only works on stress test jobs that have succeeded.

Returns:

A pandas.DataFrame object containing the Test Case results. Here is a selected list of columns in the output: 1. id: ID of the parent Test Run. 2. name: name of the summary test. 3. severity: Denotes the severity of the failure of the test. 4. test_batch_types: List of tests that this summary tests uses. 5. failing_test_types: List of tests in this summary test that fail. 6. num_none_severity: Count of tests with NONE severity. 7. num_low_severity: Count of tests with LOW severity. 9. num_high_severity: Count of tests with HIGH severity.

Return type:

pd.DataFrame

Example

# Wait until the job has finished, since this method only works on
# SUCCEEDED jobs.
job.get_status(verbose=True, wait_until_finish=True)
# Get the Test Run result.
test_run = job.get_test_run()
# Dump the Test Cases in dataframe ``df``.
df = test_run.get_summary_tests_df()
get_test_batch(test_type: str) TestBatch

Return the Test Batch object for the specified test type on this Test Run.

A TestBatch object allows a user to query the results for the corresponding test. For example, the TestBatch object representing subset_performance:subset_accuracy allows a user to understand the results of the subset_performance:subset_accuracy test to varying levels of granularity.

Parameters:

test_type – str Name of the test. Structured as test_type:test_name, e.g. subset_performance:subset_accuracy.

Returns:

A TestBatch representing test_type.

Return type:

TestBatch

Example

batch = test_run.get_test_batch("unseen_categorical")
get_test_batches() Iterator[TestBatch]

Return all TestBatch objects for the Test Run.

Returns:

An iterator of TestBatch objects.

Return type:

Iterator[TestBatch]

get_test_cases_df(show_test_case_metrics: bool = False) DataFrame

Return all the Test Cases for a completed stress Test Run in a dataframe.

This enables you to perform granular queries on Test Cases. For example, if you only care about subset performance tests and want to see the results on each feature, you can fetch all the Test Cases in a dataframe, then query on that dataframe by test type. This only works on stress test jobs that have succeeded.

Parameters:

show_test_case_metrics – bool = False Whether to show Test Case specific metrics. This could result in a sparse dataframe that is returned, since Test Cases return different metrics. Defaults to False.

Returns:

A pandas.DataFrame object containing the Test Case results. Here is a selected list of columns in the output:

  1. test_run_id: ID of the parent Test Run.

  2. features: List of features that the Test Case ran on.

  3. test_batch_type: Type of test that was run (e.g. Subset AUC).

  4. status: Status of the Test Case (e.g. Pass, Fail, etc.).

  5. severity: Denotes the severity of the failure of the test.

Return type:

pd.DataFrame

Example

# Wait until the job has finished, since this method only works on
# SUCCEEDED jobs.
job.get_status(verbose=True, wait_until_finish=True)
# Get the Test Run result.
test_run = job.get_test_run()
# Dump the Test Cases in dataframe ``df``.
df = test_run.get_test_cases_df()
class rime_sdk.TestBatch(api_client: ApiClient, test_run_id: str, test_type: str)

An interface for a Test Batch object in a RIME Test Run.

summary(show_batch_metrics: bool = False) Series

Return the Test Batch summary as a Pandas Series.

The summary contains high level information about a Test Batch. For example, the name of the Test Batch, the category, and the severity of the Test Batch as a whole.

Returns:

A Pandas Series with the following columns (and optional additional columns for batch-level metrics):

  1. test_run_id

  2. test_type

  3. test_name

  4. category

  5. duration_in_millis

  6. severity

  7. failing_features

  8. description

  9. summary_counts.total

  10. summary_counts.pass

  11. summary_counts.fail

  12. summary_counts.warning

  13. summary_counts.skip

Return type:

pd.Series

get_test_cases_df() DataFrame

Return all Test Cases in the Test Batch as a Pandas DataFrame.

Different tests will have different columns/information. For example, some tests may have a column representing the number of failing rows.

Returns:

A Pandas Dataframe where each row represents a Test Case.

Return type:

pd.DataFrame

Example

# Wait until the job has finished, since this method only works on
# SUCCEEDED jobs.
job.get_status(verbose=True, wait_until_finish=True)
# Get the Test Run result.
test_run = job.get_test_run()
# Get the "subset accuracy" Test Batch from this Test Run.
test_batch = test_run.get_test_batch("subset_performance:subset_recall")
# Return the Test Cases from this Test Batch in a dataframe ``df``.
df = test_batch.get_test_cases_df()
class rime_sdk.ContinuousTest(api_client: ApiClient, ct_id: str)

An interface to RIME continuous testing.

update_ct(model_id: Optional[str] = None, ref_data_id: Optional[str] = None, scheduled_ct_eval_data_integration_id: Optional[str] = None, scheduled_ct_eval_data_info: Optional[dict] = None, scheduled_ct_eval_pred_integration_id: Optional[str] = None, scheduled_ct_eval_pred_info: Optional[dict] = None, disable_scheduled_ct: Optional[bool] = False) Dict

Update the ContinuousTest with specified model and reference data.

Parameters:
  • model_id – Optional[str] The ID of the model to use for the ContinuousTest.

  • ref_data_id – Optional[str] The ID of the reference data to use for the ContinuousTest.

  • scheduled_ct_eval_data_integration_id – Optional[str] The integration id of the evaluation data for scheduled CT.

  • scheduled_ct_eval_data_info – Optional[Dict] The data info of the evaluation data for scheduled CT.

  • scheduled_ct_eval_pred_integration_id – Optional[str] The integration id of the evaluation prediction for scheduled CT.

  • scheduled_ct_eval_pred_info – Optional[Dict] The data info of the evaluation prediction for scheduled CT.

  • disable_scheduled_ct – Optional[bool] Specifies whether to disable continuous testing

Returns:

Dictionary representation of updated ContinuousTest object.

Return type:

Dict

Raises:

ValueError – This error is generated when no fields are submitted to be updated or when the request to the continuous testing service fails.

Example

response = ct.update_ct(ref_data_id="New reference data ID")
activate_ct_scheduling(data_info: dict, data_integration_id: Optional[str] = None, pred_integration_id: Optional[str] = None, pred_info: Optional[dict] = None) None

Activate scheduled CT.

Parameters:
  • data_info – dict The data info of the evaluation data for scheduled CT.

  • data_integration_id – Optional[str] The integration id of the evaluation data for scheduled CT.

  • pred_integration_id – Optional[str] The integration id of the evaluation prediction for scheduled CT.

  • pred_info – Optional[dict] The prediction info of the evaluation data for scheduled CT.

update_scheduled_ct_info(data_integration_id: Optional[str] = None, data_info: Optional[dict] = None, pred_integration_id: Optional[str] = None, pred_info: Optional[dict] = None) None

Update scheduled CT.

Parameters:
  • data_integration_id – Optional[str] If data_integration_id is not None, it will be used for for scheduled CT.

  • data_info – Optional[dict] If data_info is not None, it will be used for scheduled CT.

  • pred_integration_id – Optional[str] If pred_integration_id is not None, it will be used for for scheduled CT.

  • pred_info – Optional[dict] If pred_info is not None, it will be used for scheduled CT.

deactivate_ct_scheduling() None

Deactivate scheduled CT.

Return a URL to the ContinuousTest.

get_bin_size() timedelta

Return the bin size of this ContinuousTest.

get_ref_data_id() str

Return the ID of the Continuous Test’s current reference set.

get_model_id() str

Return the ID of the ContinuousTest’s current model.

get_scheduled_ct_info() Optional[Dict]

Return the scheduled continuous testing info of this ContinuousTest as a dict.

is_scheduled_ct_enabled() bool

Return whether scheduled continuous testing is enabled for this ContinuousTest.

get_events_df() DataFrame

Get a dataframe of Detected Events for the given ContinuousTest.

Monitors detect Events when degradations occur. For example, a Monitor for the metric “Accuracy” will detect an Event when the value of the model performance metric drops below a threshold.

list_monitors(monitor_types: Optional[List[str]] = None, risk_category_types: Optional[List[str]] = None) Iterator[Monitor]

List Monitors for this ContinuousTest.

Monitors examine time-sequenced data in RIME. Built-in Monitors track model health metrics such as degradations in model performance metrics or attacks on your model. This method can optionally filter by Monitor types or Risk Category types.

Parameters:
  • monitor_types – Optional[List[str]] Modifies query to return the set of built-in monitors or user-created custom monitors. Accepted values: [“Default”, “Custom”]

  • risk_category_types – Optional[List[str]] Modifies query to return monitors pertaining to certain categories of AI Risk. For instance, monitors that track model performance help you track down Operational Risk. Accepted values: [“Operational”, “Bias_and_Fairness”, “Security”, “Custom”]

Returns:

A generator of Monitor objects.

Return type:

Iterator[Monitor]

Raises:

ValueError – This error is generated when unrecognized filtering parameters are provided or when the request to the service fails.

Example

# List all default Monitors
monitors = ct.list_monitors(monitor_types=["Default"])
# For each Monitor, list all detected Events.
all_events = [monitor.list_detected_events() for monitor in monitors]
start_continuous_test(eval_data_id: str, override_existing_bins: bool = False, agent_id: Optional[str] = None, ram_request_megabytes: Optional[int] = None, cpu_request_millicores: Optional[int] = None, random_seed: Optional[int] = None, rime_managed_image: Optional[str] = None, custom_image: Optional[RuntimeinfoCustomImage] = None, **exp_fields: Dict[str, object]) ContinuousTestJob

Start a Continuous Testing run.

Runs a Continuous Testing job on a batch of data.

Parameters:
  • eval_data_id – str ID of the evaluation data.

  • override_existing_bins – bool Specifies whether to override existing bins.

  • ram_request_megabytes – Optional[int] Megabytes of RAM set as the Kubernetes pod limit for the Stress Test Job. The default is 4000MB.

  • cpu_request_millicores – Optional[int] Millicores of CPU set as the Kubernetes pod limit for the Stress Test Job. The default is 1500mi.

  • random_seed – Optional[int] Random seed to use for the Job, so that Test Job result will be deterministic.

  • agent_id – Optional[str] ID for the Agent where the Continuous Test will be run. Uses the default Agent for the workspace when not specified.

  • rime_managed_image – Optional[str] Name of a Managed Image to use when running the model test. The image must have all dependencies required by your model. To create new Managed Images with your desired dependencies, use the client’s create_managed_image() method.

  • custom_image – Optional[RuntimeinfoCustomImage] Specification of a customized container image to use running the model test. The image must have all dependencies required by your model. The image must specify a name for the image and optionally a pull secret (of type RuntimeinfoCustomImagePullSecret) with the name of the kubernetes pull secret used to access the given image.

  • exp_fields – Dict[str, object] Fields for experimental features.

Returns:

A Job object corresponding to the model Continuous Test Job.

Return type:

ContinuousTestJob

Raises:

ValueError – This error is generated when the request to the ModelTesting service fails.

Example

ct = project.get_ct()
eval_data_id = client.register_dataset("example dataset", data_config)
job = ct.start_continuous_test(
    eval_data_id=eval_data_id,
    ram_request_megabytes=8000,
    cpu_request_millicores=2000,
)
class rime_sdk.ImageBuilder(api_client: ApiClient, name: str, requirements: Optional[List[ManagedImagePipRequirement]] = None, package_requirements: Optional[List[ManagedImagePackageRequirement]] = None, python_version: Optional[str] = None)

An interface to an Image Builder object.

get_status(verbose: bool = True, wait_until_finish: bool = False, poll_rate_sec: float = 5.0) Dict

Return the status of the Image Build Job.

This method includes a toggle to wait until the Image Build job finishes.

Parameters:
  • verbose – bool Specifies whether to print diagnostic information such as logs. By default, this value is set to True.

  • wait_until_finish – bool Specifies whether to block until the Image is READY or FAILED. By default, this value is set to False.

  • poll_rate_sec – float The frequency with which to poll the Image’s build status. By default, this value is set to 5 seconds.

Returns:

A dictionary representing the Image’s state.

Return type:

Dict

class rime_sdk.DataCollector(api_client: ApiClient, project_id: str)

An interface to a Data Collector.

A Data Collector allows users to log datapoints to be used in RIME at either a datapoint or batch level in a continuous stream.

register_data_stream() str

Register a data stream with the Data Collector.

A data stream is a location to which data can be uploaded.

Returns:

The ID of the registered data stream.

Return type:

str

log_datapoints(data_stream_id: str, inputs: List[Dict], timestamps: Optional[List[datetime]] = None, labels: Optional[List[Union[Dict, int, float]]] = None, query_ids: Optional[List[Union[str, float, int]]] = None) List[str]

Log datapoints in batches.

Parameters:
  • data_stream_id – str The ID of the data stream to log the datapoints.

  • inputs – List[Dict] List of inputs to log to the Data Collector. Provide each input should as a dictionary. Feature names are dictionary keys, with their corresponding values.

  • timestamps – Optional[List[datetime]] List of optional timestamps associated with each input. The default value is the timestamp when the log_datapoints method is called.

  • labels – Optional[List[Union[Dict, int, float]]] List of optional labels associated with each input.

  • query_ids – Optional[List[Union[str, float, int]]] List of optional query IDs associated with each input. This parameter is only relevant for ranking use cases.

Returns:

List of the logged datapoint IDs.

Return type:

List[str]

Raises:

ValueError – This error is generated when the length of the inputs, timestamps, labels, or query_ids lists are not equal.

Example: This example registers a data stream and logs two datapoints to the registered data stream.

data_stream_id = data_collector.register_data_stream()
datapoint_ids = data_collector.log_datapoints(
data_stream_id=data_stream_id,
    inputs=[
        {"feature_1": 1, "feature_2": 2},
        {"feature_1": 3, "feature_2": 4},
    ],
    timestamps=[
        datetime(2020, 1, 1, 0, 0, 0),
        datetime(2020, 1, 1, 0, 0, 1),
    ],
    labels=[{"label": "label_1"}, {"label": "label_2"}],
)
log_predictions(model_id: str, datapoint_ids: List[str], predictions: List[Dict]) None

Log predictions to the Data Collector.

Parameters:
  • model_id – str The ID of the model to log the predictions.

  • datapoint_ids – List[str] List of datapoint IDs associated with each prediction.

  • predictions – List[Dict] List of predictions to log to the Data Collector. Provide each prediction as a dictionary. Feature names are dictionary keys, with their corresponding values.

Raises:

ValueError – This error is generated when the length of the datapoint_ids and predictions lists are not equal.

Example: This example logs two predictions to the Data Collector.

data_collector.log_predictions(
    model_id="model_id",
    datapoint_ids=["datapoint_id_1", "datapoint_id_2"],
    predictions=[
        {"prediction": "prediction_1"}, {"prediction": "prediction_2"}
    ],
)
class rime_sdk.Registry(api_client: ApiClient)

An interface to a RIME Registry.

register_dataset(project_id: str, name: str, data_config: dict, integration_id: Optional[str] = None, tags: Optional[List[str]] = None, metadata: Optional[dict] = None, ct_info: Optional[dict] = None, skip_validation: Optional[bool] = False, agent_id: Optional[str] = None) str

Register and validate a new dataset in a Project.

Parameters:
  • project_id – str The ID of the Project in which to register the dataset.

  • name – str The chosen name of the dataset.

  • data_config – dict A dictionary that contains the data configuration. The data configuration must match the API specification of the data_info field in the RegisterDataset request.

  • integration_id – Optional[str] = None, Provide the integration ID for datasets that require an integration.

  • tags – Optional[List[str]] = None, An optional list of tags to associate with the dataset.

  • metadata – Optional[dict] = None, An optional dictionary of metadata to associate with the dataset.

  • ct_info – Optional[dict] = None, An optional dictionary that contains the CT info. The CT info must match the API specification of the ct_info field in the RegisterDataset request.

  • skip_validation – Optional[bool] = False, The param is deprecated, validate is always performed.

  • agent_id – Optional[str] = None, Agent for running validation. If omitted the workspace’s default agent will be used.

Returns:

The ID of the newly registered dataset.

Return type:

str

Raises:

ValueError – This error is generated when the request to the Registry service fails.

Example

dataset_id = registry.register_dataset(
    name=DATASET_NAME,
    data_config={
        "connection_info": {"data_file": {"path": FILE_PATH}},
        "data_params": {"label_col": LABEL_COL},
    },
    integration_id=INTEGRATION_ID,
)
register_and_validate_dataset(project_id: str, name: str, data_config: dict, integration_id: Optional[str] = None, tags: Optional[List[str]] = None, metadata: Optional[dict] = None, ct_info: Optional[dict] = None, agent_id: Optional[str] = None) Tuple[str, Optional[Job]]

Register and validate a new dataset in a Project.

Parameters:
  • project_id – str The ID of the Project in which to register the dataset.

  • name – str The chosen name of the dataset.

  • data_config – dict A dictionary that contains the data configuration. The data configuration must match the API specification of the data_info field in the RegisterDataset request.

  • integration_id – Optional[str] = None, Provide the integration ID for datasets that require an integration.

  • tags – Optional[List[str]] = None, An optional list of tags to associate with the dataset.

  • metadata – Optional[dict] = None, An optional dictionary of metadata to associate with the dataset.

  • ct_info – Optional[dict] = None, An optional dictionary that contains the CT info. The CT info must match the API specification of the ct_info field in the RegisterDataset request.

  • agent_id – Optional[str] = None, Agent for running validation. If omitted the workspace’s default agent will be used.

Returns:

The returned Tuple contains the ID of the newly registered dataset and the Job object that represents the validation job.

Return type:

Tuple[str, Optional[Job]]

Raises:

ValueError – This error is generated when the request to the Registry service fails.

Example

dataset_id = registry.register_dataset(
    name=DATASET_NAME,
    data_config={
        "connection_info": {"data_file": {"path": FILE_PATH}},
        "data_params": {"label_col": LABEL_COL},
    },
    integration_id=INTEGRATION_ID,
)
register_model(project_id: str, name: str, model_config: Optional[dict] = None, tags: Optional[List[str]] = None, metadata: Optional[dict] = None, external_id: Optional[str] = None, integration_id: Optional[str] = None, model_endpoint_integration_id: Optional[str] = None, skip_validation: Optional[bool] = False, agent_id: Optional[str] = None) str

Register and validate a new model in a Project.

Parameters:
  • project_id – str The ID of the Project in which to register the model.

  • name – str The chosen name of the model.

  • model_config – Optional[dict] = None, A dictionary that contains the model configuration. Any model configuration that is provided must match the API specification for the model_info field of the RegisterModel request.

  • tags – Optional[List[str]] = None, An optional list of tags to associate with the model.

  • metadata – Optional[dict] = None, An optional dictionary of metadata to associate with the model.

  • external_id – Optional[str] = None, An optional external ID that can be used to identify the model.

  • integration_id – Optional[str] = None, Provide the integration ID for models that require an integration for access.

  • model_endpoint_integration_id – Optional[str] = None, Provide the integration ID for models that require an integration when running the model.

  • skip_validation – Optional[bool] = False, The param is deprecated, validate is always performed.

  • agent_id – Optional[str] = None, Agent for running validation. If omitted the workspace’s default agent will be used.

Returns:

The ID of the newly registered model.

Return type:

str

Raises:

ValueError – This error is generated when the request to the Registry service fails.

Example

model_id = registry.register_model(
    name=MODEL_NAME,
    model_config={
        "hugging_face": {
            "model_uri": URI,
            "kwargs": {
                "tokenizer_uri": TOKENIZER_URI,
                "class_map": MAP,
                "ignore_class_names": True,
            },
        }
    },
    tags=[MODEL_TAG],
    metadata={KEY: VALUE},
    external_id=EXTERNAL_ID,
    agent_id=AGENT_ID,
)
register_and_validate_model(project_id: str, name: str, model_config: Optional[dict] = None, tags: Optional[List[str]] = None, metadata: Optional[dict] = None, external_id: Optional[str] = None, integration_id: Optional[str] = None, model_endpoint_integration_id: Optional[str] = None, agent_id: Optional[str] = None) Tuple[str, Optional[Job]]

Register and validate a new model in a Project.

Parameters:
  • project_id – str The ID of the Project in which to register the model.

  • name – str The chosen name of the model.

  • model_config – Optional[dict] = None, A dictionary that contains the model configuration. Any model configuration that is provided must match the API specification for the model_info field of the RegisterModel request.

  • tags – Optional[List[str]] = None, An optional list of tags to associate with the model.

  • metadata – Optional[dict] = None, An optional dictionary of metadata to associate with the model.

  • external_id – Optional[str] = None, An optional external ID that can be used to identify the model.

  • integration_id – Optional[str] = None, Provide the integration ID for models that require an integration for access.

  • model_endpoint_integration_id – Optional[str] = None, Provide the integration ID for models that require an integration when running the model.

  • agent_id – Optional[str] = None, Agent for running validation. If omitted the workspace’s default agent will be used.

Returns:

The returned Tuple contains the ID of the newly registered dataset and the Job object that represents the validation job.

Return type:

Tuple[str, Optional[Job]]

Raises:

ValueError – This error is generated when the request to the Registry service fails.

Example

model_id = registry.register_model(
    name=MODEL_NAME,
    model_config={
        "hugging_face": {
            "model_uri": URI,
            "kwargs": {
                "tokenizer_uri": TOKENIZER_URI,
                "class_map": MAP,
                "ignore_class_names": True,
            },
        }
    },
    tags=[MODEL_TAG],
    metadata={KEY: VALUE},
    external_id=EXTERNAL_ID,
    agent_id=AGENT_ID,
)
register_predictions(project_id: str, dataset_id: str, model_id: str, pred_config: dict, integration_id: Optional[str] = None, tags: Optional[List[str]] = None, metadata: Optional[dict] = None, skip_validation: Optional[bool] = False, agent_id: Optional[str] = None) None

Register and validate a new set of predictions for a model and a dataset.

Parameters:
  • project_id – str The ID of the Project to which the models belong.

  • dataset_id – str, The ID of the dataset used to generate the predictions.

  • model_id – str, The ID of the model used to generate the predictions.

  • pred_config – dict, A dictionary that contains the prediction configuration. The prediction configuration must match the API specification for the pred_info field of the RegisterPredictions request.

  • integration_id – Optional[str] = None, Provide the integration ID for predictions that require an integration to use.

  • tags – Optional[List[str]] = None, An optional list of tags to associate with the predictions.

  • metadata – Optional[dict] = None, An optional dictionary of metadata to associate with the predictions.

  • skip_validation – Optional[bool] = False, The param is deprecated, validate is always performed.

  • agent_id – Optional[str] = None, Agent for running validation. If omitted the workspace’s default agent will be used.

Returns:

None

Raises:

ValueError – This error is generated when the request to the Registry service fails.

Example

registry.register_predictions(
    dataset_id=DATASET_ID,
    model_id=MODEL_ID,
    pred_config={
        "connection_info": {
            "databricks": {
                # Unix timestamp equivalent to 02/08/2023
                "start_time": 1675922943,
                # Unix timestamp equivalent to 03/08/2023
                "end_time": 1678342145,
                "table_name": TABLE_NAME,
                "time_col": TIME_COL,
            },
        },
        "pred_params": {"pred_col": PREDS},
    },
    tags=[TAG],
    metadata={KEY: VALUE},
)
register_and_validate_predictions(project_id: str, dataset_id: str, model_id: str, pred_config: dict, integration_id: Optional[str] = None, tags: Optional[List[str]] = None, metadata: Optional[dict] = None, agent_id: Optional[str] = None) Optional[Job]

Register and validate a new set of predictions for a model and a dataset.

Parameters:
  • project_id – str The ID of the Project to which the models belong.

  • dataset_id – str, The ID of the dataset used to generate the predictions.

  • model_id – str, The ID of the model used to generate the predictions.

  • pred_config – dict, A dictionary that contains the prediction configuration. The prediction configuration must match the API specification for the pred_info field of the RegisterPredictions request.

  • integration_id – Optional[str] = None, Provide the integration ID for predictions that require an integration to use.

  • tags – Optional[List[str]] = None, An optional list of tags to associate with the predictions.

  • metadata – Optional[dict] = None, An optional dictionary of metadata to associate with the predictions.

  • agent_id – Optional[str] = None, Agent for running validation. If omitted the workspace’s default agent will be used.

Returns:

The job object that represents the validation job.

Return type:

job

Raises:

ValueError – This error is generated when the request to the Registry service fails.

Example

registry.register_predictions(
    dataset_id=DATASET_ID,
    model_id=MODEL_ID,
    pred_config={
        "connection_info": {
            "databricks": {
                # Unix timestamp equivalent to 02/08/2023
                "start_time": 1675922943,
                # Unix timestamp equivalent to 03/08/2023
                "end_time": 1678342145,
                "table_name": TABLE_NAME,
                "time_col": TIME_COL,
            },
        },
        "pred_params": {"pred_col": PREDS},
    },
    tags=[TAG],
    metadata={KEY: VALUE},
)
list_datasets(project_id: str) Iterator[Dict]

Return a list of datasets.

Parameters:

project_id – str The ID of the Project to which the datasets belong.

Returns:

Iterator of dictionaries: each dictionary represents a dataset.

Return type:

Iterator[Dict]

Raises:

ValueError – This error is generated when the request to the Registry service fails.

list_models(project_id: str) Iterator[Dict]

Return a list of models.

Parameters:

project_id – str The ID of the Project to which the models belong.

Returns:

Iterator of dictionaries: each dictionary represents a model.

Return type:

Iterator[Dict]

Raises:

ValueError – This error is generated when the request to the Registry service fails.

list_predictions(project_id: str, model_id: Optional[str] = None, dataset_id: Optional[str] = None) Iterator[Dict]

Return a list of prediction sets.

Parameters:
  • project_id – str The ID of the Project to which the models belong.

  • model_id – Optional[str] = None The ID of the model to which the prediction sets belong.

  • dataset_id – Optional[str] = None The ID of the dataset to which the prediction sets belong.

Returns:

Iterator of dictionaries: each dictionary represents a prediction set.

Return type:

Iterator[Dict]

Raises:

ValueError – This error is generated when the request to the Registry service fails.

get_dataset(dataset_id: Optional[str] = None, dataset_name: Optional[str] = None) Dict

Return a dataset.

Parameters:
  • dataset_id – str The ID of the dataset to retrieve.

  • dataset_name – str The name of the dataset to retrieve.

Returns:

A dictionary representing the dataset.

Return type:

Dict

Raises:

ValueError – This error is generated when the request to the Registry service fails.

has_dataset(dataset_id: Optional[str] = None, dataset_name: Optional[str] = None) bool

Return a boolean on whether the dataset is present.

Parameters:
  • dataset_id – Optional[str] = None The ID of the dataset to check for.

  • dataset_name – Optional[str] = None The name of the dataset to check for.

Returns:

A boolean on whether the dataset is present.

Return type:

bool

Raises:

ValueError – This error is generated any error other than HTTPStatus.NOT_FOUND is returned from the Registry service.

get_model(model_id: Optional[str] = None, model_name: Optional[str] = None) Dict

Return a model.

Parameters:
  • model_id – str The ID of the model to retrieve.

  • model_name – str The name of the model to retrieve.

Returns:

A dictionary representing the model.

Return type:

Dict

Raises:

ValueError – This error is generated when the request to the Registry service fails.

get_predictions(model_id: str, dataset_id: str) Dict

Get a prediction set.

Parameters:
  • model_id – str The ID of the model used to generate the predictions.

  • dataset_id – str The ID of the dataset used to generate the predictions.

Returns:

A dictionary that contains the prediction set.

Return type:

Dict

Raises:

ValueError – This error is generated when the request to the Registry service fails.

delete_dataset(dataset_id: str) None

Delete a dataset.

Parameters:

dataset_id – str The ID of the dataset to delete.

Raises:

ValueError – This error is generated when the request to the Registry service fails.

delete_model(model_id: str) None

Delete a model.

Parameters:

model_id – str The ID of the model to delete.

Raises:

ValueError – This error is generated when the request to the Registry service fails.

delete_predictions(model_id: str, dataset_id: str) None

Delete a prediction set.

Parameters:
  • model_id – str The ID of the model used to generate the predictions.

  • dataset_id – str The ID of the dataset used to generate the predictions.

Raises:

ValueError – This error is generated when the request to the Registry service fails.

static log_registry_validation(registry_resp: dict, registry_type: str, registry_id: str) None

Log the validation status of a registry.

class rime_sdk.DetectionEvent(api_client: ApiClient, event_dict: dict)

An interface to a Detection Event.

RIME surfaces Detection Events to indicate problems with a model that is in production or during model validation.

to_dict() dict

Return a dictionary representation of the Event object.

class rime_sdk.Monitor(api_client: ApiClient, monitor_id: str, firewall_id: str, project_id: str)

An interface to a Monitor object.

Monitors track important model events over time including metric degradations or attacks on your model.

update(notify: Optional[bool] = None) None

Update the settings for the given Monitor in the backend.

Parameters:

notify – Optional[bool] A Boolean that specifies whether to enable Monitoring notifications for a given monitor. When Monitoring notifications are turned on for the same Project and the monitor finds a Detection Event, the system sends an alert.

list_detected_events() Iterator[DetectionEvent]

List detected Events for the given Monitor.

For each continuous testing bin upload, RIME compares the metric value to the Monitor’s thresholds and creates detection events when a degradation is detected. For a subset of Monitors, we perform Root Cause Analysis to explain the detailed cause of the Event.

Returns:

A generator of dictionary representations of Detection Events. They are sorted in reverse chronological order by the time at which the event occurred.

Return type:

Iterator[DetectionEvent]

Example

# List all default Monitors on the Firewall
monitors = firewall.list_monitors(monitor_types=["Default"])
# For each Monitor, list all detected Events.
all_events = [monitor.list_detected_events() for monitor in monitors]
class rime_sdk.FirewallClient(domain: str, auth_token: str = '', channel_timeout: float = 60.0)

An interface to connect to FirewallInstances on a firewall cluster.

Create a firewall instance by specifying the rule configuration. It will take anywhere from a few seconds to a few minutes to spin up, but once it is ready, it can respond to validation requests with the custom configuration. A single firewall cluster can have many firewall instances. They are independent from each other.

Parameters:
  • domain – str The base domain/address of the firewall.

  • auth_token – str The auth token is generated in the Firewall UI and is used to authenticate to the firewall. If the auth_token is provided, you do not need to provide an api_key. Auth tokens are only available when the Firewall UI has been enabled.

  • channel_timeout – float The amount of time in seconds to wait for responses from the firewall.

login(email: str, system_account: bool = False) None

Login to obtain an auth token.

Parameters:
  • email – str The user’s email address that is used to authenticate.

  • system_account – bool This flag specifies whether it is for a system account token or not.

Example

firewall.login("[email protected]", True)
list_firewall_instances() Iterable[FirewallInstance]

List the FirewallInstances for the given cluster.

create_firewall_instance(rule_config: dict, description: str = '', block_until_ready: bool = True, block_until_ready_verbose: Optional[bool] = None, block_until_ready_timeout_sec: Optional[float] = 900.0, block_until_ready_poll_rate_sec: Optional[float] = None) FirewallInstance

Create a FirewallInstance with the specified rule configuration.

This method blocks until the FirewallInstance is ready.

Parameters:
  • rule_config – dict Dictionary containing the rule config to customize the behavior of the FirewallInstance.

  • description – str = “” Human-readable description of the FirewallInstance.

  • block_until_ready – bool = True Whether to block until the FirewallInstance is ready.

  • block_until_ready_verbose – Optional[bool] = None Whether to print out information while waiting for the FirewallInstance to come up.

  • block_until_ready_timeout_sec – Optional[float] = None How many seconds to wait until the FirewallInstance comes up before timing out.

  • block_until_ready_poll_rate_sec – Optional[float] = None How often to poll the FirewallInstance status.

Returns:

FirewallInstance that is ready to accept validation requests.

Raises:

TimeoutError – This error is generated if the FirewallInstance is not ready by the deadline set through timeout_sec.

get_firewall_instance(firewall_instance_id: str) FirewallInstance

Get a FirewallInstance from the cluster.

Parameters:

firewall_instance_id – str The UUID string of the FirewallInstance to retrieve.

Returns:

A firewall instance on which to perform validation.

Return type:

FirewallInstance

delete_firewall_instance(firewall_instance_id: str) None

Delete a FirewallInstance from the cluster.

Careful when deleting a FirewallInstance: in-flight validation traffic will be interrupted.

Parameters:

firewall_instance_id – str The UUID string of the FirewallInstance to hard delete.

class rime_sdk.FirewallInstance(firewall_instance_id: str, api_client: ApiClient)

An interface to a single instance of the firewall running on a cluster.

Each FirewallInstance has its own rule configuration and can be accessed by its unique ID. This allows users to customize the behavior of the firewall for different use cases. Note: FirewallInstance should not be instantiated directly, but instead instantiated through methods of the FirewallClient.

Parameters:
  • firewall_instance_id – str The unique ID of the FirewallInstance.

  • api_client – ApiClient API client for interacting with the cluster.

validate(user_input_text: Optional[str] = None, output_text: Optional[str] = None, contexts: Optional[List[str]] = None) dict

Validate model input and/or output text.

get_effective_config() dict

Get the effective configuration for the FirewallInstance.

This effective configuration has default values filled in and shows what is actually being used at runtime.

block_until_ready(verbose: bool = True, timeout_sec: float = 300.0, poll_rate_sec: float = 5.0, consecutive_ready_count: int = 1) None

Block until ready blocks until the FirewallInstance is ready.

Raises:

TimeoutError – This error is raised if the FirewallInstance is not ready by the deadline set through timeout_sec.

update_firewall_instance(config: Optional[dict] = None, description: Optional[str] = None, block_until_ready: bool = True, block_until_ready_verbose: Optional[bool] = None, block_until_ready_timeout_sec: Optional[float] = None, block_until_ready_poll_rate_sec: Optional[float] = None, block_until_consecutive_ready_count: Optional[int] = 2) None

Update the config or description of the FirewallInstance.

Parameters:
  • config – str New config for the FirewallInstance.

  • description – str New description for the FirewallInstance.

  • block_until_ready – bool = True Whether to block until the FirewallInstance is ready.

  • block_until_ready_verbose – Optional[bool] = None Whether to print out information while waiting for the FirewallInstance to come up.

  • block_until_ready_timeout_sec – Optional[float] = None How many seconds to wait until the FirewallInstance comes up before timing out.

  • block_until_ready_poll_rate_sec – Optional[float] = None How often to poll the FirewallInstance status.

  • block_until_consecutive_ready_count – Optional[int] = 2 Number of consecutive READY status poll to wait for, before returning.

property rule_config: dict

Access the rule config of the FirewallInstance.

This config is immutable after it is created.

property firewall_instance_id: str

Access the UUID of the FirewallInstance.

property status: str

Access the current status of the FirewallInstance.

property description: str

Access the description of the FirewallInstance.