Python SDK

The RIME SDK provides an interface to RIME backend services for starting and viewing the progress of RIME stress test jobs. There are four objects available in the rime_sdk package:

RIMEClient

The RIMEClient provides an interface to RIME’s backend services for creating projects, and querying the backend for current stress test jobs.

rime_sdk.RIMEClient

RIMEStressTestJob

This object provides an interface for continuous tests the status of a stress test job in the RIME backend.

rime_sdk.RIMEStressTestJob

RIMEProject

This object describes a project in the RIME backend.

rime_sdk.RIMEProject

RIMEFirewall

This object describes a RIMEFirewall in the RIME Backend

rime_sdk.RIMEFirewall

RIMEImageBuilder

An interface to a RIME image builder

rime_sdk.RIMEImageBuilder

Python package providing access to RIME’s backend sevices.

class rime_sdk.RIMEFirewall(backend: RIMEBackend, firewall_id: str)

RIMEFirewall object wrapper with helpful methods for working with RIME Firewall.

backend: RIMEBackend The RIME backend used to query about the status of the job.

firewall_id: str How to refer to the FW in the backend. Use this attribute to specify the Firewall for tasks in the backend.

update_firewall_stress_test_run(stress_test_run_id: str) → UpdateFirewallResponse

Update firewall with stress test run id.

Parameters:: stress_test_run_id – Stress Test Run Id to configure new firewall
Returns:: None
Raises:: ValueError – If the provided status_filters array has invalid values. If the request to the ModelTest service failed.

get_link() → str

Get the web app URL to the firewall.

This link directs to your organization’s deployment of RIME. You can view more detailed information about the firewall in the web app, including helpful visualizations, key insights on your model’s performance, and explanations of test results for each batch.

Note: this is a string that should be copy-pasted into a browser.

run_firewall_incremental_data(test_run_config: dict, disable_firewall_events: bool = True, custom_image: Optional[CustomImage] = None, rime_managed_image: Optional[str] = None, ram_request_megabytes: Optional[int] = None, cpu_request_millicores: Optional[int] = None) → RIMEStressTestJob

Start a RIME model firewall test on the backend’s ModelTesting service.

This allows you to run Firewall Test job on the RIME backend. This will run firewall on a batch of tabular data.

Parameters:

test_run_config – dict Configuration for the test to be run, which specifies paths to the model and datasets to used for the test.
custom_image – Optional[CustomImage] Specification of a customized container image to use running the model test. The image must have all dependencies required by your model. The image must specify a name for the image and optional a pull secret (of type CustomImage.PullSecret) with the name of the kubernetes pull secret used to access the given image.
rime_managed_image – Optional[str] Name of a managed image to use when running the model test. The image must have all dependencies required by your model. To create new managed images with your desired dependencies, use the client’s create_managed_image() method.
ram_request_megabytes – int Megabytes of RAM requested for the stress test job. If none specified, will default to 4000MB. The limit is 2x the megabytes requested.
cpu_request_millicores – int Millicores of CPU requested for the stress test job. If none specified, will default to 1500mi. The limit is 2x the millicores requested.

Returns:

A RIMEStressTestJob providing information about the model stress test job.

Raises:

ValueError – If the request to the ModelTest service failed.

Example:

# This example will likely not work for you because it requires permissions
# to a specific S3 bucket. This demonstrates how you might specify such a
# configuration.
incremental_config = {
    "eval_path": "s3://rime-datasets/
       fraud_continuous_testing/eval_2021_04_30_to_2021_05_01.csv",
    "timestamp_col": "timestamp"
}
# Run the job using the specified config and the default Docker image in
# the RIME backend. Use the RIME Managed Image "tensorflow115".
# This assumes you have already created the Managed Image and waited for it
# to be ready.
firewall = rime_client.get_firewall("foo")
job =
    firewall.run_firewall_incremental_data(
        test_run_config=incremental_config,
        rime_managed_image="tensorflow115",
        ram_request_megabytes=8000,
        cpu_request_millicores=2000)

class rime_sdk.RIMEClient(domain: str, api_key: str = '', channel_timeout: float = 5.0, disable_tls: bool = False)

The RIMEClient provides an interface to RIME’s backend services for creating projects, starting stress test jobs, and querying the backend for current stress test jobs.

To initialize the RIMEClient, provide the address of your RIME instance.

Parameters:

domain – str The base domain/address of the RIME service.
api_key – str The api key providing authentication to RIME services.
channel_timeout – float The amount of time in seconds to wait for channels to become ready when opening connections to gRPC servers.

Raises:

ValueError – If a connection cannot be made to a backend service within timeout.

Example:

rime_client = RIMEClient("my_vpc.rime.com", "api-key")

create_project(name: str, description: str) → RIMEProject

Create a new RIME project in RIME’s backend.

Projects allow you to organize stress test runs as you see fit. A natural way to organize stress test runs is to create a project for each specific ML task, such as predicting whether a transaction is fradulent.

Parameters:

name – str Name of the new project.
description – str Description of the new project.

Returns:

A RIMEProject that describes the created project. Its project_id attribute can be used in start_stress_test() and list_stress_test_jobs().

Raises:

ValueError – If the request to the Upload service failed.

Example:

project = rime_client.create_project(name='foo', description='bar')

create_managed_image(name: str, requirements: List[PipRequirement]) → RIMEImageBuilder

Create a new managed Docker image with the desired PIP requirements to run RIME on.

These managed Docker images are managed by the RIME backend and will automatically be upgraded when you update your version of RIME. Note: Images take a few minutes to be built.

This method returns an object that can be used to track the progress of the image building job. The new custom image is only available for use in a stress test once it has status READY.

Parameters:

name – str The (unique) name of the new managed image. This acts as the unique identifier of the managed image. The call will fail if an image with the specified name already exists.
requirements – List[ManagedImage.PipRequirement] List of additional pip requirements to be installed on the managed image. A ManagedImage.PipRequirement can be created with the helper method RIMEClient.pip_requirement. The first argument is the name of the library (e.g. tensorflow or xgboost) and the second argument is a valid pip version specifier (e.g. >=0.1.2 or ==1.0.2).

Returns:

A RIMEImageBuilder object that provides an interface for monitoring the job in the backend.

Raises:

ValueError – If the request to the ImageRegistry service failed.

Example:

requirements = [
     # Fix the version of `xgboost` to `1.0.2`.
     rime_client.pip_requirement("xgboost", "==1.0.2"),
     # We do not care about the installed version of `tensorflow`.
     rime_client.pip_requirement("tensorflow")
 ]

# Start a new image building job
builder_job = rime_client.create_managed_image("xgboost102_tensorflow",
requirements)

# Wait until the job has finished and print out status information.
# Once this prints out the `READY` status, your image is available for
# use in stress tests.
builder_job.get_status(verbose=True, wait_until_finish=True)

static pip_requirement(name: str, version_specifier: Optional[str] = None) → PipRequirement: Construct a PipRequirement object for use in create_managed_image().

static pip_library_filter(name: str, fixed_version: Optional[str] = None) → PipLibraryFilter: Construct a PipLibraryFilter object for use in list_managed_images().

list_managed_images(pip_library_filters: Optional[List[PipLibraryFilter]] = None, page_token: str = '', page_size: int = 100) → Tuple[List[Dict], str]

List all the managed Docker images.

This is where the true power of the managed images feature lies. You can search for images with specific pip libraries installed so that you do not have to create a new managed image every time you need to run a stress test.

Parameters:

pip_library_filters – Optional[List[ListImagesRequest.PipLibraryFilter]] Optional list of pip libraries to filter by. Construct each ListImagesRequest.PipLibraryFilter object with the pip_library_filter convenience method.
page_token – str = “” This identifies to the page of results to retrieve, and used for paginating the API results. To get access to the next page of results, use the second value in the tuple returned by the previous call. Leave empty to retrieve the first page of results. used for paginating the API results.
page_size – int = 100 This is the limit on the size of the page of results. The default value is to return at most 100 managed images.

Returns:

A Tuple[List[Dict], str] of the list of managed images as dicts and the next page token.

Raises:

ValueError – If the request to the ImageRegistry service failed or the list of pip library filters is improperly specified.

Example:

# Filter for an image with catboost1.0.3 and tensorflow installed.
filters = [
    rime_client.pip_library_filter("catboost", "1.0.3"),
    rime_client.pip_library_filter("tensorflow"),
]

# Query for the images.
images, next_page_token = rime_client.list_managed_images(
    pip_library_filters=filters)

# List comprehension to get all the names of the images.
names = [x["name"] for x in images]

list_projects(page_token: str = '', page_size: int = 100) → Tuple[List[RIMEProject], str]

List projects in a paginated form.

Parameters:

page_token – str = “” This identifies to the page of results to retrieve, and used for paginating the API results. To get access to the next page of results, use the second value in the tuple returned by the previous call. Leave empty to retrieve the first page of results. used for paginating the API results.
page_size – int = 200 This is the limit on the size of the page of results. The default value is to return at most 200 projects.

Returns:

A Tuple[List[RIMEProject], str] of the list of projects and the next page token.

Raises:

ValueError – If the request to the ProjectManager service fails.

Example:

# Query for 100 projects.
projects, next_page_token, number = rime_client.list_projects()

start_stress_test(test_run_config: dict, project_id: Optional[str] = None, custom_image: Optional[CustomImage] = None, rime_managed_image: Optional[str] = None, ram_request_megabytes: Optional[int] = None, cpu_request_millicores: Optional[int] = None, data_type: str = 'tabular') → RIMEStressTestJob

Start a RIME model stress test on the backend’s ModelTesting service.

Parameters:

test_run_config – dict Configuration for the test to be run, which specifies paths to the model and datasets to used for the test.
project_id – Optional[str] Identifier for the project where the resulting test run will be stored. If not specified, the results will be stored in the default project.
custom_image – Optional[CustomImage] Specification of a customized container image to use running the model test. The image must have all dependencies required by your model. The image must specify a name for the image and optional a pull secret (of type CustomImage.PullSecret) with the name of the kubernetes pull secret used to access the given image.
rime_managed_image – Optional[str] Name of a managed image to use when running the model test. The image must have all dependencies required by your model. To create new managed images with your desired dependencies, use the client’s create_managed_image() method.
ram_request_megabytes – int Megabytes of RAM requested for the stress test job. The limit is 2x the megabytes requested.
cpu_request_millicores – int Millicores of CPU requested for the stress test job. The limit is 2x the millicores requested.
data_type – str Type of data this firewall test is to be run on. Should be one of tabular, nlp, images. Defaults to tabular.

Returns:

A RIMEStressTestJob providing information about the model stress test job.

Raises:

ValueError – If the request to the ModelTest service failed.

Example

This example will likely not work for you because it requires permissions to a specific S3 bucket. This demonstrates how you might specify such a configuration.

config = {
    "run_name": "Titanic",
    "data_info": {
        "label_col": "Survived",
        "ref_path": "s3://rime-datasets/titanic/titanic_example.csv",
        "eval_path": "s3://rime-datasets/titanic/titanic_example.csv"
    },
    "model_info": {
        "path": "s3://rime-models/titanic_s3_test/titanic_example_model.py"
    }
}

Run the job using the specified config and the default Docker image in the RIME backend. Store the results under project ID foo. Use the RIME Managed Image tensorflow115. This assumes you have already created the Managed Image and waited for it to be ready.

job = rime_client.start_stress_test_job(
 test_run_config=config, project_id="foo",
 rime_managed_image="tensorflow115")

list_stress_test_jobs(status_filters: Optional[List[str]] = None, project_id: Optional[str] = None) → List[RIMEStressTestJob]

Query the backend for a list of jobs filtered by status and project ID.

This is a good way to recover RIMEStressTestJob objects. Note that this only returns jobs from the last two days, because the time-to-live of job objects in the backend is set at two days.

Parameters:

status_filters – Optional[List[str]] = None Filter for selecting jobs by a union of statuses. The following list enumerates all acceptable values. [‘pending’, ‘running’, ‘failing’, ‘succeeded’] If omitted, jobs will not be filtered by status.
project_id – Optional[str] = None Filter for selecting jobs by project ID. If omitted, jobs from all projects will be returned.

Returns:

A list of RIMEStressTestJob objects. These are not guaranteed to be in any sorted order.

Raises:

ValueError – If the provided status_filters array has invalid values. If the request to the ModelTest service failed.

Example:

# Get all running and succeeded jobs for project 'foo'
jobs = rime_client.list_stress_test_jobs(
    status_filters=['JOB_STATUS_PENDING', 'JOB_STATUS_SUCCEEDED'],
    project_id='foo',
)

create_firewall(name: str, bin_size_seconds: int, test_run_id: str, project_id: str) → RIMEFirewall

Create a Firewall for a given project.

Parameters:

name – str FW name.
bin_size_seconds – int Bin size in seconds. Only supports daily or hourly.
test_run_id – str ID of the stress test run that firewall will be based on.
project_id – str ID of the project this FW belongs to.

Returns:

A RIMEFirewall object.

Raises:

ValueError – If the provided status_filters array has invalid values. If the request to the ModelTest service failed.

Example:

# Create FW based on foo stress test in bar project.
firewall = rime_client.create_firewall(
    "firewall name", 86400, "foo", "bar")

get_firewall(firewall_id: str) → RIMEFirewall

Get a firewall if it exists.

Query the backend for a RIMEFirewall which can be used to perform Firewall operations. If the FW you are trying to fetch does not exist, this will error.

Parameters:: firewall_id – ID of the FW instance to fetch.
Returns:: a RIMEFirewall Object
Raises:: ValueError – If the FW Instance does not exist.

Example:

# Get FW foo if it exists.
firewall = rime_client.get_firewall("foo")

get_firewall_for_project(project_id: str) → RIMEFirewall

Get the active fw for a project if it exists.

Query the backend for an active RIMEFirewall in a specified project which can be used to perform Firewall operations. If there is no active Firewall for the project, this call will error.

Parameters:: project_id – ID of the project which contains a Firewall.
Returns:: A RIMEFirewall object.
Raises:: ValueError – If the Firewall does not exist.

Example:

# Get FW in foo-project if it exists.
firewall = rime_client.get_firewall_for_project("foo-project")

start_firewall_from_reference(test_run_config: dict, disable_firewall_events: bool = True, project_id: Optional[str] = None, custom_image: Optional[CustomImage] = None, rime_managed_image: Optional[str] = None, ram_request_megabytes: Optional[int] = None, cpu_request_millicores: Optional[int] = None, data_type: str = 'tabular') → RIMEStressTestJob

Start a RIME Firewall from reference on the backend’s ModelTesting service.

This allows you to start an AI Firewall job on the RIME backend. This will run a stress test, create a firewall, and then run firewall tests on your dataset.

Parameters:

test_run_config – dict Configuration for the test to be run, which specifies paths to the model and datasets to used for the test.
project_id – Optional[str] Identifier for the project where the resulting test run will be stored. If not specified, the results will be stored in the default project.
custom_image – Optional[CustomImage] Specification of a customized container image to use running the model test. The image must have all dependencies required by your model. The image must specify a name for the image and optional a pull secret (of type CustomImage.PullSecret) with the name of the kubernetes pull secret used to access the given image.
rime_managed_image – Optional[str] Name of a managed image to use when running the model test. The image must have all dependencies required by your model. To create new managed images with your desired dependencies, use the client’s create_managed_image() method.
ram_request_megabytes – int Megabytes of RAM requested for the stress test job. If none specified, will default to 4000MB. The limit is 2x the megabytes requested.
cpu_request_millicores – int Millicores of CPU requested for the stress test job. If none specified, will default to 1500mi. The limit is 2x the millicores requested.
data_type – str Type of data this firewall test is to be run on. Should be one of tabular, nlp, images. Defaults to tabular.

Returns:

A RIMEStressTestJob providing information about the model stress test job.

Raises:

ValueError – If the request to the ModelTest service failed.

Example:

# This example will likely not work for you because it requires
# permissions to a specific S3 bucket.
# This demonstrates how you might specify such a configuration.
config_from_reference = {
"run_name": "Five Day Fraud Detection",
"data_info": {
    "label_col": "is_fraud",
    "pred_col": "is_fraud_preds",
    "ref_path": "s3://rime-datasets/fraud_continuous_testing/ref.csv",
    "eval_path": "s3://rime-datasets/fraud_continuous_testing/
                  eval_2021_04_01_to_2021_04_06.csv"
},
"monitoring_info": {
    "timestamp_col": "timestamp",
    "bin_size": "day"
},
}
# Run the job using the specified config and the default Docker image in
# the RIME backend.vStore the results under project ID ``foo``
# Use the RIME Managed Image ``tensorflow115``.
# This assumes you have already created the Managed Image and waited for
# it to be ready.
job = rime_client.start_firewall_from_reference(
    test_run_config=config_from_reference,
    project_id="foo",
    rime_managed_image="tensorflow115",
    ram_request_megabytes=8000,
    cpu_request_millicores=2000)

upload_dataset_file(file_path: Union[Path, str]) → str

Upload a dataset file to make it accessible to RIME’s backend.

The uploaded file is stored with RIME’s backend in a blob store using its file name.

Parameters:

file_path – Union[Path, str] Path to the file to be uploaded to RIME’s blob store.

Returns:

A reference to the uploaded file’s location in the blob store. This reference can be used to refer to that object when writing RIME configs. Please store this reference for future access to the file.

Raises:

FileNotFoundError – If the path file_path does not exist.
IOError – If file_path is not a file.
ValueError – If there was an error in obtaining a blobstore location from the RIME backend or in uploading file_path to RIME’s blob store. In the scenario the file fails to upload, the incomplete file will NOT automatically be deleted.

upload_model_directory(dir_path: Union[Path, str], upload_hidden: bool = False) → str

Upload a model directory to make it accessible to RIME’s backend.

The uploaded directory is stored within RIME’s backend in a blob store. All files contained within dir_path and its subdirectories are uploaded according to their relative paths within dir_path. However, if upload_hidden is False, all hidden files and subdirectories beginning with a ‘.’ are not uploaded.

Parameters:

dir_path – Union[Path, str] Path to the directory to be uploaded to RIME’s blob store.
upload_hidden – bool = False Whether or not to upload hidden files or subdirectories (ie. those beginning with a ‘.’).

Returns:

A reference to the uploaded directory’s location in the blob store. This reference can be used to refer to that object when writing RIME configs. Please store this reference for future access to the directory.

Raises:

FileNotFoundError – If the directory dir_path does not exist.
IOError – If dir_path is not a directory or contains no files.
ValueError – If there was an error in obtaining a blobstore location from the RIME backend or in uploading dir_path to RIME’s blob store. In the scenario the directory fails to upload, files will NOT automatically be deleted.

class rime_sdk.RIMEImageBuilder(backend: RIMEBackend, name: str, requirements: Optional[List[PipRequirement]] = None)

An interface to a RIME image builder.

get_status(verbose: bool = False, wait_until_finish: bool = False, poll_rate_sec: float = 5.0) → Dict

Query the ImageRegistry service for the image’s build status.

This query includes an option to wait until the image build is finished. It will either have succeeded or failed.

Parameters:

verbose – bool whether or not to print diagnostic information such as logs.
wait_until_finish – bool whether or not to block until the image is READY or FAILED.
poll_rate_sec – float the frequency with which to poll the image’s build status.

Returns:

A dictionary representing the image’s state.

class rime_sdk.RIMEStressTestJob(backend: RIMEBackend, job_id: str)

An interface to a RIME stress testing job.

This object provides an interface for monitoring the status of a stress test job in the RIME backend.

get_link() → str

Get the web app URL for a successful stress test job.

This link directs to your organization’s deployment of RIME. You can view more detailed information about the results of your stress test in the web app, including helpful visualiziations, key insights, and explanations of test results.

Note: this is a string that should be copy-pasted into a browser.

get_status(verbose: bool = False, wait_until_finish: bool = False, poll_rate_sec: float = 5.0) → Dict

Query the ModelTest service for the job’s status.

This includes flags for blocking until the job is complete and printing information to stdout. This method can help with monitoring the progress of stress test jobs, because it prints out helpful information such as running time and the progress of the test run.

Parameters:

verbose – bool whether or not to print diagnostic information such as logs. If this flag is enabled and the job has failed, the logs of the testing engine will be dumped to stdout to help with debuggability. Note that these logs have no strict form and will be subject to significant change in future versions.
wait_until_finish – bool whether or not to block until the job has succeeded or failed. If verbose is enabled too, information about the job including running time and progress will be printed to stdout every poll_rate_sec.
poll_rate_sec – float the frequency with which to poll the job’s status. Units are in seconds.

Returns:

A dictionary representing the job’s state.

{
"id": str
"type": str
"status": str
"start_time_secs": int64
"running_time_secs": double
}

Example:

# Block until this job is finished and dump monitoring info to stdout.
job_status = job.get_status(verbose=True, wait_until_finish=True)

get_test_cases_result(version: Optional[str] = None) → DataFrame

Retrieve all the test cases for a completed stress test run in a dataframe.

This gives you the ability to perform granular queries on test cases. For example, if you only care about subset performance tests and want to see the results on each feature, you can fetch all the test cases in a dataframe, then query on that dataframe by test type. This only works on stress test jobs that have succeeded.

Note: this will not work for test runs run on RIME versions <0.14.0.

Parameters:: version – Optional[str] = None Semantic version of the results to be returned. This allows users to pin the version of the results, which is helpful if you write any code on top of RIME data. If you upgrade the SDK and do not pin the version in your code, it may break because the output not guaranteed to be stable across versions. The latest output will be returned by default.
Returns:: A pandas.DataFrame object containing the test case results. Here is a selected list of columns in the output: 1. test_run_id: ID of the parent test run. 2. features: List of features that the test case ran on. 3. test_batch_type: Type of test that was run (e.g. Subset AUC, Must be Int, etc.). 4. status: Status of the test case (e.g. Pass, Fail, Skip, etc.). 5. severity: Metric that denotes the severity of the failure of the test.

Example:

# Wait until the job has finished, since this method only works on
# SUCCEEDED jobs.
job.get_status(verbose=True, wait_until_finish=True)
# Dump the test cases in dataframe ``df``.
# Pin the version to RIME version 0.14.0.
df = job.get_test_cases_result(version="0.14.0")
# Print out the column names and types.
print(df.columns)

get_test_run_result(version: Optional[str] = None) → DataFrame

Retrieve high level summary information for a complete stress test run in a single-row dataframe.

This dataframe includes information such as model metrics on the reference and evalulation datasets, overall RIME results such as severity across tests, and high level metadata such as the project ID and model task.

By concatenating these rows together, this allows you to build a table of test run results for sake of comparison. This only works on stress test jobs that have succeeded.

Note: this does not work on <0.14.0 RIME test runs.

Parameters:: version – Optional[str] = None` Semantic version of the results to be returned. This allows users to pin the version of the results, which is helpful if you write any code on top of RIME data. If you upgrade the SDK and do not pin the version in your code, it may break because the output not guaranteed to be stable across versions. The latest output will be returned by default.
Returns:: A pandas.DataFrame object containing the test run result. There are a lot of columns, so it is worth viewing them with the .columns method to see what they are. Generally, these columns have information about the model and datasets as well as summary statistics like the number of failing test cases or number of high severity test cases.

Example:

# Wait until the job has finished, since this method only works on
# succeeded jobs.
job.get_status(verbose=True, wait_until_finish=True)
# Dump the test cases in dataframe ``df``.
# Pin the version to RIME version 0.14.0.
df = job.get_test_run_result(version="0.14.0")
# Print out the column names and types.
print(df.columns)

class rime_sdk.RIMEProject(project_id: str, name: str, description: str)

This object describes a project in the RIME backend.

project_id: str

How to refer to the project in the backend.

Use this attribute to specify the project for the backend in start_stress_`test_job() and list_stress_test_jobs().

name: str: Name of the project.

description: str: Description of the project