Updating your Continuous Test
In this Notebook walkthrough, we will show how to update an AI Firewall after it has been deployed to production. The Firewall can be updated live to account for many service changes, such as modifying the reference dataset and upgrading the model, or configuring individual tests.
Latest Colab version of this notebook available here
Install dependencies
[ ]:
!pip install rime-sdk &> /dev/null
!pip install https://github.com/RobustIntelligence/ri-public-examples/archive/master.zip
[ ]:
from pathlib import Path
from tempfile import TemporaryDirectory
from typing import List
import pandas as pd
from ri_public_examples.download_files import download_files
from rime_sdk import Client
Download and prep data
[ ]:
download_files('tabular-2.0/fraud', 'fraud')
ct_data = pd.read_csv("fraud/data/fraud_incremental.csv")
ct_data[:len(ct_data)//2].to_csv("fraud/data/fraud_incremental_0.csv", index=False)
ct_data[len(ct_data)//2:].to_csv("fraud/data/fraud_incremental_1.csv", index=False)
ct_preds = pd.read_csv("fraud/data/fraud_incremental_preds.csv")
ct_preds[:len(ct_preds)//2].to_csv("fraud/data/fraud_incremental_0_preds.csv", index=False)
ct_preds[len(ct_preds)//2:].to_csv("fraud/data/fraud_incremental_1_preds.csv", index=False)
Instantiate RIME client and create project
[ ]:
API_TOKEN = '' # PASTE API_KEY
CLUSTER_URL = '' # PASTE DEDICATED DOMAIN OF RIME SERVICE (eg: rime.stable.rbst.io)
[ ]:
client = Client(CLUSTER_URL, API_TOKEN)
[ ]:
description = (
"Create an AI Firewall and update the configuration after it is deployed to production."
" Demonstration uses a tabular binary classification dataset"
" and model that simulates credit card fraud detection."
)
project = client.create_project(
"AI Firewall Configuration Demo",
description,
"MODEL_TASK_BINARY_CLASSIFICATION"
)
Upload data to S3 and register dataset and prediction set
[ ]:
from datetime import datetime
dt = str(datetime.now())
# Note: models and datasets need to have unique names.
model_id = project.register_model(f"fraud_model_{dt}", None)
[ ]:
upload_path = "ri_public_examples_fraud"
def upload_and_register_data(dataset_name, **kwargs):
dt = str(datetime.now())
s3_path = client.upload_file(
Path(f'fraud/data/fraud_{dataset_name}.csv'), upload_path=upload_path
)
preds_s3_path = client.upload_file(
Path(f"fraud/data/fraud_{dataset_name}_preds.csv"), upload_path=upload_path
)
dataset_id = project.register_dataset_from_file(
f"{dataset_name}_dataset_{dt}", s3_path, data_params={"label_col": "label", **kwargs}
)
project.register_predictions_from_file(
dataset_id, model_id, preds_s3_path
)
return dataset_id
ref_data_id = upload_and_register_data("ref")
Create a Firewall
[ ]:
from datetime import timedelta
firewall = project.create_firewall(model_id, ref_data_id, timedelta(days=1))
firewall
Run Continuous Testing on a batch of production data
[ ]:
ct_data_0_id = upload_and_register_data("incremental_0", timestamp_col="timestamp")
ct_job = firewall.start_continuous_test(ct_data_0_id)
ct_job.get_status(verbose=True, wait_until_finish=True)
Update the Reference Dataset
Suppose a week has passed, and we have updated your model by retraining on new data. We want to update our deployed Firewall to reflect the new reference dataset.
[ ]:
new_ref_data_id = upload_and_register_data("eval")
# Update firewall's configuration based on the new stress test run
firewall.update_firewall(ref_data_id=new_ref_data_id)
# The new stress test run will now be highlighted to reflect the firewall update
project
Run Continuous Testing on the latest batch of production data This time using the updated reference set as the baseline against which the production data is compared.
[ ]:
ct_data_1_id = upload_and_register_data("incremental_1", timestamp_col="timestamp")
ct_job = firewall.start_continuous_test(ct_data_1_id, override_existing_bins=True)
ct_job.get_status(verbose=True, wait_until_finish=True)