Custom Metrics
This feature allows you to configure custom metrics that may be used in model performance, subset performance,
and subset performance degradation tests.
In order to use the custom metrics functionality, you first need to create a Python file that contains
the function used to calculate the custom metric. This function should be named custom_metric_func
,
and the interface this function is expected to expose is below:
from typing import Optional
import numpy as np
import pandas as pd
def custom_metric_func(
df: Optional[pd.DataFrame],
labels: Optional[pd.Series],
preds: Optional[np.ndarray],
preds_index: Optional[pd.Index],
query_ids: Optional[pd.Series],
) -> Optional[float]:
"""Return a custom metric.
Args:
df: Optional pd.DataFrame of data, will be the full data.
labels: Optional pd.Series of labels, if exists will have the same length
as df.
preds: Optional np.array of predictions, if exists may be a subset of the
full df.
preds_index: Optional pd.Index of predictions, if exists will have the same
length as `preds`. Can be used to align the predictions with rows in the
df.
query_ids: Optional pd.Series of query ids, if exists will have the same
length as `df`.
Returns:
an optional float corresponding to the metric to be tracked.
"""
The arguments passed to this function depend on the your configured data and task.
The function should return None
whenever the provided arguments are insufficent, such as when labels are required but not available.
NOTE: for unstructured data, the dataframe will contain information about user-defined metadata for each datapoint and any profiled attributes. For tabular data, the dataframe will be the raw features.
After defining this function, you then need to specify it in the model_profiling
section of the JSON configuration object.
Model Profiling discusses this object in more detail.
{
"model_profiling": {
"custom_metrics": [
{
"name": "NAME_OF_METRIC",
"file_path": "path/to/file.py",
"range_lower_bound": 0,
"range_upper_bound": 1,
"run_subset_performance": true,
"run_subset_performance_drift": false,
"run_overall_performance": true,
"metadata": {
"short_description": "",
"starter_description": "",
"why_it_matters_description": "",
"configuration_description": "",
"example_description": ""
}
}
]
}
}
name
and file_path
are required parameters. Replace NAME_OF_METRIC
with the name of the metric (this is how it will show up in the UI). Replace path/to/file.py
with the path to the Python file you created in the above step.
All other parameters are optional.
range_lower_bound
: float, default = 0 The lower bound of the valid range for the metric.range_upper_bound
: float, default = 1 The upper bound of the valid range for the metric.run_subset_performance
: bool, default =True
If metric should be included in subset performance categoryrun_subset_performance_drift
: bool, default =False
If metric should be included in subset performance drift categoryrun_overall_performance
: bool, default =True
If metric should be included in model performance categorymetadata
: dict, Descriptions of the metric composed of the components listed belowshort_description
: string, short description for each metricstarter_description
: string, contents of “Description” tab within “More” pop-upwhy_it_matters_description
: string, contents of “Why It Matters” tab within “More” pop-upconfiguration_description
: string, contents of “Configuration” tab within “More” pop-upexample_description
: string, contents of “Example” tab within “More” pop-up