Tabular Data/Model Setup

Python package exposing core functionality of RIME.

The RIME Library offers data scientists/developers a convenient way to use the full suite of RIME data/model tests directly within their python environment.

class rime.tabular.ModelTask(value): Enum representing the different model tasks.

class rime.tabular.DataContainer(data_profile: DataProfile, df: Optional[DataFrame] = None, labels: Optional[Series] = None, preds: Optional[PredictionContainer] = None, query_ids: Optional[Series] = None, pred_profile: Optional[PredictionProfile] = None, label_profile: Optional[LabelProfile] = None, subset_profile: Optional[SubsetPerformanceProfile] = None, embedding_profiles: Optional[List[EmbeddingProfile]] = None, timestamps: Optional[Series] = None, label_frac_above_threshold: bool = True)

Class to store all data artifacts for a given dataset.

classmethod from_df(df: DataFrame, model_task: ModelTask, metrics: List[Metric], labels: Optional[Series] = None, preds: Optional[ndarray] = None, query_ids: Optional[Series] = None, ref_data: Optional[DataContainer] = None, profiling_config: Optional[ProfilingConfig] = None) → DataContainer

Load data container from df.

Parameters

df (pd.DataFrame) – Input dataframe.
model_task (ModelTask) – Model task.
metrics (List[Metric]) – List of metrics.
labels (Optional[pd.Series]) – Label vector.
preds (Optional[np.ndarray]) – Prediction array.
query_ids (Optional[pd.Series]) – Query id’s, for ranking only.
ref_data (Optional[rime.tabular.DataContainer]) – Reference data container (used for creating evaluation DataContainer).
profiling_config (Optional[ProfilingConfig]) – Profiling config.

classmethod from_data(parsed_data: ParsedData, model_task: ModelTask, profiling_config: ProfilingConfig, metrics: List[Metric], ref_data: Optional[DataContainer] = None, top_k_feats: Optional[List[str]] = None) → DataContainer

Load data container from df.

Parameters

parsed_data (ParsedData) – Tuple of required information.
model_task (ModelTask) – Model task.
profiling_config (ProfilingConfig) – Profiling config object.
ref_data (Optional[rime.tabular.DataContainer]) – Reference data container (used for creating evaluation DataContainer).
top_k_feats (Optional[List[str]]) – List of top_k_feats if smart sampling
metrics – Optional list of metrics to use.

property query_ids: Series: Safely access the preds container.

property timestamps: Series: Safely access the timestamps.

property is_query_ids_none: bool: Notifies user if this has query_ids.

property label_frac_above_threshold: bool: Check if fraction of non-null labels is above the configured threshold.

class rime.tabular.TabularRunContainer(ref_data: DataContainer, eval_data: DataContainer, model: Optional[TabularBlackBoxModel] = None, model_profile: Optional[ModelProfile] = None)

Class to store the testing state when a model is provided.

get_failing_row_details(index: int) → List[Detail]: Get failing row details.

classmethod from_model(ref_data: DataContainer, test_data: DataContainer, model: TabularBlackBoxModel, metrics: List[Metric]) → TabularRunContainer: Make predictions with model if they do not already exist.

classmethod from_predict_dict_function(ref_data: DataContainer, test_data: DataContainer, predict_dict_func: Callable, model_task: ModelTask, metrics: List[Metric]) → TabularRunContainer

Load model from predict_dict function.

Parameters

ref_data (rime.tabular.DataContainer) – Reference DataContainer.
test_data (rime.tabular.DataContainer) – Test DataContainer.
predict_dict_func (Callable) – predict_dict function to implement.
model_task (rime.tabular.ModelTask) – Specified model task (one of ModelTask.BINARY_CLASSIFICATION or ModelTask.REGRESSON)
metrics (List[Metric]) – List of metrics.

classmethod from_predict_df_function(ref_data: DataContainer, test_data: DataContainer, predict_df_func: Callable, model_task: ModelTask, metrics: List[Metric]) → TabularRunContainer: Load model from predict_df function.

property protected_features: List[str]: Return protected feature column names.

property features_not_in_model: List[str]: Return feature names not included in model.

property protected_feature_pairs: List[str]: Return protected feature pairs if they exist.

property custom_intersection_names: List[str]: Return custom intersection columns if they exist.

property custom_intersections: List[Column]: Return custom intersection columns if they exist.

property common_profiled_columns: List[Column]

Return Common Profiled Columns between Ref and Test Data.

Removes protected feature pair columns from the list of profiled columns, these are only used in a subset of tests.

property common_profiled_columns_in_model: List[Column]

Return Common Profiled Columns that are used in the model.

Removes any columns not in the model from the list of profiled columns, as these should not be included in model-specific tests (like attacks).

property same_type_columns: List[str]: Return columns that were profiled as the same Column type in ref and eval.

property model_input_column_names: List[str]

Return columns used by the model.

Removes protected feature pairs, these are only used in a subset of tests.

property profiled_all_cols: bool: Return whether smart sampling was run.

property summary_metric_list: List[Type[Metric]]: Return summary metrics based on model task.

get_run_summary_metrics(use_display_name: bool = False) → List[Dict]: Return model performance metric information.

get_comparison_cols(custom_metrics: Optional[Dict[str, ModelPerformance]] = None) → Dict[str, ModelPerformance]: Get test run comparison columns.

scale_metric_thresholds(metric_cls: Type[Metric], thresholds: Tuple[float, float, float]) → Tuple[float, float, float]: Scale the default severity thresholds for a particular metric.

property metrics: List[Metric]: Return list of metrics in run container.

NLP Data/Model Setup

Python package exposing core NLP functionality for RIME.

The RIME NLP library provides tooling for the RIME testing suite over a number of natural language tasks.

class rime.nlp.Task(value): Enumeration of supported NLP tasks.

class rime.nlp.DataContainer(data: List[dict], preds: List[dict], preds_index: List[int], data_profile: DataProfile, perf_summary: Dict[Type[Metric], float], subsets_profiles: Dict[str, SubsetsInfo], contains_labels: bool, embeddings: Optional[List[EmbeddingInfo]] = None)

Class to store all data artifacts for a given dataset.

Parameters

data – Input data. Each item in the list is a single datapoint containing the model input, optional label(s), and other task-related metadata.
preds – Model predictions.
preds_index – Indices mapping the (possibly sampled) model predictions to the input data. preds[i] = model.predict_dict(data[preds_index[i])
data_profile – NLP task-specific profile of the data generated by the RIME profiler
perf_summary – performance summary generated by the NLP task’s model evaluator
subsets_profiles – Performance metrics for each feature subset
contains_labels – bool whether labels are present for all data points
embeddings – Information regarding embeddings in the data

property is_preds_none: bool: Notifies user if this has no preds.

property num_instances: int: Return the length of the dataset.

get_feature_profile(feature_name: str) → dict: Return the histogram profile for the given feature.

contains_feature_profile(feature_name: str) → bool: Return whether the profiled feature.

get_custom_feature(feature_name: str) → dict: Return custom feature.

contains_custom_feature(feature_name: str) → bool: Return whether custom feature is present.

preds_with_labels_at_indices(indices: Collection[int], return_difference: bool = True) → Tuple[List[dict], List[dict], Optional[List[dict]], Optional[List[dict]]]: Return preds and labels at the index intersection.

get_tabular_data_container(task: UnstructuredTask, profiling_config: ProfilingConfig, model_evaluator: ModelEvaluator, ref_data: Optional[DataContainer] = None) → DataContainer: Get tabular run container.

get(raw_data: dict) → dict: Get datapoint.

classmethod from_data(data: List[dict], task: UnstructuredTask, preds_index: List[int], preds: List[dict], data_profiling_info: DataProfilingInfo, subset_profiling_info: SubsetProfilingInfo, model_evaluator: ModelEvaluator, ref_data: Optional[DataContainer] = None, embeddings: Optional[List[EmbeddingInfo]] = None, data_kwargs: Optional[dict] = None) → UDC: Construct the data container from the loaded data.

class rime.nlp.ModelEvaluator(metrics: List[Metric])

Abstract base class for the model evaluator.

property metric_class_to_metric: Mapping[Type[Metric], Metric]: Return mapping of metric class to metric.

abstract classmethod avg_pred(data: List[dict], preds: List[dict]) → Optional[Pred_T]: Return the average prediction vector.

abstract static get_avg_pred_diff(base_avg_preds: Pred_T, avg_preds: Pred_T) → float: Return the floating point difference in observed values.

classmethod get_default_metrics() → List[Type[Metric]]: Return the default metrics.

compute_labeled_impact(data: List[dict], preds: List[dict]) → Optional[float]: Return the impact metric performance.

performance(labels: List[dict], preds: List[dict], metrics: Optional[List[Type[Metric]]] = None) → Dict[Type[Metric], float]: Return specified metrics for a set of entity model outputs and labels.

get_default_metric_for_feature(feature: Feature) → Metric: By default, use the impact metric.

class rime.nlp.RunContainer(ref_data: UDC, eval_data: UDC, task: UMT, model: Optional[SingleTaskModel], model_evaluator: ModelEvaluator, data_profiling_info: Optional[DataProfilingInfo] = None, model_profiling_info: Optional[BaseModelProfilingInfo] = None, **kwargs: Any)

Base NLP run container class.

get_failing_row_details(index: int) → List[Detail]: Get failing row details.

static get_tabular_model_profile(task: Task, tabular_eval_data: DataContainer, metrics: List[Metric], tabular_profiling_config: ProfilingConfig) → Optional[ModelProfile]: Get tabular model profile.

class rime.nlp.SingleTaskModel(task: BaseTask, predict_dict: Callable[[dict], dict], **kwargs: Any): Base class for a model that performs a single NLP task.