Custom Generic Tests

This sections discusses how to specify a custom test. Establish contact with a Robust Intelligence support engineer if you encounter any difficulty.

First, define a custom test in a Python file.

This Python file must expose a class named CustomBatchRunner that inherits from the DataTestBatchRunner interface specified by RIME.

Implement the following methods in the CustomBatchRunner class.

  • _from_config: Takes a TabularRunContainer object and a configuration. Returns an initialized instance of CustomBatchRunner with a list of TestCase objects. TestCase objects are discussed later in this section.

  • _outputs_to_batch_res: Aggregates the results from the individual TestCases into a TestBatch result.

  • description: A short description of the custom test to display in the web UI.

  • long_description: A long description of the custom test to display in the web UI.

  • type: Specifies the type of test. Only used for the web UI.

TestCase objects are collections of test cases that are aggregated inside the overall batch runner. For example, a specific test that applies to each column in the dataset would initialize CustomBatchRunner with one test for each feature.

The test cases must inherit from the BaseTest class specified by RIME. Test cases must implement the following method.

  • run: Takes a TabularRunContainer object and returns a TestOutput object that contains the result of the test, along with a dictionary of additional information used to aggregate test results in the associated CustomBatchRunner object.

Example test file

"""Custom test batch runner."""
from typing import List, Tuple

from rime.core.schema import (
    CustomConfig,
    ImportanceLevel,
    Status,
    TableColumn,
    TestBatchResult,
    TestOutput,
)
from rime.core.test import BaseTest, TestExtraInfo
from rime.tabular.data_tests.batch_runner import DataTestBatchRunner
from rime.tabular.data_tests.schema.result import DataTestBatchResult, DataTestOutput
from rime.tabular.profiler.run_containers import TabularRunContainer as RunContainer


# Signature should not be changed.
class CustomTest(BaseTest):
    def __init__(self, delta: int = 0):
        """Initialize with a delta between n_rows ref and eval."""
        super().__init__()
        self.delta = delta

    # Signature should not be changed.
    def run(
        self, 
        run_container: TabularRunContainer, 
        silent_errors: bool = False
    ) -> Tuple[TestOutput, TestExtraInfo]:
        ref_data_size = len(run_container.ref_data.df)
        eval_data_size = len(run_container.eval_data.df)
        if ref_data_size > eval_data_size + self.delta:
            status = Status.WARNING
            severity = ImportanceLevel.HIGH
        else:
            status = Status.PASS
            severity = ImportanceLevel.NONE
        test_output = DataTestOutput(
            self.id, status, {"Severity": severity}, severity, [],
        )
        return test_output, TestExtraInfo(severity)


# Signature should not be changed.
class CustomBatchRunner(DataTestBatchRunner):
    """TestBatchRunner for the CustomTest."""

    # Signature should not be changed.
    @classmethod
    def _from_config(
        cls, run_container: RunContainer,  config: CustomConfig, category: str
    ) -> "DataTestBatchRunner":
        if config.params is None:
            delta = 0
        else:
            delta = config.params["delta"]
        tests = [CustomTest(delta=delta)]
        return cls(tests, category)

    # Signature should not be changed.
    def _outputs_to_batch_res(
        self,
        run_container: RunContainer,
        outputs: List[DataTestOutput],
        extra_infos: List[dict],
        duration: float,
        return_extra_infos: bool,
    ) -> TestBatchResult:
        long_description_tabs = [
            {"title": "Description", "contents": self.long_description},
            {"title": "Why it Matters", "contents": "Explain why this test matters."},
            {
                "title": "Configuration",
                "contents": "Explain how this test is configured."
            },
            {
                "title": "Example",
                "contents": "Include an example of how this test works."
            },
        ]
        return DataTestBatchResult(
            self.type,
            self.description,
            long_description_tabs,
            self.category,
            outputs,
            [],
            duration,
            extra_infos,
            [TableColumn("Severity")],
            outputs[0].severity,
            show_in_test_comparisons=False,
        )

    # Signature should not be changed.
    @property
    def description(self) -> str:
        return "This is custom test"

    # Signature should not be changed.
    @property
    def long_description(self) -> str:
        return "This is a long description of a custom test."

    # Signature should not be changed.
    @property
    def type(self) -> str:
        return "Example Custom Test"