Input Data Format
Automated Validation
The Python SDK exposes a command-line utility that can automatically validate your input data:
rime-data-format-check <ARGS>
Inspecting <REFERENCE_SET>
Done!
Inspecting <EVALUATION_SET>
Done!
---
Your data should work with RIME!
Instructions are available here.
Supported File Formats
RIME Tabular currently supports both CSV (.csv
) and Parquet (.parquet
) file formats, with task-specific nuances defined below. Input files should have header columns in string format — these will be used as feature names.
RIME is most effective when both label and prediction column are provided; however, neither are required for most tasks*.
Requirements By Task
Binary Classification
Labels should be integer values 0 or 1
Predictions should be float values between 0 and 1 that represent the positive class (label = 1) probability
Multi-Class Classification
Labels should be integers referring to class index
Predictions should be uploaded as a separate
.csv
or.parquet
file. Columns should be ordered, with the ith column representing the probability of the ith class. Predictions should sum to 1.
Ranking
* Labels are required
Labels should be any real number
Predictions should be any real number
ranking_info
must be provided in the data configuration
Regression
Labels should be any real number
Predictions should be any real number