RIME Adversarial

Documentation for usage of RIME Adversarial code, including Attacks (the algorithms used to find adversarial points) and Epsilon Balls (the methods used to constrain the attacks).

Attacks

Attack algorithms for tabular models.

class rime.tabular.attacks.TabularRandomizedAttack(black_box: BlackBoxModel, target_score: float, max_queries: int, columns: List[Column], custom_perturb_pct: Optional[Dict[str, float]] = None, default_perturb_pct: float = 1.0, max_unsuccessful_iters: Optional[int] = None, epsilon_ball: Optional[EpsilonBall] = None, **kwargs: Any)

Attack algorithm that randomly perturbs a subset of features each iteration.

__init__(black_box: BlackBoxModel, target_score: float, max_queries: int, columns: List[Column], custom_perturb_pct: Optional[Dict[str, float]] = None, default_perturb_pct: float = 1.0, max_unsuccessful_iters: Optional[int] = None, epsilon_ball: Optional[EpsilonBall] = None, **kwargs: Any)

Initialize the algorithm.

Parameters:
  • black_box – Model to attack.

  • target_score – Upper bound on the score (loss) of examples considered to be successfully adversarial.

  • max_queries – Maximum number of queries allowed to make to model.

  • columns – List of column objects from profiler.

  • custom_perturb_pct – Mapping of feature name to probability of perturbing that feature. Can be used to focus more or less on certain features. Defaults to None.

  • default_perturb_pct – Default perturbation percent to use for each feature not specified in custom_perturb_pct mapping. Defaults to 1.

  • max_unsuccessful_iters – If an integer is passed, stop attack after this number of unsuccessful iterations. Defaults to None.

  • epsilon_ball – Epsilon ball to use to restrict changes. Will sample from inside this ball. Defaults to None.

class rime.tabular.attacks.TabularCombinationAttack(black_box: BlackBoxModel, target_score: float, max_queries: float, columns: List[Column], subset_sizes: Union[int, List[int]] = 1, search_count: int = 10, max_unsuccessful_iters: Optional[int] = None, column_names_to_ignore: Optional[Set[str]] = None, epsilon_ball: Optional[ColumnRangeEpsilonBall] = None, **kwargs: Any)

Attack which greedily searches over combinations of features.

__init__(black_box: BlackBoxModel, target_score: float, max_queries: float, columns: List[Column], subset_sizes: Union[int, List[int]] = 1, search_count: int = 10, max_unsuccessful_iters: Optional[int] = None, column_names_to_ignore: Optional[Set[str]] = None, epsilon_ball: Optional[ColumnRangeEpsilonBall] = None, **kwargs: Any)

Initialize the algorithm.

Parameters:
  • black_box – Model to attack.

  • target_score – Upper bound on the score (loss) of examples considered to be successfully adversarial.

  • max_queries – Maximum number of queries allowed to make to model.

  • columns – List of column objects from profiler.

  • subset_sizes – List of feature subset sizes to consider. Can either be specified as an integer n, in which case subset sizes of 1 through n are considered, or a specific list of subset sizes. Defaults to 1.

  • search_count – The number of perturbed values to consider for each column on each iteration. Defaults to 10.

  • max_unsuccessful_iters – The maximum number of iterations to proceed without improvement. Default to None.

  • column_names_to_ignore – Names of columns to ignore, ie not attempt to perturb. Defaults to None.

  • epsilon_ball – Epsilon ball to use to restrict changes. Will make perturbations to the extrema of the ranges. Defaults to None.

class rime.tabular.attacks.TabularExhaustiveGreedyAttack(black_box: BlackBoxModel, target_score: float, max_queries: int, columns: List[Column], perturbation_threshold: float = 0.1, **kwargs: Any)

Greedy attack algorithm that exhaustively attacks each feature.

__init__(black_box: BlackBoxModel, target_score: float, max_queries: int, columns: List[Column], perturbation_threshold: float = 0.1, **kwargs: Any)

Initialize the algorithm.

Parameters:
  • black_box – Model to attack.

  • target_score – Upper bound on the score (loss) of examples considered to be successfully adversarial.

  • max_queries – Maximum number of queries allowed to make to model.

  • columns – List of column objects from profiler.

  • perturbation_threshold – When trying to exhaustively perturb a feature, will keep on going while change in score is greater than this threshold. Defaults to 0.1.

  • **kwargs – Same as arguments to TabularGreedyAttack.

class rime.tabular.attacks.TabularGreedyAttack(black_box: BlackBoxModel, target_score: float, max_queries: int, columns: List[Column], skip_categoricals: bool = False, num_features_per_round: int = 3, custom_perturb_pct: Optional[Dict[str, float]] = None, default_perturb_pct: float = 1.0, early_stop_threshold: float = inf, **kwargs: Any)

Greedy attack algorithm that samples perturbations for each column.

__init__(black_box: BlackBoxModel, target_score: float, max_queries: int, columns: List[Column], skip_categoricals: bool = False, num_features_per_round: int = 3, custom_perturb_pct: Optional[Dict[str, float]] = None, default_perturb_pct: float = 1.0, early_stop_threshold: float = inf, **kwargs: Any)

Initialize the algorithm.

Parameters:
  • black_box – Model to attack.

  • target_score – Upper bound on the score (loss) of examples considered to be successfully adversarial.

  • max_queries – Maximum number of queries allowed to make to model.

  • columns – List of column objects from profiler.

  • skip_categoricals – Whether to skip categorical columns or not. Defaults to False.

  • num_features_per_round – Number of features to perturb per iteration. Defaults to 3.

  • custom_perturb_pct – Mapping of feature name to probability of perturbing that feature. Can be used to focus more or less on certain features. Defaults to None.

  • default_perturb_pct – Default perturbation percent to use for each feature not specified in custom_perturb_pct mapping. Defaults to 1.

  • early_stop_threshold – Stop adding more perturbations in this round if initial perturbation is greater than this. Defaults to np.inf.

class rime.tabular.attacks.TabularNoiseRemoval(base_attack: TabularIterativeAttack, repeat: int = 1, target_score: Optional[float] = None)

Attack algorithm that first runs a base attack, then removes unneeded noise.

Is best paired with attacks that quickly (but inefficiently) cross the decision boundary, like TabularRandomizedAttack.

__init__(base_attack: TabularIterativeAttack, repeat: int = 1, target_score: Optional[float] = None)

Initialize with base attack and information for removing noise.

Parameters:
  • base_attack – Base attack algorithm to run first.

  • repeat – How many times to attempt to remove noise for each column. Defaults to 1.

  • target_score – Upper bound on the score (loss) of examples considered to be successfully adversarial. Defaults to None.

Epsilon Balls

Classes defining epsilon balls.

class rime.tabular.attacks.epsilon_ball.LInfQuantileEpsilonBall(epsilon: float, columns: List[Column], col_indices: Optional[List[int]] = None)

An Epsilon Ball with each feature bounded by a quantile range.

__init__(epsilon: float, columns: List[Column], col_indices: Optional[List[int]] = None)

Initialize the epsilon ball.

Parameters:
  • epsilon – A percentage specifying the (one-sided) quantile range each feature may be perturbed. Should not exceed .5 (50%), as it refers to deviations from the 50th percentile both above and below.

  • columns – List of columns associated with the features of data points handled by epsilon ball.

  • col_indices – Optional list of indices to which the columns correspond.

class rime.tabular.attacks.epsilon_ball.LInfRangeEpsilonBall(epsilon: float, columns: List[Column], col_indices: Optional[List[int]] = None)

EpsilonBall implementation that clips based on max-min range.

__init__(epsilon: float, columns: List[Column], col_indices: Optional[List[int]] = None)

Initialize the epsilon ball.

Parameters:
  • epsilon – A percentage specifying the (one-sided) size of the range each feature may be perturbed, calculated as a percentage of the corresponding columns’ value ranges. Should not exceed 1 (100%).

  • columns – List of columns associated with the features of data points handled by epsilon ball.

  • col_indices – Optional list of indices to which the columns correspond.

Attack Runner

Run tabular attacks.

rime.tabular.attacks.runner.run_attack_loop(attack: TabularIterativeAttack, run_container: TabularRunContainer, sample_size: int, use_tqdm: bool = True, special_logger: Optional[Logger] = None) Tuple[List[TabularAttackState], list]

Run attack over sample of data.

Parameters:
  • attack – Attack to run.

  • run_container – Container of data/model to be attacked.

  • sample_size – Number of data points to sample to run attacks over.

  • use_tqdm – Whether to use tqdm to log progress of loop or not, defaults to True.

  • special_logger – If specified, the logger to use to log info messages. Defaults to None.

Returns:

List of attack results and list of indices that were attacked.