Selector Guide

Selectors choose which data points to include in the coreset for efficient unlearning. Instead of unlearning on the entire forget set, a well-chosen coreset can achieve similar results in less time.

When to Use Selectors

  • Large forget sets — Reduce compute cost with a representative coreset

  • Streaming deletions — Prioritise the most impactful data points

  • Resource constraints — Run unlearning within a time/memory budget

Selector Categories

Influence-Based — Select by influence on model parameters:

  • influence_functions — Compute Hessian-based influence scores

  • tracin — Trace gradient inner products across checkpoints

Geometry-Based — Select by feature space coverage:

  • herding — Greedy selection closest to class centroids

  • k_center — Minimise maximum distance to nearest selected point

  • facility_location — Maximise coverage of feature space

  • craig — Coreset for Accelerating Incremental Gradient

Learning-Based — Select by training dynamics:

  • forgetting_score — Points frequently forgotten during training

  • active_learning — Uncertainty sampling for informative points

Gradient-Based — Select by gradient properties:

  • gradient_matching — Match full-set gradients with coreset

  • grad_norm — Select by gradient magnitude

Ensemble — Combine multiple selectors:

  • stacking — Sequential selector refinement

  • voting — Majority vote across selectors

  • weighted_fusion — Weighted combination of selector scores

Quick Comparison

Selector

Speed

Quality

Memory

Best For

random

★★★

★★★

Baselines

herding

★★★

★★★

★★

General use

influence_functions

★★★

Small models

forgetting_score

★★

★★★

★★

Classification

Usage

unlearner = ErasusUnlearner(
    model=model,
    strategy="gradient_ascent",
    selector="herding",
    device="cuda",
)

result = unlearner.fit(
    forget_data=forget_loader,
    retain_data=retain_loader,
    prune_ratio=0.3,  # Keep 30% as coreset
)