Stopper#
- class liesel.goose.Stopper(max_iter, patience, atol=0.001, rtol=0.0)[source]#
Bases:
objectHandles (early) stopping for
optim_flat().- Parameters:
max_iter (
int) – The maximum number of optimization steps.patience (
int) – Length of the recent loss window considered for early stopping. Early stopping is checked only after more thanpatienceoptimization steps have been evaluated. In other words, becauseiis zero-based,stop_early()first returnsTrueno earlier thani == patience + 1.atol (
float, default:0.001) – The non-negative absolute tolerance for early stopping.rtol (
float, default:0.0) – The non-negative relative tolerance for early stopping. The default of0.0means that no early stopping happens based on the relative tolerance.
Notes
Early stopping is based on the window of the most recent
patienceloss values ending at the current zero-based iterationi. Without tolerances, early stopping happens when the oldest loss value in this window is also the best loss value in this window. This is a rolling-window rule, not a best-so-far rule that counts the number of iterations since the global best loss. It can therefore continue while the recent window still contains newer improvements, even if the global best loss was observed before the current window. A simplified pseudo-implementation is:def stop(patience, i, loss_history): current_history = loss_history[: i + 1] recent_history = current_history[-patience:] oldest_within_patience = recent_history[0] best_within_patience = np.min(recent_history) return oldest_within_patience <= best_within_patience
Absolute and relative tolerance make it possible to stop even in cases when the oldest loss within patience is not the best. Instead, the algorithm stops, when the absolute or relative difference between the oldest loss within patience and the best loss within patience is so small that it can be neglected. To be clear: If either of the two conditions is met, then early stopping happens. The relative magnitude of the difference is calculated with respect to the best loss within patience. A simplified pseudo-implementation is:
def stop(patience, i, loss_history, atol, rtol): current_history = loss_history[: i + 1] recent_history = current_history[-patience:] oldest_within_patience = recent_history[0] best_within_patience = np.min(recent_history) diff = oldest_within_patience - best_within_patience rel_diff = diff / np.abs(best_within_patience) abs_improvement_is_neglectable = diff <= atol rel_improvement_is_neglectable = rel_diff <= rtol return (abs_improvement_is_neglectable | rel_improvement_is_neglectable)
Methods
continue_(i, loss_history)Whether optimization should continue (inverse of
stop_now()).stop_early(i, loss_history)stop_now(i, loss_history)Whether optimization should stop now.
which_best_in_recent_history(i, loss_history)Identifies the index of the best observation in the recent loss window.
Attributes