GSgnnBaseEvaluator

class graphstorm.eval.GSgnnBaseEvaluator(eval_frequency, eval_metric_list, use_early_stop=False, early_stop_burnin_rounds=0, early_stop_rounds=3, early_stop_strategy='average_increase')

Bases: object

Base class for GraphStorm Evaluators.

This class serves as the base for GraphStorm built-in evaluator classes, like GSgnnClassificationEvaluator, GSgnnRegressionEvaluator, GSgnnMrrLPEvaluator, GSgnnPerEtypeMrrLPEvaluator, and GSgnnRconstructFeatRegScoreEvaluator.

In order to create customized Evaluators, users can inherit this class and the corresponding EvalInterface class, and then implement their two abstract methods, i.e., evaluate() and compute_score() accordingly.

Parameters

eval_frequency: int: The frequency (number of iterations) of doing evaluation.
eval_metric_list: list of string: Evaluation metrics used for evaluation.
use_early_stop: bool: Set true to use early stop.
early_stop_burnin_rounds: int: Burn-in rounds (number of evaluations) before starting to check for the early stop condition. Default: 0.
early_stop_rounds: int: The number of rounds (number of evaluations) for validation scores used to decide early stop. Default: 3.
early_stop_strategy: str: The early stop strategy. GraphStorm supports two strategies: 1) consecutive_increase, and 2) average_increase. Default: average_increase.

setup_task_tracker(task_tracker)

Setup evaluation task tracker.

Parameters

task_tracker: GSSageMakerAbc: A GraphStorm task tracker.

do_eval(total_iters, epoch_end=False)

Decide whether to do the evaluation in current iteration or epoch.

Return True, if the current iteration is larger than 0 and is a multiple of the given eval_frequency, or is the end of an epoch. Otherwise return False.

Parameters

total_iters: int: The total number of iterations has been taken.
epoch_end: bool: Whether it is the end of an epoch

Returns

bool: Whether to do evaluation.

do_early_stop(val_score)

Decide whether to stop the training early.

Parameters

val_score: dict of list: Dict of evaluation scores for one metric.

Returns

bool: Whether to stop early.

get_val_score_rank(val_score)

Get the rank of the given validation score by comparing its value to the historical values.

Parameters

val_score: dict of list: A dictionary whose key is the metric and the value is a score from evaluator’s validation computation.

Returns

rank: int: The rank of the given validation score.

property metric_list: Return the evaluation metric list, which is given in class initialization.

property best_val_score: Return the best validation score of metrics used in this evaluator in the format of {metric: best_val_score}.

property best_test_score: Return the best test score of metrics used in this evaluator in the format of {metric: best_test_score}.

property best_iter_num: Return the best iteration number when the best validation score was achieved for metrics used in this evaluator in the format of {metric: best_iter_num}.

property history

Return a list of evaluation history of training.

The detailed contents of the list rely on implementations of specific Evaluators. For example, GSgnnRegressionEvaluator and GSgnnClassificationEvaluator both use a tuple of validation and testing score as one list element.

property eval_frequency: Return the evaluation frequency, which is given in class initialization.

property task_tracker: Return the task tracker set from the setup_task_tracker() method.

property val_perf_rank_list: Return the validation performance rank list.