GSgnnLinkPredictionTrainer
- class graphstorm.trainer.GSgnnLinkPredictionTrainer(model, topk_model_to_save)
Bases:
GSgnnTrainerA trainer for link prediction
This is a high-level trainer wrapper that can be used directly to train a link prediction model.
It makes use of the functions provided by GSgnnTrainer to define two main functions: fit that performs the training for the model that is provided when the object is created, and eval that evaluates a provided model against test and validation data.
Parameters
- modelGSgnnLinkPredictionModel
The GNN model for link prediction.
- topk_model_to_saveint
The top K model to save.
Example
from graphstorm.dataloading import GSgnnLinkPredictionDataLoader from graphstorm.dataset import GSgnnEdgeTrainData from graphstorm.model import GSgnnLinkPredictionModel from graphstorm.trainer import GSgnnLinkPredictionTrainer my_dataset = GSgnnEdgeTrainData( "my_graph", "/path/to/part_config", train_etypes="edge_type") target_idx = {"edge_type": target_edges_tensor} my_data_loader = GSgnnLinkPredictionDataLoader( my_dataset, target_idx, fanout=[10], batch_size=1024) my_model = GSgnnLinkPredictionModel(alpha_l2norm=0.0) trainer = GSgnnLinkPredictionTrainer(my_model, topk_model_to_save=1) trainer.fit(my_data_loader, num_epochs=2)
- property device
The device associated with the trainer.
- eval(model, data, val_loader, test_loader, total_steps, edge_mask_for_gnn_embeddings, use_mini_batch_infer=False)
do the model evaluation using validation and test sets
Parameters
- modelPytorch model
The GNN model.
- dataGSgnnEdgeTrainData
The training dataset
- val_loader: GSNodeDataLoader
The dataloader for validation data
- test_loaderGSNodeDataLoader
The dataloader for test data.
- total_steps: int
Total number of iterations.
- edge_mask_for_gnn_embeddingsstr
The mask that indicates the edges used for computing GNN embeddings.
- use_mini_batch_infer: bool
Whether do mini-batch inference when computing node embeddings
Returns
float: validation score
- property evaluator
The evaluator associated with the trainer.
- fit(train_loader, num_epochs, val_loader=None, test_loader=None, use_mini_batch_infer=True, save_model_path=None, save_model_frequency=None, save_perf_results_path=None, edge_mask_for_gnn_embeddings='train_mask', freeze_input_layer_epochs=0, max_grad_norm=None, grad_norm_type=2.0)
The fit function for link prediction.
Parameters
- train_loaderGSgnnLinkPredictionDataLoader
The mini-batch sampler for training.
- num_epochsint
The max number of epochs to train the model.
- val_loaderGSgnnLinkPredictionDataLoader
The mini-batch sampler for computing validation scores. The validation scores are used for selecting models.
- test_loaderGSgnnLinkPredictionDataLoader
The mini-batch sampler for computing test scores.
- use_mini_batch_inferbool
Whether or not to use mini-batch inference.
- save_model_pathstr
The path where the model is saved.
- save_model_frequencyint
The number of iteration to train the model before saving the model.
- save_perf_results_pathstr
The path of the file where the performance results are saved.
- edge_mask_for_gnn_embeddingsstr
The mask that indicates the edges used for computing GNN embeddings for model evaluation. By default, we use the edges in the training graph to compute GNN embeddings for evaluation.
- freeze_input_layer_epochs: int
Freeze input layer model for N epochs. This is commonly used when the input layer contains language models. Default: 0, no freeze.
- max_grad_norm: float
Clip the gradient by the max_grad_norm to ensure stability. Default: None, no clip.
- grad_norm_type: float
Norm type for the gradient clip Default: 2.0
- get_best_model_path()
Return the path of the best model.
- property optimizer
The optimizer associated with the trainer.
- remove_saved_model(epoch, i, save_model_path)
- remove previously saved model, which may not be the best K performed or other reasons.
This function will remove the entire folder.
Parameters
- epoch: int
The number of training epoch.
- i: int
The number of iteration in a training epoch.
- save_model_pathstr
The path where the model is saved.
- restore_model(model_path, model_layer_to_load=None)
Restore a GNN model and the optimizer.
Parameters
- model_pathstr
The path where the model and the optimizer state has been saved.
- model_layer_to_load: list of str
list of model layers to load. Supported layers include ‘gnn’, ‘embed’, ‘decoder’
- save_model(model, epoch, i, save_model_path)
Save the model for a certain iteration in an epoch.
- save_topk_models(model, epoch, i, val_score, save_model_path)
- Based on the given val_score, decided if save the current model trained in the i_th
iteration and the epoch_th epoch.
Parameters
- modelpytorch model
The GNN model.
- epoch: int
The number of training epoch.
- i: int
The number of iteration in a training epoch.
- val_score: dict or None
A dictionary contains scores from evaluator’s validation function. It could be None that means there is either no evluator or not do validation. In that case, just set the score rank as 1st to save all models or the last k models.
- save_model_pathstr
The path where the model is saved.
- setup_device(device)
Set up the device of this trainer.
The CUDA device is set up based on the local rank.
Parameters
- device :
The device for model training.
- setup_evaluator(evaluator)
Setup the evaluator
If the evaluator has its own task tracker, just setup the evaluator. But if the evaluator has no task tracker, will use this Trainer’s task tracker to setup the evaluator. When there is no self task tracker, will create a new one by using the given evaluator’s evaluation frequency.