GSConfig

class graphstorm.config.GSConfig(cmd_args)

Bases: object

GSgnn configuration class.

GSConfig contains all GraphStorm model training and inference configurations, which can either be loaded from a yaml file specified in the --cf argument, or from CLI arguments.

property adversarial_temperature: A hyperparameter value of temperature of adversarial cross entropy loss for link prediction tasks. Default is None.

property alpha: Common hyperparameter symbol alpha. Alpha is used in focal loss for binary classification. Default is None.

property alpha_l2norm: Coefficiency of the l2 norm of dense parameters. GraphStorm adds a regularization loss, i.e., l2 norm of dense parameters, to the final loss. It uses alpha_l2norm to re-scale the regularization loss. Specifically, loss = loss + alpha_l2norm * regularization_loss. Default is 0.

property backend: Distributed training backend. GraphStorm support gloo or nccl. Default is gloo.

property batch_size: Mini-batch size. It defines the batch size of each trainer. The global batch size equals to the number of trainers multiply the batch_size. For example, suppose we have 2 machines each of which has 8 GPUs, and set batch_size to 128. The global batch size will be 2 * 8 * 128 = 2048. Must provide.

property class_loss_func: Classification loss function. Builtin loss functions include cross_entropy and focal. Default is cross_entropy.

property contrastive_loss_temperature: Temperature of link prediction contrastive loss. This is used to rescale the link prediction positive and negative scores for the loss. Default is 1.0.

property decoder_bias: Decoder bias. decoder_bias must be a boolean. Default is True.

property decoder_edge_feat: A list of edge features that can be used by a decoder to enhance its performance. Default is None.

property decoder_type: The type of edge clasification or regression decoders. Built-in decoders include DenseBiDecoder and MLPDecoder. Default is DenseBiDecoder.

property dropout: Dropout probability. Dropout must be a float value in [0,1). Dropout is applied to every GNN layer. Default is 0.

property early_stop_burnin_rounds: Burn-in rounds before starting to check for the early stop condition. Default is 0.

property early_stop_rounds: The number of rounds for validation scores used to decide to stop training early. Default is 3.

property early_stop_strategy: The strategy used to decide if stop training early. GraphStorm supports two strategies: 1) consecutive_increase, and 2) average_increase. Default is average_increase.

property edge_feat_mp_op

The operation for using edge features during message passing computation.: Defaut is “concat”.

New in version 0.4.0: The edge_feat_mp_op argument.

GraphStorm supports five message passing operations for edge features, including:

”concat”:concatinate the source node feature with the edge feauture together, and then pass them to the destination node.
”add”:add the source node feature with the edge feauture together, and then pass them to the destination node.
”sub”:substract the edge feauture from the source node feature, and then pass them to the destination node.
”mul”:multiple the source node feature with the edge feauture, and then pass them to the destination node.
”div”:divid the source node feature by the edge feauture together, and then pass them to the destination node.

property edge_feat_name

User provided edge feature names. Default is None.

Changed in version 0.4.0: The edge_feat_name property is supported.

It can be in the following formats:

feat_name: global feature name for all edge types, i.e., for any edge, its corresponding feature name is <feat_name>.
"etype0:feat0","etype1:feat0,feat1",...: different edge types have different edge features under different names. The edge type should be in a canonical edge type, i.e., src_node_type,relation_type,dst_node_type.

This method parses given edge feature name list, and return either a string corresponding a global feature name, or a dictionary corresponding different edge types with diffent feature names.

property edge_id_mapping_file: A path to the folder that stores edge ID mapping files generated by the graph partition algorithm. Graph partition will shuffle node IDs and edge IDs according to the node partition assignment. We expect partition algorithms will save edge ID mappings to map new edge IDs to their original edge IDds. GraphStorm assumes edge ID mappings are stored as a single object along with the partition config file.

property eval_batch_size: Mini-batch size for computing GNN embeddings in evaluation. Default is 10000.

property eval_etype: The list of canonical edge types that will be added as evaluation target. If not provided, all edge types will be used as evaluation target. A canonical edge type should be formatted as src_node_type,relation_type,dst_node_type.

property eval_etypes_negative_dstnode

The list of canonical edge types that have hard negative edges constructed by corrupting destination nodes during evaluation.

For each edge type to use different fields to store the hard negatives, the format of the arguement is:

eval_etypes_negative_dstnode:
    - src_type,rel_type0,dst_type:negative_nid_field
    - src_type,rel_type1,dst_type:negative_nid_field

or, for all edge types to use the same field to store the hard negatives, the format of the arguement is:

eval_etypes_negative_dstnode:
    - negative_nid_field

property eval_fanout: The fanout of each GNN layers used in evaluation and inference. Default is same as the fanout.

property eval_frequency: The frequency of doing evaluation. GraphStorm trainers do evaluation at the end of each epoch. When eval_frequency is set, every eval_frequency iteration, trainers will do evaluation once. Default is only do evaluation at the end of each epoch.

property eval_metric: Evaluation metric(s) used during evaluation. The input can be a string specifying the evaluation metric to report, or a list of strings specifying a list of evaluation metrics to report. The first evaluation metric is treated as the major metric and is used to choose the best trained model. Default values depend on task_type. For classification tasks, the default value is accuracy; For regression tasks, the default value is rmse. For link prediction tasks, the default value is mrr.

property eval_negative_sampler: The negative sampler used for link prediction training. Built-in samplers include uniform, joint, localuniform, all_etype_uniform and all_etype_joint. Default is joint.

property exclude_training_targets: Whether to remove the training targets from the GNN computation graph. Default is True.

property fanout

The fanouts of GNN layers. The values of fanouts must be integers larger than 0. The number of fanouts must equal to num_layers. Must provide.

It accepts two formats:

20,10, which defines the number of neighbors

to sample per edge type for each GNN layer with the i_th element being the fanout for the ith GNN layer.

“etype2:20@etype3:20@etype1:10,etype2:10@etype3:4@etype1:2”, which defines

the numbers of neighbors to sample for different edge types for each GNN layers with the i_th element being the fanout for the i_th GNN layer.

property fixed_test_size: The number of validation and test data used during link prediction training and evaluation. This is useful for reducing the overhead of doing link prediction evaluation when the graph size is large. Default is None.

property freeze_lm_encoder_epochs: Before fine-tuning LM models, how many epochs GraphStorm will take to warmup a GNN model. Default is 0.

property gamma: Common hyperparameter symbol gamma. Default is None.

property gnn_norm: Normalization method for GNN layers. Options include batch or layer. Default is None.

property grad_norm_type: Value of the type of norm that is used to compute the gradient norm. Default is 2.

property graph_name: Name of the graph, loaded from the --part-config argument.

property hidden_size: The dimension of hidden GNN layers. Must be an integer larger than 0. Default is None.

property imbalance_class_weights

Used to specify a manual rescaling weight given to each class in a single-label multi-class classification task. It is used in imbalanced label use cases. It is feed into th.nn.CrossEntropyLoss. Default is None.

Customer should provide the weight in the following format: 0.1,0.2,0.3,0.1, …

property infer_all_target_nodes: Whether to force inference to run on all nodes for types specified by target-ntypes, ignoring any mask. Default is False.

property input_activate: Input layer activation funtion type. Either None or relu. Default is None.

property ip_config: IP config file that contains all IP addresses of instances in a cluster. In the file, each line stores one IP address. Default is None.

property label_field

The field name of labels in a graph data. Must provide for classification and regression tasks.

For node classification tasks, GraphStorm uses graph.nodes[target_ntype].data[label_field] to access node labels. For edge classification tasks, GraphStorm uses graph.edges[target_etype].data[label_field] to access edge labels.

property lm_infer_batch_size: Mini-batch size used to do LM model inference. Default is 32.

property lm_train_nodes: Number of nodes used in LM model fine-tuning. Default is 0.

property lm_tune_lr: Learning rate for fine-tuning language models.

property lp_decoder_type: The decoder type for loss function in link prediction tasks. Currently GraphStorm supports dot_product, distmult, transe (transe_l1 and transe_l2), and rotate. Default is distmult.

property lp_edge_weight_for_loss

Edge feature field name for edge weight. The edge weight is used to rescale the positive edge loss for link prediction tasks. Default is None.

The edge_weight can be in following format:

weight_name: global weight name, if an edge has weight,

the corresponding weight name is weight_name.

"src0,rel0,dst0:weight0","src0,rel0,dst0:weight1",...:

different edge types have different edge weights.

property lp_embed_normalizer: Type of normalization method used to normalize node embeddings in link prediction tasks. Currently GraphStorm only supports l2 normalization (l2_norm). Default is None.

property lp_loss_func: Link prediction loss function. Builtin loss functions include cross_entropy and contrastive. Default is cross_entropy.

property lr: Learning rate for dense parameters of input encoders, model encoders, and decoders. Must provide.

property max_distill_step: The maximum training steps for each node type for distillation. Default is 10000.

property max_grad_norm: Maximum gradient clip which limits the magnitude of gradients during training in order to prevent issues like exploding gradients, and to improve the stability and convergence of the training process. Default is None.

property max_seq_len: The maximum sequence length of tokenized textual data for distillation. Default is 1024.

property model_encoder_type: The encoder module used to encode graph data. It can be a GNN encoder or a non-GNN encoder, e.g., language models and MLPs. Default is None.

property multilabel: Whether the task is a multi-label classification task. Used by node classification and edge classification. Default is False.

property multilabel_weights

Used to specify label weight of each class in a multi-label classification task. It is feed into th.nn.BCEWithLogitsLoss as pos_weight.

The weights should be in the following format 0.1,0.2,0.3,0.1,0.0, … Default is None.

property no_validation: When set to true, will not perform evaluation (validation) during training. Default is False.

property node_feat_name

User provided node feature name. Default is None.

The input can be in the following formats:

feat_name: global feature name for all node types, i.e., for any node, its corresponding feature name is <feat_name>. For example, if node_feat_name is set to feat, GraphStorm will assume every node has a feat feature.
"ntype0:feat0","ntype1:feat0,feat1",...: different node types have different node features with different names. For example if node_feat_name is set to ["user:age","movie:title,genre"]. The user` nodes will take ``age as their features. The movie nodes will take both title and genre as their features. By default, for nodes of the same type, their features are first concatenated into a unified tensor, which is then transformed through an MLP layer.

Changed in version 0.5.0: Since 0.5.0, GraphStorm supports using different MLPs, to encode different input node features of the same node. For example, suppose the moive nodes have two features title and genre, GraphStorm can encode title feature with the encoder f(x) and encode genre feature with the encoder g(x).

To use different MLPs for different features of one node type, users can take the following format for node_feat_name: "ntype0:feat0","ntype1:feat0","ntype1:feat1",.... GraphStorm will create an MLP encoder for feat0 of ntype1 and another MLP encoder for feat1 of ntype1.

The return value can be:

None

A string

A dict of list of strings

A dict of list of FeatureGroup

property node_id_mapping_file: A path to the folder that stores node ID mapping files generated by the graph partition algorithm. Graph partition will shuffle node IDs and edge IDs according to the node partition assignment. We expect partition algorithms will save node ID mappings to map new node IDs to their original node IDs. GraphStorm assumes node ID mappings are stored as a single object along with the partition config file.

property num_bases: Number of bases used in RGCN weights. Default is -1.

property num_classes: The cardinality of labels in a classification task. Used by node classification and edge classification. Must provide for classification tasks.

property num_decoder_basis: The number of basis for the DenseBiDecoder decoder in edge prediction task. Default is 2.

property num_epochs: Number of training epochs. Must be integer and larger than 0 if given. Default is 0.

property num_ffn_layers_in_decoder: Number of extra feedforward neural network layers to be added in the decoder layer. Default is 0.

property num_ffn_layers_in_gnn: Number of extra feedforward neural network layers to be added between GNN layers. Default is 0.

property num_ffn_layers_in_input: Number of extra feedforward neural network layers to be added in the input layer. Default is 0.

property num_heads: Number of attention heads used in RGAT and HGT weights. Default is 4.

property num_layers: Number of GNN layers. Must be an integer larger than 0 if given. Default is 0, which means no GNN layers.

property num_negative_edges: Number of negative edges sampled for each positive edge during training. Default is 16.

property num_negative_edges_eval: Number of negative edges sampled for each positive edge during validation and testing. Default is 1000.

property num_train_hard_negatives

Number of hard negatives to sample for each edge type during training. Default is None.

For each edge type to have a number of hard negatives, the format of the arguement is:

num_train_hard_negatives:
    - src_type,rel_type0,dst_type:num_negatives
    - src_type,rel_type1,dst_type:num_negatives

or, for all edge types to have the same number of hard negatives, the format of the arguement is:

num_train_hard_negatives:
    - num_negatives

property out_emb_size: The dimension of embeddings output from the last GNN layer. It will be ignored when num_layers <= 1. Must be an integer larger than 0. Default is None.

property part_config: Path to the graph partition configuration file. Must provide.

property regression_loss_func: Regression loss function. Builtin loss functions include mse and shrinkage. Default is mse.

property remove_target_edge_type

Whether to remove the training target edge type for message passing. Default is True.

If set to True, Graphstorm will set the fanout of training target edge type as zero. This is only used with edge classification. If the edge classification is to predict the existence of an edge between two nodes, GraphStorm should remove the target edge in the message passing to avoid information leak. If it’s to predict some attributes associated with an edge, GraphStorm may not need to remove the target edge. Since it is unclear what to predict, to be safe, remove the target edge in message passing by default.

property restore_model_layers: GraphStorm model layers to load. Currently, three neural network layers are supported, i.e., node_embed, edge_embed, gnn and decoder. Default is to restore all four of these layers.

property restore_model_path: A path where GraphStorm model parameters are saved. Default is None.

property restore_optimizer_path: A path storing optimizer status corresponding to GraphML model parameters. Default is None.

property return_proba: Whether to return all the predictions or the maximum prediction in classification tasks. Set True to return predictions and False to return maximum prediction. Default is True.

property reverse_edge_types_map

A list of reverse edge type info. Default is an empty dictionary.

Each information is in the following format: <head,relation,reverse relation,tail>. For example: ["query,adds,rev-adds,asin", "query,clicks,rev-clicks,asin"].

property save_embed_path: Path to save the generated node embeddings. Default is None.

property save_model_frequency: The Number of iterations to save model once. By default, GraphStorm will save models at the end of each epoch if save_model_path is provided. Default is -1, which means only save at the end of each epoch.

property save_model_path: A path to save GraphStorm model parameters and the corresponding optimizer status. Default is None.

property save_perf_results_path: Path for saving performance results. Default is None.

property save_prediction_path: Path to save prediction results. This is used in classification or regression inference. Default is same as the save_embed_path.

property sparse_optimizer_lr: Learning rate for the optimizer corresponding to learnable sparse embeddings. Default is same as lr.

property target_etype: The list of canonical etypes that will be added as training targets in edge classification and regression tasks. If not provided, GraphStorm will assume the input graph is a homogeneous graph and set target_etype to ('_N', '_E', '_N').

property target_ntype: The node type for prediction. By default, GraphStorm will assume the input graph is a homogeneous graph and set target_ntype to _N.

property task_tracker

A task tracker used to formalize and report model performance metrics.

The supported task trackers includes SageMaker (sagemaker_task_tracker) and TensorBoard (tensorboard_task_tracker). The user can specify it in the yaml configuration as following:

basic:
    task_tracker: "tensorboard_task_tracker"

The default is sagemaker_task_tracker, which will log the metrics using Python logging facility.

For TensorBoard tracker, users can specify a file directory to store the logs by providing the file path information in a format of tensorboard_task_tracker:FILE_PATH. The tensorboard logs will be stored under FILE_PATH.

Changed in version 0.4.1: Add support for tensorboard tracker.

property task_tracker_logpath

A path for a task tracker to store the logs.

SageMaker trackers will ignore this property.

For TensorBoard tracker, users can specify a file directory to store the logs by providing the file path information in a format of tensorboard_task_tracker:FILE_PATH. The task_tracker_logpath will be set to FILE_PATH.

Default: None

New in version 0.4.1.

property task_type: Graph machine learning task type. GraphStorm supported task types include “node_classification”, “node_regression”, “edge_classification”, “edge_regression”, and “link_prediction”. Must provided.

property textual_data_path: The path to load the textual data for distillation. User need to specify a path of directory with two sub-directory for train and val split. Default is None.

property topk_model_to_save

The number of top best validation performance GraphStorm model to save.

If topk_model_to_save is set and save_model_frequency is not set, GraphStorm will try to save models after each epoch and keep at most K models. If save_model_frequency is set, GraphStorm will try to save models every number of save_model_frequency iteration and keep at most K models.

property train_etype: The list of canonical edge types that will be added as training target. If not provided, all edge types will be used as training target. A canonical edge type should be formatted as src_node_type,relation_type,dst_node_type.

property train_etypes_negative_dstnode

The list of canonical edge types that have hard negative edges constructed by corrupting destination nodes during training.

For each edge type to use different fields to store the hard negatives, the format of the arguement is:

train_etypes_negative_dstnode:
    - src_type,rel_type0,dst_type:negative_nid_field
    - src_type,rel_type1,dst_type:negative_nid_field

or, for all edge types to use the same field to store the hard negatives, the format of the arguement is:

train_etypes_negative_dstnode:
    - negative_nid_field

property train_negative_sampler: The negative sampler used for link prediction training. Built-in samplers include uniform, joint, localuniform, all_etype_uniform and all_etype_joint. Default is uniform.

property use_early_stop: Whether to use early stopping during training. Default is False.

property use_graphbolt: Whether to use GraphBolt in-memory graph representation. See https://docs.dgl.ai/stochastic_training/ for details. Default is False.

property use_mini_batch_infer: Whether to do mini-batch inference or full graph inference. Default is False for link prediction, and True for other tasks.

property use_node_embeddings: Whether to create extra learnable embeddings for nodes. These learnable embeddings will be concatenated with nodes’ own features to form the inputs for model training. Default is False.

property use_self_loop: Whether to include nodes’ own feature as a special relation type. Detault is True.

property use_wholegraph_embed: Whether to use WholeGraph to store intermediate embeddings/tensors generated during training or inference, e.g., “cache_lm_emb”, “sparse_emb”, etc. Default is None.

property verbose: Verbose for print out more running information. Default is False.

verify_edge_feat_reconstruct_arguments(): Verify the correctness of arguments for edge feature reconstruction tasks.

property wd_l2norm: Weight decay used by torch.optim.Adam. Default is 0.