GSNodeEncoderInputLayer
- class graphstorm.model.GSNodeEncoderInputLayer(g, feat_size, embed_size, activation=None, dropout=0.0, use_node_embeddings=False, force_no_embeddings=None, num_ffn_layers_in_input=0, ffn_activation=<function relu>, cache_embed=False, use_wholegraph_sparse_emb=False)
Bases:
GSNodeInputLayerThe node encoder input layer for all nodes in a heterogeneous graph.
The input layer adds a linear layer on nodes with node features and the linear layer projects the node features into a specified dimension. It also adds learnable embeddings on nodes that do not have features. Users can add learnable embeddings on the nodes with node features by setting
use_node_embeddingsto True. In this case, the input layer combines the node features with the learnable embeddings and project them to the specified dimension.Parameters
- g: DistGraph
The input DGL distributed graph.
- feat_sizedict of int or dict of FeatureGroupSize
The original feat size of each node type in the format of {str: int}. If a node has multiple feature groups, it is in the format of {str: FeatureGroupSize}
- embed_sizeint
The output embedding size.
- activationcallable
The activation function applied to the output embeddigns. Default: None.
- dropoutfloat
The dropout parameter. Default: 0.
- use_node_embeddingsbool
Whether to use learnable embeddings for nodes even when node features are available. Default: False.
- force_no_embeddingslist of str
The list node types that are forced to not use learnable embeddings. Default: None.
- num_ffn_layers_in_input: int
(Optional) Number of layers of feedforward neural network for each node type in the input layer. Default: 0.
- ffn_activationcallable
The activation function for the feedforward neural networks. Default: relu.
- cache_embedbool
Whether or not to cache the embeddings. Default: False.
- use_wholegraph_sparse_embbool
Whether or not to use WholeGraph to host embeddings for sparse updates. Default: False.
Examples:
from graphstorm import get_node_feat_size from graphstorm.model import GSgnnNodeModel, GSNodeEncoderInputLayer from graphstorm.dataloading import GSgnnData np_data = GSgnnData(...) model = GSgnnNodeModel(alpha_l2norm=0) feat_size = get_node_feat_size(np_data.g, "feat") encoder = GSNodeEncoderInputLayer(g, feat_size, embed_size=4, use_node_embeddings=True) model.set_node_input_encoder(encoder)
- forward(input_feats, input_nodes)
Input layer forward computation.
Parameters
- input_feats: dict of Tensor
The input features in the format of {ntype: feats}.
- input_nodes: dict of Tensor
The input node indexes in the format of {ntype: indexes}.
Returns
- embs: dict of Tensor
The projected node embeddings in the format of {ntype: emb}.
- require_cache_embed()
Whether to cache the embeddings for inference.
If the input layer encoder includes heavy computations, such as BERT computations, it should return
Trueand the inference engine will cache the embeddings from the input layer encoder.Returns
bool :
Trueif we need to cache the embeddings for inference.
- get_sparse_params()
Get the sparse parameters of this input layer.
This function is normally called by optimizers to update sparse model parameters, i.e., learnable node embeddings.
Returns
list of Tensors: the sparse embeddings, or empty list if no sparse parameters.
- property in_dims
Return the input feature size, which is given in class initialization.
- property out_dims
Return the number of output dimensions, which is given in class initialization.
- property use_wholegraph_sparse_emb
Return whether or not to use WholeGraph to host embeddings for sparse updates, which is given in class initialization.