HGTEncoder

class graphstorm.model.HGTEncoder(g, hid_dim, out_dim, num_hidden_layers, num_heads, edge_feat_name=None, edge_feat_mp_op='concat', dropout=0.2, norm='layer', num_ffn_layers_in_gnn=0)

Bases: GraphConvEncoder, GSgnnGNNEncoderInterface

Heterogenous Graph Transformer (HGT) encoder.

The HGTEncoder employs several HGTLayer as its encoding mechanism. The HGTEncoder should be designated as the model’s encoder within Graphstorm.

Changed in version 0.4.1: Add two new arguments edge_feat_name and edge_feat_mp_op in v0.4.1 to support edge features in HGT encoder.

Parameters

g: DistGraph: The input distributed graph.
hid_dim: int: Hidden dimension size.
out_dim: int: Output dimension size.
num_hidden_layers: int: Number of hidden layers. Total GNN layers is equal to num_hidden_layers + 1.
num_heads: int: Number of attention heads.
edge_feat_name: dict of list of str: User provided edge feature names in the format of {etype1:[feat1, feat2, …], etype2:[…], …}, or None if not provided.
edge_feat_mp_op: str: The opration method to combine source node embeddings with edge embeddings in message passing. Options include concat, add, sub, mul, and div. concat operation will concatenate the source node features with edge features; add operation will add the source node features with edge features together; sub operation will subtract the source node features by edge features; mul operation will multiply the source node features with edge features; and div operation will divide the source node features by edge features.
dropout: float: Dropout rate. Default: 0.2.
norm: str: Normalization methods. Options:batch, layer, and None. Default: layer.
num_ffn_layers_in_gnn: int: Number of fnn layers between GNN layers. Default: 0.

Examples:

# Build model and do full-graph inference on HGTEncoder
from graphstorm import get_node_feat_size
from graphstorm.model import HGTEncoder
from graphstorm.model import MLPEdgeDecoder
from graphstorm.model import GSgnnEdgeModel, GSNodeEncoderInputLayer
from graphstorm.dataloading import GSgnnData
from graphstorm.model import do_full_graph_inference

np_data = GSgnnData(...)

model = GSgnnEdgeModel(alpha_l2norm=0)
feat_size = get_node_feat_size(np_data.g, "feat")
encoder = GSNodeEncoderInputLayer(g, feat_size, 4,
                                  dropout=0,
                                  use_node_embeddings=True)
model.set_node_input_encoder(encoder)

gnn_encoder = HGTEncoder(g,
                         hid_dim=4,
                         out_dim=4,
                         num_hidden_layers=1,
                         num_heads=2,
                         dropout=0.0,
                         norm="layer",
                         num_ffn_layers_in_gnn=0)
model.set_gnn_encoder(gnn_encoder)
model.set_decoder(MLPEdgeDecoder(model.gnn_encoder.out_dims,
                                 3, multilabel=False, target_etype=("n0", "r1", "n1"),
                                 num_ffn_layers=num_ffn_layers))

h = do_full_graph_inference(model, np_data)

is_support_edge_feat(): Overwrite GraphConvEncoder class’ method, indicating HGTEncoder supports edge features.

forward(blocks, n_h, e_hs=None)

HGT encoder forward computation.

Changed in version 0.4.1: Change inputs into blocks, n_h and e_hs in v0.4.1 to support edge feature in HGT encoder.

Parameters

blocks: list of DGL MFGs: Sampled subgraph in the list of DGL message flow graphs (MFGs) format. More detailed information about DGL MFG can be found in DGL Neighbor Sampling Overview.
n_h: dict of Tensor: Input node features for each node type in the format of {ntype: tensor}.
e_hs: list of dict of Tensor: Input edge features for each edge type in the format of [{etype1: tensor, etype2: tensor, …}, …], or [{}, {}. …] for zero number of edges in input blocks. The length of e_hs should be equal to the number of gnn layers. Default is None.

Returns

h: dict of Tensor: New node embeddings for each node type in the format of {ntype: tensor}.