HGTEncoder

class graphstorm.model.HGTEncoder(g, hid_dim, out_dim, num_hidden_layers, num_heads, edge_feat_name=None, edge_feat_mp_op='concat', dropout=0.2, norm='layer', num_ffn_layers_in_gnn=0)

Bases: GraphConvEncoder, GSgnnGNNEncoderInterface

Heterogenous Graph Transformer (HGT) encoder.

The HGTEncoder employs several HGTLayer as its encoding mechanism. The HGTEncoder should be designated as the model’s encoder within Graphstorm.

Changed in version 0.4.1: Add two new arguments edge_feat_name and edge_feat_mp_op in v0.4.1 to support edge features in HGT encoder.

Parameters

g: DistGraph

The input distributed graph.

hid_dim: int

Hidden dimension size.

out_dim: int

Output dimension size.

num_hidden_layers: int

Number of hidden layers. Total GNN layers is equal to num_hidden_layers + 1.

num_heads: int

Number of attention heads.

edge_feat_name: dict of list of str

User provided edge feature names in the format of {etype1:[feat1, feat2, …], etype2:[…], …}, or None if not provided.

edge_feat_mp_op: str

The opration method to combine source node embeddings with edge embeddings in message passing. Options include concat, add, sub, mul, and div. concat operation will concatenate the source node features with edge features; add operation will add the source node features with edge features together; sub operation will subtract the source node features by edge features; mul operation will multiply the source node features with edge features; and div operation will divide the source node features by edge features.

dropout: float

Dropout rate. Default: 0.2.

norm: str

Normalization methods. Options:batch, layer, and None. Default: layer.

num_ffn_layers_in_gnn: int

Number of fnn layers between GNN layers. Default: 0.

Examples:

# Build model and do full-graph inference on HGTEncoder
from graphstorm import get_node_feat_size
from graphstorm.model import HGTEncoder
from graphstorm.model import MLPEdgeDecoder
from graphstorm.model import GSgnnEdgeModel, GSNodeEncoderInputLayer
from graphstorm.dataloading import GSgnnData
from graphstorm.model import do_full_graph_inference

np_data = GSgnnData(...)

model = GSgnnEdgeModel(alpha_l2norm=0)
feat_size = get_node_feat_size(np_data.g, "feat")
encoder = GSNodeEncoderInputLayer(g, feat_size, 4,
                                  dropout=0,
                                  use_node_embeddings=True)
model.set_node_input_encoder(encoder)

gnn_encoder = HGTEncoder(g,
                         hid_dim=4,
                         out_dim=4,
                         num_hidden_layers=1,
                         num_heads=2,
                         dropout=0.0,
                         norm="layer",
                         num_ffn_layers_in_gnn=0)
model.set_gnn_encoder(gnn_encoder)
model.set_decoder(MLPEdgeDecoder(model.gnn_encoder.out_dims,
                                 3, multilabel=False, target_etype=("n0", "r1", "n1"),
                                 num_ffn_layers=num_ffn_layers))

h = do_full_graph_inference(model, np_data)
is_support_edge_feat()

Overwrite GraphConvEncoder class’ method, indicating HGTEncoder supports edge features.

forward(blocks, n_h, e_hs=None)

HGT encoder forward computation.

Changed in version 0.4.1: Change inputs into blocks, n_h and e_hs in v0.4.1 to support edge feature in HGT encoder.

Parameters

blocks: list of DGL MFGs

Sampled subgraph in the list of DGL message flow graphs (MFGs) format. More detailed information about DGL MFG can be found in DGL Neighbor Sampling Overview.

n_h: dict of Tensor

Input node features for each node type in the format of {ntype: tensor}.

e_hs: list of dict of Tensor

Input edge features for each edge type in the format of [{etype1: tensor, etype2: tensor, …}, …], or [{}, {}. …] for zero number of edges in input blocks. The length of e_hs should be equal to the number of gnn layers. Default is None.

Returns

h: dict of Tensor

New node embeddings for each node type in the format of {ntype: tensor}.