RelationalAttLayer

class graphstorm.model.RelationalAttLayer(in_feat, out_feat, rel_names, num_heads, *, edge_feat_name=None, edge_feat_mp_op='concat', bias=True, activation=None, self_loop=False, dropout=0.0, num_ffn_layers_in_gnn=0, fnn_activation=<function relu>, norm=None)

Bases: Module

Relational graph attention layer from Relational Graph Attention Networks.

For the GATConv on each relation type:

\[h_i^{(l+1)} = \sum_{j\in \mathcal{N}(i)} \alpha_{i,j} W^{(l)} h_j^{(l)}\]

where \(\alpha_{ij}\) is the attention score between node \(i\) and node \(j\):

\[ \begin{align}\begin{aligned}\alpha_{ij}^{l} &= \mathrm{softmax_i} (e_{ij}^{l})\\e_{ij}^{l} &= \mathrm{LeakyReLU}\left(\vec{a}^T [W h_{i} \| W h_{j}]\right)\end{aligned}\end{align} \]

Note:

For inner relation message aggregation we use multi-head attention network.
For cross relation message we just use average.

Examples:

# suppose graph and input_feature are ready
from graphstorm.model import RelationalAttLayer

layer = RelationalAttLayer(
        in_feat=h_dim, out_feat=h_dim, rel_names=g.canonical_etypes,
        num_heads=4, self_loop,
        dropout, num_ffn_layers_in_gnn,
        fnn_activation, norm)
h = layer(g, input_feature)

Parameters

in_feat: int: Input feature size.
out_feat: int: Output feature size.
rel_names: list of tuple: Relation type list in the format of [(‘src_ntyp1’, ‘etype1’, ‘dst_ntype1’), …].
num_heads: int: Number of attention heads.
edge_feat_name: dict of list of str: User provided edge feature names in the format of {etype1:[feat1, feat2, …], etype2:[…], …}, or None if not provided.
edge_feat_mp_op: str: The opration method to combine source node embeddings with edge embeddings in message passing. Options include concat, add, sub, mul, and div. concat operation will concatenate the source node features with edge features; add operation will add the source node features with edge features together; sub operation will subtract the source node features by edge features; mul operation will multiply the source node features with edge features; and div operation will divide the source node features by edge features.
bias: bool: Whether to add bias. Default: True.
activation: callable: Activation function. Default: None.
self_loop: bool: Whether to include self loop message. Default: False.
dropout: float: Dropout rate. Default: 0.
num_ffn_layers_in_gnn: int: Number of fnn layers between gnn layers. Default: 0.
ffn_activation: torch.nn.functional: Activation for ffn. Default: relu.
norm: str: Normalization methods. Options:batch, layer, and None. Default: None, meaning no normalization.

Changed in version 0.4.1: Add two new arguments edge_feat_name and edge_feat_mp_op in v0.4.1 to support edge features in RGAT.

forward(g, n_h, e_h=None)

RGAT layer forward computation.

Parameters

g: DGLHeteroGraph: Input DGL heterogenous graph.
n_h: dict of Tensor: Node features for each node type in the format of {ntype: tensor}.
e_h: dict of Tensor: edge features for each edge type in the format of {etype: tensor}. Default is None.

Returns

dict of Tensor: New node embeddings for each node type in the format of {ntype: tensor}.

Changed in version 0.4.1: Change inputs into n_h and e_h in v0.4.1 to support edge feature in RGAT layer.