RelationalAttLayer

class graphstorm.model.RelationalAttLayer(in_feat, out_feat, rel_names, num_heads, *, edge_feat_name=None, edge_feat_mp_op='concat', bias=True, activation=None, self_loop=False, dropout=0.0, num_ffn_layers_in_gnn=0, fnn_activation=<function relu>, norm=None)

Bases: Module

Relational graph attention layer from Relational Graph Attention Networks.

For the GATConv on each relation type:

\[h_i^{(l+1)} = \sum_{j\in \mathcal{N}(i)} \alpha_{i,j} W^{(l)} h_j^{(l)}\]

where \(\alpha_{ij}\) is the attention score between node \(i\) and node \(j\):

\[ \begin{align}\begin{aligned}\alpha_{ij}^{l} &= \mathrm{softmax_i} (e_{ij}^{l})\\e_{ij}^{l} &= \mathrm{LeakyReLU}\left(\vec{a}^T [W h_{i} \| W h_{j}]\right)\end{aligned}\end{align} \]

Note:

  • For inner relation message aggregation we use multi-head attention network.

  • For cross relation message we just use average.

Examples:

# suppose graph and input_feature are ready
from graphstorm.model import RelationalAttLayer

layer = RelationalAttLayer(
        in_feat=h_dim, out_feat=h_dim, rel_names=g.canonical_etypes,
        num_heads=4, self_loop,
        dropout, num_ffn_layers_in_gnn,
        fnn_activation, norm)
h = layer(g, input_feature)

Parameters

in_feat: int

Input feature size.

out_feat: int

Output feature size.

rel_names: list of tuple

Relation type list in the format of [(‘src_ntyp1’, ‘etype1’, ‘dst_ntype1’), …].

num_heads: int

Number of attention heads.

edge_feat_name: dict of list of str

User provided edge feature names in the format of {etype1:[feat1, feat2, …], etype2:[…], …}, or None if not provided.

edge_feat_mp_op: str

The opration method to combine source node embeddings with edge embeddings in message passing. Options include concat, add, sub, mul, and div. concat operation will concatenate the source node features with edge features; add operation will add the source node features with edge features together; sub operation will subtract the source node features by edge features; mul operation will multiply the source node features with edge features; and div operation will divide the source node features by edge features.

bias: bool

Whether to add bias. Default: True.

activation: callable

Activation function. Default: None.

self_loop: bool

Whether to include self loop message. Default: False.

dropout: float

Dropout rate. Default: 0.

num_ffn_layers_in_gnn: int

Number of fnn layers between gnn layers. Default: 0.

ffn_activation: torch.nn.functional

Activation for ffn. Default: relu.

norm: str

Normalization methods. Options:batch, layer, and None. Default: None, meaning no normalization.

Changed in version 0.4.1: Add two new arguments edge_feat_name and edge_feat_mp_op in v0.4.1 to support edge features in RGAT.

forward(g, n_h, e_h=None)

RGAT layer forward computation.

Parameters

g: DGLHeteroGraph

Input DGL heterogenous graph.

n_h: dict of Tensor

Node features for each node type in the format of {ntype: tensor}.

e_h: dict of Tensor

edge features for each edge type in the format of {etype: tensor}. Default is None.

Returns

dict of Tensor: New node embeddings for each node type in the format of {ntype: tensor}.

Changed in version 0.4.1: Change inputs into n_h and e_h in v0.4.1 to support edge feature in RGAT layer.