GSgnnEdgeTrainData
- class graphstorm.dataloading.GSgnnEdgeTrainData(graph_name, part_config, train_etypes, eval_etypes=None, label_field=None, node_feat_field=None, edge_feat_field=None, decoder_edge_feat=None, lm_feat_ntypes=None, lm_feat_etypes=None)
Bases:
GSgnnEdgeDataEdge prediction training data
The GSgnnEdgeTrainData prepares the data for training edge prediction.
Parameters
- graph_namestr
The graph name
- part_configstr
The path of the partition configuration file.
- train_etypestuple of str or list of tuples
Target edge types for training
- eval_etypestuple of str or list of tuples
Target edge types for evaluation
- label_fieldstr
The field for storing labels
- node_feat_field: str or dict of list of str
Fields to extract node features. It’s a dict if different node types have different feature names.
- edge_feat_fieldstr or dict of list of str
The field of the edge features. It’s a dict if different edge types have different feature names.
- decoder_edge_feat: str or dict of list of str
Edge features used by decoder
Examples
from graphstorm.dataloading import GSgnnEdgeTrainData from graphstorm.dataloading import GSgnnEdgeDataLoader ep_data = GSgnnEdgeTrainData(graph_name='dummy', part_config=part_config, train_etypes=[('n1', 'e1', 'n2')], label_field='label', node_feat_field='node_feat', edge_feat_field='edge_feat') ep_dataloader = GSgnnEdgeDataLoader(ep_data, target_idx={"e1":[0]}, fanout=[15, 10], batch_size=128)
- get_edge_feats(input_edges, edge_feat_field, device='cpu')
Get the edge features
Parameters
- input_edgesTensor or dict of Tensors
The input edge IDs
- edge_feat_field: str or dict of [str ..]
The edge data fields that stores the edge features to retrieve
- devicePytorch device
The device where the returned edge features are stored.
Returns
dict of Tensors : The returned edge features.
- get_labels(eids, device='cpu')
Get the edge labels
Parameters
- eidsTensor or dict of Tensors
The edge IDs
- devicePytorch device
The device where the returned edge labels are stored.
Returns
dict of Tensors : the returned edge labels.
- get_node_feat_size()
Get node feat size using the given node_feat_field
All parameters are coming from this class’s own attributes.
- Note: If the self._node_feat_field is None, i.e., not given, the function will return a
dictionary containing all node types in the self.g, and the feature sizes are all 0s. If given the node_feat_field, will return dictionary that only contains given node types.
- get_node_feats(input_nodes, device='cpu')
Get the node features
Parameters
- input_nodesTensor or dict of Tensors
The input node IDs
- devicePytorch device
The device where the returned node features are stored.
Returns
dict of Tensors : The returned node features.
- prepare_data(g)
Prepare the training, validation and testing edge set.
It will setup the following class fields: self._train_idxs: the edge indices of the local training set. self._val_idxs: the edge indices of the local validation set, can be empty. self._test_idxs: the edge indices of the local test set, can be empty.
Arguement
g: Dist DGLGraph