GSgnnData
- class graphstorm.dataloading.GSgnnData(part_config, node_feat_field=None, edge_feat_field=None, lm_feat_ntypes=None, lm_feat_etypes=None)
Bases:
objectThe GraphStorm data class.
Parameters
- part_configstr
The path of the partition configuration JSON file.
- node_feat_field: str or dict of list of str
The fields of the node features that will be encoded by
GSNodeInputLayer. It’s a dict if different node types have different feature names. Default: None.- edge_feat_fieldstr or dict of list of str
The fields of the edge features. It’s a dict, if different edge types have different feature names. This argument is reserved for future usage when the
GSEdgeInputLayeris implemented. Default: None.- lm_feat_ntypeslist of str
The node types that contains text features. Default: None.
- lm_feat_etypeslist of tuples
The edge types that contains text features. Default: None.
- property g
The distributed graph loaded using information in the given part_config JSON file.
- property graph_name
The distributed graph’s name extracted from the given part_config JSON file.
- property node_feat_field
The fields of node features given in initialization.
- property edge_feat_field
The fields of edge features given in initialization.
- has_node_feats(ntype)
Test if the specified node type has features.
Parameters
- ntypestr
The node type
Returns
bool : Whether the node type has features.
- has_edge_feats(etype)
Test if the specified edge type has features.
Parameters
- etype(str, str, str)
The canonical edge type.
Returns
bool : Whether the edge type has features.
- has_node_lm_feats(ntype)
Test if the specified node type has text features.
Parameters
- ntypestr
The node type.
Returns
bool : Whether the node type has text features.
- has_edge_lm_feats(etype)
Test if the specified edge type has text features.
Parameters
- etype(str, str, str)
The edge type.
Returns
bool : Whether the edge type has text features.
- get_node_feats(input_nodes, nfeat_fields, device='cpu')
Get the node features of the given input nodes. The feature fields are defined in
nfeat_fields.Changed in version 0.5.0: When nfeat_fields is a dict, its value(s) can be a list of str or a list of FeatureGroup. The return value can be a dict of int or FeatureGroupSize, respectively.
Parameters
- input_nodesTensor or dict of Tensors
The input node IDs.
- nfeat_fieldsstr or dict of [str …] or dict of [FeatureGroup …]
The node feature fields to be extracted. A string represents the feature name. A dictionary indicates that each node type has different node feature names. When the value of a key (node type) is a list of strings, it indicates that the node type has only one group of features. When the value is a list of FeatureGroup, it indicates that the node type has more than one group of features.
- devicePytorch device
The device where the returned node features are stored.
Returns
dict of Tensors : The returned node features.
- get_edge_feats(input_edges, efeat_fields, device='cpu')
Get the edge features of the given input edges. The feature fields are defined in
efeat_fields.Parameters
- input_edgesTensor or dict of Tensors
The input edge IDs.
- efeat_fields: str or dict of [str ..]
The edge feature fields to be extracted.
- devicePytorch device
The device where the returned edge features are stored.
Returns
dict of Tensors : The returned edge features.
- get_blocks_edge_feats(input_blocks, efeat_fields, device='cpu')
- Get the edge features of the given input blocks. The feature fields are
defined in
efeat_fields.
New in version 0.4.0: Add
get_blocks_edge_featin 0.4.0 to support edge features in message passing.Parameters
- input_blockslist of DGLblock
The input blocks with edge features to be extracted.
- efeat_fields: string or dict of list of strings
The edge feature fields to be extracted.
- devicePytorch device
The device where the returned edge features are stored.
Returns
- block_edge_input_feats: list of dict of Tensors
The returned edge features for all blocks.
- get_unlabeled_node_set(train_idxs, mask='train_mask')
Get node indexes not having the given mask in the training set.
Parameters
- train_idxs: dict of Tensor
The training set.
- mask: str or list of str
The node feature fields storing the training mask. Default: “train_mask”.
Returns
dict of Tensors : The returned node indexes
- get_node_train_set(ntypes, mask='train_mask')
Get the training set for the given node types under the given mask.
Parameters
- ntypes: str or list of str
Node types to get the training set.
- mask: str or list of str
The node feature fields storing the training mask. Default: “train_mask”.
Returns
dict of Tensors : The returned training node indexes.
- get_node_val_set(ntypes, mask='val_mask')
Get the validation set for the given node types under the given mask.
Parameters
- ntypes: str or list of str
Node types to get the validation set.
- mask: str or list of str
The node feature fields storing the validation mask. Default: “val_mask”.
Returns
dict of Tensors : The returned validation node indexes.
- get_node_test_set(ntypes, mask='test_mask')
Get the test set for the given node types under the given mask.
Parameters
- ntypes: str or list of str
Node types to get the test set.
- mask: str or list of str
The node feature fields storing the test mask. Default: “test_mask”.
Returns
dict of Tensors : The returned test node indexes.
- get_node_infer_set(ntypes, mask='test_mask')
Get inference node set for the given node types under the given mask.
If the mask exists in
g.nodes[ntype].data, include only nodes in the mask during inference. If such a mask does not exist, run inference on the entire node set.Parameters
- ntypes: str or list of str
Node types to get the inference set.
- mask: str or list of str
The node feature fields storing the inference mask. Default: “test_mask”.
Returns
- dict[str, Tensor]:
Mapping from node type to indices of nodes to run inference on.
- get_edge_train_set(etypes=None, mask='train_mask', reverse_edge_types_map=None)
Get the training set for the given edge types under the given mask.
Parameters
- etypes: list of str
List of edge types to get the training set. If set to None, all the edge types are included. Default: None.
- mask: str or list of str
The edge feature fields storing the training mask. Default: “train_mask”.
- reverse_edge_types_map: dict of tupeles
A map for reverse edge types in the format of {(edge type):(reversed edge type)}. Default: None.
Returns
dict of Tensors : The returned training edge indexes.
- get_edge_val_set(etypes=None, mask='val_mask', reverse_edge_types_map=None)
Get the validation set for the given edge types under the given mask.
Parameters
- etypes: list of str
List of edge types to get the val set. If set to None, all the edge types are included.
- mask: str or list of str
The edge feature field storing the val mask. Default: “val_mask”.
- reverse_edge_types_map: dict
A map for reverse edge types in the format of {(edge type):(reversed edge type)}. Default: None.
Returns
dict of Tensors : The returned validation edge indexes.
- get_edge_test_set(etypes=None, mask='test_mask', reverse_edge_types_map=None)
Get the test set for the given edge types under the given mask.
Parameters
- etypes: list of str
List of edge types to get the test set. If set to None, all the edge types are included.
- mask: str or list of str
The edge feature field storing the test mask. Default: “test_mask”.
- reverse_edge_types_map: dict
A map for reverse edge types in the format of {(edge type):(reversed edge type)}. Default: None.
Returns
dict of Tensors : The returned test edge indexes.
- get_edge_infer_set(etypes=None, mask='test_mask', reverse_edge_types_map=None)
Get the inference set for the given edge types under the given mask.
If the mask exists in
g.edges[etype].data, the inference set is collected based on the mask. If not exist, the entire edge set are treated as the inference set.Parameters
- etypes: list of str
List of edge types to get the inference set. If set to None, all the edge types are included. Default: None.
- mask: str or list of str
The edge feature field storing the inference mask. Default: “test_mask”.
- reverse_edge_types_map: dict
A map for reverse edge types in the format of {(edge type):(reversed edge type)}. Default: None.
Returns
dict of Tensors : The returned inference edge indexes.