Environment Setup
This guide will walk you through the process of installing GraphStorm, based on your specific use scenario.
- GraphStorm supports three environment setup methods:
Install GraphStorm to your local Python environment. This method is ideal for model development and testing on a single machine.
Setup a GraphStorm Docker image. This method allows you to work in a reproducible environment, and can naturally be expanded to use GraphStorm in a distributed multi-machine environment.
Run GraphStorm jobs on SageMaker. This method makes it easy to run distributed jobs on massive graphs without worrying about infrastructure setup and management, allowing you to focus on your business problems.
Setup GraphStorm in your local Python environment
Prerequisites
Linux OS: The current version of GraphStorm supports Linux-based operating systems. GraphStorm has been tested on Ubuntu 20.04, 22.04 and Amazon Linux 2023.
Python3: The current version of GraphStorm requires Python version 3.8 to 3.11.
(Optional) GraphStorm supports Nvidia GPUs.
Install Dependencies
GraphStorm requires PyTorch>=1.13.0
and DGL>=1.1.3
.
We recommend using PyTorch v2.3.0 and DGL v2.3.0 for best compatibility.
For users who have to use older DGL versions, please refer to install GraphStorm with DGL 1.1.3.
For Nvidia GPU environment you can install PyTorch and DGL using:
# for CUDA 11
pip install torch==2.3.0 --index-url https://download.pytorch.org/whl/cu118
pip install dgl==2.3.0+cu118 -f https://data.dgl.ai/wheels/torch-2.3/cu118/repo.html
# for CUDA 12
pip install torch==2.3.0 --index-url https://download.pytorch.org/whl/cu121
pip install dgl==2.3.0+cu121 -f https://data.dgl.ai/wheels/torch-2.3/cu121/repo.html
And for CPU environment use:
pip install torch==2.3.0 --index-url https://download.pytorch.org/whl/cpu
pip install dgl==2.3.0 -f https://data.dgl.ai/wheels/torch-2.3/repo.html
Install GraphStorm
After you install PyTorch and DGL, use pip
to install GraphStorm:
pip install graphstorm
Clone GraphStorm codebase (Optional)
The GraphStorm repository includes a set of scripts, tools, and examples, which can facilitate the use of the framework.
graphstorm/training_scripts/ and graphstorm/inference_scripts/ include example configuration yaml files that are used in GraphStorm documentations and tutorials and can be used as a starting point for your own training configuration.
graphstorm/examples includes use-case specific examples, such as temporal graph learning, using SageMaker Pipelines, or performing graph-level predictions with GraphStorm.
graphstorm/tools includes utilities for GraphStorm, such as data sanity checks for partitioned graph data.
graphstorm/sagemaker has fully-fledged launch scripts to help your run GraphStorm jobs on Amazon SageMaker and create and execute SageMaker Pipelines.
You can clone the GraphStorm repository to get access to these tools and examples:
git clone https://github.com/awslabs/graphstorm.git
Setup GraphStorm Docker Environment
Running GraphStorm within a Docker container will allow you to have a reproducible environment to run examples without affecting your local environment.
Prerequisites
1. Docker: You need to install Docker in your environment following Docker documentation.
Using Docker’s convenience script you can install Docker on a Linux machine:
sudo apt update
sudo apt install -y ca-certificates curl
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh ./get-docker.sh --dry-run # Preview the commands
# Run the installation once ready
# sudo sh ./get-docker.sh
Note
After installing Docker, you may need to add your user to the docker group to run Docker commands without sudo:
sudo usermod -aG docker $USER
# Log out and back in for the changes to take effect
2. (Optional) GraphStorm supports Nvidia GPUs for GPU-based training and inference. To launch containers with GPU support you need the Nvidia Container Toolkit. If using the AWS Deep Learning AMI, the Nvidia Container Toolkit comes preinstalled.
Build a GraphStorm Docker image
Set up AWS access
To build and push the image to the Amazon Elastic Container Registry (ECR) you need the
aws-cli
and you will need valid AWS credentials as well.
To install the AWS CLI you can use:
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install
To set up credentials for use with aws-cli
see the
AWS docs.
Your executing role should have full ECR access to be able to pull from ECR to build the image, create an ECR repository if it doesn’t exist, and push the GraphStorm image to the repository. See the official ECR docs for details.
Building the GraphStorm images using Docker
With Docker installed, and your AWS credentials set up,
you can use the provided scripts
in the graphstorm/docker
directory to build the image.
GraphStorm supports Amazon SageMaker and EC2/local execution environments, so you need to choose which image you want to build first.
The build_graphstorm_image.sh
script can build the image
locally and tag it. It only requires providing the intended execution environment,
using the -e/--environment
argument. The supported environments
are sagemaker
to run jobs on Amazon SageMaker and local
to run jobs
on local instances, like a custom cluster of EC2 instances.
For example, you can use the following commands to build the local image with GPU support:
git clone https://github.com/awslabs/graphstorm.git
cd graphstorm
bash docker/build_graphstorm_image.sh --environment local
The above will use the local Dockerfile for GraphStorm,
build an image and tag it as graphstorm:local-gpu
.
The script also supports other arguments to customize the image name, tag and other aspects of the build. We list the full argument list below:
-x, --verbose
Print script debug info (set -x)-e, --environment
Image execution environment. Must be one of ‘local’ or ‘sagemaker’. Required.-d, --device
Device type, must be one of ‘cpu’ or ‘gpu’. Default is ‘gpu’.-p, --path
Path to graphstorm root directory, default is one level above the script’s location.-i, --image
Docker image name, default is ‘graphstorm’.-s, --suffix
Suffix for the image tag, can be used to push custom image tags. Default is “<environment>-<device>”, e.g.sagemaker-gpu
.-b, --build
Docker build directory prefix, default is/tmp/graphstorm-build/docker
.--use-parmetis
When this flag is set we add the ParMETIS dependencies to the image. ParMETIS is an advanced distributed graph partitioning algorithm designed to minimize communication time during GNN training.
For example you can build an image to support CPU-only execution using:
bash docker/build_graphstorm_image.sh --environment local --device cpu
# Will build an image named 'graphstorm:local-cpu'
Or to build and tag an image to run ParMETIS with EC2 instances:
bash docker/build_graphstorm_image.sh --environment local --device cpu --use-parmetis --suffix "-parmetis"
# Will build an image named 'graphstorm:local-cpu-parmetis'
See bash docker/build_graphstorm_image.sh --help
for more information.
Launch a GraphStorm Container
Once you have built the image, you can launch a local container to run test jobs.
If your host has access to a GPU run the following command:
docker run --gpus all --network=host --rm -v /dev/shm:/dev/shm/ -it --name gs-test graphstorm:local-gpu /bin/bash
Or if using a CPU-only host:
docker run --network=host -v /dev/shm:/dev/shm/ --rm -it --name gs-test graphstorm:local-cpu /bin/bash
This command will create a GraphStorm container, named gs-test
and attach a bash shell to it.
If successful, the command prompt will change to the container’s, like
root@<ip-address>:/#
Note
Notice that we assign the host’s shared memory volume to the container as well using
-v /dev/shm:/dev/shm/
. GraphStorm uses shared memory to host graph data, so it is important
that you allocate enough shared memory to the container. You can also set the shared memory
using e.g. --shm-size 4gb
.
Note
If you are planning to run GraphStorm in a local cluster, specific instruction for running GraphStorm with an NFS shared filesystem is given in Use GraphStorm in a Distributed Cluster.
Push the image to Amazon Elastic Container Registry (ECR)
Once you build the image, you can use the push_graphstorm_image.sh
script to push the image
to an Amazon ECR repository.
ECR allows you to easily store, manage, and deploy container images.
This will allow you to use the image in SageMaker jobs using SageMaker Bring-Your-Own-Container, or to launch EC2 clusters.
The script requires you to provide the intended execution environment again using
the -e/--environment
argument.
By default it will create a repository named graphstorm
in the us-east-1
region,
on the default AWS account aws-cli
is configured for.
It tags the image as <environment>-<device>
, creates a new ECR repository if one
doesn’t exist, and pushes the image to it.
In addition to -e/--environment
, the script supports several optional arguments, for a full list use
bash push_graphstorm_image.sh --help
. We list the most important below:
-e, --environment
Image execution environment. Must be one of ‘local’ or ‘sagemaker’. Required.-a, --account
AWS Account ID to use, we try retrieve the default from the AWS CLI configuration.-r, --region
AWS Region to push the image to, we retrieve the default from the AWS CLI configuration.-d, --device
Device type, must be one of ‘cpu’ or ‘gpu’. Default is ‘gpu’.-p, --path
Path to graphstorm root directory, default is one level above the script’s location.-i, --image
Docker image name, default is ‘graphstorm’.-s, --suffix
Suffix for the image tag, can be used to push custom image tags. Default is “<environment>-<device>”, e.g.sagemaker-gpu
.-x, --verbose
Print script debug info (set -x)
Examples:
# Push an image to '123456789012.dkr.ecr.us-east-1.amazonaws.com/graphstorm:local-cpu'
bash docker/push_graphstorm_image.sh -e local -r "us-east-1" --account "123456789012" --device cpu