Environment Setup

This guide will walk you through the process of installing GraphStorm, based on your specific use scenario.

GraphStorm supports three environment setup methods:
  • Install GraphStorm to your local Python environment. This method is ideal for model development and testing on a single machine.

  • Setup a GraphStorm Docker image. This method allows you to work in a reproducible environment, and can naturally be expanded to use GraphStorm in a distributed multi-machine environment.

  • Run GraphStorm jobs on SageMaker. This method makes it easy to run distributed jobs on massive graphs without worrying about infrastructure setup and management, allowing you to focus on your business problems.

Setup GraphStorm in your local Python environment

Prerequisites

  1. Linux OS: The current version of GraphStorm supports Linux-based operating systems. GraphStorm has been tested on Ubuntu 20.04, 22.04 and Amazon Linux 2023.

  2. Python3: The current version of GraphStorm requires Python version 3.8 to 3.11.

  3. (Optional) GraphStorm supports Nvidia GPUs.

Install Dependencies

GraphStorm requires PyTorch>=1.13.0 and DGL>=1.1.3. We recommend using PyTorch v2.3.0 and DGL v2.3.0 for best compatibility. For users who have to use older DGL versions, please refer to install GraphStorm with DGL 1.1.3.

For Nvidia GPU environment you can install PyTorch and DGL using:

# for CUDA 11
pip install torch==2.3.0 --index-url https://download.pytorch.org/whl/cu118
pip install dgl==2.3.0+cu118 -f https://data.dgl.ai/wheels/torch-2.3/cu118/repo.html

# for CUDA 12
pip install torch==2.3.0 --index-url https://download.pytorch.org/whl/cu121
pip install dgl==2.3.0+cu121 -f https://data.dgl.ai/wheels/torch-2.3/cu121/repo.html

And for CPU environment use:

pip install torch==2.3.0 --index-url https://download.pytorch.org/whl/cpu
pip install dgl==2.3.0 -f https://data.dgl.ai/wheels/torch-2.3/repo.html

Install GraphStorm

After you install PyTorch and DGL, use pip to install GraphStorm:

pip install graphstorm

Clone GraphStorm codebase (Optional)

The GraphStorm repository includes a set of scripts, tools, and examples, which can facilitate the use of the framework.

  • graphstorm/training_scripts/ and graphstorm/inference_scripts/ include example configuration yaml files that are used in GraphStorm documentations and tutorials and can be used as a starting point for your own training configuration.

  • graphstorm/examples includes use-case specific examples, such as temporal graph learning, using SageMaker Pipelines, or performing graph-level predictions with GraphStorm.

  • graphstorm/tools includes utilities for GraphStorm, such as data sanity checks for partitioned graph data.

  • graphstorm/sagemaker has fully-fledged launch scripts to help your run GraphStorm jobs on Amazon SageMaker and create and execute SageMaker Pipelines.

You can clone the GraphStorm repository to get access to these tools and examples:

git clone https://github.com/awslabs/graphstorm.git

Setup GraphStorm Docker Environment

Running GraphStorm within a Docker container will allow you to have a reproducible environment to run examples without affecting your local environment.

Prerequisites

1. Docker: You need to install Docker in your environment following Docker documentation.

Using Docker’s convenience script you can install Docker on a Linux machine:

sudo apt update
sudo apt install -y ca-certificates curl
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh ./get-docker.sh --dry-run # Preview the commands
# Run the installation once ready
# sudo sh ./get-docker.sh

Note

After installing Docker, you may need to add your user to the docker group to run Docker commands without sudo:

sudo usermod -aG docker $USER
# Log out and back in for the changes to take effect

2. (Optional) GraphStorm supports Nvidia GPUs for GPU-based training and inference. To launch containers with GPU support you need the Nvidia Container Toolkit. If using the AWS Deep Learning AMI, the Nvidia Container Toolkit comes preinstalled.

Build a GraphStorm Docker image

Set up AWS access

To build and push the image to the Amazon Elastic Container Registry (ECR) you need the aws-cli and you will need valid AWS credentials as well.

To install the AWS CLI you can use:

curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install

To set up credentials for use with aws-cli see the AWS docs.

Your executing role should have full ECR access to be able to pull from ECR to build the image, create an ECR repository if it doesn’t exist, and push the GraphStorm image to the repository. See the official ECR docs for details.

Building the GraphStorm images using Docker

With Docker installed, and your AWS credentials set up, you can use the provided scripts in the graphstorm/docker directory to build the image.

GraphStorm supports Amazon SageMaker and EC2/local execution environments, so you need to choose which image you want to build first.

The build_graphstorm_image.sh script can build the image locally and tag it. It only requires providing the intended execution environment, using the -e/--environment argument. The supported environments are sagemaker to run jobs on Amazon SageMaker and local to run jobs on local instances, like a custom cluster of EC2 instances.

For example, you can use the following commands to build the local image with GPU support:

git clone https://github.com/awslabs/graphstorm.git
cd graphstorm
bash docker/build_graphstorm_image.sh --environment local

The above will use the local Dockerfile for GraphStorm, build an image and tag it as graphstorm:local-gpu.

The script also supports other arguments to customize the image name, tag and other aspects of the build. We list the full argument list below:

  • -x, --verbose Print script debug info (set -x)

  • -e, --environment Image execution environment. Must be one of ‘local’ or ‘sagemaker’. Required.

  • -d, --device Device type, must be one of ‘cpu’ or ‘gpu’. Default is ‘gpu’.

  • -p, --path Path to graphstorm root directory, default is one level above the script’s location.

  • -i, --image Docker image name, default is ‘graphstorm’.

  • -s, --suffix Suffix for the image tag, can be used to push custom image tags. Default is “<environment>-<device>”, e.g. sagemaker-gpu.

  • -b, --build Docker build directory prefix, default is /tmp/graphstorm-build/docker.

  • --use-parmetis When this flag is set we add the ParMETIS dependencies to the image. ParMETIS is an advanced distributed graph partitioning algorithm designed to minimize communication time during GNN training.

For example you can build an image to support CPU-only execution using:

bash docker/build_graphstorm_image.sh --environment local --device cpu
# Will build an image named 'graphstorm:local-cpu'

Or to build and tag an image to run ParMETIS with EC2 instances:

bash docker/build_graphstorm_image.sh --environment local --device cpu --use-parmetis --suffix "-parmetis"
# Will build an image named 'graphstorm:local-cpu-parmetis'

See bash docker/build_graphstorm_image.sh --help for more information.

Launch a GraphStorm Container

Once you have built the image, you can launch a local container to run test jobs.

If your host has access to a GPU run the following command:

docker run --gpus all --network=host --rm -v /dev/shm:/dev/shm/ -it --name gs-test graphstorm:local-gpu /bin/bash

Or if using a CPU-only host:

docker run --network=host -v /dev/shm:/dev/shm/ --rm -it --name gs-test graphstorm:local-cpu /bin/bash

This command will create a GraphStorm container, named gs-test and attach a bash shell to it.

If successful, the command prompt will change to the container’s, like

root@<ip-address>:/#

Note

Notice that we assign the host’s shared memory volume to the container as well using -v /dev/shm:/dev/shm/. GraphStorm uses shared memory to host graph data, so it is important that you allocate enough shared memory to the container. You can also set the shared memory using e.g. --shm-size 4gb.

Note

If you are planning to run GraphStorm in a local cluster, specific instruction for running GraphStorm with an NFS shared filesystem is given in Use GraphStorm in a Distributed Cluster.

Push the image to Amazon Elastic Container Registry (ECR)

Once you build the image, you can use the push_graphstorm_image.sh script to push the image to an Amazon ECR repository. ECR allows you to easily store, manage, and deploy container images.

This will allow you to use the image in SageMaker jobs using SageMaker Bring-Your-Own-Container, or to launch EC2 clusters.

The script requires you to provide the intended execution environment again using the -e/--environment argument. By default it will create a repository named graphstorm in the us-east-1 region, on the default AWS account aws-cli is configured for. It tags the image as <environment>-<device>, creates a new ECR repository if one doesn’t exist, and pushes the image to it.

In addition to -e/--environment, the script supports several optional arguments, for a full list use bash push_graphstorm_image.sh --help. We list the most important below:

  • -e, --environment Image execution environment. Must be one of ‘local’ or ‘sagemaker’. Required.

  • -a, --account AWS Account ID to use, we try retrieve the default from the AWS CLI configuration.

  • -r, --region AWS Region to push the image to, we retrieve the default from the AWS CLI configuration.

  • -d, --device Device type, must be one of ‘cpu’ or ‘gpu’. Default is ‘gpu’.

  • -p, --path Path to graphstorm root directory, default is one level above the script’s location.

  • -i, --image Docker image name, default is ‘graphstorm’.

  • -s, --suffix Suffix for the image tag, can be used to push custom image tags. Default is “<environment>-<device>”, e.g. sagemaker-gpu.

  • -x, --verbose Print script debug info (set -x)

Examples:

# Push an image to '123456789012.dkr.ecr.us-east-1.amazonaws.com/graphstorm:local-cpu'
bash docker/push_graphstorm_image.sh -e local -r "us-east-1" --account "123456789012" --device cpu