.. _single-machine-training-inference:

Model Training and Inference on a Single Machine
-------------------------------------------------
While the :ref:`Standalone Mode Quick Start <quick-start-standalone>` tutorial introduces some basic concepts, commands, and steps of using GprahStorm CLIs on a single machine, this user guide provides more detailed description of the usage of GraphStorm CLIs in a single machine. In addition, the majority of the descriptions in this guide can be directly applied to :ref:`model training and inference on distributed clusters <distributed-cluster>`.

GraphStorm can support graph machine learning (GML) model training and inference for common GML tasks, including **node classification**, **node regression**, **edge classification**, **edge regression**, and **link prediction**. Since the :ref:`multi-task learning <multi_task_learning>` feature released in v0.3 is in experimental stage, formal documentations about this feature will be released later when it is mature.

For each task, GraphStorm provide a dedicated CLI for model training and inference. These CLIs share the same command template and some configurations, while each CLI has its unique task-specific configurations. GraphStorm also has a task-agnostic CLI for users to run your customized models.

Task-specific CLI template for model training and inference
............................................................
GraphStorm model training and inference CLIs like the commands below. 

.. code-block:: bash

    # Model training
    python -m graphstorm.run.TASK_COMMAND \
              --num-trainers 1 \
              --part-config data.json \
              --cf config.yaml \
              --save-model-path model_path/

    # Model inference
    python -m graphstorm.run.TASK_COMMAND \
              --inference \
              --num-trainers 1 \
              --part-config data.json \
              --cf config.yaml \
              --restore-model-path model_path/ \
              --save-prediction-path pred_path/

In the above two templates, the ``TASK_COMMAND`` represents one of the five task-specific commands:

    * ``gs_node_classification`` for node classification tasks;
    * ``gs_node_regression`` for node regression tasks;
    * ``gs_edge_classification`` for edge classification tasks;
    * ``gs_edge_regression`` for edge regression tasks;
    * ``gs_link_prediction`` for link prediction tasks.

These task-specific commands work for both model training and inference except that inference CLI needs to add the **-\-inference** argument to indicate this is an inference CLI, and the **-\-restore-model-path** argument that indicates the path of the saved model checkpoint.

For a single machine, the argument **-\-num-trainers** can configure how many GPUs or CPU processes to be used. If using a GPU machine, the value of **-\-num-trainers** should be **equal or less** than the total number of available GPUs, while in a CPU-only machine, the value could be less than the total number of CPU processes to avoid errors.

GraphStorm model training and inference CLIs use the **-\-part-config** argument to specify the partitioned graph data. Its value should be the path of the `*.json` file that is generated by the :ref:`GraphStorm Graph Construction <graph_construction>` step.

While the CLIs could be very simple as the template demonstrated, users can leverage a YAML file to set a variaty of GraphStorm configurations that could make full use of the rich functions and features provided by GraphStorm. The YAML file will be specified to the **-\-cf** argument. GraphStorm has a set of `example YAML files <https://github.com/awslabs/graphstorm/tree/main/training_scripts>`_ available for reference.

.. note:: 

    * Users can set CLI configurations either in CLI arguments or the configuration YAML file. But values set in CLI arguments will overwrite the values of the same configuration set in the YAML file.
    * This guide only explains a few commonly used configurations. For the detailed explanations of GraphStorm CLI configurations, please refer to the :ref:`Model Training and Inference Configurations <configurations-run>` section.

Task-agnostic CLI for model training and inference
...................................................
While task-specific CLIs allow users to quickly perform GML tasks supported by GraphStorm, users may build their own GNN models as described in the :ref:`Use Your Own Models <use-own-models>` tutorial. To put these customized models into GraphStorm model training and inference pipeline, users can use the task-agnostic CLI as shown in the examples below.

.. code-block:: bash

    # Model training
    python -m graphstorm.run.launch \
              --num-trainers 1 \
              --part-config data.json \
              customized_model.py --save-model-path model_path/ \
                                  customized_arguments 

    # Model inference
    python -m graphstorm.run.launch \
              --inference \
              --num-trainers 1 \
              --part-config data.json \
              customized_model.py --restore-model-path model_path/ \
                                  --save-prediction-path pred_path/ \
                                  customized_arguments

The task-agnostic CLI command (``launch``) has similar tempalte as the task-specific CLIs except that it takes the customized model, which is stored as a ``.py`` file, as an argument. And in case the customized model has its own arguments, they should be placed after the customized model python file.