NUM_NODES
: the number of nodes that are part of the task
NODE_RANK
: the rank of the node executing the task
NODE_IPS
: a string of IP addresses of the nodes that are part of the task, where each line contains one IP address
NUM_GPUS_PER_NODE
: the number of GPUs available on each node