This assignment will walk you through the steps to install and run Regent on Certainty. This handout assumes the basic understanding about the cluster computing. If you are completely new to a cluster computing, reading this document will be helpful. You can also find some useful materials in the "How do I....?" tab. (The links require access permission to Certainty.)

Connecting to Certainty

You need to setup the Stanford VPN by following instructions here: Once you log in to Stanford's network, you can now connect to the head node with this command:

ssh <your SUNetID>

Installing Regent

Certainty has all the prerequisite packages for installing Regent. Here are the list of modules you need to load (You better have these commands in your .bashrc. Also, don't forget to source it when you made a change):

module load python/2.7.8
module load gnu/4.7.2
module load clang/3.5.2
module load openmpi/1.10.1-gnu-4.7.2
module load cuda/7.5.18

Also, the following shell variables should be set in your environment to enable multi-node support in Regent.

export CONDUIT=ibv
export GASNET=/share/apps/gasnet/1.26.0/gcc-4.9.2

Now, you run these commands to install Regent.

git clone -b master
cd legion/language
./ --gasnet

Once you run the last line, you will be asked to choose among three options: auto/manual/never. You should enter auto, unless you know what you are doing. Complete instructions about installation can be found here:

Installing Regent locally on your Mac

Most of your Mac do not have Clang installed with the necessary header files, so you have to manually install Clang. Go to the LLVM download page and download the pre-built binary for Mac OS X. Clang 3.5.2 works the best, but other versions would also work. Then, uncompress the file and move the created directory wherever you want. Finally, add the sub-directory bin to your PATH setting. Here is one possible scenario:

mkdir test_regent_install
cd test_regent_install

curl -O
tar xfJ clang+llvm-3.5.2-x86_64-apple-darwin.tar.xz
export CLANG=$PWD/clang+llvm-3.5.2-x86_64-apple-darwin/bin/clang
export LLVM_CONFIG=$PWD/clang+llvm-3.5.2-x86_64-apple-darwin/bin/llvm-config

git clone -b master
cd legion/language
./ --debug --rdir=auto
./ examples/circuit.rg

Once you installed Clang, you can follow the same instructions to build Regent:

git clone -b master
cd legion/language

If you recently updated to macOS Sierra and see this error message when you run ./ examples/circuit.rg:

<buffer>:4:10: fatal error: 'stdio.h' file not found
#include <stdio.h>
         compilation of included c code failed

         stack traceback:
         src/terralib.lua:3386: in function 'includecstring'

then you have to re-install the command-line tools with xcode-select --install.

Installing Regent locally on your Linux

The easiest way is to follow this quickstart: If this goes wrong, you will probably need to install Clang manually. You can find a pre-built binary and the source code at the LLVM download page. Clang 3.5.2 works the best, but other versions would also work.

Writing a PBS script

Here is an example PBS script to run the circuit example:

#!/bin/bash -l
#PBS -l nodes=1:ppn=24
#PBS -l walltime=00:05:00
#PBS -m abe
#PBS -q gpu
#PBS -d .

./ examples/circuit.rg

In this assignment, you just copy and paste these commands to a script file, say However, you should become comfortable with writing PBS scripts for your future assignments. You can find the complete list of options here:

Also, you will notice that this script submits the job to the gpu queue (#PBS -q gpu). All CS315B students should use only the gpu queue for their jobs, unless they have been using Certainty outside the class.

Now, you can submit the job script with this command:

qsub ./

You should be in the legion/language directory when you submit this script.

Once the job has finished, you will get two output files,* and* (asterisks would be replaced with your job id), which record the standard output and error stream, respectively. If your job was successful, you should see nothing in* and see the following output in*:

circuit settings: loops=5 pieces=4 nodes/piece=4 wires/piece=8 pct_in_piece=80 seed=12345
Circuit memory usage:
  Nodes :        16 *   16 bytes =         256 bytes
  Wires :        32 *  120 bytes =        3840 bytes
  Total                                   4096 bytes
Starting main simulation loop
ELAPSED TIME =   0.041 s
simulation complete - destroying regions

Otherwise, you can probably find the reason of failure in either of the output files.

Running Regent on multiple nodes

The script reads the value of environment variable LAUNCHER to set up the multi-node execution correctly. Here is an example PBS script to run the same circuit example on two nodes:

#!/bin/bash -l
#PBS -l nodes=2:ppn=24
#PBS -l walltime=00:05:00
#PBS -m abe
#PBS -q gpu
#PBS -d .

LAUNCHER='mpirun --bind-to none -np 2 -npernode 1' ./ examples/circuit.rg

Note that the script passes --bind-to none to turn off processor binding. Regent's language runtime, Legion, is doing its own resource management and processor binding at the MPI level will hide some processors from runtime's management. For the complete list of options for command mpirun, please refer to the man page.

Configuring the machine for Regent

When you run a Regent program, the runtime is configured with one CPU and 512MB of system memory by default. However, this is often not sufficient because of these reasons:

  • The program works on data that does not fit to 512MB of memory.
  • The program is parallelized so it can run faster on multiple processors.
  • The program has a task that can run on GPUs.

If the default configuration is not enough for your program, you can change the machine configuration by giving some command-line flags, as in the following example command:

./ examples/circuit.rg -ll:cpu 2 -ll:csize 1024

In this command, the runtime is configured with two CPUs (-ll:cpu 2) and 1024MB of system memory (-ll:csize 1024).

If you ran Regent on multiple nodes, the command-line flags would configure each of those nodes and not the entire list of nodes. Let's say you wrote this PBS script:

#!/bin/bash -l
#PBS -l nodes=2:ppn=24
#PBS -l walltime=00:05:00
#PBS -m abe
#PBS -q gpu
#PBS -d .

LAUNCHER='mpirun --bind-to none -np 2 -npernode 1' ./ examples/circuit.rg -ll:cpu 2

This script will launch your program on two nodes, each of which is configured with two CPUs. Therefore, there will be four CPUs in total on which you can launch your tasks.

For the complete list of flags for the machine configuration, please refer to this link:

Running Legion Prof and Spy

Legion Prof and Legion Spy are two important tools to visualize Regent program's execution. Legion Prof gives you a profiling result and Legion Spy renders the data dependence structure of the program. To use these tools, you first get the logging output and pass it to the post-processing scripts. Here are the commands to enable the logging:

  • Legion Prof: ./ (your regent program) -lg:prof (number of nodes) -lg:prof_logfile (your filename)
  • Legion Spy: ./ (your regent program) -lg:spy -logfile (your filename)

If the filename contains % (e.g. my_log_file_%), it will be replaced with the node id (e.g. my_log_file_0, my_log_file_1, ...). Once you have the log files, now you can pass them to the scripts or (you can find them under legion/tools).

In order to generate Legion Spy graphs, you will need Graphviz. Here is how you can install it on Ubuntu and Mac OS X.

# Ubuntu
apt install graphviz

# Mac OS X
brew install graphviz             # if Homebrew is your package manager
port install graphviz             # if Port is your package manager

These links will give you more information about Legion Prof and Legion Spy:

What's next?

Regent and Legion have other features and options that this document does not cover. Here are some useful links to find more about Regent and Legion:

What to submit

Send the following two files to

  • The output file (*) from a job running the circuit example on two nodes. You can use the script given above.
  • A PBS script that runs the circuit example with this runtime configuration: a single node, 4 processors, 2048MB of system memory, and both Legion Prof and Spy enabled.

The submission will be examined just to check whether you actually follow the steps in this document and not be graded otherwise.

back to course webpage