This assignment will walk you through the steps to install and run Regent on Certainty. This handout assumes the basic understanding about the cluster computing. If you are completely new to a cluster computing, reading this document will be helpful. You can also find some useful materials in the "How do I....?" tab. (The links require access permission to Certainty.)
You need to setup the Stanford VPN by following instructions here: https://uit.stanford.edu/service/vpn. Once you log in to Stanford's network, you can now connect to the head node with this command:
ssh <your SUNetID>@certainty-login.stanford.edu
Certainty has all the prerequisite packages for installing Regent. Here are the list of modules you need to load (You better have these commands in your
.bashrc. Also, don't forget to source it when you made a change):
module load python/2.7.8 module load gnu/4.7.2 module load clang/3.5.2 module load openmpi/1.10.1-gnu-4.7.2 module load cuda/7.5.18
Also, the following shell variables should be set in your environment to enable multi-node support in Regent.
export CONDUIT=ibv export GASNET=/share/apps/gasnet/1.26.0/gcc-4.9.2
Now, you run these commands to install Regent.
git clone -b master https://github.com/StanfordLegion/legion.git cd legion/language ./install.py --gasnet
Once you run the last line, you will be asked to choose among three options:
auto/manual/never. You should enter
auto, unless you know what you are doing. Complete instructions about installation can be found here: https://github.com/StanfordLegion/legion/blob/master/language/README.md.
Most of your Mac do not have Clang installed with the necessary header files, so you have to manually install Clang. Go to the LLVM download page and download the pre-built binary for Mac OS X. Clang 3.5.2 works the best, but other versions would also work. Then, uncompress the file and move the created directory wherever you want. Finally, add the sub-directory
bin to your
PATH setting. Here is one possible scenario:
curl http://llvm.org/releases/3.5.2/clang+llvm-3.5.2-x86_64-apple-darwin.tar.xz > clang.tar.xz tar -Jxvf clang.tar.gz mv clang+llvm-3.5.2-x86_64-apple-darwin ~/clang-3.5.2 export PATH=$PATH:~/clang-3.5.2/bin # you might add this line to your .bashrc if you want to build Regent multiple times
Once you installed Clang, you can follow the same instructions to build Regent:
git clone -b master https://github.com/StanfordLegion/legion.git cd legion/language ./install.py
If you recently updated to macOS Sierra and see this error message when you run
<buffer>:4:10: fatal error: 'stdio.h' file not found #include <stdio.h> ^ compilation of included c code failed stack traceback: src/terralib.lua:3386: in function 'includecstring' ...
then you have to re-install the command-line tools with
The easiest way is to follow this quickstart: https://github.com/StanfordLegion/legion/blob/master/language/README.md. If this goes wrong, you will probably need to install Clang manually. You can find a pre-built binary and the source code at the LLVM download page. Clang 3.5.2 works the best, but other versions would also work.
Here is an example PBS script to run the circuit example:
#!/bin/bash -l #PBS -l nodes=1:ppn=24 #PBS -l walltime=00:05:00 #PBS -m abe #PBS -q gpu #PBS -d . ./regent.py examples/circuit.rg
In this assignment, you just copy and paste these commands to a script file, say
run_circuit.sh. However, you should become comfortable with writing PBS scripts for your future assignments. You can find the complete list of options here: https://linux.die.net/man/1/qsub-torque.
Also, you will notice that this script submits the job to the
gpu queue (
#PBS -q gpu). All CS315B students should use only the
gpu queue for their jobs, unless they have been using Certainty outside the class.
Now, you can submit the job script with this command:
You should be in the
legion/language directory when you submit this script.
Once the job has finished, you will get two output files,
run_circuit.sh.e* (asterisks would be replaced with your job id), which record the standard output and error stream, respectively. If your job was successful, you should see nothing in
run_circuit.sh.e* and see the following output in
circuit settings: loops=5 pieces=4 nodes/piece=4 wires/piece=8 pct_in_piece=80 seed=12345 Circuit memory usage: Nodes : 16 * 16 bytes = 256 bytes Wires : 32 * 120 bytes = 3840 bytes Total 4096 bytes Starting main simulation loop ... SUCCESS! ELAPSED TIME = 0.041 s GFLOPS = 3.763 GFLOPS simulation complete - destroying regions
Otherwise, you can probably find the reason of failure in either of the output files.
regent.py script reads the value of environment variable
LAUNCHER to set up the multi-node execution correctly. Here is an example PBS script to run the same circuit example on two nodes:
#!/bin/bash -l #PBS -l nodes=2:ppn=24 #PBS -l walltime=00:05:00 #PBS -m abe #PBS -q gpu #PBS -d . LAUNCHER='mpirun --bind-to none -np 2 -npernode 1' ./regent.py examples/circuit.rg
Note that the script passes
--bind-to none to turn off processor binding. Regent's language runtime, Legion, is doing its own resource management and processor binding at the MPI level will hide some processors from runtime's management. For the complete list of options for command
mpirun, please refer to the man page.
When you run a Regent program, the runtime is configured with one CPU and 512MB of system memory by default. However, this is often not sufficient because of these reasons:
If the default configuration is not enough for your program, you can change the machine configuration by giving some command-line flags, as in the following example command:
./regent.py examples/circuit.rg -ll:cpu 2 -ll:csize 1024
In this command, the runtime is configured with two CPUs (
-ll:cpu 2) and 1024MB of system memory (
If you ran Regent on multiple nodes, the command-line flags would configure each of those nodes and not the entire list of nodes. Let's say you wrote this PBS script:
#!/bin/bash -l #PBS -l nodes=2:ppn=24 #PBS -l walltime=00:05:00 #PBS -m abe #PBS -q gpu #PBS -d . LAUNCHER='mpirun --bind-to none -np 2 -npernode 1' ./regent.py examples/circuit.rg -ll:cpu 2
This script will launch your program on two nodes, each of which is configured with two CPUs. Therefore, there will be four CPUs in total on which you can launch your tasks.
For the complete list of flags for the machine configuration, please refer to this link: http://legion.stanford.edu/profiling/#machine-configuration.
Legion Prof and Legion Spy are two important tools to visualize Regent program's execution. Legion Prof gives you a profiling result and Legion Spy renders the data dependence structure of the program. To use these tools, you first get the logging output and pass it to the post-processing scripts. Here are the commands to enable the logging:
./regent.py (your regent program) -hl:prof (number of nodes)
./regent.py (your regent program) -hl:spy
The logs are printed to the standard output by default and you can save them to files by adding option
-logfile (your filename). If the filename contains
my_log_file_%), it will be replaced with the node id (e.g.
my_log_file_1, ...). Once you have the log files, now you can pass them to the scripts
legion_spy.py (you can find them under
These links will give you more information about Legion Prof and Legion Spy:
Regent and Legion have other features and options that this document does not cover. Here are some useful links to find more about Regent and Legion:
Send the following two files to firstname.lastname@example.org:
run_script.sh.o*) from a job running the circuit example on two nodes. You can use the script given above.
The submission will be examined just to check whether you actually follow the steps in this document and not be graded otherwise.