Jupyter

From FarmShare

(Difference between revisions)
Jump to: navigation, search
(Installing custom Jupyter notebook)
m (Technical Details: typos)
(113 intermediate revisions not shown)
Line 7: Line 7:
At the end of this guide, the resulting Jupyter notebook will support:
At the end of this guide, the resulting Jupyter notebook will support:
-
* '''[https://www.python.org Python]''', '''[https://www.mathworks.com MATLAB]''', '''[https://www.r-project.org R]''', '''[http://www.sas.com SAS]''', and '''[http://julialang.org Julia]''' programming languages
+
* '''[https://www.python.org Python]''', '''[https://www.mathworks.com MATLAB]''', '''[https://www.r-project.org R]''', '''[http://julialang.org Julia]''', and '''[http://www.sas.com SAS]''' programming languages
* an [https://en.wikipedia.org/wiki/Transport_Layer_Security encrypted], token protected, and web-browser enabled programming environment
* an [https://en.wikipedia.org/wiki/Transport_Layer_Security encrypted], token protected, and web-browser enabled programming environment
* indefinite persistence of the Jupyter notebook server environment with simple weekly renewals (the maximum duration of [https://uit.stanford.edu/service/kerberos Stanford Kerberos] tickets)
* indefinite persistence of the Jupyter notebook server environment with simple weekly renewals (the maximum duration of [https://uit.stanford.edu/service/kerberos Stanford Kerberos] tickets)
Line 13: Line 13:
* shared file/data storage to [https://uit.stanford.edu/service/its-course-support/diskspace Class Disk AFS Space]
* shared file/data storage to [https://uit.stanford.edu/service/its-course-support/diskspace Class Disk AFS Space]
* shared file/data storage to [https://tools.stanford.edu/cgi-bin/group-request Group AFS Space]
* shared file/data storage to [https://tools.stanford.edu/cgi-bin/group-request Group AFS Space]
-
* easy deployment on any of the Stanford [https://web.stanford.edu/group/farmshare/cgi-bin/wiki/index.php/Main_Page FarmShare] systems or any Linux system integrated with Stanford [https://uit.stanford.edu/service/kerberos/install_generic Kerberos] and [https://uit.stanford.edu/service/afs/sysadmin AFS].
+
* easy deployment on any of the Stanford [https://web.stanford.edu/group/farmshare/cgi-bin/wiki/index.php/Main_Page FarmShare2] systems or any Ubuntu Xenial x86_64 system with Stanford [https://uit.stanford.edu/service/kerberos/install_generic Kerberos] and [https://uit.stanford.edu/service/afs/sysadmin AFS].
 +
 
 +
=== Alternative ===
 +
 
 +
Note, if only Python is needed, it is recommended to use [https://colab.research.google.com Google Colaboratory] through one's [https://uit.stanford.edu/service/gsuite/login Stanford Google Account]. It provides a much easier mechanism to run Jupyter instances, supports user-installed Python packages, runs on virtualized hardware, and has GPU support.
== Overview ==
== Overview ==
Line 22: Line 26:
* Client connection must happen every time a client computer reconnects to the Internet (e.g., a laptop wakes from sleep).
* Client connection must happen every time a client computer reconnects to the Internet (e.g., a laptop wakes from sleep).
-
Commands to be typed in are in '''bold''', text to substitute is in <font color=red>'''red'''</font>, and the corn server and [https://en.wikipedia.org/wiki/Port_(computer_networking) TCP port] to note are in <font color=blue>'''blue'''</font>.
+
=== Style Guide ===
 +
 
 +
* '''bold''' - commands to be typed in
 +
** specific keyboard keys to be pushed appear in square brackets (e.g., '''[Enter]''', '''[Control]''', '''[Alt]''')
 +
** key combinations will appear with a dash (e.g., '''[Control]-C''' for copy)
 +
* <font color=red>'''red'''</font> - text that must be substituted
 +
* <font color=blue>'''blue'''</font> - numbers to keep track of
 +
** the specific rice server (e.g., rice05 vs rice12) jupyter is running on
 +
** [https://en.wikipedia.org/wiki/Port_(computer_networking) TCP port] that jupyter server is running on
 +
 
 +
== Tutorial Videos ==
 +
 
 +
=== Overview ===
 +
 
 +
This is a high-level overview video of the concepts involved in running Jupyter on FarmShare:
 +
* [http://stanford.edu/group/bil/vid/jupyter/overview.mp4 Overview]
 +
 
 +
=== Four main steps ===
 +
There are video tutorials for the main steps:
 +
* [http://stanford.edu/group/bil/vid/jupyter/install.mp4 Install]
 +
* [http://stanford.edu/group/bil/vid/jupyter/launch.mp4 Launch]
 +
* [http://stanford.edu/group/bil/vid/jupyter/connect.mp4 Connect]
 +
* [http://stanford.edu/group/bil/vid/jupyter/renew.mp4 Renew]
 +
 
 +
=== Windows specific components ===
 +
 
 +
There are two Windows-specific steps that are different from the above tutorial videos:
 +
* [http://stanford.edu/group/bil/vid/jupyter/win_login.mp4 PuTTY login]
 +
* [http://stanford.edu/group/bil/vid/jupyter/win_tunnel.mp4 SSH tunnel]
== Installation ==
== Installation ==
Line 30: Line 62:
=== SSH into FarmShare ===
=== SSH into FarmShare ===
-
In a terminal, SSH into a FarmShare computer.
+
In a terminal, SSH into a FarmShare2 computer.
-
  $ '''ssh <font color=red>jane</font>@corn.stanford.edu'''
+
  $ '''ssh <font color=red>jane</font>@rice.stanford.edu'''
substituting SUNet ID for <font color=red>'''jane'''</font>.
substituting SUNet ID for <font color=red>'''jane'''</font>.
Line 40: Line 72:
=== Bind to Jupyter virtual environment and install default configuration ===
=== Bind to Jupyter virtual environment and install default configuration ===
-
  corn<font color=blue>99</font>:~> '''bash'''
+
  rice<font color=blue>99</font>:~> '''bash'''
-
  (ignore this command if prompt already looks like: jane@corn99:~$ )
+
  (ignore this command if prompt already looks like: jane@rice<font color=blue>99</font>:~$ )
   
   
-
  jane@corn<font color=blue>99</font>:~$ '''source /afs/ir.stanford.edu/group/bil/env/jupyter/bin/activate'''
+
  jane@rice<font color=blue>99</font>:~$ '''source /afs/ir.stanford.edu/group/bil/env/j2/bin/activate'''
   
   
-
  (jupyter)jane@corn<font color=blue>99</font>:~$ '''generate_jupyter_wrapper'''
+
  (j2)jane@rice<font color=blue>99</font>:~$ '''generate_jupyter_wrapper'''
== Jupyter server ==
== Jupyter server ==
Line 51: Line 83:
This only needs to be performed ''once'' per FarmShare server used (and after every server reboot).
This only needs to be performed ''once'' per FarmShare server used (and after every server reboot).
-
[[#SSH_into_FarmShare|SSH info FarmShare]] if already not already in a system (ok to continue on from existing SSH session used for [[#installation|Installation]]).
+
[[#SSH_into_FarmShare2|SSH into FarmShare2]] if not already in a system (ok to continue on from existing SSH session used for [[#installation|Installation]]).
=== Launch Jupyter server ===
=== Launch Jupyter server ===
-
Make a note of exactly which FarmShare server is being used (e.g., corn14, corn22, etc).
+
Make a note of exactly which FarmShare server is being used (e.g., rice14, rice22, etc).
-
  (jupyter)jane@corn<font color=blue>99</font>:~$ '''pagsh'''
+
  jane@rice<font color=blue>99</font>:~$ '''pagsh'''
   
   
-
  (jupyter)jane@corn<font color=blue>99</font>:~$ '''kinit; aklog'''
+
  $ '''kinit -r 7d; aklog'''
-
Password for jane@stanford.edu:
+
   
   
-
  (jupyter)jane@corn<font color=blue>99</font>:~$ '''tmux'''
+
  $ '''bash'''
-
(new blank terminal appears)
+
   
   
-
  jane@corn<font color=blue>99</font>:~$ '''source /afs/ir.stanford.edu/group/bil/env/jupyter/bin/activate'''
+
  jane@rice<font color=blue>99</font>:~$ '''source /afs/ir.stanford.edu/group/bil/env/j2/bin/activate'''
   
   
-
  (jupyter)jane@corn<font color=blue>99</font>:~$ '''keep_kerberos_afs'''
+
  (j2)jane@rice<font color=blue>99</font>:~$ '''tmux'''
-
  Run: export KRB5CCNAME=FILE:/tmp/.krb5_jane.tgt
+
  (new blank terminal appears)
-
+
-
jane@corn<font color=blue>99</font>:~$ '''export KRB5CCNAME=FILE:/tmp/.krb5_<font color=red>jane</font>.tgt'''
+
   
   
-
  (jupyter)jane@corn<font color=blue>99</font>:~$ '''jupyter_start'''
+
  (j2)jane@rice<font color=blue>99</font>:~$ '''jupyter_start'''
-
Substitute SUNet ID for <font color=red>'''jane'''</font> in the <code>export</code> line (as instructed by the <code>keep_kerberos_afs</code> script). If successful, the output of this command will be
+
If successful, the output of this command will look like:
  IMPORTANT: This Jupyter notebook is listening on TCP port 9876
  IMPORTANT: This Jupyter notebook is listening on TCP port 9876
Line 98: Line 126:
  (opens a new tmux window and switches to it)
  (opens a new tmux window and switches to it)
   
   
-
  jane@corn<font color=blue>99</font>:/afs/ir.stanford.edu/users/j/a/jane$ '''cd'''
+
  jane@rice<font color=blue>99</font>:~$ '''source /afs/ir.stanford.edu/group/bil/env/j2/bin/activate'''
   
   
-
jane@corn<font color=blue>99</font>:~$ '''source /afs/ir.stanford.edu/group/bil/env/jupyter/bin/activate'''
+
  (j2)jane@rice<font color=blue>99</font>:~$ '''jupyter notebook list'''
-
+
-
  (jupyter)jane@corn<font color=blue>99</font>:~$ '''jupyter notebook list'''
+
  Currently running servers:
  Currently running servers:
  <nowiki>https://localhost:</nowiki><font color=blue>9876</font>/?token=ba682763f27d8e2d59862badef28b0eaecb552529933176e :: /afs/ir.stanford.edu/users/j/a/jane
  <nowiki>https://localhost:</nowiki><font color=blue>9876</font>/?token=ba682763f27d8e2d59862badef28b0eaecb552529933176e :: /afs/ir.stanford.edu/users/j/a/jane
Line 112: Line 138:
Detach from tmux and logout of FarmShare.
Detach from tmux and logout of FarmShare.
-
  '''Control-B''' then '''D''' (releasing all keys after the Control-B before hitting D)
+
  '''[Control]-B''' then '''D''' (releasing all keys after the [Control]-B before hitting D)
  (detaches tmux)
  (detaches tmux)
   
   
-
  (jupyter)jane@corn<font color=blue>99</font>:~$ '''exit'''
+
  (j2)jane@rice<font color=blue>99</font>:~$ '''exit'''
   
   
-
  jane@corn<font color=blue>99</font>:~$ '''exit'''
+
  jane@rice<font color=blue>99</font>:~$ '''exit'''
   
   
-
  jane@corn<font color=blue>99</font>:~$ '''exit''' (if exit does not work here, use '''logout''')
+
  jane@rice<font color=blue>99</font>:~$ '''exit''' (if exit does not work here, use '''logout''')
== Client connection ==
== Client connection ==
Line 127: Line 153:
SSH into the ''same'' FarmShare system that the Jupyter server was started in above, tunneling the TCP port that the Jupyter server is listening on.
SSH into the ''same'' FarmShare system that the Jupyter server was started in above, tunneling the TCP port that the Jupyter server is listening on.
-
  $ '''ssh <font color=red>jane</font>@corn<font color=red>99</font>.stanford.edu -L <font color=red>9876</font>:localhost:<font color=red>9876</font>
+
  $ '''ssh <font color=red>jane</font>@rice<font color=red>99</font>.stanford.edu -L <font color=red>9876</font>:localhost:<font color=red>9876</font>
-
substituting in the appropriate SUNet ID for <code><font color=red>'''jane'''</font></code>, the appropriate FarmShare server hostname for <code>'''corn<font color=blue>99'''</font></code>, and the appropriate TCP port for <code><font color=blue>'''9876'''</font></code>. Windows users can setup an SSH local tunnel using [https://www.electrictoolbox.com/putty-create-ssh-port-tunnel menu options] in [http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html putty].
+
substituting in the appropriate SUNet ID for <code><font color=red>'''jane'''</font></code>, the appropriate FarmShare server hostname for <code>'''rice<font color=blue>99'''</font></code>, and the appropriate TCP port for <code><font color=blue>'''9876'''</font></code>.
-
Once logged in, paste the <code><nowiki>https://localhost:</nowiki><font color=blue>9876</font>/?token=...</code> URL provided from the <code>jupyter_start</code> script into a web browser to connect to the Jupyter notebook. Ignore any browser security warnings.
+
Windows users can setup an SSH local tunnel using [https://www.electrictoolbox.com/putty-create-ssh-port-tunnel menu options] in [http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html putty]. Enter the appropriate TCP port in the ''Source port'' box and enter '''localhost:<code><font color=blue>9876'''</font></code> for the ''Destination'' box and then click the ''Add'' button.
 +
 
 +
Once logged in, paste the <code><nowiki>https://localhost:</nowiki><font color=blue>9876</font>/?token=...</code> URL provided from the <code>jupyter_start</code> or <code>jupyter notebook list</code> commands into a web browser to connect to the Jupyter notebook. Ignore any browser security warnings.
Note: Jupyter works best with [https://www.mozilla.org/en-US/firefox Firefox] or [https://www.google.com/chrome Chrome] browsers. Safari and Internet Explorer are not well supported.
Note: Jupyter works best with [https://www.mozilla.org/en-US/firefox Firefox] or [https://www.google.com/chrome Chrome] browsers. Safari and Internet Explorer are not well supported.
Line 141: Line 169:
SSH into the ''same'' FarmShare system that the Jupyter server was started in.
SSH into the ''same'' FarmShare system that the Jupyter server was started in.
-
  $ '''ssh <font color=red>jane</font>@corn<font color=red>99</font>.stanford.edu
+
  $ '''ssh <font color=red>jane</font>@rice<font color=red>99</font>.stanford.edu
-
substituting in the appropriate SUNet ID for <code><font color=red>'''jane'''</font></code> and the appropriate FarmShare server hostname for <code>'''corn<font color=blue>99'''</font></code>.
+
substituting in the appropriate SUNet ID for <code><font color=red>'''jane'''</font></code> and the appropriate FarmShare server hostname for <code>'''rice<font color=blue>99'''</font></code>.
Return to tmux window 0 where Jupyter was launched, shut down the server, and exit tmux.
Return to tmux window 0 where Jupyter was launched, shut down the server, and exit tmux.
-
  corn<font color=blue>99</font>:~> '''tmux attach -d'''
+
  rice<font color=blue>99</font>:~> '''tmux a -d'''
-
  '''Control-B''' then '''0''' (releasing all keys after the Control-B before hitting 0)
+
  '''[Control]-B''' then '''0''' (releasing all keys after the [Control]-B before hitting 0)
  (returns to tmux window 0 with lots Jupyter notebook output)
  (returns to tmux window 0 with lots Jupyter notebook output)
   
   
  [I 17:23:59.671 NotebookApp]
  [I 17:23:59.671 NotebookApp]
-
  '''Control-C''' then '''Control-C''' (releasing all keys after the Control-C before hitting Control-C a second time)
+
  '''[Control]-C''' then '''y [Enter]''' (releasing all keys after the [Control]-C saying yes to close Jupyter)
This will shut down the Jupyter notebook. Log out of FarmShare.
This will shut down the Jupyter notebook. Log out of FarmShare.
Line 160: Line 188:
This only needs to be done when the Kerberos credentials expire after a week and the notebook no longer functions. To restore the notebook after this time, SSH into the ''same'' FarmShare system used to create the virtual terminal. Renewal can take place at any time after a week when use is desired.
This only needs to be done when the Kerberos credentials expire after a week and the notebook no longer functions. To restore the notebook after this time, SSH into the ''same'' FarmShare system used to create the virtual terminal. Renewal can take place at any time after a week when use is desired.
-
  $ '''ssh <font color=red>jane</font>@corn<font color=red>99</font>.stanford.edu'''
+
  $ '''ssh <font color=red>jane</font>@rice<font color=red>99</font>.stanford.edu'''
-
substituting SUNet ID for <font color=red>'''jane'''</font> and the appropriate FarmShare server for '''corn<font color=blue>99</font>'''.
+
substituting SUNet ID for <font color=red>'''jane'''</font> and the appropriate FarmShare server for '''rice<font color=blue>99</font>'''.
-
  corn<font color=blue>99</font>:~> '''tmux attach -d'''
+
Go to an open tmux window. There is no need to stop the existing Jupyter notebook server.
 +
 
 +
  rice<font color=blue>99</font>:~> '''tmux a -d'''
 +
(should reattach to window made in "confirm Juptyer running status", but any open j2 prompt will work)
   
   
-
  (jupyter)jane@corn<font color=blue>99</font>:~$ '''kinit; aklog'''
+
  (j2)jane@rice<font color=blue>99</font>:~$ '''kinit -r 7d; aklog'''
  Password for jane@stanford.edu:
  Password for jane@stanford.edu:
-
+
 
-
(jupyter)jane@corn<font color=blue>99</font>:~$ '''keep_kerberos_afs'''
+
Confirm that a new Kerberos ticket and AFS token have been gathered:
-
Run: export KRB5CCNAME=FILE:/tmp/.krb5_jane.tgt
+
 
-
   
+
  (j2)jane@rice<font color=blue>99</font>:~$ '''klist'''
-
  '''Control-B''' then '''D''' (releasing all keys after the Control-B before hitting D)
+
  Ticket cache: FILE:/tmp/.krb5_jane.tgt
 +
  Default principal: jane@stanford.edu
 +
 
 +
  Valid starting      Expires              Service principal
 +
  11/11/1885 12:00:00 11/12/1885 11:00:00  krbtgt/stanford.edu@stanford.edu
 +
          renew until 11/18/1885 12:00:00
 +
  11/11/1885 12:00:00  11/12/1885 11:00:00  afs/ir.stanford.edu@stanford.edu
 +
          renew until 11/18/1885 12:00:00
 +
 
 +
  (j2)jane@rice<font color=blue>99</font>:~$ '''tokens'''
 +
 
 +
  Tokens held by the Cache Manager:
 +
 
 +
  User's (AFS ID 99999) tokens for afs@ir.stanford.edu [Expires Nov 12 12:00]
 +
    --End of list--
 +
 
 +
Once successful, detach and logout of FarmShare:
 +
 
 +
  '''[Control]-B''' then '''D''' (releasing all keys after the [Control]-B before hitting D)
  (detaches tmux)
  (detaches tmux)
-
This will extend the terminal's credentials for another week. The same notebook URL can be used without interruption. Logout of FarmShare.
+
This process extends the terminal's credentials for another week. The same notebook URL can be used without interruption.
== Examples ==
== Examples ==
=== Python 3 ===
=== Python 3 ===
 +
 +
==== Plotly ====
 +
 +
Plotly is a popular, open-source, interactive plotting framework available for common data analysis programming languages.
<pre>
<pre>
Line 192: Line 245:
[[File:jupyter_python_demo.png]]
[[File:jupyter_python_demo.png]]
 +
 +
==== bqplot + ipywidgets ====
 +
 +
Useful for interactive interfaces with realtime plot updating.
 +
 +
<pre>
 +
import numpy as np
 +
from bqplot import pyplot as bqplt
 +
import ipywidgets
 +
 +
def update_phase(n):
 +
    line1.y = np.sin(x - n)
 +
 +
bqplt.clear()
 +
x = np.r_[ -np.pi : np.pi : 0.1]
 +
line1 = bqplt.plot(x, np.sin(x))
 +
line2 = bqplt.plot(x, np.cos(x), 'r')
 +
bqplt.show()
 +
 +
ipywidgets.interact(update_phase, n=(-2*np.pi,2*np.pi))
 +
</pre>
 +
 +
[[File:jupyter_python_bqplot_demo.gif|800px]]
=== MATLAB ===
=== MATLAB ===
Line 211: Line 287:
<pre>
<pre>
-
x = seq(-pi, pi, 0.1)
+
library(plotly)
-
plot(x,  sin(x), type='l', lwd=4, col='blue', ann=FALSE)
+
 
-
lines(x, cos(x), type='l', lwd=4, col='red')
+
x <- seq(-pi, pi, 0.1)
 +
y1 <- sin(x)
 +
y2 <- cos(x)
 +
 
 +
ds <- data.frame(x, y1, y2)
 +
 
 +
p  <- plot_ly(ds, x = ~x, y = ~y1, type = 'scatter', mode='lines') %>%
 +
      add_trace(y = ~y2)
 +
 
 +
embed_notebook(p)
</pre>
</pre>
[[File:jupyter_r_demo.png]]
[[File:jupyter_r_demo.png]]
 +
 +
=== Julia ===
 +
 +
<pre>
 +
using Plots
 +
plotly()
 +
 +
plot([sin, cos], -pi, pi)
 +
</pre>
 +
 +
[[File:jupyter_julia_demo.png]]
=== SAS ===
=== SAS ===
Line 240: Line 336:
[[File:jupyter_sas_demo.png]]
[[File:jupyter_sas_demo.png]]
-
=== Julia ===
+
== Useful Linux commands ==
-
<pre>
+
=== Kerberos and AFS ===
-
using Plots
+
-
plotly()
+
-
plot([sin, cos], -pi, pi)
+
* <code>klist</code> - Displays [https://uit.stanford.edu/service/kerberos Kerberos] ticket status, cache file location, and expiration date
-
</pre>
+
* <code>tokens</code> - Lists [https://uit.stanford.edu/service/afs AFS] token status and expiration date
 +
* <code>kinit -r -7d</code> - obtains new Kerberos ticket using SUNetID that will be valid for a week
 +
* <code>aklog</code> - obtains AFS token from a valid Kerberos ticket
-
[[File:jupyter_julia_demo.png]]
+
=== Jupyter management ===
 +
 
 +
* <code>jupyter notebook list</code> - will list running Jupyter notebooks on that specific FarmShare machine (e.g., rice99)
 +
* <code>jupyter notebook stop <port></code> - will kill the Jupyter notebook running on port <port>
 +
 
 +
=== tmux management ===
 +
 
 +
* <code>tmux list-sessions</code> - lists all the tmux sessions on a given FarmShare machine
 +
* <code>tmux kill-session -t <target></code> - will kill tmux session <target>. <target> defaults to a number if not renamed (e.g., 0, 1, 2, etc).
 +
 
 +
== Troubleshooting ==
 +
 
 +
===Single most common error ===
 +
If the Jupyter notebook appears to be running, but no files appear in the notebook, this means that the session's Kerberos tickets and AFS tokens have expired. Follow the instructions in [[Jupyter#Renewing_virtual_terminal]] carefully. Similarly, if typing commands in the console yields permission denied errors, that too is nearly always expired Kerberos tickets.
 +
 
 +
=== General troubleshooting ===
 +
 
 +
The Jupyter notebooks are great when they work, but can be confusing to fix when not working.
 +
Do ''not'' rely on an existing browser window running Jupyter to help with debugging, its behavior can be misleading.
 +
Below are some general troubleshooting steps to help get things working again.
 +
 
 +
* Best practice - only one tmux session and jupyter notebook
 +
** It can be very confusing if multiple tmux sessions and jupyter notebooks are running.
 +
** Use the commands '''jupyter notebook list''' and '''tmux list-sessions''' to confirm that only one jupyter notebook is running in only one tmux sesssion
 +
** Use '''jupyter notebook stop <port>''' and '''tmux kill-session -t <target>''' to trim down to only one tmux session and one jupyter notebook
 +
 
 +
* Confirm the rice machine being used
 +
** Jupyter will only work if the same rice machine used to launch Jupyter is also used to establish the client connection
 +
** Jupyter notebooks launched on a given riceXX machine are only accessible via that riceXX machine
 +
 
 +
* Confirm tmux session has valid Kerberos tickets and AFS tokens
 +
** This is the most common reason Jupyter will stop working--Kerberos tickets expire once a week
 +
** Follow the instructions on [[Jupyter#Renewing_virtual_terminal]]
 +
 
 +
* Confirm that Jupyter is running
 +
*** In an open j2 tmux window, check that Jupyter is running via '''jupyter notebook list'''
 +
*** If no notebooks appear, then follow instructions at [[Jupyter#Jupyter_server]] to re-launch a Jupyter server
 +
 
 +
* Confirm that the SSH tunnel is in place
 +
** The SSH tunnel needs to be from the personal computer (e.g., laptop) to rice
 +
** Refresh the browser window with the Jupyter URL (https://localhost:XXXX)
== Custom Jupyter notebook ==
== Custom Jupyter notebook ==
-
A custom virtual environment can be deployed and used instead of the default one used by these instructions if necessary.
+
=== Kernels ===
-
This could be useful for using other Jupyter kernels, additional python modules, or a different version of python.
+
 
 +
Additional or custom kernels can be installed on a per-user basis, typically without root privileges, and used transparently in conjunction to the j2 environment.
 +
 
 +
Generally, user installed kernels should be placed in <code>~/.local/share/jupyter/kernels</code> and these will be added to the user's Jupyter notebook in addition to the ones made available by the j2 <code>sys-prefix</code> environment.
 +
 
 +
=== Custom environments ===
 +
 
 +
A custom virtual environment can be deployed and used instead of the j2 one provided here if desired.
 +
 
 +
In most cases, custom kernels as mentioned above should suffice, but some core changes like different versions of python, ipython, or jupyter require a custom virtual environment.
Create virtual environment:
Create virtual environment:
-
  virtualenv -p python3 /farmshare/user_data/jane/venv
+
  virtualenv -p python3 ~/venv
-
  source /farmshare/user_data/jane/venv/bin/activate
+
  source ~/venv/bin/activate
-
This is installed in the <code>/farmshare/user_data/</code> ZFS pool to preserve AFS user quota space, as virtual environments can get large (the default environment deployed here above exceeds 1.5GB).
+
This is installed on the Isilon home partition preserve AFS user quota space, as virtual environments can get large (the j2 environment deployed in this application exceeds 1.5GB).
-
These ZFS pools are accessible across all farmshare machines.
+
The Isilon home partition is accessible across all FarmShare2 machines.
Note, this will create a virtual environment with python3. To use python2, specify <code>-p python2</code> instead of <code>-p python3</code>.
Note, this will create a virtual environment with python3. To use python2, specify <code>-p python2</code> instead of <code>-p python3</code>.
Line 270: Line 415:
  pip-review -a
  pip-review -a
-
Most people performing numerical analyses with Jupyter will want to install at least the following packages:
+
Most people performing numerical analyses, interacting with the web, or manipulating data with Jupyter will want to install the following common packages:
-
   pip install numpy scipy jupyter sympy pandas matplotlib plotly seaborn bokeh
+
   pip install numpy scipy jupyter sympy pandas blaze matplotlib plotly seaborn statsmodels SQLAlchemy Pillow Requests lxml beautifulsoup4
The full list of packages installed in the virtual environment (generated by a <code>pip freeze</code>) used in these instructions can be found at <code>/afs/ir.stanford.edu/group/bil/env/scripts/requirements.txt</code>
The full list of packages installed in the virtual environment (generated by a <code>pip freeze</code>) used in these instructions can be found at <code>/afs/ir.stanford.edu/group/bil/env/scripts/requirements.txt</code>
Once installed, jupyter can be launched from the command line as usual. The <code>jupyter_start</code> script in <code>/afs/ir.stanford.edu/group/bil/env/scripts/</code> will also work in most cases since it relies on the environment path to launch jupyter.
Once installed, jupyter can be launched from the command line as usual. The <code>jupyter_start</code> script in <code>/afs/ir.stanford.edu/group/bil/env/scripts/</code> will also work in most cases since it relies on the environment path to launch jupyter.
 +
 +
There are some additional notes about environment creation available at <code>/afs/ir.stanford.edu/group/bil/env/scripts/README_JUPYTER</code>.
== Technical Details ==
== Technical Details ==
Line 286: Line 433:
* [https://en.wikipedia.org/wiki/Secure_Shell SSH] provides an encrypted remote [https://en.wikipedia.org/wiki/Shell_(computing) shell] into another system and is the primary way that Jupyter will be installed and accessed.
* [https://en.wikipedia.org/wiki/Secure_Shell SSH] provides an encrypted remote [https://en.wikipedia.org/wiki/Shell_(computing) shell] into another system and is the primary way that Jupyter will be installed and accessed.
-
* The first two commands of the installation (<code>bash</code> and <code>source</code>) switch the shell to [https://en.wikipedia.org/wiki/Bash_(Unix_shell) bash] and then update the [http://tldp.org/LDP/Bash-Beginners-Guide/html/chap_03.html environment] to use a pre-built Jupyter installation. The [https://en.wikibooks.org/wiki/Guide_to_Unix/Explanations/Shell_Prompt prompt] [https://wiki.archlinux.org/index.php/Bash/Prompt_customization changes] to have the <code>(jupyter)</code> prefix when the environment change is successful. SUNet IDs ceated after 2014 have their default shell set to bash and can omit the call to <code>bash</code>.
+
* The first two commands of the installation (<code>bash</code> and <code>source</code>) switch the shell to [https://en.wikipedia.org/wiki/Bash_(Unix_shell) bash] and then update the [http://tldp.org/LDP/Bash-Beginners-Guide/html/chap_03.html environment] to use a pre-built Jupyter installation. The [https://en.wikibooks.org/wiki/Guide_to_Unix/Explanations/Shell_Prompt prompt] [https://wiki.archlinux.org/index.php/Bash/Prompt_customization changes] to have the <code>(j2)</code> prefix when the environment change is successful. SUNet IDs ceated after 2014 have their default shell set to bash and can omit the call to <code>bash</code>.
-
* If of interest, the pip packages installed for the Jupyter environment are stored in a <code>requirements.txt</code> file at <code>/afs/ir.stanford.edu/group/bil/env/jupyter/requirements.txt</code>. Additional packages may be installed upon request.
+
* If of interest, the pip packages installed for the Jupyter environment are stored in a <code>requirements.txt</code> file at <code>/afs/ir.stanford.edu/group/bil/env/scripts/requirements.txt</code>. Additional packages may be installed upon request.
* The <code>jupyter_config_wrapper</code> command calls a script that will create encryption keys for the Jupyter notebook, configure the jupyter config file, and sets the default tmux shell to bash. Note, this script modifies existing Jupyter notebook config files (if they exist), but will not overwrite any parameters that have changed from their default values.
* The <code>jupyter_config_wrapper</code> command calls a script that will create encryption keys for the Jupyter notebook, configure the jupyter config file, and sets the default tmux shell to bash. Note, this script modifies existing Jupyter notebook config files (if they exist), but will not overwrite any parameters that have changed from their default values.

Revision as of 00:11, 24 November 2020

Contents

Introduction

Project Jupyter evolved out of the IPython project (specifically the IPython notebook) with the goal to provide an interactive, web-browser driven, language-independent programming environment. Jupyter notebooks can be deployed on the FarmShare servers to enable an accessible, powerful, and persistent computational platform.

Features

At the end of this guide, the resulting Jupyter notebook will support:

  • Python, MATLAB, R, Julia, and SAS programming languages
  • an encrypted, token protected, and web-browser enabled programming environment
  • indefinite persistence of the Jupyter notebook server environment with simple weekly renewals (the maximum duration of Stanford Kerberos tickets)
  • file/data storage on the Stanford AFS servers (5GB user quota, Stanford-wide, automatic backups)
  • shared file/data storage to Class Disk AFS Space
  • shared file/data storage to Group AFS Space
  • easy deployment on any of the Stanford FarmShare2 systems or any Ubuntu Xenial x86_64 system with Stanford Kerberos and AFS.

Alternative

Note, if only Python is needed, it is recommended to use Google Colaboratory through one's Stanford Google Account. It provides a much easier mechanism to run Jupyter instances, supports user-installed Python packages, runs on virtualized hardware, and has GPU support.

Overview

The guide consists of three sections: installation, Jupyter server, and client connection.

  • Installation is only performed once per user.
  • The Jupyter server needs to be started once per FarmShare server used. It will continue running until shut down (or the server is restarted).
  • Client connection must happen every time a client computer reconnects to the Internet (e.g., a laptop wakes from sleep).

Style Guide

  • bold - commands to be typed in
    • specific keyboard keys to be pushed appear in square brackets (e.g., [Enter], [Control], [Alt])
    • key combinations will appear with a dash (e.g., [Control]-C for copy)
  • red - text that must be substituted
  • blue - numbers to keep track of
    • the specific rice server (e.g., rice05 vs rice12) jupyter is running on
    • TCP port that jupyter server is running on

Tutorial Videos

Overview

This is a high-level overview video of the concepts involved in running Jupyter on FarmShare:

Four main steps

There are video tutorials for the main steps:

Windows specific components

There are two Windows-specific steps that are different from the above tutorial videos:

Installation

This only needs to be performed once per user.

SSH into FarmShare

In a terminal, SSH into a FarmShare2 computer.

$ ssh jane@rice.stanford.edu

substituting SUNet ID for jane.

A terminal application and SSH client are shipped with Mac and Linux systems. Windows does not come with an SSH client, but putty is a a free and lightweight SSH client for Windows.

Bind to Jupyter virtual environment and install default configuration

rice99:~> bash
(ignore this command if prompt already looks like: jane@rice99:~$ )

jane@rice99:~$ source /afs/ir.stanford.edu/group/bil/env/j2/bin/activate

(j2)jane@rice99:~$ generate_jupyter_wrapper

Jupyter server

This only needs to be performed once per FarmShare server used (and after every server reboot).

SSH into FarmShare2 if not already in a system (ok to continue on from existing SSH session used for Installation).

Launch Jupyter server

Make a note of exactly which FarmShare server is being used (e.g., rice14, rice22, etc).

jane@rice99:~$ pagsh

$ kinit -r 7d; aklog

$ bash

jane@rice99:~$ source /afs/ir.stanford.edu/group/bil/env/j2/bin/activate

(j2)jane@rice99:~$ tmux
(new blank terminal appears)

(j2)jane@rice99:~$ jupyter_start

If successful, the output of this command will look like:

IMPORTANT: This Jupyter notebook is listening on TCP port 9876

[I 17:23:59.661 NotebookApp] Loading IPython parallel extension
[I 17:23:59.668 NotebookApp] Serving notebooks from local directory: /afs/ir.stanford.edu/users/j/a/jane
[I 17:23:59.668 NotebookApp] 0 active kernels
[I 17:23:59.668 NotebookApp] The Jupyter Notebook is running at: https://localhost:9876/?token=ba682763f27d8e2d59862badef28b0eaecb552529933176e
[I 17:23:59.668 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 17:23:59.671 NotebookApp]

    Copy/paste this URL into your browser when you connect for the first time,
    to login with a token:
        https://localhost:9876/?token=ba682763f27d8e2d59862badef28b0eaecb552529933176e

The Jupyter notebook will not work until a client connection is established. Copy the https://localhost:9876/?token=... URL and save it for subsequent use. Also note the TCP port in blue that the Jupyter server is listening on.

Confirm Jupyter server status

Confirm that the Jupyter server was successfully launched.

Control-B then C (releasing all keys after the Control-B before hitting C)
(opens a new tmux window and switches to it)

jane@rice99:~$ source /afs/ir.stanford.edu/group/bil/env/j2/bin/activate

(j2)jane@rice99:~$ jupyter notebook list
Currently running servers:
https://localhost:9876/?token=ba682763f27d8e2d59862badef28b0eaecb552529933176e :: /afs/ir.stanford.edu/users/j/a/jane

If successful, the output will list the new running Jupyter notebook, its TCP port, token, and home directory where the notebook was launched.

Detach and logout

Detach from tmux and logout of FarmShare.

[Control]-B then D (releasing all keys after the [Control]-B before hitting D)
(detaches tmux)

(j2)jane@rice99:~$ exit

jane@rice99:~$ exit

jane@rice99:~$ exit (if exit does not work here, use logout)

Client connection

This is the only step that needs to be performed every time (e.g., a client laptop wakes from sleep, connecting from a different laptop) once the installation is complete and the Jupyter server is running.

SSH into the same FarmShare system that the Jupyter server was started in above, tunneling the TCP port that the Jupyter server is listening on.

$ ssh jane@rice99.stanford.edu -L 9876:localhost:9876

substituting in the appropriate SUNet ID for jane, the appropriate FarmShare server hostname for rice99, and the appropriate TCP port for 9876.

Windows users can setup an SSH local tunnel using menu options in putty. Enter the appropriate TCP port in the Source port box and enter localhost:9876 for the Destination box and then click the Add button.

Once logged in, paste the https://localhost:9876/?token=... URL provided from the jupyter_start or jupyter notebook list commands into a web browser to connect to the Jupyter notebook. Ignore any browser security warnings.

Note: Jupyter works best with Firefox or Chrome browsers. Safari and Internet Explorer are not well supported.

Shut down Jupyter

It's best to shut down Jupyter to free resources for others when finished and will not be used for an extended period (e.g., greater than two weeks).

SSH into the same FarmShare system that the Jupyter server was started in.

$ ssh jane@rice99.stanford.edu

substituting in the appropriate SUNet ID for jane and the appropriate FarmShare server hostname for rice99.

Return to tmux window 0 where Jupyter was launched, shut down the server, and exit tmux.

rice99:~> tmux a -d
[Control]-B then 0 (releasing all keys after the [Control]-B before hitting 0)
(returns to tmux window 0 with lots Jupyter notebook output)

[I 17:23:59.671 NotebookApp]
[Control]-C then y [Enter] (releasing all keys after the [Control]-C saying yes to close Jupyter)

This will shut down the Jupyter notebook. Log out of FarmShare.

Renewing virtual terminal

This only needs to be done when the Kerberos credentials expire after a week and the notebook no longer functions. To restore the notebook after this time, SSH into the same FarmShare system used to create the virtual terminal. Renewal can take place at any time after a week when use is desired.

$ ssh jane@rice99.stanford.edu

substituting SUNet ID for jane and the appropriate FarmShare server for rice99.

Go to an open tmux window. There is no need to stop the existing Jupyter notebook server.

rice99:~> tmux a -d
(should reattach to window made in "confirm Juptyer running status", but any open j2 prompt will work)

(j2)jane@rice99:~$ kinit -r 7d; aklog
Password for jane@stanford.edu:

Confirm that a new Kerberos ticket and AFS token have been gathered:

 (j2)jane@rice99:~$ klist
 Ticket cache: FILE:/tmp/.krb5_jane.tgt
 Default principal: jane@stanford.edu
 
 Valid starting       Expires              Service principal
 11/11/1885 12:00:00  11/12/1885 11:00:00  krbtgt/stanford.edu@stanford.edu
         renew until 11/18/1885 12:00:00
 11/11/1885 12:00:00  11/12/1885 11:00:00  afs/ir.stanford.edu@stanford.edu
         renew until 11/18/1885 12:00:00
 
 (j2)jane@rice99:~$ tokens
 
 Tokens held by the Cache Manager:
 
 User's (AFS ID 99999) tokens for afs@ir.stanford.edu [Expires Nov 12 12:00]
    --End of list--

Once successful, detach and logout of FarmShare:

[Control]-B then D (releasing all keys after the [Control]-B before hitting D)
(detaches tmux)

This process extends the terminal's credentials for another week. The same notebook URL can be used without interruption.

Examples

Python 3

Plotly

Plotly is a popular, open-source, interactive plotting framework available for common data analysis programming languages.

import numpy as np
import plotly.offline as py
import plotly.graph_objs as go
py.init_notebook_mode(connected=True)

x = np.r_[ -np.pi : np.pi : 0.1 ]
py.iplot([go.Scatter(x=x, y=np.sin(x)), go.Scatter(x=x, y=np.cos(x))])

Jupyter python demo.png

bqplot + ipywidgets

Useful for interactive interfaces with realtime plot updating.

import numpy as np
from bqplot import pyplot as bqplt
import ipywidgets

def update_phase(n):
    line1.y = np.sin(x - n)

bqplt.clear()
x = np.r_[ -np.pi : np.pi : 0.1]
line1 = bqplt.plot(x, np.sin(x))
line2 = bqplt.plot(x, np.cos(x), 'r')
bqplt.show()

ipywidgets.interact(update_phase, n=(-2*np.pi,2*np.pi))

Jupyter python bqplot demo.gif

MATLAB

farmshare_plotly_init;

x = -pi : 0.1 : pi;
hold on;
plot(x, sin(x));
plot(x, cos(x));

Jupyter matlab demo.png

Note: If Jupyter installation was performed prior to 2017/04, run the following script in an ssh terminal to configure plotly for matlab: plotly_create_config

R

library(plotly)

x  <- seq(-pi, pi, 0.1)
y1 <- sin(x)
y2 <- cos(x)

ds <- data.frame(x, y1, y2)

p  <- plot_ly(ds, x = ~x, y = ~y1, type = 'scatter', mode='lines') %>%
      add_trace(y = ~y2)

embed_notebook(p)

Jupyter r demo.png

Julia

using Plots
plotly()

plot([sin, cos], -pi, pi)

Jupyter julia demo.png

SAS

data curves;
  do x = -constant("pi") to constant("pi") by 0.1;
    y = sin(x);
    z = cos(x);
    output;
  end;
run;

symbol1 interpol=join color=blue width=5;
symbol2 interpol=join color=red  width=5;
axis1 label=none minor=none;

proc gplot data=curves;
  plot y*x=1 z*x=2 / overlay vaxis=axis1 haxis=axis1;
run;

Jupyter sas demo.png

Useful Linux commands

Kerberos and AFS

  • klist - Displays Kerberos ticket status, cache file location, and expiration date
  • tokens - Lists AFS token status and expiration date
  • kinit -r -7d - obtains new Kerberos ticket using SUNetID that will be valid for a week
  • aklog - obtains AFS token from a valid Kerberos ticket

Jupyter management

  • jupyter notebook list - will list running Jupyter notebooks on that specific FarmShare machine (e.g., rice99)
  • jupyter notebook stop <port> - will kill the Jupyter notebook running on port <port>

tmux management

  • tmux list-sessions - lists all the tmux sessions on a given FarmShare machine
  • tmux kill-session -t <target> - will kill tmux session <target>. <target> defaults to a number if not renamed (e.g., 0, 1, 2, etc).

Troubleshooting

Single most common error

If the Jupyter notebook appears to be running, but no files appear in the notebook, this means that the session's Kerberos tickets and AFS tokens have expired. Follow the instructions in Jupyter#Renewing_virtual_terminal carefully. Similarly, if typing commands in the console yields permission denied errors, that too is nearly always expired Kerberos tickets.

General troubleshooting

The Jupyter notebooks are great when they work, but can be confusing to fix when not working. Do not rely on an existing browser window running Jupyter to help with debugging, its behavior can be misleading. Below are some general troubleshooting steps to help get things working again.

  • Best practice - only one tmux session and jupyter notebook
    • It can be very confusing if multiple tmux sessions and jupyter notebooks are running.
    • Use the commands jupyter notebook list and tmux list-sessions to confirm that only one jupyter notebook is running in only one tmux sesssion
    • Use jupyter notebook stop <port> and tmux kill-session -t <target> to trim down to only one tmux session and one jupyter notebook
  • Confirm the rice machine being used
    • Jupyter will only work if the same rice machine used to launch Jupyter is also used to establish the client connection
    • Jupyter notebooks launched on a given riceXX machine are only accessible via that riceXX machine
  • Confirm tmux session has valid Kerberos tickets and AFS tokens
  • Confirm that Jupyter is running
      • In an open j2 tmux window, check that Jupyter is running via jupyter notebook list
      • If no notebooks appear, then follow instructions at Jupyter#Jupyter_server to re-launch a Jupyter server
  • Confirm that the SSH tunnel is in place
    • The SSH tunnel needs to be from the personal computer (e.g., laptop) to rice
    • Refresh the browser window with the Jupyter URL (https://localhost:XXXX)

Custom Jupyter notebook

Kernels

Additional or custom kernels can be installed on a per-user basis, typically without root privileges, and used transparently in conjunction to the j2 environment.

Generally, user installed kernels should be placed in ~/.local/share/jupyter/kernels and these will be added to the user's Jupyter notebook in addition to the ones made available by the j2 sys-prefix environment.

Custom environments

A custom virtual environment can be deployed and used instead of the j2 one provided here if desired.

In most cases, custom kernels as mentioned above should suffice, but some core changes like different versions of python, ipython, or jupyter require a custom virtual environment.

Create virtual environment:

virtualenv -p python3 ~/venv
source ~/venv/bin/activate

This is installed on the Isilon home partition preserve AFS user quota space, as virtual environments can get large (the j2 environment deployed in this application exceeds 1.5GB). The Isilon home partition is accessible across all FarmShare2 machines.

Note, this will create a virtual environment with python3. To use python2, specify -p python2 instead of -p python3.

From there, install the desired packages using pip (and update pip and all virtualenv packages to the latest versions):

pip install pip-review
pip-review -a

Most people performing numerical analyses, interacting with the web, or manipulating data with Jupyter will want to install the following common packages:

 pip install numpy scipy jupyter sympy pandas blaze matplotlib plotly seaborn statsmodels SQLAlchemy Pillow Requests lxml beautifulsoup4

The full list of packages installed in the virtual environment (generated by a pip freeze) used in these instructions can be found at /afs/ir.stanford.edu/group/bil/env/scripts/requirements.txt

Once installed, jupyter can be launched from the command line as usual. The jupyter_start script in /afs/ir.stanford.edu/group/bil/env/scripts/ will also work in most cases since it relies on the environment path to launch jupyter.

There are some additional notes about environment creation available at /afs/ir.stanford.edu/group/bil/env/scripts/README_JUPYTER.

Technical Details

  • Jupyter setup is best performed via the Linux console. This guide is mostly step-by-step, but general familiarity with Linux is helpful.
  • SSH provides an encrypted remote shell into another system and is the primary way that Jupyter will be installed and accessed.
  • The first two commands of the installation (bash and source) switch the shell to bash and then update the environment to use a pre-built Jupyter installation. The prompt changes to have the (j2) prefix when the environment change is successful. SUNet IDs ceated after 2014 have their default shell set to bash and can omit the call to bash.
  • If of interest, the pip packages installed for the Jupyter environment are stored in a requirements.txt file at /afs/ir.stanford.edu/group/bil/env/scripts/requirements.txt. Additional packages may be installed upon request.
  • The jupyter_config_wrapper command calls a script that will create encryption keys for the Jupyter notebook, configure the jupyter config file, and sets the default tmux shell to bash. Note, this script modifies existing Jupyter notebook config files (if they exist), but will not overwrite any parameters that have changed from their default values.
  • These instructions runs the notebook within a persistent virtual terminal so it will continue to run even after the user has logged out of FarmShare.
  • The jupyter_start script accepts an optional port argument, specifying which port to connect to (e.g., jupyter_start 9876).
  • Once the Jupyter notebook server is running, it is ready to accept client connections. For security, it only accepts connections from localhost (i.e., connections originating from that specific FarmShare system itself). Connections from other systems (e.g., a laptop) are created through SSH tunnels.
  • Since the encryption keys were self-signed, the browser will warn about an insecure connection, but this warning can be disregarded.
  • If this notebook URL is misplaced, tmux attach -d brings up the virtual terminal and the notebook status can be queried using the verification step on tmux window 1 using jupyter notebook list.
Personal tools
Toolbox
LANGUAGES