Linguist 278: Programming for Linguists
Stanford Linguistics, Fall 2021
Christopher Potts

subprocess

The subprocess library allows you to include calls to other command-line utilities inside your Python programs.

In [1]:
import subprocess

Convert a notebook to HTML

In [2]:
def notebook2html(filename):
    """Converts a notebook file to HTML using `nbconvert`

    Parameters
    ----------
    filename : str
        Full path to the file to create.

    Writes
    ------
    A new file with the same basename as `filename`, but
    with the `.ipynb` extension replaced by `.html`.

    """
    cmd = ["jupyter", "nbconvert", "--to", "html", filename]
    subprocess.run(cmd)
In [3]:
notebook2html("ling278_class17_subprocess.ipynb")

Capture the standard output

In [4]:
import subprocess

def capture_ls(dirname="."):
    """Use the `ls` utility to list the contents of `dirname`, then parse that
    output into a list.

    Parameters
    ----------
    dirname : str
        Directory to list. Default is the current directory.

    Returns
    -------
    list of str
    """
    cmd = ["ls", dirname]
    proc = subprocess.run(cmd, stdout=subprocess.PIPE)
    b = proc.stdout
    return b.decode('utf8').splitlines()
In [5]:
capture_ls()

Using pandoc with subprocess

The pandoc library has a lot of functionality for converting between documents of different format. Check out the demos. For the most part, the syntax of the commands is uniform and pandoc will use the file extension to infer what type the input is and what type the output should be. So we can use subprocess to write a basic converter:

In [6]:
def convert_with_pandoc(src_filename, output_filename):
    cmd = ["pandoc", "-s", src_filename, "-o", output_filename]
    subprocess.run(cmd)
In [7]:
convert_with_pandoc("ling278_class17_subprocess.html", "ling278_class17_subprocess.docx")