BLAS is an acronym for Basic Linear Algebra Subroutines. As the name indicates, it contains subprograms for basic operations on vectors and matrices. BLAS was designed to be used as a building block in other codes, for example LAPACK. The source code for BLAS is available through Netlib. However, many computer vendors will have a special version of BLAS tuned for maximal speed and efficiency on their computer. This is one of the main advantages of BLAS: the calling sequences are standardized so that programs that call BLAS will work on any computer that has BLAS installed. If you have a fast version of BLAS, you will also get high performance on all programs that call BLAS. Hence BLAS provides a simple and portable way to achieve high performance for calculations involving linear algebra. LAPACK is a higher-level package built on the same ideas.

The BLAS subroutines can be divided into three *levels*:

**Level 1:**Vector-vector operations.*O(n)*data and*O(n)*work.**Level 2:**Matrix-vector operations.*O(n^2)*data and*O(n^2)*work.**Level 3:**Matrix-matrix operations.*O(n^2)*data and*O(n^3)*work.

Each BLAS and LAPACK routine comes in several versions, one for each precision (data type). The first letter of the subprogram name indicates the precision used:

S Real single precision. D Real double precision. C Complex single precision. Z Complex double precision.

Complex double precision is not strictly defined in Fortran 77, but most compilers will accept one of the following declarations:

double complexlist-of-variablescomplex*16list-of-variables

Some of the BLAS 1 subprograms are:

- xCOPY - copy one vector to another
- xSWAP - swap two vectors
- xSCAL - scale a vector by a constant
- xAXPY - add a multiple of one vector to another
- xDOT - inner product
- xASUM - 1-norm of a vector
- xNRM2 - 2-norm of a vector
- IxAMAX - find maximal entry in a vector

The first letter (x) can be any of the letters S,D,C,Z depending on the precision. A quick reference to BLAS 1 can be found at http://www.netlib.org/blas/blasqr.ps

Some of the BLAS 2 subprograms are:

- xGEMV - general matrix-vector multiplication
- xGER - general rank-1 update
- xSYR2 - symmetric rank-2 update
- xTRSV - solve a triangular system of equations

A detailed description of BLAS 2 can be found at http://www.netlib.org/blas/blas2-paper.ps.

Some of the BLAS 3 subprograms are:

- xGEMM - general matrix-matrix multiplication
- xSYMM - symmetric matrix-matrix multiplication
- xSYRK - symmetric rank-k update
- xSYR2K - symmetric rank-2k update

The more advanced matrix operations, like solving a linear system of equations, are contained in LAPACK. A detailed description of BLAS 3 can be found at http://www.netlib.org/blas/blas3-paper.ps.

Let us first look at a very simple BLAS routine, SSCAL. The call sequence is

call SSCAL ( n, a, x, incx )

Here *x* is the vector, *n* is the length
(number of elements in *x* we wish to use), and *a*
is the scalar by which we want to multiply *x*. The last
argument *incx* is the *increment*. Usually, *incx=1*
and the vector *x* corresponds directly to the
one-dimensional Fortran array *x*. For *incx>1*
it specifies how many elements in the array we should
"jump" between each element of the vector *x*.
For example, if *incx=2* it means we should only scale
every other element (note: the physical dimension of the array *x*
should then be at least *2n-1*). Consider these examples
where *x* has been declared as `real x(100)`.

call SSCAL(100, a, x, 1) call SSCAL( 50, a, x(50), 1) call SSCAL( 50, a, x(2), 2)

The first line will scale all 100 elements of *x* by *a*.
The next line will only scale the last 50 elements of *x*
by *a*. The last line will scale all the even indices of *x*
by *a*.

Observe that the array *x* will be overwritten by the
new values. If you need to preserve a copy of the old *x*,
you have to make a copy first, e.g., by using SCOPY.

Now consider a more complicated example. Suppose you have two
2-dimensional arrays A and B, and you are asked to find the *(i,j)*
entry of the product A*B. This is easily done by computing the
inner product of row *i* from A and column *j* of
B. We can use the BLAS 1 subroutine SDOT. The only difficulty is
to figure out the correct indices and increments. The call
sequence for SDOT is

SDOT ( n, x, incx, y, incy )

Suppose the array declarations were

real A(lda,lda) real B(ldb,ldb)

but in the program you know that the actual size of A is *m*p*
and for B it is *p*n*. The *i*'th row of A starts
at the element *A(i,1)*. But since Fortran stores
2-dimensional arrays down columns, the next row element *A(i,2)*
will be stored *lda* elements later in memory (since *lda*
is the length of a column). Hence we set *incx = lda*. For
the column in B there is no such problem, the elements are stored
consecutively so *incy = 1*. The length of the inner
product calculation is *p*. Hence the answer is

SDOT ( p, A(i,1), lda, B(1,j), 1 )

First of all you should check if you already have BLAS on your system. If not, you can find it on Netlib at http://www.netlib.org/blas.

The BLAS routines are almost self-explanatory. Once you know which routine you need, fetch it and read the header section that explains the input and output parameters in detail. We will look at an example in the next section when we address the LAPACK routines.

*Copyright © 1995-7 by Stanford University. All rights
reserved.*