Life tables and R programming: Period Life Table Construction

  • The directory /tmp/drutex-e695c3fb9d9d3404353802d068e2073d-1 has been created.
  • The directory /tmp/drutex-e695c3fb9d9d3404353802d068e2073d-2 has been created.
  • The directory /tmp/drutex-e695c3fb9d9d3404353802d068e2073d-3 has been created.
  • The directory /tmp/drutex-e695c3fb9d9d3404353802d068e2073d-4 has been created.
  • The directory /tmp/drutex-e695c3fb9d9d3404353802d068e2073d-5 has been created.

There are many ways to construct a life table and they vary with complexity. Various assumptions go into the conversion of TeX Embedding failed! into TeX Embedding failed!, especially at the youngest ages, and how to terminate the life table at the highest ages. Graduation or other techniques may be used to go from abridged life tables (e.g. 5 year age groups) to single years of age. See sections 3.1 and following in Preston et. al. for details.

The data is deaths from all causes over the period 1999-2002, obtained from the CDC:

## fetch data; will read in and define AllCOD.19992002
load(url("http://coale.stanford.edu/data/AllCOD.19992002.RData"));

Lifetable creation

Create a simple life table for 1999-2002 females using the raw data given in the above data. See one solution below.

  > AllCOD.19992002
  $Females
    	 Gender	   Age.Description	 Death.Count	 Population	 Crude.Death.Rate
  1 	      F	      Under 1 Year	       49016	    7846166	            624.7
  2 	      F	        1- 4 years	        8689	   30110500	             28.9
  3 	      F	        5- 9 years	        5595	   39639456	             14.1
  4 	      F	       10-14 years	        6399	   40360290	             15.9
  5 	      F	       15-19 years	       15621	   39339459	             39.7
  6 	      F	       20-24 years	       17776	   37857523	             47.0
  7 	      F	       25-34 years	       50896	   79035547	             64.4
  8 	      F	       35-44 years	      131696	   90536189	            145.5
  9 	      F	       45-54 years	      245138	   78172750	            313.6
  10	      F	       55-64 years	      394940	   51980118	            759.8
  11	      F	       65-74 years	      766829	   40202118	           1907.4
  12	      F	       75-84 years	     1443183	   30133845	           4789.2
  13	      F	 85 years and over	     1782749	   12281209	          14516.1

Compute the observed death rates and write a function in R to compute and return a period lifetable with the same age group boundaries as the raw data (0,1,5,10,15,...80, 85+) Refer to Jamie’s notes and chapter 3 in Preston et al. Your function should take as arguments: x,nDx, nKx (age, deaths between (x,x+n), population aged (x,x+n) and return columns of the lifetable: x,nMx,nax,nqx, lx, ndx, nLx, Tx, ex

A template of the function will look like:

 lifetable<- function(x, nDx, nKx) {
 
     # compute death rates
     # decide on 1a0, 4a1, ... values
     # compute nqx with final qx=1
     # compute lx,dx,nLx,Tx, ex
 
     # return data frame of results with named columns
 
     lt <- data.frame(x=x, nax =nax, 
                nMx = nMx, 
                nqx = nqx, 
                lx = lx, 
                ndx = ndx, 
                nLx = nLx,
                Tx = Tx,
                ex = ex )
     return(lt)
   }

Compare your solution with the examples of Jamie's example life table function, the terse code from UC Berkeley and the much more elaborate life.table() found in John Wilmoth's archive of R functions.

 

Solution

Here is one possible route to take in solving this exercise. Since most people used Jamie’s life table function as a starting point, I will present a modification of that as my solution. I use data.frame() to return the value of the calculation because then the individual components of the result may be selected by name, not just by column number.

### --- Data --- ###
  ## load() is quite different from reading in data from an ASCII file.  It restores 
  ## variables once saved with the save() command.  It is difficult to know in advance 
  ## what will be restored by load-ing a stored dataset.  I try to minimize confusion by 
  ## using the same name for the R object and for the name of the file.
 
load(url("http://coale.stanford.edu/data/AllCOD.19992002.RData"));
    # str(AllCOD.19992002)  ## the str() function will tell you how something is structured. 
    # here is  shows that AllCOD.19992002 consists of a list with two components, 
    # (Female, Male), and each of those components is in turn a data.frame.
 
females <- AllCOD.19992002$Females ; 
 
# variables I will pass the life table function
# age is iregularly spaced
x <- c(0,1,5,10,15,20,25,35,45,55,65,75,85) ;
  # note that R collapses a single column to a vector when it pulls out the result out 
  # of a data.frame
nDx <- females$Death.Count;   # other syntax which produces the same result: females[[3]], females[,3], 
                              # females["Death.Count"], or AllCOD.19992002$Females$Death.Count
nKx <- females$Population ;
nMx <- nDx / nKx ;

And now the lifetable function. Note that I do not have separate arguments for TeX Embedding failed! and TeX Embedding failed!, but only a single TeX Embedding failed! argument.

life.table <- function( x, nMx){
## simple lifetable using Keyfitz and Flieger separation factors and 
## exponential tail of death distribution (to close out life table)
 
  b0 <- 0.07;   b1<- 1.7;      
  nmax <- length(x)
  # nMx = nDx/nKx   
  n <- c(diff(x),999)          #width of the intervals
 nax <- n / 2;		            # default to .5 of interval
 nax[1] <- b0 + b1 *nMx[1]    # from Keyfitz & Flieger(1968)
 nax[2] <- 1.5  ;              
 nax[nmax] <- 1/nMx[nmax] 	  # e_x at open age interval
 nqx <- (n*nMx) / (1 + (n-nax)*nMx)
 nqx<-ifelse( nqx > 1, 1, nqx); # necessary for high nMx
 nqx[nmax] <- 1.0
 lx <- c(1,cumprod(1-nqx)) ;  # survivorship lx
 lx <- lx[1:length(nMx)]
 ndx <- lx * nqx ;
 nLx <- n*lx - nax*ndx;       # equivalent to n*l(x+n) + (n-nax)*ndx
 nLx[nmax] <- lx[nmax]*nax[nmax]
 Tx <- rev(cumsum(rev(nLx)))
 ex <- ifelse( lx[1:nmax] > 0, Tx/lx[1:nmax] , NA);
 lt <- data.frame(x=x,nax=nax,nmx=nMx,nqx=nqx,lx=lx,ndx=ndx,nLx=nLx,Tx=Tx,ex=ex)
 return(lt)
}