-----------------------------------------------------------------------------------
log: C:\AAA Miker Files\newer web pages\soc_meth_proj3\section4_2009.log
log type: text
opened on: 17 Feb 2009, 15:31:10
. set mem 200m
Current memory allocation
current memory usage
settable value description (1M = 1024k)
--------------------------------------------------------------------
set maxvar 5000 max. variables allowed 1.909M
set memory 200M max. data space 200.000M
set matsize 400 max. RHS vars in models 1.254M
-----------
203.163M
. use "C:\AAA Miker Files\newer web pages\soc_meth_proj3\cps_mar_2000_new.dta", clear
. gen random=uniform()
*I generated a uniform random variable so I could subset the data to make the drawing of graphs less time consuming...
. twoway (scatter incwage age) if age>24 & age<65 & random<.1
value label incwagelbl not found
r(111);
*I had to drop the value label.
. label val incwage .
. twoway (scatter incwage age) if age>24 & age<65 & random<.1
. *This kind of scatterplot, even reducing the sample size by 90%, still is uninformative because of too many dots.
. save "C:\AAA Miker Files\newer web pages\soc_meth_proj3\cps_mar_2000_new.dta", replace
file C:\AAA Miker Files\newer web pages\soc_meth_proj3\cps_mar_2000_new.dta saved
*I saved the dataset above because I am about to do a dataset-erasing operation, that is table with the option replace, which erases the dataset in memory and replaces it with this little table.
. table age if age>24 & age<65 [aweight= perwt_rounded], contents (mean incwage) replace
-------------------------
Age | mean(incwage)
----------+--------------
25 | 18994.63884
26 | 21609.25855
27 | 22602.97719
28 | 23740.77618
29 | 24434.94152
30 | 25561.54957
31 | 25912.09322
32 | 27763.1664
33 | 28465.10021
34 | 26307.66749
35 | 28294.34806
36 | 28077.12539
37 | 27960.1125
38 | 30439.64504
39 | 29310.39272
40 | 28797.78578
41 | 30119.11736
42 | 30995.87134
43 | 31190.21905
44 | 29284.21552
45 | 30992.28454
46 | 31361.58221
47 | 31017.66766
48 | 33609.92178
49 | 31860.37725
50 | 33065.30876
51 | 31988.28593
52 | 31181.54386
53 | 30966.94183
54 | 28962.11128
55 | 28042.6756
56 | 27754.03629
57 | 27212.21566
58 | 22663.09439
59 | 23186.67787
60 | 20070.18688
61 | 20931.43403
62 | 16455.27565
63 | 13791.26046
64 | 11673.78667
-------------------------
. edit
- preserve
. *This procedure puts a new table into memory, where it can easily be copied to
Excel, but it also erases your old dataset from memory, so be sure you save before you table, replace. I went into the data editor, and copied the cells.
. clear
. use "C:\AAA Miker Files\newer web pages\soc_meth_proj3\cps_mar_2000_new.dta", clear
. graph box incwage, over(age)
--Break--
r(1);
*This graph was taking too long on my old laptop.
. graph box incwage if age>24 & age<65 & random<.1, over(age)
. graph box incwage if age>24 & age<65 & random<.1 & incwage<100000, over(age)
. *These box plots were not as informative, I think, as the simple mean income by age, for several reasons. One reason is that the regression is a regression of means, not a regression of medians. Second, the shape of the means by age con
forms very well to the shape of the predicted values from the regression models
that include age and age-squared.
. save "C:\AAA Miker Files\newer web pages\soc_meth_proj3\cps_mar_2000_new.dta", replace
file C:\AAA Miker Files\newer web pages\soc_meth_proj3\cps_mar_2000_new.dta saved
. exit, clear