-------------------------------------------------------------------------------------------------------

name:  <unnamed>

log type:  text

opened on:  28 Sep 2016, 09:40:02

. ttest yrsed if age>=25 & age<=34, by(sex)

Two-sample t test with equal variances

------------------------------------------------------------------------------

Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]

---------+--------------------------------------------------------------------

Male |    9027    13.31212    .0312351    2.967666    13.25089    13.37335

Female |    9511    13.55657    .0292693    2.854472    13.49919    13.61394

---------+--------------------------------------------------------------------

combined |   18538    13.43753    .0213921    2.912627     13.3956    13.47946

---------+--------------------------------------------------------------------

diff |           -.2444469    .0427623               -.3282649   -.1606289

------------------------------------------------------------------------------

diff = mean(Male) - mean(Female)                              t =  -5.7164

Ho: diff = 0                                     degrees of freedom =    18536

Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0

Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

*How do we make sense of the T-statistic of -5.7164?

*We use the Stata command ttail, which determines the upper tail probability given (degrees of freedom, T-statistic)

. display 1- ttail(18536, -5.7164)

5.524e-09

* Since the statistic is negative, we want one minus the right hand cumulative probability from that point.

. display  ttail(18536, 5.7164)

5.524e-09

. display  2*ttail(18536, 5.7164)

1.105e-08

* Ordinarily we double the probability to cover the cumulative probability from 5.7 to infinity, and from -5.7 to negative infinity. And the total of both of those probabilities is 1 in 100,000,000. Which is really, really small. This means we reject the null hypothesis that men and women in the US have the same educational attainment. If that null hypothesis (of no difference between men’s and women’s educations) were true, we would expect to find a difference as large as 0.24 only once in a hundred million chances.

*Now on to a look at welfare income

. summarize incwelfr

Variable |       Obs        Mean    Std. Dev.       Min        Max

-------------+--------------------------------------------------------

incwelfr |    103226    40.62242    478.8231          0      25000

. summarize incwelfr if age>=15 & incwelfr>0 & incwelfr~=.

Variable |       Obs        Mean    Std. Dev.       Min        Max

-------------+--------------------------------------------------------

incwelfr |      1289    3253.134    2813.505          1      25000

. summarize incwelfr if age>=15 & incwelfr>0 & incwelfr~=. [fweight= perwt_rounded]

Variable |       Obs        Mean    Std. Dev.       Min        Max

-------------+--------------------------------------------------------

incwelfr |   2551246    3072.095    2803.442          1      25000

*Now we are going to generate a new dichotomous variable that will be 1 (yes) for people who had welfare income in 1999, and 0(no) for people who do not have welfare income in 1999.

. replace receives_welfare=1 if incwelfr>0 & incwelfr~=.

*Generate and replace are the commands for creating new variables.

lfare |      Freq.     Percent        Cum.

------------+-----------------------------------

0 |    132,421       99.04       99.04

1 |      1,289        0.96      100.00

------------+-----------------------------------

Total |    133,710      100.00

*now we define the label that we are planning to attach to the values of the new variable, and we associate the value 0 with “no” and the value 1 with “yes.”

. label define receives_welfare_lbl 0 "no" 1 "yes"

* this command above associates the variable “receives_welfare” with the value label “receives_welfare_lbl”

lfare |      Freq.     Percent        Cum.

------------+-----------------------------------

no |    132,421       99.04       99.04

yes |      1,289        0.96      100.00

------------+-----------------------------------

Total |    133,710      100.00

* and we also attach a label to the variable itself:

*Now that we have created a new variable, we need to save the dataset to make sure that the new variable receives_welfare is available next time we open the dataset. I am not saving the dataset here, because I need to start every class with the same original dataset…

does |

respondent |

welfare |      Freq.     Percent        Cum.

------------+-----------------------------------

no |    132,421       99.04       99.04

yes |      1,289        0.96      100.00

------------+-----------------------------------

Total |    133,710      100.00

---------------------------------------------------------------------------------------

Variable |       Obs        Mean    Std. Dev.       Min        Max

-------------+--------------------------------------------------------

yrsed |    101937    12.79583    3.153618          0         17

---------------------------------------------------------------------------------------

Variable |       Obs        Mean    Std. Dev.       Min        Max

-------------+--------------------------------------------------------

yrsed |      1289     10.9903    2.817995          0         17

. table receives_welfare sex [fweight= perwt_rounded] , contents(freq mean age mean yrsed mean incwage) row col

----------------------------------------------------------------------

does      |

responden |

welfare   |               Male              Female               Total

----------+-----------------------------------------------------------

no |           1.34e+08            1.38e+08            2.72e+08

| 34.216377258300781  36.402645111083984  35.327167510986328

|           12.92792            12.90996             12.9187

|        26619.92881         14124.35177         20203.23216

|

yes |            357,702             2193544             2551246

| 34.846588134765625  32.796371459960938  33.083827972412109

|           10.75763            11.14463            11.09037

|        4196.737659         3577.073717         3663.954806

|

Total |           1.34e+08            1.40e+08            2.74e+08

| 34.218059539794922  36.346202850341797  35.306285858154297

|           12.92039            12.87497            12.89688

|        26542.14272         13915.27974         20005.84709

----------------------------------------------------------------------

* The table command allows us to put a variety of statistics into a table, in this case of sex by receives_welfare.

* Now we are going to ingest the new CPS data, first we want to clear the old data (with all new variables already having been saved).

. cd "C:\Users\Michael\Documents\current class files\intro soc methods\1995 HW1 data"

C:\Users\Michael\Documents\current class files\intro soc methods\1995 HW1 data

* Change the default directory to the directory that your dataset and do file are in.

. clear all

. do "C:\Users\Michael\Documents\current class files\intro soc methods\1995 HW1 data\cps_00006.do"

* I used the menus, File>do to find the .do file and click on it.

. /* Important: you need to put the .dat and .do files in one folder/

>    directory and then set the working folder to that folder. */

.

. set more off

.

. clear

. infix ///

>  int     year                                 1-4 ///

>  float  perwt                                5-12 ///

>  byte    age                                 13-14 ///

>  byte    sex                                 15 ///

>  long    inctot                              16-21 ///

>  using cps_00006.dat

.

. replace perwt=perwt/100

.

. label var year `"Survey year"'

. label var perwt `"Person weight"'

. label var age `"Age"'

. label var sex `"Sex"'

. label var inctot `"Total personal income"'

.

. label define agelbl 00 `"Under 1 year"'

. label define agelbl 01 `"1"', add

. label define agelbl 02 `"2"', add

. label define agelbl 03 `"3"', add

. label define agelbl 04 `"4"', add

. label define agelbl 05 `"5"', add

. label define agelbl 06 `"6"', add

. label define agelbl 07 `"7"', add

. label define agelbl 08 `"8"', add

. label define agelbl 09 `"9"', add

. label define agelbl 10 `"10"', add

. label define agelbl 11 `"11"', add

. label define agelbl 12 `"12"', add

. label define agelbl 13 `"13"', add

. label define agelbl 14 `"14"', add

. label define agelbl 15 `"15"', add

. label define agelbl 16 `"16"', add

. label define agelbl 17 `"17"', add

. label define agelbl 18 `"18"', add

. label define agelbl 19 `"19"', add

. label define agelbl 20 `"20"', add

. label define agelbl 21 `"21"', add

. label define agelbl 22 `"22"', add

. label define agelbl 23 `"23"', add

. label define agelbl 24 `"24"', add

. label define agelbl 25 `"25"', add

. label define agelbl 26 `"26"', add

. label define agelbl 27 `"27"', add

. label define agelbl 28 `"28"', add

. label define agelbl 29 `"29"', add

. label define agelbl 30 `"30"', add

. label define agelbl 31 `"31"', add

. label define agelbl 32 `"32"', add

. label define agelbl 33 `"33"', add

. label define agelbl 34 `"34"', add

. label define agelbl 35 `"35"', add

. label define agelbl 36 `"36"', add

. label define agelbl 37 `"37"', add

. label define agelbl 38 `"38"', add

. label define agelbl 39 `"39"', add

. label define agelbl 40 `"40"', add

. label define agelbl 41 `"41"', add

. label define agelbl 42 `"42"', add

. label define agelbl 43 `"43"', add

. label define agelbl 44 `"44"', add

. label define agelbl 45 `"45"', add

. label define agelbl 46 `"46"', add

. label define agelbl 47 `"47"', add

. label define agelbl 48 `"48"', add

. label define agelbl 49 `"49"', add

. label define agelbl 50 `"50"', add

. label define agelbl 51 `"51"', add

. label define agelbl 52 `"52"', add

. label define agelbl 53 `"53"', add

. label define agelbl 54 `"54"', add

. label define agelbl 55 `"55"', add

. label define agelbl 56 `"56"', add

. label define agelbl 57 `"57"', add

. label define agelbl 58 `"58"', add

. label define agelbl 59 `"59"', add

. label define agelbl 60 `"60"', add

. label define agelbl 61 `"61"', add

. label define agelbl 62 `"62"', add

. label define agelbl 63 `"63"', add

. label define agelbl 64 `"64"', add

. label define agelbl 65 `"65"', add

. label define agelbl 66 `"66"', add

. label define agelbl 67 `"67"', add

. label define agelbl 68 `"68"', add

. label define agelbl 69 `"69"', add

. label define agelbl 70 `"70"', add

. label define agelbl 71 `"71"', add

. label define agelbl 72 `"72"', add

. label define agelbl 73 `"73"', add

. label define agelbl 74 `"74"', add

. label define agelbl 75 `"75"', add

. label define agelbl 76 `"76"', add

. label define agelbl 77 `"77"', add

. label define agelbl 78 `"78"', add

. label define agelbl 79 `"79"', add

. label define agelbl 80 `"80"', add

. label define agelbl 81 `"81"', add

. label define agelbl 82 `"82"', add

. label define agelbl 83 `"83"', add

. label define agelbl 84 `"84"', add

. label define agelbl 85 `"85"', add

. label define agelbl 86 `"86"', add

. label define agelbl 87 `"87"', add

. label define agelbl 88 `"88"', add

. label define agelbl 89 `"89"', add

. label define agelbl 90 `"90 (90+, 1988-2002)"', add

. label define agelbl 91 `"91"', add

. label define agelbl 92 `"92"', add

. label define agelbl 93 `"93"', add

. label define agelbl 94 `"94"', add

. label define agelbl 95 `"95"', add

. label define agelbl 96 `"96"', add

. label define agelbl 97 `"97"', add

. label define agelbl 98 `"98"', add

. label define agelbl 99 `"99+"', add

. label values age agelbl

.

. label define sexlbl 1 `"Male"'

. label define sexlbl 2 `"Female"', add

. label values sex sexlbl

* Don’t forget to save the data when you are done…

.

.

end of do-file

. log close

name:  <unnamed>