General Social Surveys (GSS)
GSS 101: An Introduction to the GSS
by Stefanie Bailey, Statistical Software Support (June 16, 2001)(Document updated on March 10, 2004 )
The General Social Survey (GSS) is the first U.S. social science data set designed for user analysis rather than for researchers on a specific project. It is an ongoing general face-to-face survey of U.S. households, with data collected from 1972-2000. The GSS is conducted by the National Opinion Research Center in Chicago. The most recent available data are from 2000.
The General Social Survey currently consists of 23 independent cross-sectional surveys. The years in which it was administered are: 1973-1978, 1980, 1982, 1983-1993, 1994, 1996, 1998, and 2000.
Each GSS survey contains a core of standard attitudinal and demographic variables, plus questions on certain special-interest topics that are different each year (called “topical modules”). See the GSS codebook or the ICPSR site for a good overview of which variables were included each year. Some years also include experiments and non-GSS supplements, which are not discussed here (see the User’s Guide for more information).
Three sections of the GSS survey are repeated regularly:
1. Core questions
The core consists of about one-third demographic questions (20 minutes) and two-thirds attitudes and behaviors (40 minutes). Many core items appeared in earlier national studies, so they can be compared to pre-1970s data. Generally, core items are replicated each time the GSS is administered, although not all core variables are included in all years.
The demographic section is extensive, providing data on information such as age, level of education, veteran’s status, home ownership, divorce history, and extensive information on family of origin and spouses. Attitude items cover most major political and social debates in America today, such as abortion, government spending priorities, and women’s rights.
Because core questions are replicated, subgroups of cases that are generally too small to be studied in a regular national sample accumulate across years and become larger. Examples of groups that have at least 2,000 respondents across GSS survey years are retired people, widows, African and Irish Americans, and ex-smokers.
2. Topical modules
Every recent survey contains the core questions, plus one or more topical modules. Each of these modules focuses on a particular subject and lasts about 15 minutes. A list of topical modules with the years they were administered follows:
If you want to download data for an entire topical module, you can refer to http://www.icpsr.umich.edu/GSS/ under the “Collections” link. Here you find the relevant year and variable names, which you can use to search for the module’s variables in DEWI.
The same website (http://www.icpsr.umich.edu/GSS/) also has a subject index of variables, which may be useful as you decide which data to extract.
3. Cross-national modules (international data not available through GSS web extraction)
The General Social Survey has been part of the International Social Survey Program (ISSP) since 1985. ISSP is an ongoing, collaborative annual cross-national program. ISSP coordinates research topics and goals among existing national surveys. A number of countries participate in the program, which delivers the same sets of questions to respondents worldwide. Some earlier ISSP topical modules have now been replicated. This means that besides comparing responses across nations, researchers can track international trends over time.
Questions in ISSP modules are similar in format to “regular” GSS questions. You can use them for analyses in the same way as other GSS variables. If you want to compare results across countries, though, you need to get the international data here (http://webapp.icpsr.umich.edu/cocoon/ICPSR-SERIES/00124.xml). Only data for the US are available through DEWI GSS web extraction.
A list of ISSP modules with the years they were administered follows:
The population of the General Social Survey is adult households. The GSS samples English-speaking non-institutionalized adults living in the US.
Through 1993, the sample size was 1500 respondents per survey. Beginning in 1994, the sample was increased to 3000.
Response rates range from 70 percent in 2000 to 82 percent in 1993. Non-response has been highest among those who live in central cities and in the Northeast, and it is higher among men than women. These rates equal or exceed those typical of high-quality surveys and are better than typical rates for telephone and mail surveys. See User’s Guide (p. 56) for more detailed information.
Specifics of the sample design
The 1972-1974 GSS surveys were modified probability samples, introducing quotas at the block level. The 1975 and 1976 surveys mixed modified with full probability samples. Since then full probability samples have been used. After 1994, a new design was introduced, with a condensed set of core questions and more room for topical “mini-modules.” See the User’s Guide for more details about sampling.
Oversamples of blacks were carried out in 1982 and 1987. If you are using 1982 or 1987 data and desire a nationally representative sample, you should either weight by the OVERSAMPLE variable (see below) or use the SAMPLE variable to exclude the oversamples from analysis.
Because of the sample constraints and other factors, the GSS sample does not exactly match U.S. Census data for the general population. For example, males are under-represented in all full probability samples, men with full-time employment were under-represented in all block quota samples, and blacks were over-represented in 1972. The User’s Guide (pp. 44-45) provides detailed information comparing GSS samples to Census data.
Because of its multistage, clustered sample design, the GSS sample is less statistically efficient than a simple random sample. This difference is called a design effect, which for the GSS is estimated to be between 1.5 and 1.8. Users can adjust for design effects when performing analyses by weighting the sample. See the User’s Guide (pp. 41-42) for more information.
Weight variables are included in the GSS. Weights compensate for unequal probabilities of sample selection and adjust for the effects of non-response. By using weights, researchers can make generalizations to the national populations represented by the GSS.
Below are descriptions of some common weight variables (see the User’s Guide [pp. 42-43] for more information):
Users of GSS data can carry out three principal types of analyses. Cross-sectional multivariate analyses, such as linear regression, look at many respondents at one point in time. These are probably the most common studies using GSS data.
Even for cross-sectional analysis, you can pool multiple years in your sample. The larger sample gives you greater precision, especially among small subgroups. Be sure to include YEAR as a variable to control for year effects. The Users’ Guide suggests three years as a generally appropriate pool for cross-sectional analysis.
Trend analyses measure data over several points in time. Since each year’s respondents are different people, panel analyses that follow the same respondents over time are not possible.
The third possible type of study using GSS data is cohort analysis, which looks for effects within the GSS by birth cohort. See User’s Guide (pp. 80-82) for more information on these three types of analyses, and the end of this document for references on trend analyses.
For general use:
James A. Davis and Tom W. Smith. 1992. The NORC General Social Survey: A User’s Guide. Newbury Park, CA: Sage Publications.
James A. Davis, Tom W. Smith, and Peter V. Marsden. 1999. General Social Surveys, 1972-1998: Cumulative Codebook. Chicago: NORC.
Smith, Tom W. and B. J. Arnold. 1990. Annotated Bibliography of Papers Using the General Social Survey. Ann Arbor, MI: ICPSR.
Useful publications for analyses:
Firebaugh, Glenn. 1997. Analyzing Repeated Surveys. Quantitative Applications in the Social Sciences, no. 115. Newbury Park, CA: Sage Publications.
Glenn, Norval D. 1977. Cohort Analysis. Quantitative Applications in the Social Sciences, no. 5. Newbury Park, CA: Sage Publications.
Kalton, Graham. 1983. Introduction to Survey Sampling. Quantitative Applications in the Social Sciences, no. 35. Newbury Park, CA: Sage Publications.
Lee, Eun Sul, Ronald N. Forthofer, and Ronald J. Lorimor. 1989. Analyzing Complex Survey Data. Quantitative Applications in the Social Sciences, no. 71. Newbury Park, CA: Sage Publications.
Niemi, Richard G., John Mueller, and Tom W. Smith. 1989. Trends in Public Opinion: A Compendium of Survey Data. Westport, CT: Greenwood.
Smith, Tom W. 1988. Timely Artifacts: A Review of Measurement Variation in the 1972-1988 GSS. Chicago: NORC.