US Census

From edegan.com
Revision as of 17:55, 14 July 2009 by imported>Ed
Jump to navigation Jump to search

The U.S. Census is taken by U.S. Census Bureau. Micro data is releases in 1% and 5% versions, with different variables available. Of particular interest to researchers is the Public Use Microdata Series (PUMS), which is available for free download and can be used for academic purposes.

Ethnicity variables in the U.S. Census

The U.S. Census PUMS documentation provides the Census questionnaire in appendix D. The questionnaire poses five questions that have a direct bearing on the ethnicity of respondents (original question numbers preserved):

  • 5. Is this person Spanish /Hispanic /Latino? Mark the "No" box if not Spanish /Hispanic /Latino.
    • Yes, Mexican, Mexican Am., Chicano
    • Yes, Puerto Rican
    • Yes, Cuban
    • Yes, other Spanish /Hispanic /Latino — Print group.
    • No, not Spanish /Hispanic /Latino
  • 6. What is this person’s race? Mark one or more races to indicate what this person considers himself/herself to be.
    • White
    • Black, African Am., or Negro
    • American Indian or Alaska Native — Print name of enrolled or principal tribe.
    • Native Hawaiian
    • Guamanian or Chamorro
    • Samoan
    • Other Pacific Islander - Print race.
    • Asian Indian
    • Chinese
    • Filipino
    • Japanese
    • Korean
    • Vietnamese
    • Other Asian — Print race.
    • Some other race — Print race.
  • 7. What is this person’s ancestry or ethnic origin? (For example: Italian, Jamaican, African Am., Cambodian, Cape Verdean, Norwegian, Dominican, French Canadian, Haitian, Korean, Lebanese, Polish, Nigerian, Mexican, Taiwanese, Ukrainian, and so on.)
  • 11. Does this person speak a language other than English at home?
    • Yes - What is this language? (For example: Korean, Italian, Spanish, Vietnamese)
    • No
  • 12. Where was this person born?
    • In the United States — Print name of state.
    • Outside the United States — Print name of foreign country

Only questions 6 and 7 provide truly structured data. Questions 7, 11 and 12 allow for free-hand entries and so exhibit substantial variation in encoding/representation of answers.