Source and Accurcy Statement

Attachment C -- Source and Accuracy Statement.doc

Annual Social and Economic Supplement to the Current Population Survey

Source and Accurcy Statement

OMB: 0607-0354

Document [doc]
Download: doc | pdf


Source of the Data and Accuracy of the Estimates for the 2008 Annual Social and Economic Supplement Microdata File

Table of Contents


SOURCE OF DATA 1

Basic CPS 1

The 2008 Annual Social and Economic Supplement 3

Estimation Procedure 3


ACCURACY OF ESTIMATES 4

Sampling Error 4

Nonsampling Error 4

Nonresponse 5

Coverage 5

Comparability of Data 6

A Nonsampling Error Warning 7

Estimation of Median Incomes 7

Standard Errors and Their Use 8

Estimating Standard Errors 8

Generalized Variance Parameters 8

Standard Errors of Estimated Numbers 10

Standard Errors of Estimated Percentages 11

Standard Errors of Estimated Differences 12

Standard Errors of Estimated Ratios 14

Standard Errors of Estimated Medians 15

Standard Errors of Averages for Grouped Data 18

Standard Errors of Estimated Per Capita Deficits 20

Accuracy of State Estimates 21

Standard Errors of State Estimates 21

Standard Errors of Regional Estimates 22

Standard Errors of Groups of States 23

Standard Errors of Data for Combined Years 24

Standard Errors of Differences of 2-Year Averages 25

Standard Errors of Quarterly or Yearly Averages 26

Technical Assistance 27


REFERENCES 33



Tables


Table 1. Description of the March Basic CPS and ASEC Sample Cases 2

Table 2. CPS Coverage Ratios: March 2008 6

Table 3. Estimation Groups of Interest and Generalized Variance Parameters 10

Table 4. Parameters for Computation of Standard Errors for Labor Force Characteristics: March 2008 28

Table 5. Parameters for Computation of Standard Errors for People and Families:

2008 ASEC 29

Table 6. CPS Year Factors: ASEC 1947 to 2007 30

Table 7. CPS Year-to-Year Correlation Coefficients for Income Characteristics: ASEC 1961 to 2008 31

Table 8. CPS Year-To-Year Correlation Coefficients for Poverty Characteristics: ASEC 1971 to 2008 31

Table 9. Factors and Populations for State Standard Errors and Parameters: 2008 ASEC 32

Table 10. Factors and Populations for Regional Standard Errors and Parameters: 2008 ASEC 32

Source of the Data and Accuracy of the Estimates for the 2008 Annual

Social and Economic Supplement Microdata File

SOURCE OF DATA

The data in this microdata file are from the 2008 Annual Social and Economic Supplement (ASEC) of the Current Population Survey (CPS). The Census Bureau conducts the ASEC over a 3-month period, in February, March, and April, with most data collection occurring in the month of March. The ASEC uses two sets of questions, the basic CPS and a set of supplemental questions. The CPS, sponsored jointly by the Census Bureau and the U.S. Bureau of Labor Statistics, is the country’s primary source of labor force statistics for the entire population. The Census Bureau and the U.S. Bureau of Labor Statistics also jointly sponsor the ASEC.


Basic CPS. The monthly CPS collects primarily labor force data about the civilian noninstitutional population living in the United States. The institutionalized population, which is excluded from the population universe, is composed primarily of the population in correctional institutions and nursing homes (91 percent of the 4.1 million institutionalized people in Census 2000). Interviewers ask questions concerning labor force participation about each member 15 years old and over in sample households. Typically, the week containing the nineteenth of the month is the interview week. The week containing the twelfth is the reference week (i.e., the week about which the labor force questions are asked).


The CPS uses a multistage probability sample based on the results of the decennial census, with coverage in all 50 states and the District of Columbia. The sample is continually updated to account for new residential construction. When files from the most recent decennial census become available, the Census Bureau gradually introduces a new sample design for the CPS.


In April 2004, the Census Bureau began phasing out the 1990 sample1 and replacing it with the 2000 sample, creating a mixed sampling frame. Two simultaneous changes occurred during this phase-in period. First, primary sampling units (PSUs)2 selected for only the 2000 design gradually replaced those selected for the 1990 design. This involved 10 percent of the sample. Second, within PSUs selected for both the 1990 and 2000 designs, sample households from the 2000 design gradually replaced sample households from the 1990 design. This involved about 90 percent of the sample. The new sample design was completely implemented by July 2005.


In the first stage of the sampling process, PSUs are selected for sample. The United States is divided into 2,025 PSUs. The PSUs were redefined for this design to correspond to the Office of Management and Budget definitions of Core-Based Statistical Area definitions and to improve efficiency in field operations. These PSUs are grouped into 824 strata. Within each stratum, a single PSU is chosen for the sample, with its probability of selection proportional to its population as of the most recent decennial census. This PSU represents the entire stratum from which it was selected. In the case of strata consisting of only one PSU, the PSU is chosen with certainty.

Approximately 72,000 housing units were selected from the sampling frame for the basic CPS. Based on eligibility criteria, 11 percent of these housing units were sent directly to computer-assisted telephone interviewing (CATI). The remaining units were assigned to interviewers for computer-assisted personal interviewing (CAPI).3 Of all housing units in sample, about 58,900 were determined to be eligible for interview. Interviewers obtained interviews at about 53,800 of these units. Noninterviews occur when the occupants are not found at home after repeated calls or are unavailable for some other reason. Table 1 summarizes historical changes in the CPS design.


Table 1. Description of the of the March Basic CPS and ASEC Sample Cases

Time period

Number of sample PSUs

Basic CPS housing units eligible

Total (ASEC/ADS1 + basic CPS)

housing units eligible

Interviewed

Not

interviewed

Interviewed

Not interviewed

2008

824

53,800

5,100

76,600

6,400

2007

824

53,700

5,600

76,100

7,100

2006

824

54,000

5,400

76,700

7,100

2005

754/824 2

54,400

5,700

77,200

7,500

2004

754

55,000

5,200

77,700

7,000

2003

754

55,500

4,500

78,300

6,800

2002

754

55,500

4,500

78,300

6,600

2001

754

46,800

3,200

49,600

4,300

2000

754

46,800

3,200

51,000

3,700

1999

754

46,800

3,200

50,800

4,300

1998

754

46,800

3,200

50,400

5,200

1997

754

46,800

3,200

50,300

3,900

1996

754

46,800

3,200

49,700

4,100

1995

792

56,700

3,300

59,200

3,800

1990 to 1994

729

57,400

2,600

59,900

3,100

1989

729

53,600

2,500

56,100

3,000

1986 to 1988

729

57,000

2,500

59,500

3,000

1985

629/729 3

57,000

2,500

59,500

3,000

1982 to 1984

629

59,000

2,500

61,500

3,000

1980 to 1981

629

65,500

3,000

68,000

3,500

1977 to 1979

614

55,000

3,000

58,000

3,500

1976

624

46,500

2,500

49,000

3,000

1973 to 1975

461

46,500

2,500

49,000

3,000

1972

449/461 4

45,000

2,000

45,000

2,000

1967 to 1971

449

48,000

2,000

48,000

2,000

1963 to 1966

357

33,400

1,200

33,400

1,200

1960 to 1962

333

33,400

1,200

33,400

1,200

1959

330

33,400

1,200

33,400

1,200


1 The ASEC was referred to as the Annual Demographic Survey (ADS) until 2002.

2 The Census Bureau redesigned the CPS following the Census 2000. During phase-in of the new design, housing units from the new and old designs were in the sample.

3 The Census Bureau redesigned the CPS following the 1980 Decennial Census of Population and Housing.

4 The Census Bureau redesigned the CPS following the 1970 Decennial Census of Population and Housing.

The 2008 Annual Social and Economic Supplement. In addition to the basic CPS questions, interviewers asked supplementary questions for the ASEC. They asked these questions of the civilian noninstitutional population and also of military personnel who live in households with at least one other civilian adult. The additional questions covered the following topics:


  • Household and family characteristics

  • Marital status

  • Geographic mobility

  • Foreign-born population

  • Income from the previous calendar year

  • Poverty

  • Work status/occupation

  • Health insurance coverage

  • Program participation

  • Educational attainment


Including the basic CPS sample, approximately 97,500 housing units were in sample for the ASEC. About 83,000 housing units were determined to be eligible for interview, and about 76,600 interviews were obtained (see Table 1).


The additional sample for the ASEC provides more reliable data for Hispanic households, non-Hispanic minority households, and non-Hispanic White households with children 18 years or younger. These households were identified for sample from previous months and the following April. For more information about the households eligible for the ASEC, please refer to reference [2].


Estimation Procedure. This survey’s estimation procedure adjusts weighted sample results to agree with independently derived population estimates of the civilian noninstitutional population of the United States and each state (including the District of Columbia). These population estimates, used as controls for the CPS, are prepared monthly to agree with the most current set of population estimates that are released as part of the Census Bureau’s population estimates and projections program.


The population controls for the nation are distributed by demographic characteristics in two ways:


  • Age, sex, and race (White alone, Black alone, and all other groups combined).

  • Age, sex, and Hispanic origin.


The population controls for the states are distributed by race (Black alone and all other race groups combined), age (0-15, 16-44, and 45 and over), and sex.


The independent estimates by age, sex, race, and Hispanic origin, and for states by selected age groups and broad race categories, are developed using the basic demographic accounting formula whereby the population from the latest decennial data is updated using data on the components of population change (births, deaths, and net international migration) with net internal migration as an additional component in the state population estimates.


The net international migration component in the population estimates includes a combination of the following:


  • Legal migration to the United States.

  • Emigration of foreign-born and native people from the United States.

  • Net movement between the United States and Puerto Rico.

  • Estimates of temporary migration.

  • Estimates of net residual foreign-born population, which include unauthorized migration.


Because the latest available information on these components lags the survey date, it is necessary to make short-term projections of these components to develop the estimate for the survey date.


The estimation procedure of the ASEC includes a further adjustment so the husband and wife of a household receive the same weight.



ACCURACY OF THE ESTIMATES

A sample survey estimate has two types of error: sampling and nonsampling. The accuracy of an estimate depends on both types of error. The nature of the sampling error is known given the survey design; the full extent of the nonsampling error is unknown.


Sampling Error. Since the CPS estimates come from a sample, they may differ from figures from an enumeration of the entire population using the same questionnaires, instructions, and enumerators. For a given estimator, the difference between an estimate based on a sample and the estimate that would result if the sample were to include the entire population is known as sampling error. Standard errors, as calculated by methods described in “Standard Errors and Their Use,” are primarily measures of the magnitude of sampling error. However, they may include some nonsampling error.


Nonsampling Error. For a given estimator, the difference between the estimate that would result if the sample were to include the entire population and the true population value being estimated is known as nonsampling error. There are several sources of nonsampling error, which may occur during the development or execution of the survey. It can occur because of circumstances created by the interviewer, the respondent, the survey instrument, or the way the data are collected and processed. For example, errors could occur because:


• The interviewer records the wrong answer, the respondent provides incorrect information, the respondent estimates the requested information, or an unclear survey question is misunderstood by the respondent (measurement error).


• Some individuals which should have been included in the survey frame were missed (coverage error).

• Responses are not collected from all those in the sample or the respondent is unwilling to provide information (nonresponse error).


• Values are estimated imprecisely for missing data (imputation error).


• Forms may be lost, data may be incorrectly keyed, coded, or recoded, etc. (processing error).


Answers to questions about money income often depend on the memory or knowledge of one person in a household. Recall problems can cause underestimates of income in survey data because it is easy to forget minor or irregular sources of income. Respondents may also misunderstand what the Census Bureau considers money income or may simply be unwilling to answer these questions correctly because the questions are considered too personal. See reference [3] for more details.


To minimize these errors, the Census Bureau applies quality control procedures during all stages of the production process including the design of the survey, the wording of questions, the review of the work of interviewers and coders, and the statistical review of reports.


Two types of nonsampling error that can be examined to a limited extent are nonresponse and undercoverage.


Nonresponse. The effect of nonresponse cannot be measured directly, but one indication of its potential effect is the nonresponse rate. For the cases eligible for the 2008 ASEC, the basic CPS household-level nonresponse rate was 8.6 percent. The household-level nonresponse rate for the ASEC was an additional 7.7 percent. These two nonresponse rates lead to a combined supplement nonresponse rate of 15.6 percent.


Coverage. The concept of coverage in the survey sampling process is the extent to which the total population that could be selected for sample “covers” the survey’s target population. Missed housing units and missed people within sample households create undercoverage in the CPS. Overall CPS undercoverage for March 2008 is estimated to be about 12.0 percent. CPS coverage varies with age, sex, and race. Generally, coverage is larger for females than for males and larger for non-Blacks than for Blacks. This differential coverage is a general problem for most household-based surveys.


The CPS weighting procedure partially corrects for bias from undercoverage, but biases may still be present when people who are missed by the survey differ from those interviewed in ways other than age, race, sex, Hispanic origin, and state of residence. How this weighting procedure affects other variables in the survey is not precisely known. All of these considerations affect comparisons across different surveys or data sources.


A common measure of survey coverage is the coverage ratio, calculated as the estimated population before poststratification divided by the independent population control. Table 2 shows March 2008 CPS coverage ratios by age and sex for certain race and Hispanic groups. The CPS coverage ratios can exhibit some variability from month to month.

Table 2. CPS Coverage Ratios: March 2008


Total

White only

Black only

Residual race

Hispanic

Age group

All people

Male

Female

Male

Female

Male

Female

Male

Female

Male

Female

0-15

0.88

0.88

0.89

0.90

0.90

0.78

0.78

0.89

0.91

0.89

0.91

16-19

0.85

0.85

0.86

0.87

0.87

0.74

0.79

0.90

0.85

0.93

0.89

20-24

0.78

0.76

0.79

0.78

0.80

0.70

0.71

0.71

0.80

0.84

0.85

25-34

0.83

0.80

0.85

0.82

0.87

0.63

0.77

0.79

0.88

0.78

0.92

35-44

0.87

0.84

0.89

0.86

0.91

0.75

0.81

0.81

0.79

0.78

0.90

45-54

0.89

0.87

0.91

0.88

0.93

0.82

0.87

0.81

0.84

0.74

0.88

55-64

0.92

0.92

0.92

0.93

0.92

0.90

0.92

0.79

0.83

0.86

0.94

65+

0.93

0.92

0.95

0.92

0.94

0.96

1.01

0.89

0.87

0.81

0.86

15+

0.87

0.86

0.89

0.87

0.90

0.77

0.84

0.81

0.84

0.81

0.90

0+

0.88

0.86

0.89

0.87

0.90

0.77

0.83

0.83

0.85

0.83

0.90


NOTES: (1) The Residual race group includes cases indicating a single race other than White or Black, and cases

indicating two or more races.

(2) Hispanics may be any race. For a more detailed discussion on the use of parameters for race and

ethnicity, please see the “Generalized Variance Parameters” section.


Comparability of Data. Data obtained from the CPS and other sources are not entirely comparable. This results from differences in interviewer training and experience and in differing survey processes. This is an example of nonsampling variability not reflected in the standard errors. Therefore, caution should be used when comparing results from different sources.

Data users should be careful when comparing the data from this microdata file, which reflects Census 2000-based controls, with microdata files from March 1994 through December 2002, which reflect 1990 census-based controls. Ideally, the same population controls should be used when comparing any estimates. In reality, the use of the same population controls is not practical when comparing trend data over a period of 10 to 20 years. Thus, when it is necessary to combine or compare data based on different controls or different designs, data users should be aware that changes in weighting controls or weighting procedures can create small differences between estimates. See the discussion following for information on comparing estimates derived from different controls or different sample designs.


Microdata files from previous years reflect the latest available census-based controls. Although the most recent change in population controls had relatively little impact on summary measures, such as averages, medians, and percentage distributions, it did have a significant impact on levels. For example, use of Census 2000-based controls results in about a 1 percent increase from the 1990 census-based controls in the civilian noninstitutional population and in the number of families and households. Thus, estimates of levels for data collected in 2003 and later years will differ from those for earlier years by more than what could be attributed to actual changes in the population. These differences could be disproportionately greater for certain population subgroups than for the total population.

Note that certain microdata files from 2002, namely June, October, November, and the 2002 ASEC, contain both Census 2000-based estimates and 1990 census-based estimates and are subject to the comparability issues discussed previously. All other microdata files from 2002 reflect the 1990 census-based controls.


Users should also exercise caution because of changes caused by the phase-in of the Census 2000 files (see “Basic CPS”). During this time period, CPS data are collected from sample designs based on different censuses. Three features of the new CPS design have the potential of affecting published estimates: (1) the temporary disruption of the rotation pattern from August 2004 through June 2005 for a comparatively small portion of the sample, (2) the change in sample areas, and (3) the introduction of the new Core-Based Statistical Areas (formerly called metropolitan areas). Most of the known effect on estimates during and after the sample redesign will be the result of changing from 1990 to 2000 geographic definitions. Research has shown that the national-level estimates of the metropolitan and nonmetropolitan populations should not change appreciably because of the new sample design. However, users should still exercise caution when comparing metropolitan and nonmetropolitan estimates across years with a design change, especially at the state level.


Caution should also be used when comparing Hispanic estimates over time. No independent population control totals for people of Hispanic origin were used before 1985.


A Nonsampling Error Warning. Since the full extent of the nonsampling error is unknown, one should be particularly careful when interpreting results based on small differences between estimates. The Census Bureau recommends that data users incorporate information about nonsampling error into their analyses, as nonsampling error could impact the conclusions drawn from the results. Caution should also be used when interpreting results based on a relatively small number of cases. Summary measures (such as medians and percentage distributions) probably do not reveal useful information when computed on a subpopulation smaller than 75,000.


For additional information on nonsampling error, including the possible impact on CPS data when known, refer to references [2] and [4].


Estimation of Median Incomes. The Census Bureau has changed the methodology for computing median income over time. The Census Bureau has computed medians using either Pareto interpolation or linear interpolation. Currently, we are using linear interpolation to estimate all medians. Pareto interpolation assumes a decreasing density of population within an income interval, whereas linear interpolation assumes a constant density of population within an income interval. The Census Bureau calculated estimates of median income and associated standard errors for 1979 through 1987 using Pareto interpolation if the estimate was larger than $20,000 for people or $40,000 for families and households. This is because the width of the income interval containing the estimate is greater than $2,500.


We calculated estimates of median income and associated standard errors for 1976, 1977, and 1978 using Pareto interpolation if the estimate was larger than $12,000 for people or $18,000 for families and households. This is because the width of the income interval containing the estimate is greater than $1,000. All other estimates of median income and associated standard errors for 1976 through 2007 (2008 ASEC) and almost all of the estimates of median income and associated standard errors for 1975 and earlier were calculated using linear interpolation.


Thus, use caution when comparing median incomes above $12,000 for people or $18,000 for families and households for different years. Median incomes below those levels are more comparable from year to year since they have always been calculated using linear interpolation. For an indication of the comparability of medians calculated using Pareto interpolation with medians calculated using linear interpolation, see reference [5].


Standard Errors and Their Use. The sample estimate and its standard error enable one to construct a confidence interval. A confidence interval is a range about a given estimate that has a specified probability of containing the average result of all possible samples. For example, if all possible samples were surveyed under essentially the same general conditions and using the same sample design, and if an estimate and its standard error were calculated from each sample, then approximately 90 percent of the intervals from 1.645 standard errors below the estimate to 1.645 standard errors above the estimate would include the average result of all possible samples.


A particular confidence interval may or may not contain the average estimate derived from all possible samples, but one can say with specified confidence that the interval includes the average estimate calculated from all possible samples.


Standard errors may also be used to perform hypothesis testing, a procedure for distinguishing between population parameters using sample estimates. The most common type of hypothesis is that the population parameters are different. An example of this would be comparing the percentage of men who were part-time workers to the percentage of women who were part-time workers.


Tests may be performed at various levels of significance. A significance level is the probability of concluding that the characteristics are different when, in fact, they are the same. For example, to conclude that two characteristics are different at the 0.10 level of significance, the absolute value of the estimated difference between characteristics must be greater than or equal to 1.645 times the standard error of the difference.


The Census Bureau uses 90-percent confidence intervals and 0.10 levels of significance to determine statistical validity. Consult standard statistical textbooks for alternative criteria.


Estimating Standard Errors. The Census Bureau uses replication methods to estimate the standard errors of CPS estimates. These methods primarily measure the magnitude of sampling error. However, they do measure some effects of nonsampling error as well. They do not measure systematic biases in the data associated with nonsampling error. Bias is the average over all possible samples of the differences between the sample estimates and the true value.


Generalized Variance Parameters. While it is possible to compute and present an estimate of the standard error based on the survey data for each estimate in a report, there are a number of reasons why this is not done. A presentation of the individual standard errors would be of limited use, since one could not possibly predict all of the combinations of results that may be of interest to data users. Additionally, data users have access to CPS microdata files, and it is impossible to compute in advance the standard error for every estimate one might obtain from those data sets. Moreover, variance estimates are based on sample data and have variances of their own. Therefore, some method of stabilizing these estimates of variance, for example, by generalizing or averaging over time, may be used to improve their reliability.


Experience has shown that certain groups of estimates have a similar relationship between their variances and expected values. Modeling or generalization may provide more stable variance estimates by taking advantage of these similarities. The generalized variance function is a simple model that expresses the variance as a function of the expected value of the survey estimate. The parameters of the generalized variance function are estimated using direct replicate variances. These generalized variance parameters provide a relatively easy method to obtain approximate standard errors for numerous characteristics. In this source and accuracy statement, Table 4 provides the generalized variance parameters for labor force estimates, and Table 5 provides generalized variance parameters for characteristics from the 2008 ASEC. Also, tables are provided that allow the calculation of parameters and standard errors for comparisons to adjacent years and for U.S. states and regions. Table 6 provides factors to derive prior year parameters. Tables 7 and 8 contain correlation coefficients for comparing estimates from consecutive years. Tables 9 and 10 provide factors and population controls to derive U.S. state and regional parameters.


The basic CPS questionnaire records the race and ethnicity of each respondent. With respect to race, a respondent can be White, Black, Asian, American Indian and Alaska Native (AIAN), Native Hawaiian and Other Pacific Islander (NHOPI), or combinations of two or more of the preceding. A respondent’s ethnicity can be Hispanic or non-Hispanic, regardless of race.


The generalized variance parameters to use in computing standard errors are dependent upon the race/ethnicity group of interest. The following table summarizes the relationship between the race/ethnicity group of interest and the generalized variance parameters to use in standard error calculations.
















Table 3. Estimation Groups of Interest and Generalized Variance Parameters

Race/ethnicity group of interest

Generalized variance parameters to use in standard error calculations

Total population

Total or White

Total White, White AOIC, or White non-Hispanic population

Total or White

Total Black, Black AOIC, or Black non-Hispanic population

Black

Total Asian, AIAN, NHOPI;

Asian, AIAN, NHOPI AOIC;

or Asian, AIAN, NHOPI non-Hispanic population

Asian, AIAN, NHOPI

Populations from other race groups

Asian, AIAN, NHOPI

Hispanic population

Hispanic

Two or more races – employment/unemployment, educational attainment characteristics

Black

Two or more races – all other characteristics

Asian, AIAN, NHOPI


NOTES: (1) AIAN, NHOPI are American Indian and Alaska Native, Native Hawaiian and Other Pacific Islander, respectively.

(2) AOIC is an abbreviation for alone or in combination. The AOIC population for a race group of

interest includes people reporting only the race group of interest (alone) and people reporting

multiple race categories including the race group of interest (in combination).

(3) Hispanics may be any race.

(4) Two or more races refers to the group of cases self-classified as having two or more races.


Standard Errors of Estimated Numbers. The approximate standard error, sx, of an estimated number from this microdata file can be obtained using the formula:


(1)


where x is the estimate and a and b are the parameters in Tables 4 and 5 associated with the particular type of characteristic. When calculating standard errors from cross-tabulations involving different characteristics, use the set of parameters for the characteristic that will give the largest standard error.


Illustration 1

Suppose there were 3,442,000 unemployed women in the civilian labor force. Use Formula (1) and the appropriate parameters from Table 4 to get


Illustration 1

Number of unemployed females in the

civilian labor force (x)

3,442,000

a parameter (a)

-0.000031

b parameter (b)

2,782

Standard error

96,000

90-percent confidence interval

3,284,000 to 3,600,000


The standard error is calculated as



and the 90-percent confidence interval is calculated as 3,442,000 ± 1.645 × 96,000.


A conclusion that the average estimate derived from all possible samples lies within a range computed in this way would be correct for roughly 90 percent of all possible samples.


Illustration 2

Suppose there were 58,370,000 married-couple family households. Use Formula (1) and the appropriate parameters from Table 5 to get


Illustration 2

Number of married-couple family

households (x)

58,370,000

a parameter (a)

-0.000004

b parameter (b)

1,052

Standard error

219,000

90-percent confidence interval

58,010,000 to 58,730,000


The standard error is calculated as



and the 90-percent confidence interval is calculated as 58,370,000 ± 1.645 × 219,000.


A conclusion that the average estimate derived from all possible samples lies within a range computed in this way would be correct for roughly 90 percent of all possible samples.


Standard Errors of Estimated Percentages. The reliability of an estimated percentage, computed using sample data for both numerator and denominator, depends on both the size of the percentage and its base. Estimated percentages are relatively more reliable than the corresponding estimates of the numerators of the percentages, particularly if the percentages are 50 percent or more. When the numerator and denominator of the percentage are in different categories, use the parameter from Table 4 or 5 as indicated by the numerator. However, for calculating standard errors for different characteristics of families in poverty, use the standard error of a ratio equation (see “Standard Errors of Estimated Ratios”).


The approximate standard error, sy,p, of an estimated percentage can be obtained by using the formula:


(2)


Here y is the total number of people, families, households, or unrelated individuals in the base of the percentage, p is the percentage (0 p 100), and b is the parameter in Table 4 or 5 associated with the characteristic in the numerator of the percentage.


Illustration 3

Suppose there were 188,983,000 out of 222,723,000 adults (aged 18 and older), or 84.9 percent, who graduated from high school. Use Formula (2) and the appropriate parameter from Table 5 to get


Illustration 3

Percentage of adults who are high school graduates (p)

84.9

Base (y)

222,723,000

b parameter (b)

1,206

Standard error

0.08

90-percent confidence interval

84.8 to 85.0


The standard error is calculated as



The 90-percent confidence interval of the percentage of adults who graduated from high school is calculated as 84.9 ± 1.645 × 0.08.


Standard Errors of Estimated Differences. The standard error of the difference between two sample estimates is approximately equal to


(3)


where sx and sy are the standard errors of the estimates, x and y. The estimates can be numbers, percentages, ratios, etc. Tables 7 and 8 contain the correlation coefficient, r, for CPS year-to-year comparisons. The correlations were derived for income and poverty estimates, but they can be used for other types of estimates where the year-to-year correlation between identical households is high. For making other comparisons, assume that r equals zero. Making this assumption will result in accurate estimates of standard errors for the difference between two estimates of the same characteristic in two different areas, or for the difference between separate and uncorrelated characteristics in the same area. However, if there is a high positive (negative) correlation between the two characteristics, the formula will overestimate (underestimate) the true standard error.


Illustration 4

Suppose there were 17,940,000 men over age 24 who were never married and 9,526,000 men over age 24 who were divorced. The apparent difference is 8,414,000. Use Formulas (1) and (3) with r = 0 and the appropriate parameters from Table 5 to get


Illustration 4


Never married (x)

Divorced (y)

Difference

Number of males

over age 24

17,940,000

9,526,000

8,414,000

a parameter (a)

-0.000009

-0.000009

-

b parameter (b)

2,652

2,652

-

Standard error

211,000

156,000

262,000

90-percent

confidence interval

17,593,000 to 18,287,000

9,269,000 to 9,783,000

7,983,000 to 8,845,000


The standard error of the difference is calculated as



The 90-percent confidence interval around the difference is calculated as 8,414,000 ± 1.645 × 262,000. Since this interval does not include zero, we can conclude with 90 percent confidence that the number of never married men over age 24 was higher than the number of divorced men over age 24.


Illustration 5

Suppose that the percentage of people without health insurance coverage for 2007 was 15.3 percent out of 299,106,000 people, and the percentage of people without health insurance coverage for 2006 was 15.8 percent out of 296,824,000 people. The apparent difference is 0.5 percent. Use Formulas (2) and (3) and the appropriate parameter, factor, and correlation coefficient from Tables 5, 6, and 7 to get


Illustration 5


2006 (x)

2007 (y)

Difference

Percentage of people without

health insurance (p)

15.8

15.3

0.5

Base

296,824,000

299,106,000

-

b parameter (b)

2,652*

2,652

-

Correlation coefficient (r)

-

-

0.30

Standard error

0.11

0.11

0.13

90-percent

confidence interval

15.6 to 16.0

15.1 to 15.5

0.3 to 0.7

*This parameter is calculated by multiplying the year factor for 2006 from Table 6, 1.0, by the current b parameter.


The standard error of the difference is calculated as



and the 90-percent confidence interval around the difference is calculated as 0.5 ± 1.645 × 0.13. Since this interval does not include zero, we can conclude with 90 percent confidence that the percentage of people without health insurance in 2007 was lower than the percentage of people without health insurance in 2006.


Standard Errors of Estimated Ratios. Certain estimates may be calculated as the ratio of two numbers. Compute the standard error of a ratio, x/y, using


(4)


The standard error of the numerator, sx, and that of the denominator, sy, may be calculated using formulas described earlier. In Formula (4), r represents the correlation between the numerator and the denominator of the estimate.


For one type of ratio, the denominator is a count of families or households and the numerator is a count of people in those families or households with a certain characteristic. If there is at least one person with the characteristic in every family or household, use 0.7 as an estimate of r. An example of this type is the average number of children per family with children.


For all other types of ratios, r is assumed to be zero. Examples are the average number of children per family and the family poverty rate. If r is actually positive (negative), then this procedure will provide an overestimate (underestimate) of the standard error of the ratio.


Note: For estimates expressed as the ratio of x per 100 y or x per 1,000 y, multiply Formula (4) by 100 or 1,000, respectively, to obtain the standard error.


Illustration 6

Suppose there were 9,049,000 men working part-time and 17,933,000 women working part-time. The ratio of men working part-time to women working part-time would be 0.505 or 50.5 percent. Use Formulas (1) and (4) with r = 0 and the appropriate parameters from Table 4 to get


Illustration 6


Males (x)

Females (y)

Ratio

Number who work part-

time

9,049,000

17,933,000

0.505

a parameter (a)

-0.000032

-0.000031

-

b parameter (b)

2,971

2,782

-

Standard error

156,000

200,000

0.0104

90-percent confidence

interval

8,792,000 to 9,306,000

17,604,000 to 18,262,000

0.488 to 0.522


The standard error is calculated as


and the 90-percent confidence interval is calculated as 0.505 ± 1.645 × 0.0104.


Illustration 7

Suppose that the number of families below the poverty level was 7,623,000 and the total number of families was 77,908,000. The ratio of families below the poverty level to the total number of families would be 0.098 or 9.8 percent. Use Formulas (1) and (4) with r = 0 and the appropriate parameters from Table 5 to get


Illustration 7


In poverty (x)

Total (y)

Ratio (in percent)

Number of families

7,623,000

77,908,000

9.8

a parameter (a)

0.000052

-0.000004

-

b parameter (b)

1,243

1,052

-

Standard error

112,000

240,000

0.15

90-percent confidence

interval

7,439,000 to 7,807,000

77,513,000 to 78,303,000

9.6 to 10.0


The standard error is calculated as



and the 90-percent confidence interval is calculated as 0.098 ± 1.645 × 0.0015.


Standard Errors of Estimated Medians. The sampling variability of an estimated median depends on the form of the distribution and the size of the base. One can approximate the reliability of an estimated median by determining a confidence interval about it. (See “Standard Errors and Their Use” for a general discussion of confidence intervals.)


Estimate the 68-percent confidence limits of a median based on sample data using the following procedure:


1. Determine, using Formula (2), the standard error of the estimate of 50 percent from the distribution.


2. Add to and subtract from 50 percent the standard error determined in step 1. These two numbers are the percentage limits corresponding to the 68-percent confidence interval about the estimated median.


3. Using the distribution of the characteristic, determine upper and lower limits of the

68-percent confidence interval by calculating values corresponding to the two points established in step 2.


Note: The percentage limits found in step 2 may or may not fall in the same characteristic distribution interval.

Use the following formula to calculate the upper and lower limits:


(5)


where

Xp = estimated upper and lower limits for the confidence interval

(0 p 1). For purposes of calculating the confidence interval, p takes on the values determined in step 2. Note that Xp estimates the median when p = 0.50.


N = for distribution of numbers: the total number of units (people,

households, etc.) for the characteristic in the distribution.


= for distribution of percentages: the value 100.


p = the values obtained in Step 2.


L, U = the lower and upper boundaries, respectively, of the interval containing Xp.


Note: For continuous data, i.e., income, time, etc., the upper bound of the interval containing Xp and lower bound of the next interval are essentially the same and will be treated as such in the illustration.


NL, NU = for distribution of numbers: the estimated number of units

(people, households, etc.) with values of the characteristic less than L and U, respectively.


= for distribution of percentages: the estimated percentage of units (people, households, etc.) having values of the characteristic less than L and U, respectively.


4. Divide the difference between the two points determined in step 3 by 2 to obtain the standard error of the median.


Note: Medians and their standard errors calculated as below may differ from those in published tables and reports showing medians, since narrower income intervals were used in those calculations.


Illustration 8

Suppose there were 116,783,000 households in 2007, and their income was distributed in the following way:


Illustration 8

Income level

Number of households

Cumulative number of households

Cumulative percentage of households

Under $5,000

3,413,000

3,413,000

2.92

$5,000 to $9,999

5,042,000

8,455,000

7.24

$10,000 to $14,999

7,051,000

15,506,000

13.28

$15,000 to $24,999

13,528,000

29,034,000

24.86

$25,000 to $34,999

12,532,000

41,566,000

35.59

$35,000 to $49,999

16,521,000

58,087,000

49.74

$50,000 to $74,999

21,268,000

79,355,000

67.95

$75,000 to $99,999

13,841,000

93,196,000

79.80

$100,000 and over

23,586,000

116,783,000*

100.00

*This value does not equal the sum of the number of households due to rounding.


1. Using Formula (2) with b = 1,140, the standard error of 50 percent on a base of 116,783,000 is about 0.16 percent.


2. To obtain a 68-percent confidence interval on an estimated median, add to and subtract from 50 percent the standard error found in step 1. This yields percentage limits of 49.84 and 50.16.


3. The lower and upper boundaries for the interval in which the percentage limits fall, are L = $50,000 and U = $75,000, respectively.


Then the estimated numbers of households with an income less than $50,000 and $75,000 are NL = 58,087,000 and NU = 79,355,000, respectively.


Using Formula (5), the lower limit for the confidence interval of the median is found to be about



Similarly, the upper limit is found to be about



Thus, a 68-percent confidence interval for the median income for households is from

$50,138 to $50,578.


4. The standard error of the median is, therefore,



Standard Errors of Averages for Grouped Data. The formula used to estimate the standard error of an average for grouped data is


(6)


In this formula, y is the size of the base of the distribution and b is the parameter from Table 4 or 5. The variance, , is given by the following formula:


(7)


where , the average of the distribution, is estimated by


(8)


where


c = the number of groups; i indicates a specific group, thus taking on values 1 through c.

pi = estimated proportion of people, households, families, or unrelated individuals whose values for the characteristic being considered fall in group i.


= (Li + Ui)/2 where Li and Ui are the lower and upper interval boundaries, respectively, for group i. is assumed to be the most representative value for the characteristic of people, households, families, or unrelated individuals in group i. If group c is open-ended, i.e., no upper interval boundary exists, use a group approximate average value of


(9)


Note: For continuous data, i.e., income, time, etc., the upper bound of the ith interval and lower bound of the next interval are essentially the same and will be treated as such in the illustration.


Illustration 9

Suppose that there were 7,623,000 families in poverty and that the distribution of the income deficit (the difference between their family income and poverty threshold) for all families in poverty was



Income deficit

Number of families in poverty

Percentage of families in poverty (pi)

Average income deficit

Under $500

248,000

3.3

250

$500 to $999

296,000

3.9

750

$1,000 to $1,999

656,000

8.6

1,500

$2,000 to $2,999

500,000

6.6

2,500

$3,000 to $3,999

581,000

7.6

3,500

$4,000 to $4,999

542,000

7.1

4,500

$5,000 to $5,999

440,000

5.8

5,500

$6,000 to $6,999

482,000

6.3

6,500

$7,000 to $7,999

347,000

4.6

7,500

$8,000 and over

3,530,000

46.3

12,000

Total

7,623,000*

100


*This value does not equal the sum of the number of families due to rounding.


Using Formula (8),



and Formula (7),


Use the appropriate parameter from Table 5 and Formula (6) to get


Illustration 9

Average income deficit for families in

poverty

$7,547

Variance (S2)

19,717,000

Base (y)

7,623,000

b parameter (b)

1,140

Standard error

$54

90-percent confidence interval

$7,458 to $7,636


The standard error is calculated as



and the 90-percent confidence interval is calculated as $7,547 ± 1.645 × $54.


Standard Errors of Estimated Per Capita Deficits. Certain average values in reports associated with the ASEC data represent the per capita deficit for households of a certain class. The average per capita deficit is approximately equal to


(10)


where


h = number of households in the class.


m = average deficit for households in the class.


p = number of people in households in the class.


x = average per capita deficit of people in households in the class.


To approximate standard errors for these averages, use the formula


(11)


In Formula (11), r represents the correlation between p and h.


For one type of average, the class represents households containing a fixed number of people. For example, h could be the number of 3-person households. In this case, there is an exact correlation between the number of people in households and the number of households. Therefore, r = 1 for such households.


For other types of averages, the class represents households of other demographic types, for example, households in distinct regions, households in which the householder is of a certain age group, and owner-occupied and tenant-occupied households. In this and other cases in which the correlation between p and h is not perfect, use 0.7 as an estimate of r.


Illustration 10

Suppose there were 26,509,000 people living in families in poverty, and 7,623,000 families in poverty, and the average deficit income for families in poverty was $7,547 with a standard error of $54 (from Illustration 9). Use Formulas (1), (10), and (11) and the appropriate parameters from Table 5 and r = 0.7 to get


Illustration 10


Number (h)

Number of people (p)

Average income deficit (m)

Average per capita deficit (x)

Value for families in

poverty

7,623,000

26,509,000

$7,547

$2,170

a parameter (a)

+0.000052

-0.000018

-

-

b parameter (b)

1,243

5,282

-

-

Correlation (r)

-

-

-

0.7

Standard Error

112,000

357,000

$54

$28

90-percent

confidence interval

7,439,000 to 7,807,000

25,922,000 to 27,096,000

$7,458 to $7,636

$2,124 to $2,216


The estimate of the average per capita deficit is calculated as



and the standard error is calculated as


The 90-percent confidence interval is calculated as $2,170 1.645 $28.


Accuracy of State Estimates. The redesign of the CPS following the 1980 census provided an opportunity to increase efficiency and accuracy of state data. All strata are now defined within state boundaries. The sample is allocated among the states to produce state and national estimates with the required accuracy while keeping total sample size to a minimum. Improved accuracy of state data was achieved with about the same sample size as in the 1970 design.


Since the CPS is designed to produce both state and national estimates, the proportion of the total population sampled and the sampling rates differ among the states. In general, the smaller the population of the state the larger the sampling proportion. For example, in Vermont approximately 1 in every 250 households is sampled each month. In New York the sample is about 1 in every 2,000 households. Nevertheless, the size of the sample in New York is four times larger than in Vermont because New York has a larger population.


Note: The Census Bureau recommends the use of 3-year averages to compare estimates across states and 2-year averages to evaluate changes in state estimates over time. See “Standard Errors of Data for Combined Years” and “Standard Errors of Differences of 2-Year Averages.” The Census Bureau also recommends the American Community Survey microdata file as the preferred source for income and poverty state data in years 2006 (2005 estimates) to the present.


Standard Errors for State Estimates. The standard error for a state may be obtained by determining new state-level a and b parameters and then using these adjusted parameters in the standard error formulas mentioned previously. To determine a new state-level b parameter (bstate), multiply the b parameter from Table 4 or 5 by the state factor from Table 9. To determine a new state-level a parameter (astate), use the following:


(1) If the a parameter from Table 4 or 5 is positive, multiply it by the state factor from Table 9.


(2) If the a parameter in Table 4 or 5 is negative, calculate the new state-level a parameter as follows:


(12)


where POPstate is the state population found in Table 9.

Illustration 11

Suppose there were 14,435,000 people living in New York state who were born in the United States. Use Formulas (1) and (12) and the appropriate parameter, factor, and population from Tables 5 and 9 to get


Illustration 11

Number of people in NY who were born in the U.S. (x)

14,435,000

b parameter (b)

2,652

New York state factor

1.17

State population

19,039,135

State a parameter (astate)

-0.000163

State b parameter (bstate)

3,103

Standard error

104,000

Obtain the state-level b parameter by multiplying the b parameter, 2,652, by the state factor, 1.17. This gives bstate = 2,652 × 1.17 = 3,103. Obtain the needed state-level a parameter by



The standard error of the estimate of the number of people in New York state who were born in the United States can then be found by using Formula (1) and the new state-level a and b parameters, -0.000163 and 3,103, respectively. The standard error is given by



Standard Errors of Regional Estimates. To compute standard errors for regional estimates, follow the steps for computing standard errors for state estimates found in “Standard Errors for State Estimates” using the regional factors and populations found in Table 10.


Illustration 12

Suppose there were 15,501,000 of 109,545,000 people, or 14.2 percent, living in poverty in the South. Use Formulas (2) and (12) and the appropriate parameter and factor from Tables 5 and 10 to get


Illustration 12

Poverty rate in the South (p)

14.2

Base (y)

109,545,000

b parameter (b)

5,282

South regional factor

1.08

Regional b parameter (bregion)

5,705

Standard error

0.25

90-percent confidence interval

13.8 to 14.6

Obtain the region-level b parameter by multiplying the b parameter, 5,282, by the South regional factor, 1.08. This gives bregion = 5,282 × 1.08 = 5,705.


The standard error of the estimate of the poverty rate for people living in the South can then be found by using Formula (2) and the new region-level b parameter, 5,705. The standard error is given by



and the 90-percent confidence interval of the poverty rate for people living in the South is calculated as 14.2 1.645 0.25.


Standard Errors of Groups of States. The standard error calculation for a group of states is similar to the standard error calculation for a single state. First, calculate a new state group factor for the group of states. Then, determine new state group a and b parameters. Finally, use these adjusted parameters in the standard error formulas mentioned previously.


Use the following formula to determine a new state group factor:


(13)


where POPi and state factori are the population and factor for state i from Table 9. To obtain a new state group b parameter (bstate group), multiply the b parameter from Table 4 or 5 by the state factor obtained by Formula (13). To determine a new state group a parameter (astate group), use the following:


(1) If the a parameter from Table 4 or 5 is positive, multiply it by the state group factor determined by Formula (13).


(2) If the a parameter in Table 4 or 5 is negative, calculate the new state group a parameter as follows:

(14)

Illustration 13

Suppose the state group factor for the state group Illinois-Indiana-Michigan was required. The appropriate factor would be



Standard Errors of Data for Combined Years. Sometimes estimates for multiple years are combined to improve precision. For example, suppose is an average derived from n consecutive years’ data, i.e., , where the xi are the estimates for the individual years. Use the formulas described previously to estimate the standard error, , of each year’s estimate. Then the standard error of is


(15)


where


(16)


and are the standard errors of the estimates xi for years i = 1 to n. Tables 7 and 8 contain the correlation coefficients, r, for the correlation between consecutive years i and i+1. Correlation between nonconsecutive years is zero. The correlations were derived for income and poverty estimates, but they can be used for other types of estimates where the year-to-year correlation between identical households is high.


The Census Bureau recommends the use of multi-year average estimates for certain small population subgroups4 (see also “Accuracy of State Estimates.”) Three-year averages are recommended for comparisons across population subgroups, and 2-year averages are recommended for comparisons across adjacent years (see “Standard Errors of Differences of 2-Year Averages.”)


Illustration 14

Suppose the 2005-2007 3-year average percentage of the AIAN population without health insurance was 32.1. Suppose the percentages and bases for 2005, 2006, and 2007 were 30.6, 33.7, and 32.1 percent and 2,251,000, 2,543,000, and 2,745,000, respectively. Use Formulas (2), (15), and (16) and the appropriate parameters, factors, and correlation coefficients from Tables 5, 6, and 7 and to get


Illustration 14


2005

2006

2007

2005-2007 avg

Percentage of AIAN without health

insurance (p)

30.6

33.7

32.1

32.1

Base (y)

2,251,000

2,543,000

2,745,000

-

b parameter (b)

3,809*

3,809*

3,809

-

Correlation (r)

-

-

-

0.30, 0.30

Standard error

1.90

1.83

1.74

1.25

90-percent confidence interval

27.5 to 33.7

30.7 to 36.7

29.2 to 35.0

30.0 to 34.2

*These parameters are calculated by multiplying the year factors from Table 6 by the current parameter.


The standard error of the 3-year average is calculated as



where



The 90-percent confidence interval for the 3-year average percentage of the AIAN population without health insurance is 32.1 1.645 1.25.


Standard Errors of Differences of 2-Year Averages. Comparing two non-overlapping 2-year averages also improves precision for comparisons across years. Use the formulas described previously to estimate the standard error, , of each year’s estimate, xi, and the standard error, , of each 2-year average, . Then the standard error of the difference of the two non-overlapping 2-year averages, , is


(17)

Illustration 15

Suppose you want to calculate the standard error of the difference between the 2004, 2005 and 2006, 2007 averages of the percentage of people in California without health insurance. Use the following information along with Formula (2) and Tables 5, 6, and 9 to get



2004

2005

2006

2007

Percentage of people in CA without health

insurance (p)

18.0

18.8

18.8

18.2

Base (y)

35,854,000

35,940,000

36,208,000

36,295,000

b parameter (b)

2,6521

2,6521

2,6521

2,652

California state factor

1.25

1.25

1.25

1.25

State b parameter (bstate)

3,315

3,315

3,315

3,315

Standard error2

0.37

0.38

0.37

0.37

1These parameters are calculated by multiplying the year factors from Table 6 by the current parameter.

2See “Standard Errors of State Estimates” for instructions and illustrations on calculating state standard errors.


Use this information, Formulas (15), (16), and (17), and the appropriate correlation coefficient from Table 7 to get


Illustration 15


2004, 2005

2005, 2006

2006, 2007

avg(2004, 2005) -avg(2006, 2007)

Average percentage of people in CA

without health insurance ( )

18.4

-

18.5

0.1

Correlation coefficient (r)

0.30

0.30

0.30

-

Standard error

0.30*

-

0.30*

0.40

90-percent confidence interval

17.9 to 18.9

-

18.0 to 19.0

-0.6 to 0.8

*See “Standard Errors of Data for Combined Years” for instructions and illustrations on calculating these standard errors.


The standard error of the difference of the two 2-year averages is calculated as



and the 90-percent confidence interval around the difference of the 2-year averages is calculated as 0.1 1.645 0.40. Since this interval does include zero, we cannot conclude with 90 percent confidence that the 2006-2007 average percentage of people in California without health insurance was higher than the 2004-2005 average percentage of people in California without health insurance.


Standard Errors of Quarterly or Yearly Averages. For information on calculating standard errors for labor force data from the CPS which involve quarterly or yearly averages, please see the “Explanatory Notes and Estimates of Error: Household Data” section in Employment and Earnings, a monthly report published by the U.S. Bureau of Labor Statistics.


Technical Assistance. If you require assistance or additional information, please contact the Demographic Statistical Methods Division via e-mail at [email protected].


Table 4. Parameters for Computation of Standard Errors for Labor Force Characteristics:

March 2008

Characteristic

a

b




Total or White






Civilian labor force, employed

-0.000016

3,068

Not in labor force

-0.000009

1,833

Unemployed

-0.000016

3,096




Civilian labor force, employed, not in labor force, and unemployed



Men

-0.000032

2,971

Women

-0.000031

2,782

Both sexes, 16 to 19 years

-0.000022

3,096

 

 

Black






Civilian labor force, employed, not in labor force, and unemployed

-0.000151

3,455

Men

-0.000311

3,357

Women

-0.000252

3,062

Both sexes, 16 to 19 years

-0.001632

3,455


 

 

Hispanic






Civilian labor force, employed, not in labor force, and unemployed

-0.000141

3,455

Men

-0.000253

3,357

Women

-0.000266

3,062

Both sexes, 16 to 19 years

-0.001528

3,455




Asian, AIAN, NHOPI






Civilian labor force, employed, not in labor force, and unemployed

-0.000346

3,198

Men

-0.000729

3,198

Women

-0.000659

3,198

Both sexes, 16 to 19 years

-0.004146

3,198





NOTES: (1) These parameters are to be applied to basic CPS monthly labor force estimates.

(2) AIAN, NHOPI are American Indian and Alaska Native, Native Hawaiian and Other Pacific Islander, respectively.

(3) The Total or White, Black, and Asian, AIAN, NHOPI parameters are to be used for both alone and in-combination race group estimates.

(4) Hispanics may be any race. For a more detailed discussion on the use of parameters for race and ethnicity, please see the “Generalized Variance Parameters” section.

(5) For foreign-born and noncitizen characteristics for Total and White, the a and b parameters should be multiplied by 1.3. No adjustment is necessary for foreign-born and noncitizen characteristics for Black, Hispanic, and Asian, AIAN, NHOPI.

(6) For nonmetropolitan characteristics, multiply the a and b parameters by 1.5. If the characteristic of interest is total state population, not subtotaled by race or ancestry, the a and b parameters are zero.

(7) For the group self-classified as having two or more races, use the Black parameters for all employment and unemployment characteristics.

(8) To obtain parameters prior to 2008, multiply the parameter from this table by the appropriate year factor in Table 6.


Table 5. Parameters for Computation of Standard Errors for People and Families: 2008 ASEC

  Characteristics

Total or White

Black

Asian, AIAN, NHOPI

Hispanic

a

b

a

b

a

b

a

b

PEOPLE  

Educational attainment

-0.000005

1,206

-0.000030

1,364

-0.000065

1,101

-0.000025

922

Employment

-0.000016

3,068

-0.000151

3,455

-0.000346

3,198

-0.000141

3,455

People by family income

-0.000010

2,494

-0.000063

2,855

-0.000169

2,855

-0.000078

2,855

Income









Total

-0.000005

1,249

-0.000032

1,430

-0.000084

1,430

-0.000039

1,430

Male

-0.000011

1,249

-0.000068

1,430

-0.000176

1,430

-0.000076

1,430

Female

-0.000010

1,249

-0.000059

1,430

-0.000163

1,430

-0.000080

1,430

Age









15 to 24

-0.000030

1,249

-0.000146

1,430

-0.000404

1,430

-0.000126

1,430

25 to 44

-0.000015

1,249

-0.000083

1,430

-0.000208

1,430

-0.000095

1,430

45 to 64

-0.000016

1,249

-0.000107

1,430

-0.000301

1,430

-0.000186

1,430

65 and over

-0.000034

1,249

-0.000291

1,430

-0.000806

1,430

-0.000560

1,430

Health insurance

-0.000009

2,652

-0.000064

3,809

-0.000174

3,809

-0.000083

3,809

Marital status, household, and family

 

 

 

 

 

 

 

 

Some household members

-0.000009

2,652

-0.000064

3,809

-0.000174

3,809

-0.000083

3,809

All household members

-0.000011

3,222

-0.000094

5,617

-0.000256

5,617

-0.000122

5,617

Mobility (movers)

 

 

 

 

 

 

 

 

Educational attainment, labor force,

Marital status, HH, family, and income

-0.000005

1,460

-0.000025

1,460

-0.000067

1,460

-0.000032

1,460

US, county, state, region, or MSA

-0.000013

3,965

-0.000067

3,965

-0.000181

3,965

-0.000086

3,965

Below poverty

 

 

 

 

 

 

 

 

Total

-0.000018

5,282

-0.000089

5,282

-0.000241

5,282

-0.000115

5,282

Male

-0.000036

5,282

-0.000188

5,282

-0.000495

5,282

-0.000224

5,282

Female

-0.000035

5,282

-0.000168

5,282

-0.000470

5,282

-0.000236

5,282

Age

 

 

 

 

 

 

 

 

Under 15

-0.000066

4,072

-0.000273

4,072

-0.000723

4,072

-0.000288

4,072

Under 18

-0.000050

4,072

-0.000208

4,072

-0.000586

4,072

-0.000238

4,072

15 and over

-0.000022

5,282

-0.000117

5,282

-0.000312

5,282

-0.000144

5,282

15 to 24

-0.000048

1,998

-0.000204

1,998

-0.000564

1,998

-0.000175

1,998

25 to 44

-0.000024

1,998

-0.000115

1,998

-0.000291

1,998

-0.000133

1,998

45 to 64

-0.000026

1,998

-0.000150

1,998

-0.000421

1,998

-0.000260

1,998

65 and over

-0.000054

1,998

-0.000407

1,998

-0.001127

1,998

-0.000782

1,998

Unemployment

-0.000016

3,096

-0.000151

3,455

-0.000346

3,198

-0.000141

3,455










FAMILIES, HOUSEHOLDS, OR UNRELATED INDIVIDUALS

Income

-0.000005

1,140

-0.000027

1,245

-0.000074

1,245

-0.000034

1,245

Marital status, household, and family, educational

 

 

 

 

 

 



Attainment, population by age/sex

-0.000004

1,052

-0.000021

952

-0.000056

952

-0.000026

952

Poverty

+0.000052

1,243

+0.000052

1,243

+0.000052

1,243

+0.000052

1,243

 

 

 

 

 

 

 

 

 


NOTES: (1) These parameters are to be applied to the 2008 Annual Social and Economic Supplement data.

(2) AIAN, NHOPI are American Indian and Alaska Native, Native Hawaiian and Other Pacific Islander, respectively.

(3) Hispanics may be any race. For a more detailed discussion on the use of parameters for race and ethnicity, please see the “Generalized Variance Parameters” section.

(4) The Total or White, Black, and Asian, AIAN, NHOPI parameters are to be used for both alone and in-combination race group estimates.

(5) For nonmetropolitan characteristics, multiply the a and b parameters by 1.5. If the characteristic of interest is total state population, not subtotaled by race or ancestry, the a and b parameters are zero.

(6) For foreign-born and noncitizen characteristics for Total and White, the a and b parameters should be multiplied by 1.3. No adjustment is necessary for foreign-born and noncitizen characteristics for Black, Asian, AIAN, NHOPI, and Hispanic.

(7) For the group self-classified as having two or more races, use the Asian, AIAN, NHOPI parameters for all characteristics except employment, unemployment, and educational attainment, in which case use Black parameters.

(8) To obtain parameters prior to 2008, multiply the parameter from this table by the appropriate year factor in Table 6.



Table 6. CPS Year Factors: ASEC 1947 to 2007

Data collection period

Total or White

Black

Hispanic

a and b

a and b

a*

a and b






2003 – 2007

1.00

1.00

1.00

1.00

2001 (expanded) – 2002

1.00

1.00

1.53

1.00

1996 – 2001 (basic)

1.97

1.97

3.00

1.97

1990 – 1995

1.82

1.82

2.78

1.82

1989

2.02

2.02

3.09

2.12

1985 – 1988

1.70

1.70

2.60

1.70

1982 – 1984

1.70

1.70

2.60

2.38

1973 – 1981

1.52

1.52

2.32

2.13

1967 – 1972

1.52

1.52

2.32

3.58

1957 – 1966

2.28

2.28

3.48

5.38

1947 – 1956

3.42

3.42

5.22

8.07







NOTES: (1) Blacks have separate factors for the a and b parameter factors due to the new race

definitions and how they affected the population control totals.

(2) Use the asterisked factor to get a parameters for all estimates of the Black population

except those for Black families, households, and unrelated individuals in poverty.

(3) For races not listed, use the factor for Total or White.

(4) Hispanics may be any race. For a more detailed discussion on the use of parameters

for race and ethnicity, please see the “Generalized Variance Parameters” section.



Table 7. CPS Year-to-Year Correlation Coefficients for Income Characteristics: ASEC 1961 to 2008

Characteristics

1961-2001 (basic)

or 2001 (expanded)-2008

2000 (basic)-

2001 (expanded)

People

Families

People

Families






Total

0.30

0.35

0.19

0.22

White

0.30

0.35

0.20

0.23

Black

0.30

0.35

0.15

0.18

Other

0.30

0.35

0.15

0.17

Hispanic

0.45

0.55

0.36

0.28







NOTES: (1) Correlation coefficients are not available for income data before 1961.

(2) Hispanics may be any race. For a more detailed discussion on the use of parameters for race and

ethnicity, please see the “Generalized Variance Parameters” section.

(3) These correlation coefficients are for comparisons of consecutive years. For comparisons of

nonconsecutive years, assume the correlation is zero.

(4) For households and unrelated individuals, use the correlation coefficient for families.



Table 8. CPS Year-to-Year Correlation Coefficients for Poverty Characteristics: ASEC 1971 to 2008

Characteristics

1973-84, 1985-2001 (basic)

or 2001 (expanded)-2008

2000 (basic)-

2001 (expanded)

1984-1985

1972-1973

1971-1972

People

Families

People

Families

People

Families

People

Families

People

Families












Total

0.45

0.35

0.29

0.22

0.39

0.30

0.15

0.14

0.31

0.28

White

0.35

0.30

0.23

0.20

0.30

0.26

0.14

0.13

0.28

0.25

Black

0.45

0.35

0.23

0.18

0.39

0.30

0.17

0.16

0.35

0.32

Other

0.45

0.35

0.22

0.17

0.30

0.30

0.17

0.16

0.35

0.32

Hispanic

0.65

0.55

0.52

0.40

0.56

0.47

0.17

0.16

0.35

0.32













NOTES: (1) Correlation coefficients are not available for poverty data before 1971.

(2) Hispanics may be any race. For a more detailed discussion on the use of parameters for race and

ethnicity, please see the “Generalized Variance Parameters” section.

(3) These correlation coefficients are for comparisons of consecutive years. For comparisons of

nonconsecutive years, assume the correlation is zero.

(4) For households and unrelated individuals, use the correlation coefficient for families.



Table 9. State Populations and Factors for State Parameters and Standard Errors: 2008 ASEC

State

Factor

Population

State

Factor

Population







Alabama

1.05

4,573,648

Montana

0.24

948,609

Alaska

0.18

662,694

Nebraska

0.46

1,749,305

Arizona

1.23

6,343,671

Nevada

0.67

2,579,307

Arkansas

0.68

2,797,557

New Hampshire

0.34

1,302,926

California

1.25

36,174,702

New Jersey

1.12

8,587,595

Colorado

1.20

4,837,095

New Mexico

0.58

1,958,069

Connecticut

0.88

3,446,589

New York

1.17

19,039,135

Delaware

0.22

856,960

North Carolina

1.11

8,980,550

District of Columbia

0.18

578,556

North Dakota

0.16

624,208

Florida

1.12

18,034,137

Ohio

1.09

11,298,197

Georgia

1.08

9,463,484

Oklahoma

0.91

3,553,494

Hawaii

0.29

1,250,217

Oregon

1.01

3,732,455

Idaho

0.36

1,496,447

Pennsylvania

1.09

12,224,184

Illinois

1.13

12,707,700

Rhode Island

0.30

1,037,893

Indiana

1.08

6,275,241

South Carolina

1.06

4,349,549

Iowa

0.77

2,948,881

South Dakota

0.17

783,743

Kansas

0.73

2,730,702

Tennessee

1.08

6,102,934

Kentucky

1.05

4,176,352

Texas

1.28

23,744,707

Louisiana

1.05

4,219,629

Utah

0.54

2,664,218

Maine

0.39

1,302,578

Vermont

0.18

615,618

Maryland

1.13

5,537,556

Virginia

1.08

7,537,276

Massachusetts

1.06

6,369,673

Washington

1.15

6,431,605

Michigan

1.09

9,923,431

West Virginia

0.39

1,787,529

Minnesota

1.07

5,157,769

Wisconsin

1.10

5,538,845

Mississippi

0.71

2,864,017

Wyoming

0.15

520,403

Missouri

1.11

5,793,704











NOTES: (1) The state population counts in this table are for the 0+ population.

(2) For foreign-born and noncitizen characteristics for Total and White, the a and b parameters should be multiplied by 1.3. No adjustment is necessary for foreign-born and noncitizen characteristics for Black, Asian, AIAN, NHOPI, and Hispanic.



Table 10. Regional Populations and Factors for Regional Parameters and Standard Errors: 2008 ASEC

Region

Factor

Population




Midwest

1.03

65,531,726

Northeast

1.05

53,926,191

South

1.08

109,157,935

West

1.10

69,599,492





NOTES: (1) The state population counts in this table are for the 0+ population.

(2) For foreign-born and noncitizen characteristics for Total and White, the a and b parameters should be multiplied by 1.3. No adjustment is necessary for foreign-born and noncitizen characteristics for Black, Asian, AIAN, NHOPI, and Hispanic.

References


[1] Bureau of Labor Statistics. 1994. Employment and Earnings. Volume 41 Number 5, May 1994. Washington, DC: Government Printing Office.


[2] U.S. Census Bureau. 2006. Current Population Survey: Design and Methodology. Technical Paper 66. Washington, DC: Government Printing Office. (http://www.census.gov/prod/2006pubs/tp66.pdf)


[3] U.S. Census Bureau. 1993. Money Income of Households, Families, and Persons in the United States: 1992. Current Population Reports, P60-184. Washington, DC: Government Printing Office. (http://www2.census.gov/prod2/popscan/p60-184.pdf)


[4] Brooks, C.A. and Bailar, B.A. 1978. Statistical Policy Working Paper 3 – An Error Profile: Employment as Measured by the Current Population Survey. Subcommittee on Nonsampling Errors, Federal Committee on Statistical Methodology, U.S. Department of Commerce, Washington, DC. (http://www.fcsm.gov/working-papers/spp.html)


[5] U.S. Census Bureau. 1978. Money Income in 1976 of Families and Persons in the United States. Current Population Reports, P60-114. Washington, DC: Government Printing Office. (http://www2.census.gov/prod2/popscan/p60-114.pdf)



1 For detailed information on the 1990 sample redesign, please see reference [1].

2 The PSUs correspond to substate areas (i.e., counties or groups of counties) that are geographically contiguous.

3 For further information on CATI and CAPI and the eligibility criteria, please see reference [2].

4 Estimates of characteristics of the American Indian and Alaska Native (AIAN) and Native Hawaiian and Other Pacific Islander (NHOPI) populations based on a single-year sample would be unreliable due to the small size of the sample that can be drawn from either population. Accordingly, such estimates are based on multi-year averages.

File Typeapplication/msword
File TitleSource of the Data and Accuracy of the Estimates for the 2008 Annual Social and Economic Supplement Microdata File
AuthorBureau Of The Census
Last Modified ByBureau Of The Census
File Modified2009-08-03
File Created2009-08-03

© 2024 OMB.report | Privacy Policy