Download:
pdf |
pdfATTACHMENT 17
Source of the Data and Accuracy of the Estimates for the
October 2010 CPS Microdata File on Internet Use
SOURCE OF DATA
The data in this microdata file are from the October 2010 Current Population Survey (CPS). The U.S.
Census Bureau conducts the CPS every month, although this file has only October data. The October
survey uses two sets of questions, the basic CPS and a set of supplemental questions. The CPS,
sponsored jointly by the Census Bureau and the U.S. Bureau of Labor Statistics, is the country’s primary
source of labor force statistics for the entire population. The National Telecommunications and
Information Administration sponsors the supplemental questions for October.
Basic CPS. The monthly CPS collects primarily labor force data about the civilian noninstitutionalized
population living in the United States. The institutionalized population, which is excluded from the
population universe, is composed primarily of the population in correctional institutions and nursing
homes (91 percent of the 4.1 million institutionalized people in Census 2000). Interviewers ask questions
concerning labor force participation about each member 15 years old and over in sample households.
Typically, the week containing the nineteenth of the month is the interview week. The week containing
the twelfth is the reference week (i.e., the week about which the labor force questions are asked).
The CPS uses a multistage probability sample based on the results of the decennial census, with coverage
in all 50 states and the District of Columbia. The sample is continually updated to account for new
residential construction. When files from the most recent decennial census become available, the Census
Bureau gradually introduces a new sample design for the CPS.
In April 2004, the Census Bureau began phasing out the 1990 sample 1 and replacing it with the 2000
sample, creating a mixed sampling frame. Two simultaneous changes occurred during this phase-in
period. First, primary sampling units (PSUs) 2 selected for only the 2000 design gradually replaced those
selected for the 1990 design. This involved 10 percent of the sample. Second, within PSUs selected for
both the 1990 and 2000 designs, sample households from the 2000 design gradually replaced sample
households from the 1990 design. This involved about 90 percent of the sample. The new sample design
was completely implemented by July 2005.
In the first stage of the sampling process, PSUs are selected for sample. The United States is divided into
2,025 PSUs. The PSUs were redefined for this design to correspond to the Office of Management and
Budget definitions of Core-Based Statistical Area definitions and to improve efficiency in field
operations. These PSUs are grouped into 824 strata. Within each stratum, a single PSU is chosen for the
sample, with its probability of selection proportional to its population as of the most recent decennial
census. This PSU represents the entire stratum from which it was selected. In the case of strata
consisting of only one PSU, the PSU is chosen with certainty.
1
For detailed information on the 1990 sample redesign, please see reference [1].
2
The PSUs correspond to substate areas (i.e., counties or groups of counties) that are geographically contiguous.
17-1
Approximately 72,000 housing units were selected for sample from the sampling frame in October.
Based on eligibility criteria, 11 percent of these housing units were sent directly to computer-assisted
telephone interviewing (CATI). The remaining units were assigned to interviewers for computer-assisted
personal interviewing (CAPI). 3 Of all housing units in sample, about 59,000 were determined to be
eligible for interview. Interviewers obtained interviews at about 54,000 of these units. Noninterviews
occur when the occupants are not found at home after repeated calls or are unavailable for some other
reason.
October 2010 Supplement. In October 2010, in addition to the basic CPS questions, interviewers asked
supplementary questions of the civilian noninstitutionalized population three years and older on internet
use.
Estimation Procedure. This survey’s estimation procedure adjusts weighted sample results to agree
with independently derived population estimates of the civilian noninstitutionalized population of the
United States and each state (including the District of Columbia). These population estimates, used as
controls for the CPS, are prepared monthly to agree with the most current set of population estimates that
are released as part of the Census Bureau’s population estimates and projections program.
The population controls for the nation are distributed by demographic characteristics in two ways:
•
•
Age, sex, and race (White alone, Black alone, and all other groups combined).
Age, sex, and Hispanic origin.
The population controls for the states are distributed by race (Black alone and all other race groups
combined), age (0-15, 16-44, and 45 and over), and sex.
The independent estimates by age, sex, race, and Hispanic origin, and for states by selected age groups
and broad race categories, are developed using the basic demographic accounting formula whereby the
population from the latest decennial data is updated using data on the components of population change
(births, deaths, and net international migration) with net internal migration as an additional component in
the state population estimates.
The net international migration component in the population estimates includes a combination of the
following:
•
Legal migration to the United States.
•
Emigration of foreign-born and native people from the United States.
•
Net movement between the United States and Puerto Rico.
•
Estimates of temporary migration.
•
Estimates of net residual foreign-born population, which include unauthorized migration.
Because the latest available information on these components lags the survey date, it is necessary to
make short-term projections of these components to develop the estimate for the survey date.
3
For further information on CATI and CAPI and the eligibility criteria, please see reference [2].
17-2
ACCURACY OF THE ESTIMATES
A sample survey estimate has two types of error: sampling and nonsampling. The accuracy of an
estimate depends on both types of error. The nature of the sampling error is known given the survey
design; the full extent of the nonsampling error is unknown.
Sampling Error. Since the CPS estimates come from a sample, they may differ from figures from an
enumeration of the entire population using the same questionnaires, instructions, and enumerators. For a
given estimator, the difference between an estimate based on a sample and the estimate that would result
if the sample were to include the entire population is known as sampling error. Standard errors, as
calculated by methods described in “Standard Errors and Their Use,” are primarily measures of the
magnitude of sampling error. However, they may include some nonsampling error.
Nonsampling Error. For a given estimator, the difference between the estimate that would result if the
sample were to include the entire population and the true population value being estimated is known as
nonsampling error. There are several sources of nonsampling error that may occur during the
development or execution of the survey. It can occur because of circumstances created by the
interviewer, the respondent, the survey instrument, or the way the data are collected and processed. For
example, errors could occur because:
•
•
•
•
•
The interviewer records the wrong answer, the respondent provides incorrect information,
the respondent estimates the requested information, or an unclear survey question is
misunderstood by the respondent (measurement error).
Some individuals who should have been included in the survey frame were missed
(coverage error).
Responses are not collected from all those in the sample or the respondent is unwilling to
provide information (nonresponse error).
Values are estimated imprecisely for missing data (imputation error).
Forms may be lost, data may be incorrectly keyed, coded, or recoded, etc. (processing
error).
To minimize these errors, the Census Bureau applies quality control procedures during all stages of the
production process including the design of the survey, the wording of questions, the review of the work
of interviewers and coders, and the statistical review of reports.
Two types of nonsampling error that can be examined to a limited extent are nonresponse and
undercoverage.
Nonresponse. The effect of nonresponse cannot be measured directly, but one indication of its potential
effect is the nonresponse rate. For the October 2010 basic CPS, the household-level nonresponse rate
was 8.5 percent. The person-level nonresponse rate for the Internet Use supplement was an additional
7.4 percent.
Since the basic CPS nonresponse rate is a household-level rate and the Internet Use supplement
nonresponse rate is a person-level rate, we cannot combine these rates to derive an overall nonresponse
rate. Nonresponding households may have fewer persons than interviewed ones, so combining these
rates may lead to an overestimate of the true overall nonresponse rate for persons for the Internet Use
supplement.
17-3
Coverage. The concept of coverage in the survey sampling process is the extent to which the total
population that could be selected for sample “covers” the survey’s target population. Missed housing
units and missed people within sample households create undercoverage in the CPS. Overall CPS
undercoverage for October 2010 is estimated to be about 12 percent. CPS coverage varies with age, sex,
and race. Generally, coverage is larger for females than for males and larger for non-Blacks than for
Blacks. This differential coverage is a general problem for most household-based surveys.
The CPS weighting procedure partially corrects for bias from undercoverage, but biases may still be
present when people who are missed by the survey differ from those interviewed in ways other than age,
race, sex, Hispanic origin, and state of residence. How this weighting procedure affects other variables
in the survey is not precisely known. All of these considerations affect comparisons across different
surveys or data sources.
A common measure of survey coverage is the coverage ratio, calculated as the estimated population
before poststratification divided by the independent population control. Table 1 shows October 2010
CPS coverage ratios by age and sex for certain race and Hispanic groups. The CPS coverage ratios can
exhibit some variability from month to month.
Table 1. CPS Coverage Ratios: October 2010
Total
White only
Black only
Residual race
Hispanic
All
Age
Male Female Male Female Male Female Male Female Male Female
group people
0.87
0.88
0.86
0.90
0.88
0.82
0.76
0.85
0.86
0.87
0.84
0-15
0.88
0.83
0.89
0.83
0.84
0.80
0.80
0.89
0.92
0.90
16-19 0.85
20-24 0.78
0.77
0.79
0.80
0.80
0.63
0.73
0.77
0.81
0.83
0.86
0.79
0.86
0.81
0.87
0.68
0.80
0.77
0.81
0.77
0.91
25-34 0.82
0.87
0.92
0.88
0.93
0.79
0.85
0.83
0.94
0.77
0.95
35-44 0.89
0.88
0.91
0.88
0.91
0.83
0.86
0.97
0.92
0.85
0.98
45-54 0.89
0.90
0.90
0.91
0.90
0.91
0.89
0.87
0.95
0.98
0.85
0.81
55-64
0.94
0.94
0.94
0.94
0.95
0.89
0.94
0.93
0.83
0.74
0.88
65+
15+
0.88
0.87
0.89
0.88
0.90
0.79
0.84
0.86
0.89
0.81
0.91
0.88
0.87
0.88
0.88
0.90
0.80
0.82
0.86
0.88
0.83
0.89
0+
Notes: (1) The Residual race group includes cases indicating a single race other than White or Black,
and cases indicating two or more races.
(2) Hispanics may be any race. For a more detailed discussion on the use of parameters for
race and ethnicity, please see the “Generalized Variance Parameters” section.
Comparability of Data. Data obtained from the CPS and other sources are not entirely comparable.
This results from differences in interviewer training and experience and in differing survey processes.
This is an example of nonsampling variability not reflected in the standard errors. Therefore, caution
should be used when comparing results from different sources.
Data users should be careful when comparing the data from this microdata file, which reflects Census
2000-based controls, with microdata files from March 1994 through December 2002, which reflect 1990
census-based controls. Ideally, the same population controls should be used when comparing any
17-4
estimates. In reality, the use of the same population controls is not practical when comparing trend data
over a period of 10 to 20 years. Thus, when it is necessary to combine or compare data based on
different controls or different designs, data users should be aware that changes in weighting controls or
weighting procedures can create small differences between estimates. See the discussion following for
information on comparing estimates derived from different controls or different sample designs.
Microdata files from previous years reflect the latest available census-based controls. Although the most
recent change in population controls had relatively little impact on summary measures such as averages,
medians, and percentage distributions, it did have a significant impact on levels. For example, use of
Census 2000-based controls results in about a 1 percent increase from the 1990 census-based controls in
the civilian noninstitutionalized population and in the number of families and households. Thus,
estimates of levels for data collected in 2003 and later years will differ from those for earlier years by
more than what could be attributed to actual changes in the population. These differences could be
disproportionately greater for certain population subgroups than for the total population.
Note that certain microdata files from 2002, namely June, October, November, and the 2002 ASEC,
contain both Census 2000-based estimates and 1990 census-based estimates and are subject to the
comparability issues discussed previously. All other microdata files from 2002 reflect the 1990 censusbased controls.
Users should also exercise caution because of changes caused by the phase-in of the Census 2000 files
(see “Basic CPS”). During this time period, CPS data were collected from sample designs based on
different censuses. Three features of the new CPS design have the potential of affecting published
estimates: (1) the temporary disruption of the rotation pattern from August 2004 through June 2005 for a
comparatively small portion of the sample, (2) the change in sample areas, and (3) the introduction of the
new Core-Based Statistical Areas (formerly called metropolitan areas). Most of the known effect on
estimates during and after the sample redesign will be the result of changing from 1990 to 2000
geographic definitions. Research has shown that the national-level estimates of the metropolitan and
nonmetropolitan populations should not change appreciably because of the new sample design.
However, users should still exercise caution when comparing metropolitan and nonmetropolitan
estimates across years with a design change, especially at the state level.
Caution should also be used when comparing Hispanic estimates over time. No independent population
control totals for people of Hispanic origin were used before 1985.
A Nonsampling Error Warning. Since the full extent of the nonsampling error is unknown, one should
be particularly careful when interpreting results based on small differences between estimates. The
Census Bureau recommends that data users incorporate information about nonsampling errors into their
analyses, as nonsampling error could impact the conclusions drawn from the results. Caution should also
be used when interpreting results based on a relatively small number of cases. Summary measures (such
as medians and percentage distributions) probably do not reveal useful information when computed on a
subpopulation smaller than 75,000.
For additional information on nonsampling error including the possible impact on CPS data when known,
refer to references [2] and [3].
17-5
Standard Errors and Their Use. The sample estimate and its standard error enable one to construct a
confidence interval. A confidence interval is a range about a given estimate that has a specified
probability of containing the average result of all possible samples. For example, if all possible samples
were surveyed under essentially the same general conditions and using the same sample design, and if an
estimate and its standard error were calculated from each sample, then approximately 90 percent of the
intervals from 1.645 standard errors below the estimate to 1.645 standard errors above the estimate would
include the average result of all possible samples.
A particular confidence interval may or may not contain the average estimate derived from all possible
samples, but one can say with specified confidence that the interval includes the average estimate
calculated from all possible samples.
Standard errors may also be used to perform hypothesis testing, a procedure for distinguishing between
population parameters using sample estimates. The most common type of hypothesis is that the
population parameters are different. An example of this would be comparing the percentage of men who
were part-time workers to the percentage of women who were part-time workers.
Tests may be performed at various levels of significance. A significance level is the probability of
concluding that the characteristics are different when, in fact, they are the same. For example, to
conclude that two characteristics are different at the 0.10 level of significance, the absolute value of the
estimated difference between characteristics must be greater than or equal to 1.645 times the standard
error of the difference.
The Census Bureau uses 90-percent confidence intervals and 0.10 levels of significance to determine
statistical validity. Consult standard statistical textbooks for alternative criteria.
Estimating Standard Errors. The Census Bureau uses replication methods to estimate the standard
errors of CPS estimates. These methods primarily measure the magnitude of sampling error. However,
they do measure some effects of nonsampling error as well. They do not measure systematic biases in
the data associated with nonsampling error. Bias is the average over all possible samples of the
differences between the sample estimates and the true value.
Generalized Variance Parameters. While it is possible to compute and present an estimate of the
standard error based on the survey data for each estimate in a report, there are a number of reasons why
this is not done. A presentation of the individual standard errors would be of limited use, since one could
not possibly predict all of the combinations of results that may be of interest to data users. Additionally,
data users have access to CPS microdata files, and it is impossible to compute in advance the standard
error for every estimate one might obtain from those data sets. Moreover, variance estimates are based
on sample data and have variances of their own. Therefore, some methods of stabilizing these estimates
of variance, for example, by generalizing or averaging over time, may be used to improve their
reliability.
Experience has shown that certain groups of estimates have similar relationships between their variances
and expected values. Modeling or generalizing may provide more stable variance estimates by taking
advantage of these similarities. The generalized variance function is a simple model that expresses the
17-6
variance as a function of the expected value of the survey estimate. The parameters of the generalized
variance function are estimated using direct replicate variances. These generalized variance parameters
provide a relatively easy method to obtain approximate standard errors for numerous characteristics. In
this source and accuracy statement, Table 3 provides the generalized variance parameters for labor force
estimates, and Table 4 provides generalized variance parameters for characteristics from the October
2010 supplement. Also, tables are provided that allow the calculation of parameters for U.S. states and
regions. Tables 5 and 6 provide factors and population controls to derive U.S. state and regional
parameters.
The basic CPS questionnaire records the race and ethnicity of each respondent. With respect to race, a
respondent can be White, Black, Asian, American Indian and Alaskan Native (AIAN), Native Hawaiian
and Other Pacific Islander (NHOPI), or combinations of two or more of the preceding. A respondent’s
ethnicity can be Hispanic or non-Hispanic, regardless of race.
The generalized variance parameters to use in computing standard errors are dependent upon the
race/ethnicity group of interest. The following table summarizes the relationship between the
race/ethnicity group of interest and the generalized variance parameters to use in standard error
calculations.
17-7
Table 2. Estimation Groups of Interest and Generalized Variance Parameters
Generalized variance parameters to
use in standard error calculations
Race/ethnicity group of interest
Total population
Total or White
White alone, White AOIC, or White non-Hispanic population
Total or White
Black alone, Black AOIC, or Black non-Hispanic population
Black
Asian alone, Asian AOIC, or Asian non-Hispanic population
AIAN alone, AIAN AOIC, or AIAN non-Hispanic population
Asian, AIAN, NHOPI
NHOPI alone, NHOPI AOIC, or NHOPI non-Hispanic
population
Populations from other race groups
Asian, AIAN, NHOPI
Hispanic population
Hispanic
Two or more races – employment/unemployment and
educational attainment characteristics
Two or more races – all other characteristics
Black
Asian, AIAN, NHOPI
Notes: (1) AIAN is American Indian and Alaska Native and NHOPI is Native Hawaiian and Other Pacific
Islander.
(2) AOIC is an abbreviation for alone or in combination. The AOIC population for a race group of
interest includes people reporting only the race group of interest (alone) and people reporting multiple
race categories including the race group of interest (in combination).
(3) Hispanics may be any race.
(4) Two or more races refers to the group of cases self-classified as having two or more races.
Standard Errors of Estimated Numbers. The approximate standard error, sx, of an estimated number
from this microdata file can be obtained by using the formula:
sx
ax 2 bx
(1)
Here x is the size of the estimate and a and b are the parameters in Table 3 or 4 associated with the
particular type of characteristic. When calculating standard errors from cross-tabulations involving
different characteristics, use the set of parameters for the characteristic that will give the largest standard
error.
Illustration 1
Suppose there were 7,705,000 unemployed men (ages 16 and up) in the civilian labor force. Use the
appropriate parameters from Table 3 and Formula (1) to get
17-8
Illustration 1
Number of unemployed males in the civilian
labor force (x)
a parameter (a)
b parameter (b)
Standard error
90-percent confidence interval
7,705,000
-0.000032
2,971
145,000
7,466,000 to 7,944,000
The standard error is calculated as
sx
0.000032 u 7,705,000 2 2,971u 7,705,000 145,000
The 90-percent confidence interval is calculated as 7,705,000 ± 1.645 × 145,000.
A conclusion that the average estimate derived from all possible samples lies within a range computed in
this way would be correct for roughly 90 percent of all possible samples.
Standard Errors of Estimated Percentages. The reliability of an estimated percentage, computed
using sample data for both numerator and denominator, depends on both the size of the percentage and
its base. Estimated percentages are relatively more reliable than the corresponding estimates of the
numerators of the percentages, particularly if the percentages are 50 percent or more. When the
numerator and denominator of the percentage are in different categories, use the parameter from Table 3
or 4 as indicated by the numerator.
The approximate standard error, sy,p, of an estimated percentage can be obtained by using the formula:
s y, p
b
p100 p
y
(2)
Here y is the total number of people, families, households, or unrelated individuals in the base or
denominator of the percentage, p is the percentage 100*x/y (0 p 100), and b is the parameter in Table
3 or 4 associated with the characteristic in the numerator of the percentage.
Illustration 2
Suppose there were 119,545,000 households in the U.S., and 71.1 percent had an internet connection in
their home. Use the appropriate parameter from Table 4 and Formula (2) to get
Illustration 2
Percentage of households with an internet
connection (p)
Base (y)
b parameter (b)
Standard error
90-percent confidence interval
17-9
71.1
119,545,000
1,860
0.18
70.8 to 71.4
The standard error is calculated as
s y, p
1,860
u 71.1u 100.0 71.1 0.18
119,545,000
The 90-percent confidence interval for the estimated percentage of households with an internet
connection in the home is from 70.8 to 71.4 percent (i.e., 71.1 ± 1.645 × 0.18).
Standard Errors of Estimated Differences. The standard error of the difference between two sample
estimates is approximately equal to
s x1 x 2
s x1 s x 2
2
2
(3)
where sx1 and sx2 are the standard errors of the estimates, x1 and x2. The estimates can be numbers,
percentages, ratios, etc. This will result in accurate estimates of the standard error of the same
characteristic in two different areas, or for the difference between separate and uncorrelated
characteristics in the same area. However, if there is a high positive (negative) correlation between the
two characteristics, the formula will overestimate (underestimate) the true standard error.
Illustration 3
Suppose that of the 24,294,000 rural households in the U.S., 66.1 percent have an internet connection at
home, and of the 95,250,000 urban households in the U.S., 72.3 percent have an internet connection at
home. Use the appropriate parameters and factors from Table 4 and Formulas (2) and (3) to get
Illustration 3
Rural (x1)
Percentage of households
with internet connection (p)
Base (y)
b parameter (b)
Standard error
90-percent confidence
interval
Urban (x2)
Difference
66.1
72.3
6.2
24,294,000
2,790
0.51
95,250,000
1,860
0.20
0.55
65.3 to 66.9
72.0 to 72.6
5.3 to 7.1
The standard error of the difference is calculated as
s x 1 x 2
0.512 0.20 2
0.55
The 90-percent confidence interval around the difference is calculated as 6.2 ± 1.645 × 0.55. Since this
interval does not include zero, we can conclude with 90 percent confidence that the percentage of rural
households with an internet connection is less than the percentage of urban households with an internet
connection.
17-10
Accuracy of State Estimates. The redesign of the CPS following the 1980 census provided an
opportunity to increase efficiency and accuracy of state data. All strata are now defined within state
boundaries. The sample is allocated among the states to produce state and national estimates with the
required accuracy while keeping total sample size to a minimum. Improved accuracy of state data was
achieved with about the same sample size as in the 1970 design.
Since the CPS is designed to produce both state and national estimates, the proportion of the total
population sampled and the sampling rates differ among the states. In general, the smaller the population
of the state the larger the sampling proportion. For example, in Vermont approximately 1 in every 250
households is sampled each month. In New York the sample is about 1 in every 2,000 households.
Nevertheless, the size of the sample in New York is four times larger than in Vermont because New York
has a larger population.
Standard Errors of State Estimates. The standard error for a state may be obtained by determining
new state-level a and b parameters and then using these adjusted parameters in the standard error
formulas mentioned previously. To determine a new state-level b parameter (bstate), multiply the b
parameter from Table 3 or 4 by the state factor from Table 5. To determine a new state-level a parameter
(astate), use the following:
(1)
If the a parameter from Table 3 or 4 is positive, multiply it by the state factor from Table
5.
(2)
If the a parameter in Table 3 or 4 is negative, calculate the new state-level a parameter as
follows:
a state
b state
POPstate
(4)
where POPstate is the state population found in Table 5.
Illustration 4
Suppose there were 9,813,000 of 12,935,000 households in California, or 75.9 percent, with an internet
connection in the home. Use Formula (2) and the appropriate parameter, factor, and population from
Tables 4 and 5 to get
Illustration 4
Number of households in California with an internet
connection (p)
Base (x)
b parameter (b)
California state factor
State b parameter (bstate)
Standard error
90-percent confidence interval
75.9
12,935,000
1,860
1.14
2,120
0.55
75.0 to 76.8
Obtain the state-level b parameter by multiplying the b parameter, 1,860, by the state factor, 1.14. This
gives bstate = 1,860 × 1.14 = 2,120.
17-11
The standard error of the estimate of the number of California households with an internet connection
can then be found by using Formula (2) and the new state-level b parameter, 2,120. The standard error is
given by
s y, p
2,120
u 75.9 u 100 75.9
12,935,000
0.55
and the 90-percent confidence interval of the number of California households with an internet
connection is calculated as 75.9 r 1.645 u 0.55.
Standard Errors of Regional Estimates. To compute standard errors for regional estimates, follow the
steps for computing standard errors for state estimates found in “Standard Errors of State Estimates”
using the regional factors and populations found in Table 6.
Illustration 5
Suppose there were 9,857,000 households in the South where no one used the internet. Use Formulas (1)
and (4) and the appropriate parameter, factor, and population from Tables 4 and 6 to get
Illustration 5
Number of households in the South with no
internet use (x)
b parameter (b)
South regional factor
Regional population
Regional a parameter (aregion)
Regional b parameter (bregion)
Standard error
90-percent confidence interval
9,857,000
1,860
1.07
112,507,725
-0.000018
1,990
134,000
9,637,000 to 10,077,000
Obtain the region-level b parameter by multiplying the b parameter, 1,860, by the regional factor, 1.07.
This gives bregion = 1,860 × 1.07 = 1,990. Obtain the needed region-level a parameter by
a region
1,990
112,507,725
0.000018
The standard error of the estimate of the number of households in the South with no one using the
internet can be found by using Formula (1) and the new region-level a and b parameters,
-0.000018 and 1,990, respectively. The standard error is given by
sx
0.000018 u 9,857,000 2 1,990 u 9,857,000
134,000
and the 90-percent confidence interval of the number of households in the South without internet use is
calculated as 9,857,000 r 1.645 u 134,000.
17-12
Standard Errors of Groups of States. The standard error calculation for a group of states is similar to
the standard error calculation for a single state. First, calculate a new state group factor for the group of
states. Then, determine new state group a and b parameters. Finally, use these adjusted parameters in
the standard error formulas mentioned previously.
Use the following formula to determine a new state group factor:
n
¦ POP u state factor
i
state group factor
i
i 1
(5)
n
¦ POP
i
i 1
where POPi and state factori are the population and factor for state i from Table 5. To obtain a new state
group b parameter (bstate group), multiply the b parameter from Table 3 or 4 by the state factor obtained by
Formula (5). To determine a new state group a parameter (astate group), use the following:
(1)
If the a parameter in Table 3 or 4 is positive, multiply it by the state group factor
determined by Formula (5).
(2)
If the a parameter in Table 3 or 4 is negative, calculate the new state group a parameter as
follows:
a state group
b state group
n
¦ POP
(6)
i
i 1
Illustration 6
Suppose the state group factor for the state group Illinois-Indiana-Michigan was required. The
appropriate factor would be
12,804,807 u 1.13 6,367,019 u 1.11 9,807,189 u 1.13
1.13
12,804,807 6,367,019 9,807,189
Standard Errors of Quarterly or Yearly Averages. For information on calculating standard errors for
labor force data from the CPS which involve quarterly or yearly averages, please see the “Explanatory
Notes and Estimates of Error: Household Data” section in Employment and Earnings, a monthly report
published by the U.S. Bureau of Labor Statistics.
state group factor
Technical Assistance. If you require assistance or additional information, please contact the
Demographic Statistical Methods Division via e-mail at [email protected].
17-13
Table 3. Parameters for Computation of Standard Errors for Labor Force Characteristics:
October 2010
Characteristic
a
b
Civilian labor force, employed
Not in labor force
Unemployed
-0.000016
-0.000009
-0.000016
3,068
1,833
3,096
Civilian labor force, employed, not in labor force, and unemployed
Men
Women
Both sexes, 16 to 19 years
-0.000032
-0.000031
-0.000022
2,971
2,782
3,096
-0.000151
-0.000311
-0.000252
-0.001632
3,455
3,357
3,062
3,455
-0.000141
-0.000253
-0.000266
-0.001528
3,455
3,357
3,062
3,455
-0.000346
-0.000729
-0.000659
-0.004146
3,198
3,198
3,198
3,198
Total or White
Black
Civilian labor force, employed, not in labor force, and unemployed
Total
Men
Women
Both sexes, 16 to 19 years
Hispanic, may be of any race
Civilian labor force, employed, not in labor force, and unemployed
Total
Men
Women
Both sexes, 16 to 19 years
Asian, American Indian and Alaska Native, Native Hawaiian and
Other Pacific Islander
Civilian labor force, employed, not in labor force, and unemployed
Total
Men
Women
Both sexes, 16 to 19 years
Notes: (1) These parameters are to be applied to basic CPS monthly labor force estimates.
(2) The Total or White, Black, and Asian, AIAN, NHOPI parameters are to be used for both alone and in
combination race group estimates.
(3) For nonmetropolitan characteristics, multiply the a and b parameters by 1.5. If the characteristic of
interest is total state population, not subtotaled by race or ethnicity, the a and b parameters are zero.
(4) For foreign-born and noncitizen characteristics for Total and White, the a and b parameters should be
multiplied by 1.3. No adjustment is necessary for foreign-born and noncitizen characteristics for
Black, Hispanic, and Asian, AIAN, NHOPI parameters.
(5) For the groups self-classified as having two or more races, use the Asian, AIAN, NHOPI parameters
for all employment characteristics.
17-14
Table 4. Parameters for Computation of Standard Errors for Internet Use Characteristics:
October 2010
Characteristics
Total or White
Black
Hispanic
a
b
a
b
a
b
-0.000009
-0.000018
-0.000009
-0.000015
-0.000031
2,131
4,408
2,207
4,687
9,336
-0.000051
-0.000107
-0.000053
-0.000108
-0.000150
2,410
5,047
2,527
6,733
9,336
-0.000070
-0.000218
-0.000109
-0.000228
-0.000317
2,745
8,505
4,259
11,347
15,733
API, AIAN,
NHOPI
a
b
PEOPLE
Educational attainment
People by family income
Income
Marital status, household, and family
Poverty
-0.000109
-0.000283
-0.000141
-0.000285
-0.000395
1,946
5,047
2,527
6,733
9,336
FAMILIES, HOUSEHOLDS, OR UNRELATED INDIVIDUALS
Income
Marital status, household, and family,
Educational attainment, population by age/sex
Poverty
-0.000008
2,016 -0.000047 2,201 -0.000095 3,709 -0.000123 2,201
-0.000008
0.000092
1,860 -0.000036 1,683 -0.000073 2,836 -0.000094 1,683
2,196 0.000092 2,196 0.000155 3,701 0.000092 2,196
Notes: (1) These parameters are to be applied to the October 2010 Internet Use Supplement data.
(2) AIAN is American Indian and Alaska Native and NHOPI is Native Hawaiian and Other Pacific Islander.
(3) Hispanics may be any race. For a more detailed discussion on the use of parameters for race and ethnicity,
please see the “Generalized Variance Parameters” section.
(4) The Total or White, Black, and Asian, AIAN, NHOPI parameters are to be used for both alone and in combination
race group estimates.
(5) For nonmetropolitan characteristics, multiply the a and b parameters by 1.5. If the characteristic of interest is total
state population, not subtotaled by race or ethnicity, the a and b parameters are zero.
(6) For foreign-born and noncitizen characteristics for Total and White, the a and b parameters should be multiplied
by 1.3. No adjustment is necessary for foreign-born and noncitizen characteristics for Black, Asian, AIAN,
NHOPI, and Hispanic parameters.
(7) For the group self-classified as having two or more races, use the Asian, AIAN, NHOPI parameters for all
characteristics except employment, unemployment, and educational attainment, in which case use Black
parameters.
17-15
Table 5. Factors and Populations for State Standard Errors and Parameters:
October 2010
State
Factor Population
State
Factor
Population
Alabama
Alaska
Arizona
Arkansas
California
Colorado
Connecticut
Delaware
District of Columbia
Florida
Georgia
Hawaii
Idaho
Illinois
Indiana
Iowa
Kansas
Kentucky
Louisiana
Maine
Maryland
Massachusetts
Michigan
Minnesota
Mississippi
Missouri
1.09
0.18
1.13
0.70
1.14
1.14
0.91
0.23
0.18
1.10
1.11
0.31
0.35
1.13
1.11
0.79
0.74
1.11
1.09
0.42
1.16
1.11
1.13
1.11
0.73
1.15
4,665,365
689,104
6,610,584
2,864,928
36,860,804
5,040,248
3,478,953
879,945
602,314
18,325,796
9,749,111
1,259,628
1,541,579
12,804,807
6,367,019
2,985,178
2,788,354
4,259,890
4,446,562
1,299,907
5,647,848
6,566,729
9,807,189
5,242,399
2,896,587
5,919,764
Montana
Nebraska
Nevada
New Hampshire
New Jersey
New Mexico
New York
North Carolina
North Dakota
Ohio
Oklahoma
Oregon
Pennsylvania
Rhode Island
South Carolina
South Dakota
Tennessee
Texas
Utah
Vermont
Virginia
Washington
West Virginia
Wisconsin
Wyoming
0.25
0.47
0.65
0.37
1.14
0.51
1.16
1.13
0.17
1.13
0.94
1.00
1.13
0.30
1.11
0.18
1.12
1.14
0.54
0.19
1.12
1.15
0.41
1.13
0.15
966,596
1,784,755
2,638,342
1,310,604
8,652,695
2,003,427
19,395,304
9,316,202
638,441
11,392,633
3,652,556
3,830,388
12,452,123
1,037,218
4,515,827
803,610
6,259,050
24,868,941
2,825,861
615,834
7,752,633
6,670,098
1,804,170
5,606,963
548,770
Notes: (1) The state population counts in this table are for the 0+ population.
(2) For foreign-born and noncitizen characteristics for Total and White, the a and b parameters should be
multiplied by 1.3. No adjustment is necessary for foreign-born and noncitizen characteristics for Black,
Asian, AIAN, NHOPI, and Hispanic parameters.
17-16
Table 6. Factors and Populations for Regional
Standard Errors and Parameters: October 2010
Region
Factor
Population
Northeast
Midwest
South
West
1.06
1.06
1.07
1.02
54,809,367
66,141,112
112,507,725
71,485,429
Notes: (1) The census regional population counts in this table are for the 0+ population.
(2) For foreign-born and noncitizen characteristics for Total and White, the a and b parameters should be
multiplied by 1.3. No adjustment is necessary for foreign-born and noncitizen characteristics for Black,
Asian, AIAN, NHOPI, and Hispanic parameters.
17-17
REFERENCES
[1]
Bureau of Labor Statistics. 1994. Employment and Earnings. Volume 41 Number 5, May 1994.
Washington, DC: Government Printing Office.
[2]
U.S. Census Bureau. 2006. Current Population Survey: Design and Methodology. Technical
Paper 66. Washington, DC: Government Printing Office.
(http://www.census.gov/prod/2006pubs/tp-66.pdf)
[3]
Brooks, C.A. and Bailar, B.A. 1978. Statistical Policy Working Paper 3 - An Error
Profile: Employment as Measured by the Current Population Survey. Subcommittee on
Nonsampling Errors, Federal Committee on Statistical Methodology, U.S. Department of
Commerce, Washington, DC. (http://www.fcsm.gov/working-papers/spp.html)
17-18
File Type | application/pdf |
File Title | Current Population Survey, October 2010 School Enrollment and Internet Use Table of Contents |
Subject | CPS October 2010 School Enrollment and Internet Use |
Author | shiel001 |
File Modified | 2012-08-31 |
File Created | 2012-08-21 |