|
|
|
SUPPORTING STATEMENT B
U.S. Department of Commerce
U.S. Census Bureau
The American Community Survey and the
Puerto Rico Community Survey
OMB Control No. 0607-0810
B. Collections of Information Employing Statistical Methods
ACS Households
The U.S. Census Bureau samples about 298,000 housing unit (HU) addresses each month; about 293,500 are mailed survey materials. The mailing operations are conducted through the U.S. Postal Service and use first-class postage rates for all pieces. For addresses that were mailed survey materials but did not respond by mail, internet, or by calling our telephone questionnaire assistance line, the Census Bureau selects a subsample of all households and assign them to the Computer Assisted Personal Interview (CAPI), the nonresponse followup data collection mode. Unmailable household addresses are sampled and included in the CAPI data collection mode.
In 2019, the HU sample yielded approximately 138,000 self-response interviews per month. The HU CAPI follow-up yielded an estimated response rate of approximately 80 percent in 2019. The 2019 final weighted response rate for ACS was 91 percent.
ACS Group Quarters
In addition to the ACS data collection from households, the data are also collected from a sample of group quarters (GQ) facilities and residents. The field representatives use the CAPI Group Quarters Facility questionnaire (GQFQ) in English or Spanish when making initial telephone contact to schedule an appointment and to conduct a telephone or personal visit at the sample GQ and also to generate the subsample of persons for ACS interviews. An introductory letter is mailed to the sample GQ approximately two weeks prior to the period when a field representative may begin making contact with the GQ. The Spanish GQFQ instrument is used for ACS data collection at Puerto Rico GQs. A subset of the ACS HU questions is used for the interviews with sample residents in GQs. Resident-level personal interviews with sampled GQ residents are conducted using CAPI, but bilingual paper questionnaires can also be used for self-response. The GQ CAPI and paper questionnaires contain questions for one person. The GQ CAPI also excludes certain questions for residents and institutional group quarters that are out of scope to reduce burden. Field representatives may call or conduct additional personal visits to the GQ and/or sample residents to obtain missing or incomplete ACS GQ forms until the closeout of each sample panel.
Describe the procedures for the collection of information including:
Statistical methodology for stratification and sample selection,
Estimation procedure,
Degree of accuracy needed for the purpose described in the justification,
Unusual problems requiring specialized sampling procedures, and
Any use of periodic (less frequent than annual) data collection cycles to reduce burden.
The ACS employs a two-phase, two-stage sample design. The first-phase sample consists of two separate address samples: Period 1 and Period 2. These samples are chosen at different points in time. Both samples are selected in two stages of sampling, a first-stage and a second-stage. Subsequent to second-stage sampling, the majority of sample addresses are randomly assigned to one of the 12 months of the sample year (the exception is for addresses in remote Alaska, which are assigned to either January or September). The second-phase of sampling occurs when the CAPI sample is selected.
The Period 1 sample is selected during September and October of the year prior to the sample year (e.g., the 2019 Period 1 sample was selected in September and October of 2018). Approximately half of a year’s sample is selected at this time. Sample addresses that are not in remote Alaska are randomly assigned to one of the first six months of the sample year; sample addresses in remote Alaska are assigned to the first six months as a whole to address access issues.
Period 2 sampling occurs in January and February of the sample year (e.g., the 2019 Period 2 sample was selected during January and February of 2019). This sample accounts for the remaining half of the overall first-phase sample. Period 2 sample addresses that are not in remote Alaska are randomly assigned to one of the last six months of the sample year; Period 2 sample addresses in remote Alaska are assigned to the last six months as a whole.1
A subsample of nonresponding addresses and of any addresses deemed unmailable is selected for the CAPI data collection mode.2
The following steps are used to select the first-phase and second-phase samples in both periods.
First stage sampling defines the universe for the second stage of sampling through three steps. First, all addresses that were in a first-phase sample within the past four years are excluded from eligibility. This ensures that no address is in sample more than once in any five-year period. The second step is to select a 20 percent systematic sample of “new” units, i.e., those units that have never appeared on a previous Master Address File extract. Each new address is systematically assigned either to the current year or to one of four back-samples. This procedure maintains five equal partitions (samples) of the universe. The third step is to randomly assign all eligible addresses to a period.3
Second-stage sampling uses 16 sampling strata in the United States.4 The stratum-level rates used in second-stage sampling account for the first-stage selection probabilities. These rates are applied at a block level to addresses in the United States by calculating a measure of size for each of the following geographic entities:
Counties.
Places.
School Districts (elementary, secondary, and unified).
American Indian Areas.
Tribal Subdivisions.
Alaska Native Village Statistical Areas.
Hawaiian Homelands.
Minor Civil Divisions – in Connecticut, Maine, Massachusetts, Michigan, Minnesota, New Hampshire, New Jersey, New York, Pennsylvania, Rhode Island, Vermont, and Wisconsin.5
Census
Designated Places – in Hawaii only.
The measure of size for all areas except American Indian areas, Tribal Subdivisions, Alaska Native Village statistical areas, and Hawaiian Homelands is an estimate of the number of occupied HUs in the area. This is calculated by multiplying the number of ACS addresses by an estimated occupancy rate at the block level. A measure of size for each census tract is also calculated in the same manner.
For American Indian areas, tribal subdivisions areas, and Alaska Native Village statistical areas, the measure of size is the estimated number of occupied HUs multiplied by the proportion of people reporting American Indian or Alaska Native (alone or in combination) in the 2010 Census.
For Hawaiian Homelands, the measure of size is the estimated number of occupied HUs multiplied by the proportion of people reporting Native Hawaiian (alone or in combination) in the 2010 Census.
Each block is then assigned the smallest positive, nonzero measure of size from the set of all entities of which it is a part. The 2019 second-stage sampling strata and the overall first-phase sampling rates are shown in Table 1. Table 2 includes the rates used in Puerto Rico.
The overall first-phase sampling rates are calculated using the distribution of ACS valid addresses by second-stage sampling stratum in such a way as to yield an overall target sample size for the year of 3,576,000 (1,788,000 for each period) in the United States. The first-phase rates are adjusted for the first-stage sample to yield the second-stage selection probabilities.
Table 1. First-phase Sampling Rate Categories for the United States
Sampling |
Type of Area |
Rate Definitions |
2019 Sampling Rates |
|
|
Stratum |
|
|
Period 1 |
Period 2 |
|
1 |
0 < MOS1 < 200 |
15.00% |
15.00 % |
15.00 % |
|
2 |
200 ≤ MOS < 400 |
10.00% |
10.00 % |
10.00 % |
|
3 |
400 ≤ MOS < 800 |
7.00% |
7.00 % |
7.00 % |
|
4 |
800 ≤ MOS < 1200 |
2.80 × BR |
4.29 % |
4.28 % |
|
5 |
1200 ≤ MOS and 0 < TRACTMOS2 < 400 |
3.50 × BR |
5.36 % |
5.35 % |
|
6 |
1200 ≤ MOS and 0 < TRACTMOS < 400 HR3 |
0.92 × 3.50 × BR |
4.93 % |
4.92 % |
|
7 |
1200 ≤ MOS and 400 ≤ TRACTMOS < 1000 |
2.80 × BR |
4.29 % |
4.28 % |
|
8 |
1200 ≤ MOS and 400 ≤ TRACTMOS < 1000 HR |
0.92 × 2.80 × BR |
3.95 % |
3.95 % |
|
9 |
1200 ≤ MOS and 1000 ≤ TRACTMOS < 2000 |
1.70 × BR |
2.60 % |
2.60 % |
|
10 |
1200 ≤ MOS and 1000 ≤ TRACTMOS < 2000 HR |
0.92 × 1.70 × BR |
2.40 % |
2.39 % |
|
11 |
1200 ≤ MOS and 2000 ≤ TRACTMOS < 4000 |
BR4 |
1.53 % |
1.53 % |
|
12 |
1200 ≤ MOS and 2000 ≤ TRACTMOS < 4000 HR |
0.92 × BR |
1.41 % |
1.41 % |
|
13 |
1200 ≤ MOS and 4000 ≤ TRACTMOS < 6000 |
0.60 × BR |
0.92 % |
0.92 % |
|
14 |
1200 ≤ MOS and 4000 ≤ TRACTMOS < 6000 HR |
0.92 × 0.60 × BR |
0.85 % |
0.84 % |
|
15 |
1200 ≤ MOS and 6000 ≤ TRACTMOS |
0.35 × BR |
0.54 % |
0.53 % |
|
16 |
1200 ≤ MOS and 6000 ≤ TRACTMOS HR |
0.92 × 0.35 × BR |
0.49 % |
0.49 % |
1MOS = measure of size (estimated number occupied housing units) of the smallest governmental entity
2TRACTMOS = the measure of size (MOS) at the census tract level
3HR = areas where predicted levels of completed mail and CATI interviews are > 60%
4BR = base sampling rate
Table 2. First-phase Sampling Rate Categories for Puerto Rico
Sampling |
Type of Area |
Rate Definitions |
2019 Sampling Rates |
|
|
Stratum |
|
|
Period 1 |
Period 2 |
|
3 |
400 ≤ MOS1 < 800 |
7.00% |
7.00 % |
7.00 % |
|
5 |
1200 ≤ MOS and 0 < TRACTMOS2 < 400 |
3.50 × BR3 |
5.11 % |
5.11 % |
|
7 |
1200 ≤ MOS and 400 ≤ TRACTMOS < 1000 |
2.80 × BR |
4.09 % |
4.09 % |
|
9 |
1200 ≤ MOS and 1000 ≤ TRACTMOS < 2000 |
1.70 × BR |
2.48 % |
2.48 % |
|
11 |
1200 ≤ MOS and 2000 ≤ TRACTMOS < 4000 |
BR4 |
1.46 % |
1.46 % |
|
13 |
1200 ≤ MOS and 4000 ≤ TRACTMOS < 6000 |
0.60 × BR |
0.88 % |
0.88 % |
1MOS = measure of size (estimated number occupied housing units) of the smallest governmental entity
2TRACTMOS = the measure of size (MOS) at the census tract level
4BR = base sampling rate
After each block is assigned to a second-stage sampling stratum in each period, a systematic sample of addresses is selected from the second-stage universe (first-stage sample) within each county and county equivalent.
After the second stage of sampling, sample addresses selected during Period 1 sampling that are not in remote Alaska are allocated to one of the first six months of the sample year. Sample addresses selected during Period 2 sampling that are not in remote Alaska are assigned to a month between July through December, inclusively. Sample addresses in remote Alaska are assigned to January or September in Period 1 and Period 2 sampling, respectively.
The addresses from which CAPI subsamples are selected can be divided into two groups. One group includes addresses that are not eligible for any other data collection operation; these consist of unmailable addresses and those in remote Alaska areas. The second group includes addresses that are eligible for the other data collection operations but for which no response was obtained prior to CAPI subsampling—these consist of mailable addresses not in a remote Alaska area.
All sample addresses in remote Alaska are sent to the CAPI data collection operation. Most unmailable addresses are selected for CAPI at a rate of 2-in-3; the exception is when they are in a Hawaiian Homeland (HH) area, Alaska Native Village statistical area (ANVSA), or certain American Indian (AI) areas, where all are selected for CAPI.
With one exception, mailable addresses from which a response was not obtained by the time of the CAPI operation are sampled at rates of 1-in-2, 2-in-5, and 1-in-3; these rates are set at the tract level. The exception is for addresses in HH, ANVSA, and AI areas, where all are selected for CAPI. Table 3 shows the CAPI subsampling rates that are associated with each group of addresses.
All non-responding addresses in Puerto Rico are subsampled for CAPI at a 1-in-2 rate.
Table 3. Second-Phase (CAPI) Subsampling Rates for the United States
Address and Tract Characteristics |
CAPI Subsampling Rate |
Addresses in Remote Alaska* |
Take all (100.0%) |
Addresses in Hawaiian Homelands, Alaska Native Village statistical areas and a subset of American Indian areas* |
Take all (100.0%) |
Unmailable addresses that are not in the previous two categories |
66.7% |
Mailable addresses in tracts with predicted levels of completed mail and CATI interviews prior to CAPI subsampling between 0% and 35%, inclusive |
50.0% |
Mailable addresses in tracts with predicted levels of completed mail and CATI interviews prior to CAPI subsampling greater than 35% and less than or equal to 50% |
40.0% |
Mailable addresses in all other tracts |
33.3% |
The 2019 group quarters (GQ) sampling frame was divided into two strata: a small GQ stratum and a large GQ stratum. Small GQs have expected populations of 15 or fewer people residing at the GQ, while large GQs have expected populations of more than 15 people residing at the GQ.
Samples were selected in two phases within each stratum. In general, GQs were selected in the first phase and then persons of sampled GQs were selected in the second phase. Both phases differ between the two strata. Each sampled GQ was randomly assigned to one or more months in 2019; it was in these months that their person samples were selected.
There were two stages of selecting small GQs for sample.
First stage
The small GQ universe is divided into five groups that are approximately equal in size. All new small GQs are systematically assigned to one of these five groups on a yearly basis, with about the same probability (20 percent) of being assigned to any given group. Each group represents a second-stage sampling frame, from which GQs are selected once every five years. The 2019 second-stage sampling frame was used in 2017 as well, and is currently to be used in 2024, 2029, etc.
Second stage
GQs were systematically selected from the 2019 second-stage sampling frame. Each GQ had the same second-stage probability of being selected within a given state, where the probabilities varied between states. Table 4 below shows these probabilities.
Note that the GQ sampling rate for Puerto Rico was 2.5 percent.
Table 4. 2019 Group Quarter State Targeted Sampling Rates for the U.S.
State |
Targeted Rate |
State |
Targeted Rate |
State |
Targeted Rate |
Alabama |
2.34% |
Kentucky |
2.36% |
North Dakota |
4.56% |
Alaska |
3.18% |
Louisiana |
2.89% |
Ohio |
2.54% |
Arizona |
2.41% |
Maine |
3.40% |
Oklahoma |
2.24% |
Arkansas |
2.19% |
Maryland |
2.42% |
Oregon |
2.59% |
California |
2.71% |
Massachusetts |
2.30% |
Pennsylvania |
2.70% |
Colorado |
2.45% |
Michigan |
2.86% |
Rhode Island |
2.27% |
Connecticut |
2.55% |
Minnesota |
2.56% |
South Carolina |
2.11% |
Delaware |
5.00% |
Mississippi |
2.67% |
South Dakota |
4.26% |
Dist of Columbia |
2.94% |
Missouri |
2.33% |
Tennessee |
2.45% |
Florida |
2.43% |
Montana |
4.34% |
Texas |
2.27% |
Georgia |
2.66% |
Nebraska |
2.52% |
Utah |
1.92% |
Hawaii |
3.34% |
Nevada |
3.65% |
Vermont |
4.84% |
Idaho |
4.09% |
New Hampshire |
2.97% |
Virginia |
2.69% |
Illinois |
2.54% |
New Jersey |
2.73% |
Washington |
2.42% |
Indiana |
2.39% |
New Mexico |
2.92% |
West Virginia |
2.23% |
Iowa |
2.38% |
New York |
2.26% |
Wisconsin |
2.57% |
Kansas |
2.57% |
North Carolina |
2.39% |
Wyoming |
6.56% |
Individuals were selected for sample from each GQ that was selected for sample in the first phase of sample selection. If 15 or fewer people were residing at a GQ at the time a field representative visited the GQ, then all of them were selected for sample. Otherwise, if more than 15 people were residing at the GQ, then the field representative selected a systematic sample of ten people from the GQ’s roster.
The targeted state-level sampling rates are the probabilities of selecting any given person in a GQ; it is around these probabilities that the sample design is based. These probabilities reflect both phases of sample selection, and they varied by state. The probabilities for 2019 are shown in Table 4.
The sample was designed so that the second-phase sampling rate would be 100 percent for small GQs (i.e., select the entire expected population of 15 or fewer people for sample in every small sampled GQ). This means the probability of selecting any person in a small GQ was designed to equal the probability of selecting the small GQ itself.
All large GQs were eligible for being sampled in 2019. This means there was only a single stage of sampling in this phase. This stage consists of systematically assigning “hits” to GQs independently in each state, where each hit represents ten people to be sampled.
In general, a GQ has either Z or Z+1 hits assigned to it. The value for Z is dependent on both the GQ’s expected population size and its within-state target sampling rate, shown in Table 3. When this rate is multiplied by a GQ’s expected population, the result is a GQ’s expected person sample size. If a GQ’s expected person sample size is less than ten, then Z = 0; if it is at least ten but less than 20, then Z = 1; if it is at least 20 but less than 30, then Z = 2; and so on. See 2.C. below for a detailed example.
If a GQ has an expected person sample size that is less than ten, then this method effectively gives the GQ a probability of selection that is proportional to its size; this probability is the expected person sample size divided by ten. If a GQ has an expected person sample size of ten or more, then it is in sample with certainty and is assigned one or more hits.
Individuals are selected within each GQ to which one or more hits are assigned in the first phase of selection. There are ten people selected at a GQ for every hit assigned to the GQ. The individuals are systematically sampled from a roster of people residing at the GQ at the time of a field representative’s visit. The exception is if there are far fewer persons residing in a GQ than expected—in these situations, the number of people to sample at the GQ is reduced to reflect the GQ’s actual population. In cases where fewer than ten people reside in a GQ at the time of a visit, the field representative will select all of them for sample.
As for small GQs, the targeted state-level sampling rates are the probabilities of selecting any given person in a GQ. The probabilities are shown in Table 3. Note that these rates are the same as for everyone in small GQs.
As an example, suppose a GQ in Indiana has an expected population of 250. The target sampling rate in Indiana is 2.39 percent, meaning any given person in a GQ in Indiana has about a 1-in-41.8 chance of being selected. This rate, combined with the GQ’s expected population of 250, means that the expected number of individuals selected for sample in this GQ is 5.975 (2.39 percent × 250). Since this is less than ten, this GQ has either 0 or 1 hits assigned to it (Z = 0). The probability of it being assigned a hit is the GQ’s expected person sample size of 5.975 divided by 10, or 59.75 percent.
As a second example, suppose a GQ in Idaho has an expected population of 1,000. The target sampling rate in Idaho is 4.09 percent, meaning any given person in a GQ in Idaho has about a 1-in-24.4 chance of being selected. This rate, combined with the GQ’s expected population of 1,000, means that the expected number of individuals selected for sample in the GQ is 40.9 (4.09 percent × 1,000); this GQ is assigned either four or five hits (Z = 4).
All sample GQs are assigned to one or more months (interview months); these are the months in which field representatives will visit a GQ to select a person sample and conduct interviews. All small GQs, all large GQs that are assigned only one hit, all remote Alaska GQs, all sampled military facilities, and all sampled correctional facilities (regardless of how many hits a military or correctional facility is assigned) are assigned to a single interview month. Remote Alaska GQs are assigned to either the first half or second half of the year; federal prisons are assigned to September; all the others are randomly assigned one interview month. Military ships are restricted to March–December to allow time for the Census Bureau to let the point of contact for these ships know what month the sampled ships are in.
All large GQs that are assigned multiple hits, but are not in any of the categories above, have each hit randomly assigned to a different interview month. If a GQ has more than 12 hits assigned to it, then multiple hits are assigned to one or more interview months for the GQ. For example, if a GQ has 15 hits assigned to it, then there are three interview months in which two hits are assigned and nine interview months in which one hit is assigned. There are two restrictions to this process. One restriction is applied to college dormitories, whose hits are randomly assigned to nonsummer months only (i.e., January through April and September through December). The other restriction is applied to military ships, whose hits are randomly assigned only to the last ten months of the year (i.e., March through December).
They are sampled separately from other GQs using the same procedure shown above and are all assigned to the September interview month as before. The Census Bureau uses the most up-to-date information through two files delivered from the Bureau of Prisons–a facilities file listing each federal prison including address and contact information, and a list of all federal inmates. The Census Bureau samples inmates directly from the file, and the questionnaires for inmates in federal prisons are preprinted with the names of inmates in sample.
Remote Alaska
Remote Alaska is a set of rural areas in Alaska that are difficult to access and for which all HU addresses are treated as unmailable. There are approximately 30,000 HU addresses and 500 GQs in Remote Alaska. Due to the difficulties in field operations during specific months of the year and the extremely seasonal population in these areas, data collection operations in Remote Alaska differ from the rest of the country. In both the main and supplemental HU address samples, the month assigned for each Remote Alaska HU address is based on the county, place, AIANSA, or block group (in that order) in which it is contained. The Census Bureau assigns all designated addresses located in each of these geographical entities to either the first half or the second half of the year in such a way as to balance workloads between the halves of the year and to keep groups of cases together geographically. Addresses are sorted for each month by county and geographical order in the address frame and all sample addresses are sent directly to CAPI (bypassing mail, Internet for the HU sample) in the appropriate month. The GQ sample assigned in Remote Alaska to the first half or the second half of the year using the same procedure and allow up to six months to complete the HU and GQ data collection for each of the two data collection periods.
Data collection instruments are available to respondents and interviewers in English and Spanish. Respondents may also complete the survey via a phone interview with bilingual staff in Haitian French Creole, German, Hindi, Italian, Korean, Portuguese, Russian, Spanish, Tagalog, Ukrine, and Vietnamese.
Additional methods for maximizing response are explored as part of several of the methods panel tests. Once details of specific strategies are determined they will be provided as part of a non-substantive change request.
The Census Bureau is continuously engaging and responding to stakeholders to adapt the way we gather data, administer the survey, and conduct the way we do business. The ACS Methods Panel Tests program (OMB Control No. 0607-0936) allows the Census Bureau to respond to emerging trends and changes in our nation that spawn new data needs by building on our comprehensive research agenda. This work not only improves the ACS, but also allows the Census Bureau to innovate responsively across key aspects of our work. The ACS Methods Panel Tests program also provides an opportunity to research and test elements of survey data collection that relate to the decennial census.
Testing allows the Census Bureau the opportunity to improve data quality, reduce data collection costs, improve questionnaire content and data collection materials, as well as react to emerging needs.
The Census Bureau will collect and process these data. Within the Census Bureau, please consult the following individuals for further information on their area of expertise.
Statistical Aspects
Steven Hefter Chief, ACS Sample Design Branch
Decennial Statistical Studies Division
Phone: (301) 763-4082
Overall Data Collection
Dameka Reese Assistant Division Chief for Data Collection
American Community Survey Office
Phone: (301) 763-3804
1 Remote Alaska assignments are made so that the sample addresses are approximately evenly distributed between the two data collection periods.
2
All nonmailable and nonresponding addresses in the following areas
are now sent to CAPI: all Hawaiian Homelands, all Alaska Native
Village statistical areas, American Indian areas with an estimated
proportion of American Indian population ≥ 10%.
3
Most of the period assignments are made during Period 1 sampling.
The only assignments in Period 2 sampling are made for addresses
that were not part of the process in Period 1, e.g., new
addresses.
4
Beginning with the 2011 sample the ACS implemented a change to the
stratification, increasing the number of sampling strata and
changing how the sampling rates are defined. Prior to 2011 there
were seven strata; there are now 16 sampling strata. Table 1
gives a summary of these strata and the rates.
5 These are the states where MCDs are active, functioning governmental units.
Page
|
File Type | application/vnd.openxmlformats-officedocument.wordprocessingml.document |
Author | Mary Reuling Lenaiyasa (CENSUS/PCO FED) |
File Modified | 0000-00-00 |
File Created | 2023-09-06 |