Supporting Statement (1220-0050) CE Part B Final 11-21-2023

Supporting Statement (1220-0050) CE Part B Final 11-21-2023.docx

Consumer Expenditure Surveys: Quarterly Interview and Diary

OMB: 1220-0050

Document [docx]
Download: docx | pdf

Consumer Expenditure Surveys

OMB Control Number 1220-0050

OMB Expiration Date: June 30, 2025


Supporting Statement For

the Consumer expenditure sureys


OMB Control NO. 1220-0050


B. CollectionS of Information Employing Statistical Methods


1. Describe (including a numerical estimate) the potential respondent universe and any sampling or other respondent selection methods to be used. Data on the number of entities (e.g., establishments, State and local government units, households, or persons) in the universe covered by the collection and in the corresponding sample are to be provided in tabular form for the universe as a whole and for each of the strata in the proposed sample. Indicate expected response rates for the collection as a whole. If the collection had been conducted previously, include the actual response rate achieved during the last collection.


The Consumer Expenditure (CE) Survey is a nationwide household survey conducted by the U.S. Bureau of Labor Statistics to find out how households in the United States spend their money. The CE Survey consists of two sub-surveys, a Quarterly Interview survey (CEQ), and a two-week Diary survey (CED). The Interview survey collects detailed information on large expenditures such as property, automobiles, and major appliances, as well as on recurring expenditures such as rent, utilities, and insurance premiums. By contrast, the Diary survey collects detailed information on small, frequently purchased items such as food and apparel. The data from the two surveys are then combined to provide a complete picture of consumer expenditures in the United States.


The data for both surveys are collected from a representative sample of households around the country. Both surveys have the same sample design, which is a two-stage sampling process. In the first stage a representative sample of counties from around the United States is selected for the survey. And then in the second stage a representative sample of households from those counties is selected for the survey. This two-stage sampling process is designed to generate a sample of households in which every demographic group and every wealth level is well-represented. The rest of this section describes the two sampling processes in more detail.


Primary Sampling Units (PSUs)

In the first stage of sampling all 3,143 counties or county equivalents in the United States are partitioned into 1,470 small geographic clusters called “primary sampling units” (PSUs) from which a representative sample of 91 PSUs is randomly selected for the survey. The clusters are the “core-based statistical areas” (CBSAs) defined by the Office of Management and Budget (OMB). They range in size from 1 county to 29 counties with the average size being 5 counties. The same sample of 91 PSUs is used in both the CEQ and CED surveys. The 91 PSUs fall into three size classes:


PSU

size class”

Number

of PSUs

Description

S

23

Large Metropolitan Core Based Statistical Areas. These are CBSAs with over 2.5 million people, and they are self-representing PSUs.

N

52

Small Metropolitan Core Based Statistical Areas, and Micropolitan Core Based Statistical Areas. These are CBSAs with under 2.5 million people, and they are non-self-representing PSUs.

R

16

Non-Core Based Statistical Areas. These are small clusters of counties in “rural” areas created by CE staff, and they are non-self-representing PSUs.


BLS selected its sample of 91 PSUs from a stratified sample design in which all 23 self-representing PSUs (the S PSUs) were selected for the survey with certainty, while all the non-self-representing PSUs (the N and R PSUs) were stratified into 68 (=52+16) strata using a 4-variable model whose independent variables were latitude, longitude, median household income, and median household property value. Then one PSU was randomly selected from each stratum with its probability of selection being proportional to its population.


All 91 PSUs are used by the CE survey. However, one of CE’s major customers is the Consumer Price Index (CPI) which is an urban survey, not a national survey, that uses CE’s data for its expenditure weights, and CPI uses only the 75 (=23+52) urban PSUs in its survey.


Sampling Households Within PSUs

After selecting a sample of PSUs, a sample of households is then selected from the civilian non-institutional portion of those PSUs. That includes people living in houses, condominiums, and apartments, as well as people living in group quarters such as college dormitories and boarding houses. However, it excludes the non-civilian and institutional portions of the population, such as military personnel living on base, nursing home residents, and prison inmates.


Addresses for the CEQ and CED surveys are selected from two sampling frames maintained by the Census Bureau: the Unit frame and the Group Quarters (GQ) frame. Both frames are derived from the Master Address File (MAF), which is basically a list of all residential addresses identified in the 2020 census and which is updated twice per year with information from the U.S. Postal Service. The Unit frame is the larger of the two frames and it contains both existing housing units and newly constructed housing units. It has approximately 99% of the MAF’s civilian non-institutional addresses and it is updated twice per year. The GQ frame is also derived from the MAF, but it is much smaller. It has the remaining 1% of the MAF’s civilian non-institutional addresses and it is updated every three years.


In each PSU, a “systematic sample” of households is selected from the two frames. The first step in the selection process is sorting the households by variables that are correlated with their expenditures. The purpose of this is to ensure that households of every wealth level are well-represented in the sample. In the systematic sampling process the first household in the sample is selected from the sorted list using a random number generator. Then after the initial household is selected every k-th household down the list is selected where “k” is the PSU’s sampling interval. The sampling interval “k” is computed as the number of addresses in the PSU divided by the number of addresses in the PSU that are to be selected for the sample. The Unit and GQ frames have different sorting variables, but they have the same sampling interval.


Table 1 below shows how the households are sorted in the Unit frame. It has codes ranging from 10 to 99 with the lower codes being for low-wealth households, and the higher codes being for high-wealth households. For the Unit frame, the sorting or “stratification” variable is created from the number of occupants in each household, their housing tenure (owner/renter), and the market value of their home (for owners) or the rental value of their apartment or home (for renters). These variables are used because they are correlated with expenditures: households with more people tend to be wealthier than those with fewer people; homeowners tend to be wealthier than renters; and people living in high-price housing units tend to be wealthier than people living in low-price housing units.


All the renters are at one end of the stratification and all the owners are at the other end of the stratification. The renters and owners are further subdivided into quartiles based on monthly rental and property values to ensure that households of every wealth level are well represented in the survey. Vacant housing units are put in the middle column for the number of household occupants because although they were vacant at the time of the decennial census, when CE’s field representatives visit them most will be occupied, and they could be in any of the four non-zero categories. Therefore the middle column is their “expected” location.


Table 1. CE Unit Frame Stratification Code Values


Renter/Owner Quartile

Number of Occupants


1 person

2 persons

Vacant

3 persons

4+ persons

Renters 1st Quartile

10

11

12

13

14

Renters 2nd Quartile

25

24

23

22

21

Renters 3rd Quartile

30

31

32

33

34

Renters 4th Quartile

45

44

43

42

41

Owners 1st Quartile

50

51

52

53

54

Owners 2nd Quartile

65

64

63

62

61

Owners 3rd Quartile

70

71

72

73

74

Owners 4th Quartile

85

84

83

82

81

Other



99




To draw a systematic sample from the Unit frame, the addresses are sorted first by PSU, then by the State Federal Information and Processing Standards (FIPS) code, the County FIPS code, the CE stratification variable described above, Census Tract code, Census Block code, Street name, Street number, and MAFID code.


To draw a systematic sample from the GQ frame, the addresses are sorted first by PSU, then by the State FIPS code, County FIPS code, Census Tract code, CHPCT (the percent of people in the tract living in college housing), and Census Block code. CHPCT is used because people living in college housing units are very different than the rest of the people in the GQ frame, so using it as a stratification variable helps produce a more representative sample.


For more information on the sample design in general, please see the paper by Susan King on “Selecting a Sample of Households for the Consumer Expenditure Survey” (Attachment Q); or the paper by Danielle Neiman et. al., “Review of the 2010 Sample Redesign of the Consumer Expenditure Survey” (Attachment R). For more information on the geographic portion of CE’s sample design, please see the memorandum from Adam Safir to Jennifer Epps on “CE sample redesign PSU Memo for Census.,” July 21, 2023 (Attachment S).


Consumer Units

A consumer unit (CU) is the unit from which the CE seeks to collect its detailed expenditure information. A CU is basically the same thing as a “household,” so the terms are often used interchangeably. However, there are some technical differences between them. A CU is a group of people living together in a housing unit (1) who are related by blood, marriage, adoption, or some other legal arrangement such as foster children; (2) who are unrelated but pool their incomes to make joint expenditure decisions; or (3) is a person living alone or sharing a housing unit with other people but who is financially independent of the other people.1 Approximately 99 percent of all occupied housing units have one CU, and the other 1 percent have two or more CUs.


There are approximately 135 million CUs in the United States. The following table shows the estimated number of CUs in all 91 strata from which CE’s sample of 91 PSUs was selected.2 The stratum code is a 4-character variable where the first character is the stratum’s size class (S/N/R). The second character is the stratum’s region of the country (1=Northeast, 2=Midwest, 3=South, 4=West). The third character is the stratum’s division of the country (1=New England, 2=Middle Atlantic, 3=East North Central, etc.). And the fourth character is a unique identifier of the strata within their size/region/division classes. The table below shows the approximate number of CUs in each of the 91 strata.



Estimated Number of CUs in CE’s 91 Strata


Stratum Code

Estimated Number of CUs in the Stratum

S11A

2,012,737

S12A

8,487,236

S12B

2,543,623

S23A

3,917,636

S23B

1,788,888

S24A

1,509,094

S24B

1,148,695

S35A

2,595,054

S35B

2,500,156

S35C

2,480,395

S35D

1,293,296

S35E

1,158,575

S37A

3,139,562

S37B

2,900,904

S48A

1,973,718

S48B

1,207,171

S49A

5,376,795

S49B

1,934,281

S49C

1,873,524

S49D

1,636,850

S49E

1,343,541

S49F

592,735

S49G

220,019

N11B

2,093,772

N11C

1,783,635

N12C

1,630,267

N12D

1,410,227

N12E

1,637,424

N12F

1,506,250

N23C

1,418,278

N23D

1,323,519

N23E

1,631,773

N23F

1,390,320

N23G

1,624,024

N23H

1,563,962

N23I

1,591,942

N23J

1,461,301

N24C

1,261,388

N24D

1,179,480

N24E

1,473,526

N24F

1,325,588

N35F

1,408,330

N35G

1,329,856

N35H

1,325,019

N35I

1,263,738

N35J

1,365,207

N35K

1,086,272

N35L

1,485,932

N35M

1,184,213

N35N

1,321,602

N35O

1,227,104

N35P

1,364,252

N35Q

1,008,645

N36A

1,078,082

N36B

1,058,845

N36C

1,154,509

N36D

1,231,699

N36E

1,070,648

N36F

1,097,232

N37C

1,188,159

N37D

1,230,168

N37E

1,286,922

N37F

1,079,624

N37G

1,129,798

N37H

1,235,772

N37I

1,085,754

N37J

1,178,516

N48C

1,506,783

N48D

1,766,930

N48E

1,626,841

N48F

1,523,209

N49H

2,339,251

N49I

2,226,914

N49J

2,025,901

N49K

1,989,689

R11D

266,718

R12G

322,073

R23K

622,969

R23L

555,556

R24G

721,236

R24H

613,087

R35R

595,889

R35S

727,339

R36G

631,462

R36H

541,477

R37K

530,282

R37L

604,196

R48G

203,781

R48H

154,698

R48I

185,230

R49L

301,433

Total

135,000,000



Sample Size and Response Rates

The table below shows the expected annual sample sizes and response rates for the CEQ and CED surveys for 2024-2026. The sample sizes were increased from their previous levels due to the CPI program changing the source of its outlet frame information from the Telephone Point of Purchase Survey (TPOPS) to the CEQ and CED surveys. The CEQ’s sample size used to be 48,000 addresses per year but it was increased to 52,700 addresses per year in April 2020, and the CED’s sample size used to be 12,000 addresses per year but it was increased to 17,800 addresses per year in January 2020. The CPI program relied on TPOPS as its source of outlet sampling frame information since 1998, but due to its low response rate the duty of providing outlet information to the CPI program was transferred to the CE program.



Quarterly Interview Survey

Diary Survey

Category

2024

2025

2026

2024

2025

2026

Total Sample Size (addresses)

52,700

52,700

52,700

17,800

17,800

17,800








Type B and C Noninterviews (vacant, demolished, etc.)







Number

9,000

9,000

9,000

3,000

3,000

3,000

Percent of Total Sample

17.0

17.0

17.0

17.0

17.0

17.0








Eligible Units (occupied housing units)







Number

43,700

43,700

43,700

14,800

14,800

14,800

Percent of Total Sample

83.0

83.0

83.0

83.0

83.0

83.0








Type A Noninterviews







Number

25,300

25,300

25,300

8,600

8,600

8,600

Percent of Eligible Units

58.0

58.0

58.0

58.0

58.0

58.0








Completed Interviews







Number

18,400

18,400

18,400

6,200

6,200

6,200

Percent of Eligible Units (Response Rate)

42.0

42.0

42.0

42.0

42.0

42.0


As the table above shows, 83% of the sample addresses are expected to have occupied housing units, and the other 17% are expected to have unoccupied housing units, or to be addresses that are nonexistent, nonresidential, vacant, demolished, etc. Such addresses are called “Type B/C” noninterviews. Then for CEQ, 42% of the occupied housing units are expected to complete an interview, and the other 58% are expected to be “Type A” noninterviews, which are occupied housing units that do not complete an interview. That is expected to yield 18,400 quarterly interviews per year in 2024-2026. For CED, 42% of the occupied housing units are expected to complete an interview, and the other 58% are expected to be “Type A” noninterviews. That is expected to yield 12,400 (= 6,200 × 2) weekly diaries per year in 2024-2026.


Nonresponse Bias

In 2022 CE staff completed a nonresponse bias study to determine whether the CEQ and CED surveys’ nonrespondents were “missing completely at random” (MCAR), and whether their missing-ness generated any bias in the published expenditure estimates over the ten-year period 2010-2019. The study was undertaken in response to an OMB directive, and it concluded that the nonrespondents were not MCAR, and the amount of bias they generated was small.


The MCAR part of the study had four sub-studies. They found different demographic groups had different response rates; respondents had different demographic characteristics than the American population as a whole; respondents’ demographic characteristics changed over time; and a mathematical model predicting response rates had parameters on many of its demographic variables that were statistically significant. Overall, all four sub-studies indicated that CE’s nonrespondents were not MCAR. The most significant finding within these four sub-studies was that high-income households had lower response rates than low-income households, which is a concern because CE is an economic survey that focuses on expenditures, and income is correlated with expenditures.


The bias part of the study also had four sub-studies. They examined four different nonresponse weighting adjustment procedures to get an idea of the range of possible values that the “correct” nonresponse-adjusted expenditure estimates might have. All four procedures increased the CEQ’s expenditure estimates by about one percent from its base-weighted (i.e., unadjusted) values, and all four procedures decreased the CED’s expenditure estimates by about one percent from its base-weighted values. Thus in both surveys CE’s expenditure estimates would have been biased by about one percent if the nonresponse weighting adjustment procedure had been ignored. The consistency of all four nonresponse bias estimates within each survey suggests that the results are robust.


So, overall, the study showed that CE’s nonresponse weighting adjustment procedure is working well. The nonrespondents are not MCAR, but the amount of bias they generate is small, and the nonresponse weighting adjustment procedure is doing a good job compensating for the bias. The study provided a counterexample to the commonly-held belief that if a survey’s data are not missing completely at random then its estimates are subject to nonresponse bias.


For more information on the calculation of response rates, see the memorandum from Sharon Krieger to David Swanson on “2021 Response Rates for the Interview and Diary Surveys” (Attachment T). For more information on the nonresponse bias studies, see “A Nonresponse Bias Study of the Consumer Expenditure Survey for the Ten-Year Period 2010-2019” (Attachment U).




2. Describe the procedures for the collection of information including:

  • Statistical methodology for stratification and sample selection;

  • Estimation procedure;

  • Degree of accuracy needed for the purpose described in the justification;

  • Unusual problems requiring specialized sampling procedures; and

  • Any use of periodic (less frequent than annual) data collection cycles to reduce burden.


Field representatives (FRs) from the U.S. Census Bureau, under contract with BLS, collect data from CE’s sample households both in-person and by telephone. Historically, the preference has been to collect data in-person, but during the COVID pandemic interviewing by telephone became the primary way of collecting data. The reason was to prevent the spread of the COVID virus and safeguard the health of CE’s field representatives and the people in the sample households. In 2021 approximately 30% of CEQ’s interviews were conducted in-person and 70% were conducted by telephone; and for CED approximately 60% of the interviews were conducted in-person and 40% were conducted by telephone. This practice will continue for the foreseeable future. See Attachment F - CED Advanced Letter Procedures and Diary Email Template for additional information on modifications resulting from COVID.


FRs visit or phone each household in the CEQ’s sample every 3 months for 4 consecutive quarters to collect information on the expenditures the households made during the previous 3 months. After participating in the survey for 4 quarters, the household is dropped from the survey and replaced by another household. The households in the CEQ survey are on a rotating schedule with approximately one-fourth of the households in the sample being new to the survey each quarter.


Prior to the first visit, the sample households are sent an advanced letter informing them that they have been selected for the survey and asking for their cooperation. For subsequent visits in the CEQ survey, the households are sent an advanced letter reminding them that it has been 3 months since they last participated in the survey and asking for their cooperation again. Field representatives enter the household’s responses into a laptop computer.


For the CED survey, field representatives visit or telephone each household in the sample two times to collect information on the expenditures they make during a 2-week period.


On the first visit in the CED survey, the field representatives introduce themselves, explain the survey, and help the households choose between filling out the diaries on paper or online. Households choosing to fill out the diaries on paper are given two weekly diary forms, one for each week of the survey period, while households choosing to fill out the diaries online are given an electronic link to the diary and an Online Diary User Guide. Households are asked to record all the expenditures they make over the 2-week survey period. For the households filling out the diaries on paper, the field representatives make a second visit to pick up the completed diaries, and thank them for participating in the survey. All the households are dropped from the survey after their 2-week period and replaced by other households.


During the COVID pandemic, procedures were modified to allow field representatives to contact households by telephone in lieu of personal visits. Whichever way the households are initially contacted, the field representatives give them three options for filling out the diaries: mailing them a diary form that allows them to fill it out by hand; emailing them a link to a diary form that allows them to fill it out online; or calling them on the telephone and having them report their expenditures orally.


After completing the second week of the CED survey and the fourth quarter of the CEQ survey, the households are sent a Thank You letter and a certificate of appreciation for their participation in the survey.


Estimation

The primary statistic calculated by the CE survey is the average annual expenditure per consumer unit. It is a weighted average whose calculation follows well-established statistical principles. The final weight for each sample CU is the product of its base weight (which is the inverse of the CU’s probability of selection); a nonresponse adjustment factor to account for noninterviews; and a calibration adjustment factor to post-stratify the weights to account for population undercoverage. A typical base weight for a CU in the CEQ is approximately 10,000, which means it represents 10,000 CUs – itself plus 9,999 other CUs that were not selected for the survey. A typical final weight is approximately 30,000, which means it represents 30,000 CUs in the population – itself plus29,999 other CUs that were not selected for the survey and/or did not participate in the survey.


For additional information on CE’s sample design and estimation methodology, please see “Chapter 16, Consumer Expenditures and Income” in the BLS Handbook of Methods (Attachment V); see the memorandum from Adam Safir to Jennifer Epps on “CE sample redesign PSU Memo for Census.,” July 21, 2023 (Attachment S); and Lauren Vermeer and Sharon Krieger’s memo on “Response Rate Computations for the Consumer Expenditure Survey” (Attachment W).



3. Describe methods to maximize response rates and to deal with issues of non-response. The accuracy and reliability of information collected must be shown to be adequate for intended uses. For collections based on sampling, a special justification must be provided for any collection that will not yield "reliable" data that can be generalized to the universe studied.


Keeping the CEQ’s and CED’s response rates as high as possible requires special efforts, particularly from the Census Bureau’s field staff. The field staff are trained in a variety of techniques designed to persuade people to participate in the survey, such as “refusal conversion” techniques which are designed to change the minds of people who are hesitant to participate in the survey. If someone continues to refuse to participate in the survey, the field office sends a letter trying to persuade them to participate in the survey and a senior interviewer or supervisory field representative is assigned to the case for more refusal conversion efforts. Of course, refusal conversion efforts take time and cost money, so regional office staff try to decide which cases to work on and how much effort to put into them based on cost-effectiveness considerations.


Special computer processing techniques are also used in the CEQ to reduce respondent burden, which in turn helps keep response rates up. For example, some data collected in one interview are carried forward to subsequent interviews, such as data on household members and their personal characteristics, along with data on their properties, mortgages, vehicles, and insurance policies. Minimizing respondent burden, including interview length, are important factors in the effort to keep response rates up.


When field staff still cannot convert noninterviews to interviews, the estimation process has a noninterview adjustment to account for them. As mentioned above, every CU in the sample has a base weight equal to the number of CUs in the population it represents. In this process the respondent CUs have their weights increased to account for the nonrespondent CUs. The total sample of CUs (both respondents and nonrespondents) is partitioned into 192 subsets based on their region, CU size, income, and number of contact attempts.3 Then within each subset the base weights of the respondents are increased by multiplying them by a factor equal to the sum of the base weights for all CUs (both respondents and nonrespondents) divided by the sum of the base weights from just the respondent CUs. This makes the final weights of the respondents add up to the total number of CUs in the population.




4. Describe any tests of procedures or methods to be undertaken. Testing is encouraged as an effective means of refining collections of information to minimize burden and improve utility. Tests must be approved if they call for answers to identical questions from 10 or more respondents. A proposed test or set of tests may be submitted for approval separately or in combination with the main collection of information.


CE plans to perform the following tests if funding is available and will submit an Information Collection Reques (ICR), as needed:


Records Path and Incentives Feasibility Test ~ 2024

The purpose of this project is to develop and test the proposed records path for select sections in the CE Interview survey in addition to testing protocols for providing a targeted, promised incentive for respondents to use specific records during their interview. The results of this test will attempt to determine the feasibility and impact of the proposed records path and associated incentives protocol in the CE survey. The test results will improve BLS’ understanding of the operational issues underlying the implementation of a records path with targeted incentives, including respondent and interviewer reactions, impact on interview time, and associated data quality.



Conduct a Diary Performance-Based Incentives Field Test ~ 2025

The purpose of this project is to develop and test field performance-based incentives in the diary survey with a focus of improving response, engagement, and quality. Previous results have shown that performance-based incentives are effective in increasing sample in the Interview survey, albeit moderately. However, no such test has been done of performance-based incentives in an independent diary. Further, previous results have raised some concerns that performance-based incentives may introduce bias (though the finding was not significant) and this should be evaluated in the context of the diary survey.




5. Provide the name and telephone number of individuals consulted on statistical aspects of the design and the name of the agency unit, contractor(s), grantee(s), or other person(s) who will actually collect and/or analyze the information for the agency.


The sample design is a joint effort between BLS and the Census Bureau, with the two bureaus focusing on different aspects of the sample design. BLS focuses on the PSUs, and the Census Bureau focuses on the households. For more information on the sample design or the data collection effort, you may contact the following individuals.


Sample Design:

David Swanson (BLS)

James Farber (Census)

(202) 691-6917

(301) 763-1844

Data Collection:

Janel Brattland (BLS)

Jennifer Epps (Census)

(202) 691-5427

(301) 763-5342


1 Unrelated people who share a housing unit are considered to be separate CUs if they are responsible for paying their own expenses in at least two of these three categories: food, shelter, and all other expenses. Likewise college students living away from home are considered to be separate CUs from their parents if they are responsible for paying their own expenses in at least two of these three categories.

2 The number of CUs comes from combining information about the total number of housing units in the Census Bureau’s sampling frames (the MAF) with observations made by CE’s field representatives about the number of CUs living in those housing units. CE’s observations in the field show the average number of CUs per occupied housing unit is approximately 1.015. For every 1,000 occupied housing units there are approximately 1,015 CUs. The number of CUs per stratum shown in the table below comes from allocating the nationwide total of 135 million CUs by the number of people living in each stratum according to the 2020 census.

3 There are 4 regions of the country, 4 CU size classes, 3 income classes, and 4 contact attempt classes, making 192 = 4 x 4 x 3 x 4 subsets into which the sample is partitioned. For nonrespondents the number of people in the CU is obtained from data collected in previous interviews or from talking to their neighbors. For all CUs (both respondents and nonrespondents) their income is estimated from a publicly available database from the IRS which has the average household income by zipcode. In the nonresponse adjustment process every CU is assumed to have its zipcode’s average income value.

3


File Typeapplication/vnd.openxmlformats-officedocument.wordprocessingml.document
File TitleChanges in section A
AuthorFRIEDLANDER_M
File Modified0000-00-00
File Created2023-12-12

© 2024 OMB.report | Privacy Policy