Consumer Expenditure Surveys
1220-0050
November 2016
Supporting Statement
Consumer Expenditure Surveys: Quarterly Interview and Diary
B. Collection of Information Employing Statistical Methods
Sampling Method
The Consumer Expenditure (CE) Survey is a nationwide household survey conducted jointly by the U.S. Bureau of Labor Statistics and the U.S. Census Bureau to find out how Americans spend their money. Its data are collected from a representative sample of households drawn in a two-stage sampling design. In the first stage a representative sample of counties from around the United States is selected for the survey. In the second stage a representative sample of households is selected from those counties. This two-stage sampling process is designed to generate a sample of households for which every wealth level is well-represented in the survey. The rest of this section describes these two sampling stages in more detail.
For more details, please refer to the paper by Danielle Neiman et. al., “Review of the 2010 Sample Redesign of the Consumer Expenditure Survey” (Attachment V); or “Selecting a Sample of Households for the Consumer Expenditure Survey” by Susan King (Attachment P).
Consumer Units
A consumer unit (CU) is the unit from which the CE seeks expenditure reports. It consists of 1) all members of a housing unit who are related by blood, marriage, adoption, or some other legal arrangement such as foster children; 2) two or more unrelated people living together who pool their incomes to make joint expenditure decisions; 3) a single person sharing a housing unit with unrelated people but who is financially independent of them; or 4) a person living alone.1 There are approximately 128 million CUs in the CE’s universe, and approximately 97 percent of all occupied housing units are occupied by a single CU.2
The following table shows the estimated number of CUs in all 91 strata from which CE’s PSUs were selected.3 Please see Section 2 below entitled “Primary Sampling Units (PSUs)” for more information.
Estimated Number of CUs in CE’s 91 Strata
Stratum Code |
Estimated Number of CUs in the Stratum |
S11A |
1,887,339 |
S12A |
8,112,274 |
S12B |
2,473,117 |
S23A |
3,922,393 |
S23B |
1,781,143 |
S24A |
1,388,373 |
S24B |
1,155,728 |
S35A |
2,336,674 |
S35B |
2,306,991 |
S35C |
2,191,776 |
S35D |
1,153,879 |
S35E |
1,123,717 |
S37A |
2,664,186 |
S37B |
2,454,491 |
S48A |
1,738,291 |
S48B |
1,054,479 |
S49A |
5,318,591 |
S49B |
1,797,370 |
S49C |
1,751,542 |
S49D |
1,426,079 |
S49E |
1,283,258 |
S49F |
563,955 |
S49G |
216,890 |
N11B |
2,075,306 |
N11C |
1,755,305 |
N12C |
1,685,635 |
N12D |
1,444,057 |
N12E |
1,627,362 |
N12F |
1,476,875 |
N23C |
1,407,856 |
N23D |
1,350,685 |
N23E |
1,558,206 |
N23F |
1,350,080 |
N23G |
1,626,948 |
N23H |
1,621,504 |
N23I |
1,552,658 |
N23J |
1,420,920 |
N24C |
1,232,971 |
N24D |
1,178,558 |
N24E |
1,363,274 |
N24F |
1,222,144 |
N35F |
1,258,315 |
N35G |
1,095,713 |
N35H |
1,255,291 |
N35I |
1,056,840 |
N35J |
1,282,928 |
N35K |
1,093,284 |
N35L |
1,281,533 |
N35M |
1,064,952 |
N35N |
1,207,732 |
N35O |
1,134,426 |
N35P |
1,285,451 |
N35Q |
1,062,611 |
N36A |
1,048,734 |
N36B |
1,029,656 |
N36C |
1,086,449 |
N36D |
1,161,406 |
N36E |
1,057,350 |
N36F |
993,880 |
N37C |
1,009,958 |
N37D |
1,166,194 |
N37E |
1,054,532 |
N37F |
1,013,583 |
N37G |
1,070,049 |
N37H |
1,142,633 |
N37I |
1,086,616 |
N37J |
1,182,361 |
N48C |
1,338,251 |
N48D |
1,544,012 |
N48E |
1,592,281 |
N48F |
1,329,461 |
N49H |
2,159,289 |
N49I |
2,140,759 |
N49J |
1,916,748 |
N49K |
1,809,097 |
R11D |
270,615 |
R12G |
342,390 |
R23K |
665,686 |
R23L |
560,289 |
R24G |
762,030 |
R24H |
641,689 |
R35R |
639,707 |
R35S |
768,510 |
R36G |
649,952 |
R36H |
583,304 |
R37K |
545,339 |
R37L |
658,333 |
R48G |
199,687 |
R48H |
165,559 |
R48I |
185,479 |
R49L |
296,175 |
Total |
128,000,000 |
Response Rates
The following table shows expected annual sample sizes in 2017-2019 for the Quarterly Interview Survey (CEQ) and the Diary Survey (CED). Each year the sample for the CEQ will include 48,000 addresses, and the sample for the CED will include 12,000 addresses. From these addresses 13% are expected to be “Type B/C” noninterviews, which are addresses that are not occupied housing units (they are nonexistent, nonresidential, vacant, demolished, etc.); and the other 87% are occupied housing units. Of those occupied housing units, approximately 37% are expected to be “Type A” noninterviews, which are occupied housing units that do not participate in the survey; and the other 63% are expected to be housing units with completed interviews. This is expected to yield approximately 26,300 completed interviews in the CEQ and approximately 13,200 (= 6,600 × 2) weekly diaries in the CED per year.
The response rates shown below are the CEQ’s and CED’s actual response rates over the past five years (2010-2014) minus 5 percentage points. Response rates have been decreasing over time, so the 5-year historical response rates are reduced by 5 percentage points to account for the downward trend.
The sample sizes shown below for 2017-2019 are the annual number of quarterly interviews for CEQ, and the annual number of bi-weekly diaries for CED.
Category |
Quarterly Interview |
Diary |
|
|
|
Total Sample Size (addresses) |
48,000 |
12,000 |
|
|
|
Type B and C Noninterviews (vacant, demolished, etc.) |
|
|
Number |
6,200 |
1,600 |
Percent of Total Sample |
13.0 |
13.0 |
|
|
|
Eligible Units (occupied housing units) |
|
|
Number |
41,800 |
10,400 |
Percent of Total Sample |
87.0 |
87.0 |
|
|
|
Type A Noninterviews |
|
|
Number |
15,500 |
3,800 |
Percent of Eligible Units |
37.0 |
37.0 |
|
|
|
Completed Interviews |
|
|
Number |
26,300 |
6,600 |
Percent of Eligible Units (Response Rate) |
63.0 |
63.0 |
Starting in 2015 the CEQ and CED began drawing their samples of addresses from a new sampling frame called the Master Address File (MAF), which is basically a list of all addresses from the 2010 census, and it is updated twice per year with information from the U.S. Postal Service’s Delivery Sequence File. The CEQ and CED do not have much experience with the MAF, but the ACS has more experience, so the estimated Type B/C rate of 13% comes from ACS’s experience.
For more information on the calculation of response rates, see the memorandum from Sharon Krieger and David Swanson on “Response Rates in the Consumer Expenditure Survey” (2015) (Attachment Q).
In 2008 CE staff conducted a nonresponse bias study to determine whether the missing data from nonrespondents generated any bias in the CEQ’s published estimates. Their study was undertaken in response to an OMB directive. Results from four individual studies were synthesized, and they concluded that no bias was generated in spite of the fact that CE’s data are not “missing completely at random (MCAR).” As they said, “the results from these four studies provide a counterexample to the commonly held belief that if a survey’s data are not missing completely at random then its estimates are subject to nonresponse bias.” For more information, see “Assessing Nonresponse Bias in the Consumer Expenditure Interview Survey” (Attachment R).
2. Collection Methods
Under contract with BLS, field representatives from the U.S. Census Bureau personally visit the households in the Interview and Diary surveys’ samples to collect the data. Prior to the first household visit, respondents are sent an advanced letter informing them that they have been selected for the survey and asking them for their cooperation. For subsequent household visits in the Interview survey, respondents are sent an advanced letter reminding them that is has been 3 months since they last participated in the survey and asking for their cooperation again.
Field representatives visit each household in the Interview survey’s sample every 3 months for 4 consecutive quarters to collect information on the expenditures they made during the previous 3 months. The field representatives enter the household’s responses into a laptop computer. After participating in the survey for 4 quarters, the household is dropped from the survey and replaced by another household. The households in the Interview survey are on a rotating schedule, with approximately one-fourth of the households in the sample being new to the survey each quarter.
For the Diary survey, field representatives visit each household in the sample three times to collect information on the expenditures they make during a 2-week period. On the first visit the field representatives introduce themselves, explain the survey, and leave a diary in which the household members are asked to record all their expenditures for a 1-week period. On the second visit, the field representatives pick up the first week’s diary, ask whether there are any questions, and leave another diary for the second week. On the third visit, the field representatives pick up the second week’s diary and thank the household for participating in the survey. After participating in the survey for two weeks, the household is dropped from the survey and replaced by another household.
The Diary survey’s data collection procedure will be the same in 2017-2019 except for two changes. Starting in 2017 the field representatives will leave both diaries on the first visit instead of leaving one diary at a time. This is called “double placement.” It reduces data collection costs by eliminating the visit between the first and second weeks, and research shows it has no effect on the quality of the data. The other change is to the diary’s “placement window,” which is the time period allotted to the field representatives to leave or “place” the diaries with the respondents. Before 2017 each household was assigned an “earliest placement day,” which is a specific day of the month, and the field representatives were required to place the first week’s diary within seven days of that date. However, starting in 2017 the field representatives will be given a “placement month” and will be allowed to place the diaries with the respondents anytime during the month. It is hoped this increased flexibility in the placement of diaries will increase the Diary survey’s response rate. Again, research shows that it should have no effect on the quality of the data.
After completing the second week of the Diary survey and the fourth quarter of the Interview survey, the households are sent a Thank You letter and a certificate of appreciation for their participation in the survey.
Primary Sampling Units (PSUs)
The primary sampling units (PSUs) used in the CEQ and CED are small clusters of counties. The number of counties in the PSUs selected for the sample ranges from 1 to 29 with the average number being 5. The set of sample PSUs used in the two CE surveys consist of 91 PSUs, 75 of which are also used in the Consumer Price Index (CPI). The 91 PSUs fall into three categories:
PSU “size class” |
Number of PSUs |
Description |
S |
23 |
Large Metropolitan Core Based Statistical Areas (self-representing PSUs) |
N |
52 |
Small Metropolitan Core Based Statistical Areas and Micropolitan Core Based Statistical Areas (non-self-representing PSUs) |
R |
16 |
Non-Core Based Statistical Areas (non-self-representing PSUs) |
The BLS selected these PSUs from a stratified sampling design in which the non-self-representing PSUs (the N and R PSUs) were stratified using a 4-variable model whose independent variables were latitude, longitude, median household income, and median household property value. Then one PSU was randomly selected from each stratum with its probability of selection being proportional to its population. For more information on the stratification, please see the paper from Susan King on “Selecting a Sample of Households for the Consumer Expenditure Survey” (Attachment P). Also, for an overview of the CE sample design and the CU selection process, please refer to the memorandum from Jay Ryan on “PSUs for the Consumer Expenditure Survey’s 2010 Census-Based Sample Design” (Attachment T).
Sampling Within PSUs
CE selects its sample of households from the U.S. civilian non-institutional population, which includes people living in houses, condominiums, apartments, and people living in group quarters such as college dormitories or boarding houses. However, it excludes both the non-civilian and institutional portions of the population, such as military personnel living on base, nursing home residents, and prison inmates.
Addresses for the CEQ and CED are selected from two sampling frames maintained by the Census Bureau: the Unit and Group Quarters (GQ) frame. Both frames are derived from the Master Address File (MAF), which is basically a list of all residential addresses identified in the 2010 census and is updated twice per year with information from the U.S. Postal Service. The Unit frame is the larger of the two frames and it contains both existing housing units and new housing units. It has approximately 99% of the MAF’s civilian non-institutional addresses and is updated twice per year. The GQ frame is also derived from the MAF but it is much smaller; it has the remaining 1% of the civilian non-institutional addresses and is updated every three years.
A “systematic sample” of households is selected from the two frames in each PSU. The first step in the selection process is sorting the households by variables that are correlated with their expenditures. The purpose of the sort is to ensure that households of every wealth level are well-represented in the sample. The first household in the systematic sample is selected from the sorted list using a random number generator. Then after the initial household is selected every k-th household down the list is selected where “k” is the PSU’s sampling interval. The Unit and GQ frame have different sorting variables, but they have the same sampling interval.
For the Unit frame, the sorting or “stratification” variable is created from the number of occupants in each household, their housing tenure (owner/renter), and the market value of their homes (for owners) or the rental value of their apartment or home (for renters.) These variables are used because they are correlated with expenditures: households with more people tend to be wealthier than those with fewer people; homeowners tend to be wealthier than renters; and people living in high-price housing units tend to be wealthier than those in low-price housing units.
In Table 1 below, all the renters are at one end of the stratification and all the owners are at the other end of the stratification. The renters and owners are further subdivided into quartiles based on monthly rental and property values in order to ensure that households of every wealth level are well represented in the survey. Vacant housing units are put in the middle column for the number of household occupants because although they were vacant at the time of the decennial census, when CE’s field representatives visit them most will be occupied and they could be in any of the four non-zero categories. Thus the middle column is their “expected” location. Each cell is assigned a stratification code value, and all addresses in the Unit frame fall into one of these cells. The stratification code is a surrogate for sorting by expenditures.
Table 1. CE Unit Frame Stratification Code Values
Renter/Owner Quartile |
Number of Occupants |
||||
|
1 person |
2 persons |
Vacant |
3 persons |
4+ persons |
Renters 1st Quartile |
10 |
11 |
12 |
13 |
14 |
Renters 2nd Quartile |
25 |
24 |
23 |
22 |
21 |
Renters 3rd Quartile |
30 |
31 |
32 |
33 |
34 |
Renters 4th Quartile |
45 |
44 |
43 |
42 |
41 |
Owners 1st Quartile |
50 |
51 |
52 |
53 |
54 |
Owners 2nd Quartile |
65 |
64 |
63 |
62 |
61 |
Owners 3rd Quartile |
70 |
71 |
72 |
73 |
74 |
Owners 4th Quartile |
85 |
84 |
83 |
82 |
81 |
Other |
|
|
99 |
|
|
To draw a systematic sample in the Unit frame, the addresses are sorted by PSU, Federal Information Processing Standards (FIPS) State code, FIPS County code, CE stratification variable (described above), Census Tract code, Census Block code, Street name, Street number, and MAFID code.
To draw a systematic sample in the Group Quarters frame, the addresses are sorted by PSU, FIPS State code, FIPS County code, Census Tract code, CHPCT, and Census Block code. The variable CHPCT is the “percent of college housing.” Research on the college housing population shows it is very different than the rest of the civilian non-institutional population in the GQ frame, so using it as a stratification variable produces a more representative systematic sample of GQ housing.
For more information on sampling within PSUs for the CE Surveys, please refer to the paper from Susan King on “Selecting a Sample of Households for the Consumer Expenditure Survey” (Attachment P).
Estimation
The estimation procedure for both the CED and CEQ follow well-established statistical principles. The final weight for each sample CU is the product of its base weight (which is the inverse of the CU’s probability of selection); an adjustment to account for noninterviews; and a calibration adjustment that post-stratifies the weights to account for population undercoverage. A typical base weight for a CU in the CEQ is approximately 10,000, which means it represents 10,000 CUs – itself plus 9,999 other CUs that were not selected for the survey. A typical final weight is approximately 18,000, which means it represents 18,000 CUs – itself plus 17,999 other CUs that were not selected for the survey and/or did not participate in the survey.
For additional information on the sample design and estimation methodology used in the CE surveys, please refer to “Chapter 16, Consumer Expenditures and Income” in the BLS Handbook of Methods (Attachment S); Jay Ryan’s memo to Richard Schwartz, “PSUs for the Consumer Expenditure Survey’s 2010 Census-Based Sample Design,” December 18, 2012 (Attachment T); and Ruth Ann Killion’s memo to Jay Ryan, “Consumer Expenditure Surveys Sample Allocation for Interview Year 2016,” February 11, 2015 (Attachment U).
3. Methods to Maximize Response Rates
In the CE Surveys, keeping the noninterview rate at a low level requires special efforts, particularly from the Census Bureau Field staff. For each refusal case, the regional office sends a special letter to the address and assigns the case for follow-up by the program supervisor, supervisory field representative, or senior interviewer, taking into account time and cost considerations.
To adjust for those noninterviews that the field staff cannot convert to interviews, the sample design provides for a noninterview adjustment in the estimation procedure. The computer processing employs special techniques in the CEQ to reference data provided in the previous interview, to keep recall problems and interview time to a minimum.
4. Testing
Plans
Subject to resource availability, CE plans to
conduct the following studies (prior to the expiration of the
clearance). Ideally these studies will utilize non-production
sample, but funding may necessitate the use of production sample for
some tests. A Non-Substantive Change Request (NCR) or full
package will be submitted for all of the proposed studies should
funding and resources become available.
Test |
Survey |
Description |
Online Diary Implementation Test |
Diary |
The purpose of this project is to contract with an outside vendor to test CE's online diary under predetermined protocol conditions to inform decisions about implementing the online diaries in CE production and maintaining the online diaries the CE redesign. |
Optimal Contact Threshold Field Test (Census) |
Interview |
This test builds on a 2015 analysis of the optimal threshold number of contact attempts. The evaluation criteria included cost savings, indicators of reporting quality, and response rates. The findings confirmed earlier results suggesting seven as the optimal threshold for contact attempts. That is, attempting contact for sample units beyond that threshold was costly; did not substantively impact sample characteristics; did not improve measurement error as assessed by reporting quality indicators; and increased response rates but without improving sample composition with respect to household size, urbanicity, and homeownership status (and worsened sample composition if sample units’ reluctance/concerns about survey participation was included as another characteristic of interest). The purpose of this project is to evaluate the cost savings, reporting quality, and response rate impact of implementing a seven contact attempt threshold, in a large sample size field test setting, for difficult-to-interview consumer units that display doorstep concerns related to hostility. |
Gemini Large-Scale Feasibility Test |
Interview and Diary |
The purpose of this project is to field a test designed to closely reflect all of the components of redesign; incorporating lessons learned from the web and individual diaries test, the proof-of-concept test, incentives test, and cognitive lab studies and additional research. Findings from this test will be used in planning the dress rehearsal prior to full implementation of the redesign. |
CEQ Worksheet |
Interview |
CE plans to test the introduction of a CEQ Worksheet to accompany the Quarterly survey; an auxiliary worksheet that respondents can use to facilitate record keeping and reporting of certain expenditures. Since households participating in the CEQ are interviewed once every 3 months and asked to recall all of their expenses during the entire 3 month period, it may be helpful to provide them with some auxiliary worksheet that they can use during those months in between to help expedite their next interview. The objective is to develop an optional worksheet that respondents can use in between, as well as during, interviews.
|
5. Statistical Contacts
The Census Bureau will
collect the data. Within the Census Bureau, you may consult the
following individuals regarding their area of expertise for further
information.
Sample Design: Stephen Ash (301) 763-4294
Data Collection: Jennifer Epps (301) 763-5342
1 Unrelated people who share a housing unit are considered to be separate CUs if they are responsible for paying their own expenses in at least two of these three categories: shelter, food, and all other expenses. Likewise college students living away from home are considered to be separate CUs from their parents if they are responsible for paying their own expenses in at least two of the three categories.
2 The number of CUs comes from dividing the Census Bureau’s 2015 estimate of the number of people in the civilian non-institutional population (316 million) by the average number of people per CU (2.45).
3 The number of CUs per stratum comes from allocating the nationwide total of 128 million CUs by each stratum’s proportion of the nationwide population in the 2010 census.
File Type | application/vnd.openxmlformats-officedocument.wordprocessingml.document |
File Title | Changes in section A |
Author | FRIEDLANDER_M |
File Modified | 0000-00-00 |
File Created | 2021-01-23 |