Download:
pdf |
pdfThe memorandum and attached document(s) was prepared for Census Bureau internal use. If
you have any questions regarding the use or dissemination of the information, please contact
the Stakeholder Relations Staff at [email protected].
2020 CENSUS PROGRAM INTERNAL MEMORANDUM SERIES: 2019.15.i
Date:
April 25, 2019
MEMORANDUM FOR: The Record
From:
Deborah M. Stempowski (signed April 25, 2019)
Chief, Decennial Census Management Division
Subject:
2020 Census Evaluation: Reengineered Address Canvassing Study Plan
Contact:
Jennifer Reichert
Decennial Census Management Division
301-763-4298
[email protected]
This memorandum releases the final version of the 2020 Census Evaluation: Reengineered Address
Canvassing Study Plan, which is part of the 2020 Census Program for Evaluations and Experiments
(CPEX). For specific content related questions, you may also contact the authors:
Nancy R. Johnson
Decennial Statistical Studies Division
301-763-3639
[email protected]
Eric V. Slud
Center for Statistical Research and Methodology
301-763-4991
[email protected]
census.gov
2020 Census Evaluation:
Reengineered Address Canvassing
Operation Study Plan
Final
Author: Nancy Johnson, DSSD
Contributing: Eric Slud, CSRM
April 22, 2019
Version 1.1
Page intentionally left blank.
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
Table of Contents
I.
Introduction ......................................................................................................................... 1
II.
Background ......................................................................................................................... 1
III.
Assumptions...................................................................................................................... 12
IV.
Research Questions ........................................................................................................... 13
V.
Methodology ..................................................................................................................... 13
VI.
Data Requirements ............................................................................................................ 26
VII.
Risks.................................................................................................................................. 28
VIII.
Limitations ........................................................................................................................ 28
IX.
Issues That Need to be Resolved ...................................................................................... 29
X.
Division Responsibilities .................................................................................................. 30
XI.
Review/Approval Table .................................................................................................... 34
XII.
Document Revision and Version Control History ............................................................ 34
XIII.
Glossary of Acronyms ...................................................................................................... 35
XIV. References ......................................................................................................................... 36
XV.
Appendix ........................................................................................................................... 38
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
List of Tables
Table 1: Sampling Strata and Sample Sizes for Suppressed Addresses ................................ 17
Table 2: Sampling Strata and Sample Sizes for Salted Addresses ......................................... 19
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
I.
Introduction
In an effort to reduce costs, the U.S. Census Bureau has reengineered address canvassing for the
2020 Census to include a suite of In-Office and In-Field Address Canvassing operations (U.S.
Census Bureau, 2018b). Because these operations include new or revised methods, evaluating
their accuracy and effectiveness is important when considering either improvements for the 2030
Census or new approaches to conducting the census (JASON, 2016).
This evaluation will focus on selected components of the reengineered Address Canvassing
operation. Specifically, it will estimate certain types of errors that can occur during In-Field
Address Canvassing and will compare these estimates to results from previous studies. In
addition, the evaluation will investigate the effectiveness of the Interactive Review and of the InOffice Address Canvassing operation as a whole.
II.
Background
To conduct and tabulate the decennial census, the Census Bureau needs the address and physical
location of each living quarters in the United States and Puerto Rico. A complete and accurate
list ensures residents will be invited to participate in the census and that the census will count
residents in their correct locations. The Address Canvassing operations are key components in
the creation of an accurate address list.
Historically, Address Canvassing field staff, referred to as listers, traversed almost every block in
the United States and Puerto Rico, comparing their observations on the ground with the Census
Bureau’s address list. Listers verified or corrected addresses that were on the list, added new
addresses to the list, and deleted addresses that no longer existed. Listers also collected map spot
(coordinate) locations for each structure and added new streets. However, this method is
expensive. During the full In-Field Address Canvassing operation for the 2010 Census, 8,213
crew leaders managed 111,105 listers during production listing and 3,083 crew leaders managed
37,784 listers during quality control listing (U.S. Census Bureau, 2012) for a field execution cost
of about $445 million. Additional costs were incurred for field infrastructure and information
technology infrastructure support for a total cost of about $845 million (Holland, 2012).
As part of the revised design for the 2020 Census, the Census Bureau identified four major
innovation areas. The reengineered Address Canvassing operation is one of those innovation
areas. While Address Canvassing will cover the entire nation in 2020, the Census Bureau has
determined that a full In-Field Address Canvassing is no longer necessary (U.S. Census Bureau,
2017). Address Canvassing now includes a suite of operations, conducted both in the field and in
the office, that will update the address list and map data for the 2020 Census enumeration.
1
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
A. Address Canvassing
Overview
In-Office Address Canvassing is the process of using empirical geographic evidence (e.g., imagery,
and comparison of the Census Bureau’s address list to partner-provided lists) to assess the current
address list. This process detects and identifies change using high quality imagery, administrative
data, and third-party data sources to reduce the In-Field Address Canvassing workload. In-Office
Address Canvassing includes five components: Interactive Review (IR); Active Block Resolution
(ABR); Ungeocoded Resolution; In-Office Address Canvassing Group Quarters/Transitory
Locations; and Local Update of Census Addresses (LUCA) Address Validation. An additional
process monitors potential change through “triggers” as described in the section on Change
Monitoring. Each component includes a Quality Control (QC) process. All of these components
update the address list prior to enumeration operations.
Master Address File Updates
The Master Address File (MAF) serves as the base of the census frame and was first used to
support the 2000 Census operations. Each address in the MAF is designed to be linked to a
geographic location in the Topologically Integrated Geographic Encoding and Referencing
(TIGER) database, the Census Bureau’s spatial feature database. This linkage ensures that the
census data are processed and tabulated in the correct geographic location. After 2009, when the
decennial census conducted a large In-Field Address Canvassing operation to update the
MAF/TIGER systems, the Geographic Support System (GSS) was developed to provide current,
accurate, and complete address, feature, and boundary data. The Census Bureau currently
conducts several operations to validate and update the census address list in preparation for the
2020 Census. One of the operations uses the United States Postal Service’s Delivery Sequence
File (DSF), which contains a list of all updated address information and is used to update the
MAF twice a year (U.S. Census Bureau, 2017).
Interactive Review
In-Office Address Canvassing began in September 2015 with the Interactive Review (IR) of
blocks to identify coverage issues that exist in the MAF (where the MAF counts do not reflect
the housing units observed in current imagery) and to identify stability, growth, or decline in
housing compared to the 2010 Census residential landscape. During IR, staff reviewed blocks by
comparing baseline satellite images from the time of 2010 Census operations to current images
to assess the extent to which the number of addresses in the census address list—both housing
units (HUs) and group quarters (GQs)—was consistent with the number of addresses visible in
current imagery. The staff identified blocks that contain residential growth and decline, blocks
that contain MAF overcoverage and undercoverage 1, and the capacity of blocks to contain
additional living quarters in the future. After IR, each block received one of three high-level
statuses:
1
Overcoverage occurs when the address list contains an address that does not exist on the ground or when there are multiple
instances of an address for the same residential structure (that is, duplicates). Undercoverage occurs when the address list is
missing residential addresses that exist on the ground.
2
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
1. Passive – A passive status indicates that staff did not see any change in housing between
the baseline image and the current image. In addition, the number of housing units
observed in imagery matched the number of addresses in the MAF; staff detected no
overcoverage or undercoverage.
2. On Hold – A block may be placed on hold for one of several reasons (e.g., poor imagery,
cloud cover, or detected future growth).
3. Active – An active block has some indication of residential growth or decline since the
2010 Census, or coverage differences identified in IR. That is, the IR staff set one or
more electronic markers, or pins, during their review to indicate that the MAF/TIGER
System data are inconsistent with imagery. Therefore, the block requires further
assessment to either fix the coverage concerns with a MAF/TIGER System update or to
assign the block to In-Field Address Canvassing.
Blocks placed on hold may be triggered into re-review when new imagery becomes available.
Blocks designated as active were sent to ABR, where some were resolved, and some remained in
an active status.
Active Block Resolution
Active Block Resolution (ABR) seeks to research and update areas identified with growth,
decline, undercoverage of addresses, or overcoverage of addresses in the MAF. The ABR staff
use several data sources to update the MAF and to resolve IR Active blocks. The ABR program
was in place beginning in April 2016 and was discontinued in February 2017 because of funding
uncertainty and reprioritization of critical components of the 2020 Census. Because this program
was performed leading up to the 2020 Census Address Canvassing operation as a part of the InOffice Address Canvassing activities that began in 2015, it is considered part of the 2020 Census
Address Canvassing operational design.
Ungeocoded Resolution
Ungeocoded Resolution is an activity designed to resolve ungeocoded records (addresses that are
not assigned to a block) by adding or editing spatial features and address ranges in the
MAF/TIGER System.
3
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
In-Office Address Canvassing Group Quarters/Transitory Locations
In-Office Address Canvassing Group Quarters 2/Transitory Locations3 is an activity designed to
identify, validate, and update living quarters that the Census Bureau classified as a group quarters or
transitory location. Staff conduct research using administrative data, local Geographic Information
System data, public and commercial information, and, in some cases, phone calls that are made to
administrative contacts. Although this activity began in 2017, the Census Bureau decided to suspend
this component of In-Office Address Canvassing from March 2018 through September 2018 because
of budget constraints.
Local Update of Census Addresses Address Validation
Local Update of Census Address (LUCA) Validation is an In-Office Address Canvassing activity
that reviews address lists from outside partners. The LUCA program provides the opportunity for
tribal, state, and local governments to review and comment on the Census Bureau’s address list to
ensure an accurate and complete enumeration of their communities. The LUCA Validation is
designed to use office research to validate submissions provided by these entities.
Change Monitoring
A “trigger” is an automated process or event that provides information or data that suggests a
block should be re-reviewed through IR or sent for canvassing in the field. The temporal
difference between In-Field Address Canvassing for the 2020 Census and the first IR was the
reason behind the development of “trigger” events; In-Field Address Canvassing for the 2020
Census will occur several years after the first IR was completed. In an attempt to record changes
that occur within blocks after the latest IR review or ABR review, and to re-review blocks that at
some point cause uncertainty regarding their latest In-Office Address Canvassing status, some
blocks get “triggered” for re-review to determine whether the status of the block has changed.
In-Field Address Canvassing
In-Field Address Canvassing is an operation in which listers visit specific geographic areas—
Basic Collection Units (BCUs)—to identify every place where people could live or stay. Using
the Listing and Mapping Application (LiMA 4), listers compare what they see on the ground to
the existing census address list and either verify or correct the address and location information.
Listers knock on every door to verify address information, collect associated mailing address
2
Group quarters are places where people live or stay, in a group living arrangement, which is owned or managed by
an entity or organization providing housing and/or services for the residents. This is not a typical household-type
living arrangement. These services may include custodial or medical care as well as other types of assistance, and
residency is commonly restricted to those receiving these services. People living in group quarters are usually not
related to each other. Group quarters include such places as college residence halls, residential treatment centers,
skilled nursing facilities, group homes, military barracks, correctional facilities, and workers’ dormitories.
3
Transitory locations are recreational vehicle parks, campgrounds, hotels, motels, marinas, racetracks, circuses, and
carnivals.
4
The LiMA is an application that aids listers with constructing or updating address lists. It contains map information
that helps listers to find their canvassing assignments and provides a means to collect coordinate information for
each address.
4
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
information, and collect information about any additional housing units present at the address.
Listers also classify each living quarter as a housing unit, group quarter, or transitory location.
Quality Control
Quality Control (QC) is the process of reviewing the work of office and field staff. For the
operational design of 2020 Census Address Canvassing, QC is an integrated part of each of the
components that make up the Address Canvassing operation. The QC program for Address
Canvassing is responsible for devising a plan to ensure quality of the In-Office Address Canvassing
and In-Field Address Canvassing work. For the QC of In-Field Address Canvassing, this means
ensuring proper execution of duties by field staff. For In-Office Address Canvassing, QC strategies
include additional review and adjudication of work. It also includes a process for informing
individual analysts of errors that is intended to reduce their future error rate.
B. Filters
The decennial census operations 5 use snapshots of the MAF, known as extracts, to retrieve the
latest information on the nation’s addresses. A set of rules, called a filter, is applied to the MAF
in an attempt to maximize the number of valid MAF units, while minimizing the number of
invalid units on the MAF extracts. These extracts provide the basis for the address frames used in
census operations. The In-Field Address Canvassing Extract is sent to the Listing and Mapping
Application (LiMA) and forms the basis for the dependent address list, which the listers update
as they canvass the BCUs. In general, the In-Field Address Canvassing filter rules rely on
categorical variables such as when a unit was added to the MAF, the source of data that added
the unit, its residential status, and outcomes from past field operations to determine whether or
not an address is valid for its operation. Some components of the In-Office Address Canvassing
also use filters either in the creation of an address list or in producing counts of addresses in
blocks.
C. Post-Enumeration Survey
Because several of the evaluation research questions use data from the Post-Enumeration Survey
(PES), this section provides background on the PES operations.
Overview
To measure the coverage of the 2020 Census, the U.S. Census Bureau will conduct the PES. The
2020 PES will provide estimates of census net coverage error and components of census
coverage for housing units 6 and people living in housing units in each state, the District of
Columbia, and Puerto Rico, excluding remote Alaska. The components of census coverage
include correct enumerations, omissions, and erroneous enumerations, including duplicates.
5
The Census Bureau surveys and other estimate programs also use extracts from the MAF that are produced from
filters as defined by the needs of the surveys or programs.
6
Group quarters are out-of-scope for the PES.
5
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
The 2020 PES is a large, complex survey that collects housing unit and person information
(independent from the 2020 Census operations) for a sample of housing units in selected areas
across the country. The survey has two separate samples: 1) the Population sample (P sample)
and 2) the Enumeration sample (E sample). The P sample consists of independently listed
housing units within an area-based sample of BCUs. The source of the E sample is the census
housing units and census person enumerations in housing units geocoded to the sample of BCUs
selected for the P sample.
The PES includes several activities, such as:
• Selecting the BCU sample.
• Conducting the Independent Listing (IL).
• Subsampling small BCUs.
• Conducting the Initial Housing Unit (IHU) Matching and Followup (IHUFU).
• Identifying the P sample.
• Identifying the Person Interview (PI) Sample.
• Conducting the Person Interview.
• Identifying the E sample.
• Conducting Person Matching and Followup.
• Conducting Final Housing Unit (FHU) Matching and Followup (FHUFU).
• Estimation.
The next sections summarize the activities listed above. The majority of the description is a
summary from Kennel, 2019. For details on the PES sample design, see Hill et.al, 2019.
Selecting the BCU Sample
The first production operation will be the selection of the BCUs from the BCU frame, which
includes all BCUs for the 50 states, D.C., and Puerto Rico 7. (Note: For the remainder of this
document, references to state also include D.C. unless otherwise specified.) The BCU frame will
exclude BCUs that are in remote areas of Alaska determined by the census type of enumeration
areas (TEAs) and BCUs that are fully covered by water. The eligible BCUs will be stratified by
state, size, percent of homeowners, and an American Indian Reservation indicator.
Conducting the Independent Listing
Listers will canvass the sample BCUs and construct a list of housing units. Using the LiMA,
listers will identify the location of all housing units by collecting GPS coordinates. Group
Quarters are out of scope, but housing units within transitory locations will be listed.
Approximately 564,500 housing units are expected to be listed (541,000 housing units in the
U.S. and 23,500 housing units in Puerto Rico) in approximately 10,400 BCUs (10,000 in the
U.S. and 400 in Puerto Rico).
7
Puerto Rico is in the Update Leave Type of Enumeration Area and will not be listed by In-Field Address
Canvassing.
6
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
Subsampling Small BCUs
The Independent Listing results and preliminary census housing unit counts are used to stratify
and determine the subsampling rates for small BCUs. Small BCUs are subsampled using variable
rates to minimize variance on the coverage estimates and to increase field efficiency.
Initial Housing Unit Matching and Followup
The matching starts with a computer match of the independent listings against the census records
on the Enumeration Extract within each sample BCU and one ring of surrounding BCUs. A
match results in four possible outcomes for each address:
1) Matched
2) Possibly matched
3) Not matched
4) Potential duplicate
The possibly matched addresses, not matched addresses, and potential duplicates go to the Initial
Housing Unit Before Followup Clerical Matching. The National Processing Center (NPC)
matching staff will use computer-assisted clerical matching techniques, along with maps, to
review and attempt to resolve the match status of the possibly matched addresses and the not
matched addresses. In addition, the matching staff search for duplicate census addresses. Cases
that remain unresolved after the clerical matching will go to the Initial Housing Unit Followup.
In the Initial Housing Unit Followup, the interviewers will collect additional information that
may allow resolution of the match status, and they attempt to resolve the potential duplicates.
This field operation will include a QC component.
The NPC staff will use the information from the Initial Housing Unit Followup to match the
unresolved cases. The result of this operation is a file containing match codes for listed housing
units and census housing units in the sample BCUs.
Identifying the P Sample
The source of the P sample housing units are the IL units that are determined to be housing units
or potential housing units after the IHU Matching and Followup operations. In BCUs containing
57 or fewer IL housing units, all the IL units are in the P sample. For BCUs having 58 or more
IL housing units, a subsample of IL units is selected for the P sample. In the American Indian
Reservation stratum, all housing units are included in the P sample.
Identifying the Person Interview Sample
In this phase of sampling, the addresses that go to the Person Interview (PI) will be selected. For
a BCU with 57 or fewer housing units observed, all of the housing units will be included in the
PI sample. For a BCU with 58 or more housing units observed, a subsample of segments of
contiguous housing units will be selected.
7
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
After selecting the addresses, the PI sample is expected to be approximately 171,500 housing
units in the U.S. and 7,800 in Puerto Rico. The sample will be distributed among the states
roughly proportional to the population size. However, states with small populations and
American Indian Reservations will have slight increases in sample.
Person Interview
For each sample BCU, enumerators visit people in selected housing units. During the Person
Interview (PI) operation, enumerators use an automated instrument to obtain information about
the
• Current residents of the sample housing unit (name, sex, age, date of birth, race,
relationship, and Hispanic origin).
• People who moved out of the sample housing unit between Census Day and the time of
the PI (outmovers).
For QC purposes, the PI operation will include a reinterview.
Identifying the E sample
The E sample will be identified after the PI but before the Person Matching begins. A primary
goal of E sample identification is to identify changes (primarily adds and deletes) to the census
housing units in sample BCUs between In-Field Address Canvassing and the final census
universe.
The sampling frame for the E sample is the final list of valid housing units on the Census
Unedited File (CUF) that are in the same sample BCUs as the P sample. The CUF contains the
final inventory of census housing unit addresses and will likely differ from the preliminary list of
census addresses.
Person Matching and Followup
Before the Person Matching operations begin, an automated operation assigns a residence status
code to all people listed in the PI. In addition, an automated operation assigns geocodes to
alternate and inmover addresses collected during the PI. For addresses where the automated
operations cannot assign codes, clerical operations attempt to make the code assignments.
The Person Computer Matching will attempt to search for matches between people rostered at
the sample address during the PI and the people enumerated in the census in the sample BCU
and the surrounding ring of BCUs. Alternate and inmover addresses collected in the PI and
geocoded during automated or clerical geocoding provide other places to search for matches
between the PI roster and census people. In addition, the computer matching will conduct a
nationwide search for matches. However, matching people between Puerto Rico and the states is
out of scope. The computer match includes a search for duplicate people.
8
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
During the Person Before Followup Clerical Matching, the matching staff assign the status of
match, possible match, or nonmatch to the PI and census person records. In addition, the
matching staff searches for duplicates. Cases with an unresolved match status, enumeration
status, or residence status are eligible for the Person Followup (PFU).
During the PFU, interviewers contact cases to resolve issues. The PFU is sometimes conducted
outside the sample BCUs to followup on links found during the nationwide search.
The Person After Followup Clerical matching uses the information from the PFU in an attempt
to resolve the match, enumeration, or residence status.
Each component of the Person Matching and Followup includes QC checks.
Final Housing Unit Matching and Followup
The Final Housing Unit Computer Processing will determine which housing units will go to
Final Housing Unit Clerical Matching. These housing units are:
• Housing units added to the census after the preliminary list (i.e., Enumeration Extract)
was created.
• Listed housing units matched to a census unit that was deleted from the preliminary list.
In the Final Housing Unit Before Followup Clerical Matching, the NPC matching staff use
computer-assisted matching techniques, along with the Independent Listing maps and census
maps, to match, possibly match, or assign no matched codes to the addresses from the Final
Housing Unit Computer Processing. Unresolved addresses go to the Final Housing Unit
Followup.
In the Final Housing Unit Followup, interviewers collect information about the unresolved
addresses. The information collected will vary depending on the reason for the case going to
followup.
During the Final Housing Unit After Followup Matching, the NPC matching staff use the
information collected by the interviewers to attempt to resolve the status of the addresses. This is
the last step before PES estimation.
Each component of the Final Housing Unit Matching and Followup includes QC checks.
Post-Enumeration Survey Estimation for Housing Units
Note: Because this evaluation does not include any estimates for people, this section only
summarizes the estimation methodology for housing units.
The PES Estimation process consists of several operations, which lead to estimates of coverage
for both housing units and people in housing units. This process includes estimates of net
9
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
coverage and the components of census coverage. The estimation operations for housing units
consist of:
• Imputing missing data for the P sample housing units.
• Imputing missing data for the E sample housing units.
• Using the dual system estimate methodology to determine the housing unit net coverage
error.
• Computing the housing unit components of census coverage.
• Calculating uncertainty estimates (e.g., standard errors and mean squared errors).
D. Past Studies and Census Tests
This section describes past studies and census tests containing results relevant to this evaluation.
1990 Census Precanvass Suppression Study
For the 1990 Census Precanvass operation 8, a sample of addresses was suppressed from the
Precanvass Address Registers. These registers contained the addresses sent to listers for
verification and updating. The analysis then determined whether the listers added the suppressed
addresses or missed them.
The findings were as follows:
• The overall miss rate was 30.0 percent with a standard error of 1.6 percent.
• The miss rate for housing units suppressed from multiunit addresses was significantly
higher than for single unit addresses—45.2 percent and 24.3 percent, respectively. The
estimated standard errors were 4.1 percent for multiunits and 1.5 percent for single units.
• Of the assignment areas sampled, 55.1 percent contained at most one miss.
Census 2000 and the 2010 Census
An evaluation in Census 2000 (Smith, et.al, 2003) estimated the addresses that were correctly
added (and added-in-error) and correctly deleted or duplicated (and deleted-in-error 9) by the
enumeration operations 10. An evaluation of the 2010 Census (U.S. Census Bureau, 2013)
expanded on the Census 2000 evaluation by including a component to estimate the number and
percent of addresses that the 2010 Census Address Canvassing operation correctly added (and
added-in-error) and correctly deleted or duplicated (and deleted-in-error).
Using results from the 2000 Post-Enumeration Survey and the 2010 Census Coverage
Measurement, these two studies yielded the following results:
8
The 1990 Census Precanvass operation was a precensus activity that occurred in certain types of areas from May
1989 through July 1989. Field enumerators canvassed the areas to identify and add addresses missing from the
precanvass registers and to update existing addresses on the registers. This Precanvass operation was an earlier form
of the In-Field Address Canvassing operation.
9
Deleted-in-error also includes duplicated-in-error.
10
Examples of enumeration operations include Nonresponse Followup, Non-ID Processing, GQ Enumeration, etc.
10
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
•
•
•
In Census 2000, enumeration operations correctly deleted about 85.6 percent of
addresses, and in the 2010 Census, enumeration operations correctly deleted 74.2 percent
of addresses.
In Census 2000, enumeration operations correctly added 83.9 percent of addresses, and in
the 2010 Census, the enumeration operations correctly added 79.6 percent of addresses.
The 2010 Census Address Canvassing operation correctly deleted 95.7 percent of
addresses and correctly added 83.6 percent of addresses.
2015 Census Address Validation Test
The purpose of the 2015 Address Validation Test (AVT) was to assess the performance of
various methods to develop the 2020 Census address frame and to determine workloads for the
2020 canvassing operation; in other words, reengineering the address canvassing for the 2020
Census. The AVT occurred between September 2014 and February 2015. (U.S. Census Bureau,
2015)
One component, the MAF Model Validation Test (MMVT), consisted of a full-block canvassing
operation intended to assess the ability of a set of statistical models to predict blocks that have
experienced address changes that are not recorded in the MAF. If effective, the statistical models
would offer an inexpensive solution to the problem of determining which census blocks require
updates or which do not. To do this, the test collected address data in an address listing operation
in a national sample of 10,100 blocks. The analysis compared the results of the fieldwork to the
predictions from the statistical models.
Key observations from the MMVT include that the statistical models:
• Did a mediocre job of identifying specific blocks with many adds or deletes.
• Were not accurate for predicting national totals of MAF coverage errors.
• Could do reasonably well at initially screening or prioritizing blocks for in-office imagery
review.
2016 Address Canvassing Test
The Address Canvassing Test occurred during the fall of 2016 in two sites: Buncombe County,
NC, and a portion of St. Louis, MO. Both of these locations offered situations to gain insight on
how they handled both in-office and in-field operations. All blocks in both sites were canvassed
using both In-Field and In-Office Address Canvassing methods. (Snodgrass, et.al, 2018)
Two key findings from the Address Canvassing Test were as follows:
• In passive blocks, including ABR resolved blocks, there were inconsistencies when
comparing In-Field Address Canvassing address actions and In-Office Address
Canvassing.
• Of the actions taken by Active Block Resolution in active blocks, most were consistent
with In-Field Address Canvassing. Of the blocks “identified for fieldwork,” most had
either an add action or a negative action.
11
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
2016 Master Address File Coverage Study
The MAF Coverage Study (MAFCS) was a field activity intended to measure the coverage of the
census address list, to validate the In-Office Address Canvassing operation, and to provide
updates to the MAF. Based on funding uncertainty and reprioritization of critical 2020 Census
components, the MAFCS was discontinued in fiscal year 2017. Consequently, the 2016 MAFCS
report provided the only set of estimates generated from the MAFCS program. The coverage
measures from the MAFCS were for 2016 and were likely not indicative of the address coverage
for the 2020 Census because most of the frame updating procedures specific to the decennial
census had yet to start. (Williams, 2018)
Two findings from the 2016 MAFCS were as follows:
• For 2016, the national estimate of overcoverage was about 5.5 percent, and the estimate
of undercoverage was 6.6 percent.
• Blocks classified as Active in IR had an estimated 7.7 percent overcoverage and 9.8
percent undercoverage of addresses. In Passive blocks, however, the 2016 MAFCS
estimated 4.3 percent overcoverage and 4.8 percent undercoverage.
2018 Census End-to-End Test: Evaluation of Address Canvassing
The Evaluation of Address Canvassing was conducted in Providence, RI, one of the 2018 Census
End-to-End Test sites. The main objective of the evaluation was to quantify the extent to which
In-Office Address Canvassing correctly assigned BCU statuses as either active or passive.
(Johnson and McDougall, 2019)
The key findings were as follows:
• In-Office Address Canvassing correctly classified 71 percent of the active BCUs.
• Of all the passive BCUs in sample, an estimated 67 percent were correctly classified.
• An estimated 71 percent of the triggered BCUs were correctly identified as having a
change to the inventory of residential addresses.
III.
Assumptions
Below are assumptions that will enable successful completion of the design and methodology for
this evaluation.
1. The project team will obtain adequate funding to implement the evaluation as described
in this study plan.
2. The project team assumes that the Census Bureau will be able to obtain the services of a
contractor to support the implementation of Virtual Canvassing.
3. The Census Data Lake will contain 2020 Census operational data required for analysis.
4. Costs are tracked at a level that allows comparisons between the 2010 Census Address
Canvassing and the 2020 Census Reengineered Address Canvassing.
5. The PES design will provide enough sample in the evaluation domains of interest to
calculate reliable estimates.
12
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
IV.
Research Questions
Listed below are the research questions for this evaluation.
1. Enumeration operations: What percentage of the housing units added by the postAddress Canvassing operations (e.g., Non-ID processing, Nonresponse Followup, New
Construction, etc.) combined were correctly added and added-in-error?
2. In-Field Address Canvassing: What percentage of the housing units added during InField Address Canvassing were correctly added (and added-in-error)? What percentage of
the housing units identified as deleted or duplicated by the listers during In-Field Address
Canvassing were correctly deleted or duplicated (and deleted-in-error)?
3. In-Field Address Canvassing: What percentage of the suppressed housing units did the
listers add and miss adding? What percentage of the “salted” or false housing units did
the listers delete and miss deleting?
4. In-Office Address Canvassing: What percentage of the BCUs did the In-Office Address
Canvassing Interactive Review correctly classify as active and passive?
5. In-Office Address Canvassing: How accurate is the Virtual Canvassing?
6. In-Office Address Canvassing: Were the set of triggers sufficient for identifying
instances in which housing unit change occurred?
a. Were there instances of housing unit change that were not detected by triggers,
and as a result, did not send a block (or BCU) back to Interactive Review for
assessment or directly to an active status?
b. Did the set of triggers result in unnecessary work in the Interactive Review and
in-field?
c. What is the effectiveness of specific trigger reasons?
7. In-Office Address Canvassing: What is the effect on the enumeration of addresses
missed by In-Office Address Canvassing in the misclassified BCUs?
a. Were the missed addresses in the BCUs, which the Interactive Review
misclassified as passive, enumerated as valid, residential units?
b. What is the cost of incorrectly classifying BCUs?
8. How effective was the filter in identifying valid living quarters for the In-Field Address
Canvassing dependent address list?
9. What is the cost of the reengineered Address Canvassing operation compared to the cost
of a 100 percent In-Field Address Canvassing?
10. Can unit level modeling support improved targeting for Address Canvassing, specifying
filter criteria, estimating hidden units, or estimating recurring MAF coverage errors?
V.
Methodology
This section describes the methodology for answering the research questions.
13
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
A. Evaluation Design
Question 1 - Enumeration operations: What percentage of the housing units added by the postAddress Canvassing operations (e.g., Non-ID, Nonresponse Followup, etc.) combined were
correctly added and added-in-error?
Although this question is not about Address Canvassing, it was included, along with question 2
below, in the 2010 Census Evaluation of Address Frame Accuracy and Quality (U.S. Census
Bureau, 2013). In addition, an evaluation in Census 2000 (Smith, et. al., 2003) provided these
estimates. Because the data to answer this question are readily available from PES and
complement question 2, it will be included in this evaluation and will provide a comparison to
the percentages from Census 2000 and the 2010 Census.
Note that the evaluations from Census 2000 and the 2010 Census included estimates for the
percentages of housing units correctly deleted or duplicated and deleted-in-error. However,
unlike Census 2000 and the 2010 Census, PES will not match or conduct field followup for
housing units deleted or duplicated by the enumeration operations.
The “Adds” will consist of the housing units that exist on the Census Unedited File (CUF) that
did not exist on the Enumeration Extract 11. These units are part of the PES E sample.
The PES will classify the correctly added housing units as “correct enumerations” and will have
one of the following PES match codes 12 assigned:
• Match – The P sample and E sample housing units matched in the FHU operation.
• Correct Enumeration – The FHU Followup interview determined that the E sample
housing unit existed as a housing unit on Census Day and was correctly geocoded in the
BCU. The housing unit was not matched to a unit previously found by PES.
• Possible match - The code for a possible match was assigned when the E sample housing
unit was a possible match to a P sample housing unit, but the FHU followup interview
was inconclusive or incomplete.
The “Adds” added-in-error will consist of the housing units added to the census inventory that
the FHU did not find as housing units existing on Census Day. These cases will have one of the
following PES codes assigned to them:
• Not a housing unit – The FHU followup determined the E sample address was for a group
quarters, a business, or the unit was demolished, burned down, uninhabitable, or could
not be located.
• Duplicates – The E sample address was found to be a duplicate of another unit in the
census.
11
The Enumeration extract is a file that identifies the eligible addresses for the enumeration operations. It includes
the results of In-Field Address Canvassing.
12
When this study plan was drafted, the PES match codes were not final. Therefore, the PES match codes in
question 1 and question 2 are examples and may not list all the match codes that will be classified as correct or
incorrect enumerations.
14
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
•
Geocoding error – The E sample housing unit existed as a housing unit at the time of the
FHU followup interview but was incorrectly geocoded to the BCU. As a result, the PES
analysis considers the housing unit to be erroneously enumerated in the BCU.
After the FHU operation, the match and enumeration status for some housing units may remain
unresolved. The PES imputes an enumeration status for housing units missing a status. The
analysis will either report the unresolved housing units in their own category or re-impute a
match and enumeration status for these units.
The calculated estimates will use the PES weights or a modified version of the PES weights. The
analysis may need to modify the weights because it will not use the match code results in exactly
the same way as the PES analysis will use them. When possible, estimates will be provided for
various characteristics (e.g., urban/rural, size of housing unit) to examine whether add errors are
correlated with any characteristic.
Question 2 – In-Field Address Canvassing: What percentage of the housing units added during
In-Field Address Canvassing were correctly added (and added-in-error)? What percentage of the
housing units identified as deleted or duplicated by the listers during In-Field Address
Canvassing were correctly deleted or duplicated (and deleted-in-error)?
Like question 1 above, the PES will have the data to provide estimates of the correctly added and
added-in-error housing units for In-Field Address Canvassing. The 2010 Census evaluation
provided estimates of the correctly deleted units and the housing units deleted-in-error by
conducting matching and field followup on a sample of units identified during In-Field Address
Canvassing as deletes and duplicates. For the 2020 Census, the In-Field Address Canvassing
deleted units that pass the filter will go into the enumeration operations. This will give these
units a second look to verify their deleted status. The PES will include these units in its match
universe and follow up on questionable cases or potential matches. As a result, the evaluation
analysis will use the PES match codes to determine whether the units are correctly deleted or
deleted-in-error.
The universe for determining the correctly added addresses will consist of the housing units
having “Add” action codes from In-Field Address Canvassing in the PES sample BCUs. The
analysis will use the match codes from the PES Initial Housing Unit Matching and Followup
operations to determine whether the added housing units were correctly added or add-in-error.
The correctly added housing units will have one of the following PES match codes from the
Initial Housing Unit Matching and Followup operations:
• Match – The address having an “Add” action matched a P sample housing unit.
• Geocoded correctly – The address having an “Add” action matched a P sample housing
unit and was found in the sample BCU during the Initial Housing Unit follow up search
in the BCU and the surrounding BCU. It was correctly enumerated in the BCU. Note that
the PES analysis considers addresses found in the surrounding BCU of the sample BCU
as correct enumerations, the evaluation analysis will treat these as added-in-error.
15
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
•
Possible Match - The code for a possible match was assigned when the address having an
“Add” action was a possible match to a P sample housing unit, but the Initial Housing
Unit follow up interview was inconclusive or incomplete.
The added-in-error housing units will have one of the following PES match codes from the
Initial Housing Unit Matching and Followup operations:
• Not a housing unit – The Initial Housing Unit followup determined the address with an
“Add” action was for a group quarters, a business, or the unit was demolished, burned
down, uninhabitable, or could not be located.
• Duplicates – The address with an “Add” action was found to be a duplicate of another
unit in the Census.
• Geocoding error – The address with an “Add” action existed as a housing unit at the time
of the Initial Housing Unit followup interview but was incorrectly geocoded to the BCU.
As a result, the PES analysis considers the housing unit to be erroneously enumerated in
the BCU. As noted above, the analysis will treat any addresses found in the surrounding
BCU of the sample BCU as a geocoding error instead of a PES match status of geocoded
correctly. The evaluation analysis will need to identify and recode these addresses.
After the Initial Housing Unit Matching and Followup operations, some addresses may have an
unresolved match status. The analysis will either report the unresolved housing units in their own
category or impute a match status.
The calculated estimates will use the PES weights or a modified version of the PES weights. The
analysis may need to modify the weights because it will not use the match code results in exactly
the same way as the PES analysis will use them. When possible, estimates will be provided for
various characteristics (e.g., urban/rural, size of housing unit) to examine whether add errors are
correlated with any characteristic.
Question 3 – In-Field Address Canvassing: What percentage of the suppressed housing units
did the listers add and miss adding? For a sample of “salted” or false housing units, what
percentage of the housing units did the listers delete and miss deleting?
To answer these questions, a sample of addresses from the MAF that passed the filter will be
suppressed from the dependent list that populates the LiMA for the 2019 production In-Field
Address Canvassing. In addition, a sample of false addresses will be included in the LiMA’s
dependent address list for the production In-Field Address Canvassing. Suppressed addresses,
even if listers do not add them, will still be in the Mailout operation, and the salted addresses will
not be included in the Mailout.
It is possible that some of the suppressed addresses actually do not exist or are nonresidential or
duplicates. Because these addresses will be in the Mailout, the analysis will check their status by
using both the UAA codes and their status on the CUF. If a suppressed address has an UAA
code, then the analysis will classify the address as not correctly added by the listers (i.e., lister
error). Because it is possible for the U.S. Postal Service to successfully deliver mail to an
ineligible unit (e.g., nonresidential address), the analysis will match the MAFIDs of the
16
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
suppressed addresses to the MAFIDs on the CUF. If the lister did not add a suppressed address,
and it appears on the CUF as a valid, enumerated housing unit, then the analysis will classify the
address as a missed add (i.e., lister error).
In addition, some of the false addresses may be actual addresses. These addresses will be
matched—either by the Decennail Information Technology Division (DITD) or through
Production Environment for Administrative Records Staging, Integration and Storage
(PEARSIS)—to the enumerated addresses on the CUF to determine the validity of these
addresses. If a lister did not delete a false address and it matches to a valid, enumerated address
on the CUF, then the lister took the correct action. If the address does not match, then the lister
missed deleting the address (i.e., lister error).
Sample Selection for Suppressed Addresses
The sample of suppressed addresses will be selected from the universe of addresses (MAFIDs)
that pass the filter in blocks identified as active, triggered, or on hold at the time of sampling.
The MAFIDs will be sorted by state, county, tract, BCU, and by address within the BCU.
The sample design will include 13 strata in four address categories and five Urban/Rural (U/R)
Types as shown in Table 1 below. The U/R Type is collapsed for some of the categories.
Table 1: Sampling Strata and Sample Sizes for Suppressed Addresses
Stratum
Address Category
1
2
3
4
5
6
7
8
9
10
11
Single unit
Single unit
Single unit
Single unit
Single unit
Multiunit
Multiunit
Multiunit
Multiunit
Multiunit
Mobile home or trailer
12
13
Mobile home or trailer
Special units
Urban/Rural Type
Central city
Suburban
Exurban
Small town
Rural
Central city
Suburban
Exurban
Small town
Rural
Central city, suburban, exurban,
small town
Rural
Central city, suburban, exurban,
small town, rural
Total Sample Size
The four address categories include the following:
1. Single unit addresses.
2. Addresses within multiunits.
3. Mobile homes or trailers.
17
Minimum
Sample Size
2,416
2,416
2,416
2,416
2,416
3,378
3,378
3,378
3,432
3,432
3,432
Oversample
Size
18,200
31,010
17,000
5,000
22,910
22,950
14,350
5,000
5,000
5,000
6,300
3,432
3,432
5,680
1,600
41,790
160,000
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
4. Addresses with an indication that they may be special housing situations that are not
group quarters or transitory locations, such as potential “hidden” housing units, hard-tofind units, and informally subdivided housing (e.g., location description has: basement
apartment, apartment over garage, unit above store, share, alley, etc.).
The five U/R types include the following:
1. Central city.
2. Suburban.
3. Exurban.
4. Small town.
5. Rural.
The minimum sample size to detect differences between two strata is calculated using the
following formula:
Where
((𝑍𝑍𝛼𝛼∗ ⁄2 + 𝑍𝑍𝛽𝛽 )2 ∗ (𝑝𝑝1 (1 − 𝑝𝑝1 ) + �𝑝𝑝2 (1 − 𝑝𝑝2 )� ∗ 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑)
𝑛𝑛 ≥
𝛿𝛿 2
Zα*/2
Zβ
p1
=
=
=
=
p2
deff
=
=
proportion for stratum 2
design effect due to unequal weighting
n
=
sample size
δ
minimum detectible difference
critical value for set alpha level assuming a two-sided test
critical value for set beta level
proportion for stratum 1
In a 1990 study, the Census Bureau ran a similar approach for the suppression of addresses and
found the field listers’ overall miss rate was 30 percent: with a 42.2 percent miss rate for
multiunit addresses and a 24.3 percent miss rate for single units (Russell, 1992).
The minimum overall sample size for the 13 strata is 41,790. This assumes a minimum detectible
difference = 0.03, the design effect = 1, and a two-sided Z-test with a power of 80 percent at an
alpha level of 0.10. This formula uses the miss rates from the 1990 study for the single units
(strata 1 to 5) for a minimum sample size of 2,416 per stratum and for multiunits (strata 6 to 10)
for a minimum sample size of 3,378 per stratum. Because the 1990 study did not provide miss
rates for mobile homes or special addresses, the formula uses a conservative miss rate of 51
percent for a minimum sample size of 3,432 each for strata 9 to 13.
The selected sample of 160,000 addresses will be an oversample for two reasons: 1) For Field
Division (FLD) planning purposes, the Decennial Statistical Studies Division (DSSD) needs to
select the sample before Geography Division (GEO) has identified the final set of BCUs going to
In-Field Address Canvassing. The final number of BCUs going to In-Field Address Canvassing
is expected to decrease by that time. Because some sample addresses may be in BCUs no longer
going to In-Field Address Canvassing, an oversample will ensure statistical quality of the results.
18
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
2) The estimated coefficient of variation (CV) of the minimum sample size is above the Census
Bureau’s statistical standard of 0.30. The optimizer in Excel was used to obtain the stratum
sample sizes. The optimizer estimated a CV of 0.13 assuming a minimum sample of 1,600
addresses in a stratum with a total oversample size of 160,000 addresses.
There is a risk that listers may add suppressed addresses in a manner that differs from the
original addresses. The processing of these addresses in the MAF matching and updating system
will result in introducing address duplication. Unfortunately, there is no time during production
processing for either DSSD or DITD to identify these duplicate addresses and remove them from
the production stream. As a result, both the original suppressed addresses and their duplicates
will continue into the Mailout. Ultimately, the duplicate addresses could end up in the
Nonresponse Followup (NRFU) workload.
Identifying Salted Addresses
As with the suppression sample design, the strata for the salted sample includes address
categories and the U/R type. The address categories include four categories for single housing
units and two categories where multiple addresses will be added to the sample. Two of the single
housing unit categories will be selected from the MAF, two of the single unit categories will be
based on existing addresses and the other two categories will be for multiple units in false, madeup addresses.
The sample of salted or false addresses will be sorted by state, county, tract, and BCU, then
selected from the following six strata:
Single units selected from the MAF
1. Nonresidential addresses
2. Addresses that do not pass the filter (where the Unit Status variable equals demolished
units or nonexistent addresses)
Single unit, false addresses based on existing addresses
3. Single housing unit
4. False hidden units
Multiple, false addresses
5. Add entire multiunit structures
6. Add a false street with one house number or a range of false house numbers
Table 2 shows the strata and sample sizes for the salted addresses. As indicated, there will be 30
strata, and the total size with oversampling will be 150,000. The sample of salted addresses
going to In-Field Address Canvassing is expected to be less than 150,000 because the active or
passive status of BCUs is not yet final. Some active BCUs are expected to change to a passive
status.
Table 2: Sampling Strata and Sample Sizes for Salted Addresses
Stratum
Salted Category
Urban/Rural Type
Single addresses:
19
Oversample
Size
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
1
Nonresidential
2
Nonresidential
3
Nonresidential
4
Nonresidential
5
Nonresidential
6
Addresses that do not pass filter
7
Addresses that do not pass filter
8
Addresses that do not pass filter
9
Addresses that do not pass filter
10
Addresses that do not pass filter
11
Single unit address
12
Single unit address
13
Single unit address
14
Single unit address
15
Single unit address
16
Special units
17
Special units
18
Special units
19
Special units
20
Special units
Multiple addresses:
21
Entire multiunit
22
Entire multiunit
23
Entire multiunit
24
Entire multiunit
25
Entire multiunit
26
Entire street
27
Entire street
28
Entire street
29
Entire street
30
Entire street
Total Addresses Salted
Central city
Suburban
Exurban
Small town
Rural
Central city
Suburban
Exurban
Small town
Rural
Central city
Suburban
Exurban
Small town
Rural
Central city
Suburban
Exurban
Small town
Rural
14,672
14,672
5,000
5,000
5,000
9,854
9,854
5,000
5,000
5,948
8,000
8,000
8,000
8,000
8,000
2,000
2,000
2,000
2,000
2,000
Central city
Suburban
Exurban
Small town
Rural
Central city
Suburban
Exurban
Small town
Rural
4,000
4,000
4,000
4,000
4,000
4,000
4,000
4,000
4,000
4,000
150,000
To be consistent with the suppressed sample, the minimum sample size is calculated using the
same formula, detectable difference, power, and alpha as the suppressed addresses. Because
there is no prior information on the miss rate for salted addresses, a conservative miss rate of 51
percent is used. Therefore, the minimum number of addresses to salt for each stratum is 3,432 for
a total of 102,960.
The optimizer in Excel was used to determine the oversample sizes for strata 1 to10 (i.e., the
strata for which the addresses are selected from the MAF) that would minimize the CV assuming
a minimum size of 5,000 addresses in a stratum and a total size of 80,000 addresses for the 10
strata. This results in a CV of 0.16.
For the salted sample, the analysis will only provide unweighted numbers. As false addresses,
the salted addresses represent just themselves. Even the addresses selected from the MAF, such
20
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
as the nonresidential units, do not represent all nonresidential addresses in the nation or in the
active blocks. The MAF update process does not routinely add nonresidential units to the MAF.
Typically, a field operation, such as a prior In-Field Address Canvassing operation, provides the
information that these addresses are nonresidential.
Question 4 – In-Office Address Canvassing: What percentage of the BCUs did the In-Office
Address Canvassing Interactive Review correctly classify as active and passive?
This question examines the effectiveness and accuracy of the In-Office Address Canvassing IR
in identifying the BCUs that need In-Field Address Canvassing. Essentially, this question is
asking whether the active and passive BCUs have change to the address inventory.
The analysis using PES results will examine the coverage components. For housing units, the
PES computes separate estimates of correct enumerations, erroneous enumerations and
omissions. The PES correct enumerations will be two types: 1) correctly enumerated in the BCU
and 2) correctly enumerated in the surrounding BCU. The estimates for erroneous enumerations
will have three parts: 1) structures enumerated in the census as housing units but do not exist or
were not housing units, 2) housing units enumerated more than once (duplicates), and 3)
geocoding errors.
For this evaluation, the second type of PES correct enumerations—the housing units correctly
enumerated in the surrounding BCU—will be classified as erroneous enumerations. The analysis
for the evaluation is changing this definition because the intent is to examine correct
classification or incorrect classification for BCUs 13. However, the PES staff estimates correct
enumerations, erroneous enumerations, and omissions at higher levels of geography. Changing
the definition requires the evaluation analysis to code its own variables for correct enumerations
and erroneous enumerations based on the P sample and E sample matching and follow up results.
Because the PES calculation for omissions is the Dual System Estimate minus the correct
enumerations, the analysis will need to recalculate this estimate.
If an active BCU has one or more housing units that are erroneous enumerations or omissions,
then IR correctly classified the BCU. If a passive BCU has only correct enumerations and no
erroneous enumerations or omissions, then IR correctly classified the BCU.
The analysis will include distributions of addresses in the correctly and incorrectly assigned
active and passive BCUs.
Question 5 – In-Office Address Canvassing: How accurate is the Virtual Canvassing?
The operation called “Virtual Canvassing” by this evaluation will use the ABR procedures that
were revised just before the suspension of ABR. Therefore, this research question will essentially
evaluate ABR if the operation had continued.
13
The IR work unit is a block. So ideally, the analysis would examine correct and incorrect classification at the
block-level. However, the fieldwork will use BCUs as the work unit. BCUs and blocks do not always have a one-toone relationship, so the analysis cannot translate a BCU-level analysis to a block-level analysis.
21
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
The revised ABR procedures rely on the placement in IR of coverage “pins.” (The IR staff will
assign a “pin” to an area on the current imagery that appears to have additional housing units or
less housing units when compared to the baseline image). As a result, a “second” or repeat IR
will need to be conducted before the Virtual Canvassing. This repeat IR will occur in the 10,000
BCUs in the PES sample for the U.S.
In Virtual Canvassing, staff (contractors) will use in-office information to canvass the BCUs
found to be active by the repeat IR. In-office information may include imagery, local geographic
information and imagery, parcel data, local files, partner data, street-level imagery, MAF address
information, TIGER street data, and DSF data. Staff will focus on resolving the coverage issues
identified by the IR. To resolve the coverage issues, staff may have to canvass the entire BCU.
However, only the addresses needing updating will receive action codes.
Because of resource constraints during census production, the repeat IR and Virtual Canvassing
will be conducted from late August 2020-October 2020. To mitigate temporal differences, IR
and Virtual Canvassing will use current imagery from around April 1, 2020—the PES reference
day.
The analysis will use the PES results to determine the accuracy of the Virtual Canvassing—and
thus of the revised ABR procedures. To use the PES results, the addresses added by Virtual
Canvassing will need to be matched to the PES addresses. (The other addresses should already
be in the PES match universe). Either the DSSD or the GEO will conduct this matching.
The calculated estimates will use the PES weights or a modified version of the PES weights. The
analysis may need to modify the weights because it will not use the match code results in exactly
the same way as the PES analysis will use them.
Question 6 – In-Office Address Canvassing: Were the set of triggers sufficient for identifying
instances in which housing unit change occurred?
a. Were there instances of housing unit change that were not detected by triggers, and as a
result, did not send a BCU back to IR for assessment or directly to an active status?
b. Did the set of triggers result in unnecessary work in the IR and in-field?
c. What is the effectiveness of specific trigger types?
The analysis for question 6a will examine the passive-never triggered BCUs and determine
whether these BCUs have changes to the address list. When using the PES results, the analysis
will determine whether the passive-never triggered BCUs have omissions or erroneous
enumerations.
If the sample size allows, the analysis will examine the passive BCUs that were triggered for IR
re-review and remained passive after the re-review. It is possible for change to occur after the rereview, but subsequent triggers did not detect the change. The analysis will use the Virtual
Canvassing and PES results as described in the previous paragraph.
22
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
The analysis for the first part of question 6b will show a summary from the weekly Trigger
Report. This report shows the number of blocks sent to IR for each trigger event, and as a result
of the re-review, the number and percentage of blocks that become active, passive, or placed on
hold. A trigger event with a relatively high percentage of passive blocks and low percentage of
active blocks after the re-review may indicate an ineffective trigger that resulted in unnecessary
work. (The weekly Trigger Report covers triggered blocks in the entire U.S. and Puerto Rico).
The analysis for the second part of question 6b will examine the PES results for the triggered
BCUs that became active and went to In-Field Address Canvassing. Triggered BCUs without
omissions or erroneous enumerations from PES may indicate BCUs that did not need to go to InField Address Canvassing.
The analysis for question 6c will examine the effectiveness of specific trigger types or reasons
(e.g., a DSF update results in an increase or decrease to the housing unit count, a block has a
large number of housing units without map spots, etc.). To date, there have been about 50 trigger
types. A trigger type may have occurred on multiple dates or events. The analysis will examine
both trigger types and trigger events. The PES sample size may not allow estimates for each
trigger type or event. If necessary, the trigger types and events will be collapsed into smaller
categories, but any trigger types or trigger events that stand out will be noted.
Note that “effectiveness” may be defined in different ways. For example, is a trigger type
effective if it:
• Results in a “high” percentage of reviewed blocks becoming active?
• Results in a “high” percentage of changed addresses?
• Results in any changed addresses or active blocks?
The analysis will provide several tables to allow for an examination of differing meanings of the
term “effectiveness.”
Question 7 – In-Office Address Canvassing: What is the effect on the enumeration of
addresses missed by In-Office Address Canvassing in the misclassified BCUs?
a. Were the missed addresses in the BCUs, which the IR misclassified as passive,
enumerated as valid, residential units?
b. What is the cost of incorrectly classifying BCUs?
The analysis will use results from both the Virtual Canvassing and the PES to answer questions
7a and 7b.
The intent of question 7a is to determine whether the missed addresses in the BCUs incorrectly
classified as passive are being added by enumeration operations, and as a result, the effect of the
misclassification on the final enumeration is minimized. The BCUs misclassified as active and
that should have been passive according to the Virtual Canvassing or PES results, will not have
23
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
added addresses from Virtual Canvassing or omissions from PES. Therefore, question 7a just
examines the misclassified passive BCUs.
The addresses having add actions from Virtual Canvassing will need to be matched to the CUF
(i.e., the final address list) to determine whether they are valid, enumerated units. If the match
determines that any of the addresses are enumerated units, then the analysis will estimate the
undercoverage rate in the misclassified passive BCUs.
The analysis will use the PES results in a similar way to answer question 7a. Because the PES
matches to In-Field Address Canvassing results, which are on the Enumeration Extract, and the
results from census enumerations, which are on the CUF, no additional matching will be needed.
The analysis will calculate the omissions in the misclassified passive BCUs and estimate the
undercoverage rate.
Question 7b examines the negative actions (deletes, duplicates, moves from one BCU to another
BCU, and nonresidential) and the effect these records have on cost by potentially increasing the
NRFU workload. Virtual Canvassing results will provide the negative actions in the sample
BCUs that IR incorrectly classified. The DITD will match the addresses having negative actions
to the CUF to minimize the potential of false negatives. The analysis will remove any addresses
that match to valid, enumerated addresses.
The 2020 Census NRFU Operational Assessment report is expected to provide a cost per address
(or a cost from which a cost per address can be derived). Multiplying the estimated, weighted
number of addresses having negative actions by the NRFU cost per address will give the
estimated cost impact of incorrectly classifying BCUs as passive.
In addition, the BCUs incorrectly classified as active may increase the cost of In-Field Address
Canvassing. Using Virtual Canvassing and In-Field Address Canvassing results, the analysis will
determine the number of BCUs that have no address inventory changes. The 2020 Census InField Address Canvassing Operational Assessment will provide a cost per BCU. Multiplying this
cost by the estimated, weighted number of misclassified active BCUs will provide the estimated
total cost impact for these BCUs.
Question 8 – How effective was the filter in identifying valid living quarters for the In-Field
Address Canvassing dependent address list?
To evaluate how well the In-Field Address Canvassing filter identified housing units for the
dependent address list, the results from the PES P sample will be used for part of the analysis. As
mentioned in the background section, the P sample includes matching of the PES Independent
Listing to the Enumeration Extract—which includes the results from In-Field Address
Canvassing—and field follow up on questionable matches or unmatched housing units. (Note
that PES does not include group quarters, so the research question can only be answered for
housing units.)
24
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
The In-Field Address Canvassing Transaction File will show the added housing units that match
back to existing addresses on the MAF that did not pass the filter (i.e., matched adds). The PES
match codes will identify which of these matched adds were valid, residential addresses. In
addition, the PES results will assist in identifying the existing MAF addresses that listers failed
to add. The matched adds representing valid, residential housing units are addresses that ideally,
should have be included on the In-Field Address Canvassing Extract. If possible, the analysis
will attempt to find one or more common characteristics of these addresses that can define new
filter rules, which will pass the addresses without substantially increasing the number of invalid
addresses that pass. Examples of characteristics the analysis may examine include:
• The latest source that validated the addresses.
• The original source of the addresses.
• The length of time since operation validated the addresses.
• The geographic location of the addresses.
Because the PES does not match to the In-Field Address Canvassing addresses having negative
actions (deletes, duplicates, and nonresidential), the analysis will use the results of matching
these types of addresses to the CUF. This match will show the addresses with negative actions
that are valid, enumerated housing units and were deleted-in-error. The PES match codes will
identify the In-Field Address Canvassing addresses that listers failed to identify as invalid, or
deleted, housing units. Based on the results from the CUF match and the PES match codes, the
analysis will identify the address records that should have been excluded from the In-Field
Address Canvassing Extract.
Question 9 - What is the cost of the reengineered Address Canvassing operation compared to the
cost of a 100 percent In-Field Address Canvassing?
Note: The methodology for this question depends on cost information from the Decennial
Budget Office.
Question 10 – Can unit level modeling support improved targeting for Address Canvassing,
specifying filter criteria, estimating hidden units, or estimating recurring MAF coverage errors?
See the appendix for a description of the methodology for answering this question.
B. Interventions with the 2020 Census
The suppression and salting of addresses during the In-Field Address Canvassing operation
requires interventions with the 2020 Census production solutions or systems.
The DITD will give the In-Field Address Canvassing Filter Flag variable a value that will cause
the suppressed addresses to not pass. By not passing the filter, the addresses will not go to In-
25
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
Field Address Canvassing. After completion of In-Field Address Canvassing, all the suppressed
addresses will be reinstated and have a value that passes the enumeration filter.
The DITD will add the salted addresses directly to the input file created for the LiMA. This will
cause the salted addresses to go to In-Field Address Canvassing. After completion of In-Field
Address Canvassing, the DITD will remove all the salted addresses before matching to the
MAF/TIGER. This will keep the salted addresses from continuing on to the enumeration
operations.
C. Implications for 2030 Census Design Decisions and Future Research and Testing
The Decennial Research Objectives and Methodology (DROM) Working Group removed the
following proposed research question because of substantial limitations associated with acquiring
results that are meaningful and not confounded:
If In-Office Address Canvassing Interactive Review was eliminated, what would be the effect on
the decennial census? If In-Field Address Canvassing was eliminated, what would be the effect
on the decennial census?
Stakeholders agreed that a more meaningful research focus for the early decade phase would
involve testing alternative operational designs to strike an optimal balance between in-office and
in-field operations for improved efficiency and accuracy.
VI.
Data Requirements
The table below lists the data needed to answer the research questions along with the source of
the data, how the evaluation will use the data (purpose), and the expected delivery date.
Data File/Report
Source
Purpose
August 2018 Block
Characteristics File
Decennial
Information
Technology
Division (DITD)
August 2018 MAF Extract
DITD
January 2019 Block
Characteristics File
DITD
This file provides the preliminary active,
passive, on hold status of blocks and will be
used for selecting the sample of suppressed
and salted addresses in the In-Field Address
Canvassing.
This MAF extract will be used for selecting
the sample of suppressed and salted
addresses in the In-Field Address
Canvassing.
This file provides the updated active,
passive, and on hold status of blocks and
will be used to check how status changes
affect the suppressed and salted sample.
26
Expected
Delivery Date
08/03/2018
08/10/2018
01/04/2019
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
Data File/Report
Source
Purpose
January 2019 MAF Extract
DITD
Dangerous Address File
FLD or DCMD
Block Characteristics File
for In-Field Address
Canvassing
BCU Table for In-Field
Address Canvassing
Address Canvassing
Geographic Reference File
– U.S.
Address Canvassing MAF
Extract –U.S.
DITD
This MAF extract will be used to check
how updates affect the sample of
suppressed and salted addresses in the InField Address Canvassing.
Dangerous addresses will be excluded from
the suppressed sample.
This file will give the final status (active,
passive, etc.) for blocks.
Trigger Status File
GEO-Address
and Spatial
Analysis Branch
GEO – Address
and Spatial
Analysis Branch
Applications
Development
and Services
Division
(ADSD)
DITD
Trigger Event File
LiMA Output (Address
Update) File from In-Field
Address Canvassing
In-Field Address
Canvassing Transaction File
DITD
DITD
DITD
Virtual Canvassing
Transaction or Address
Update File
Undeliverable as Addressed
File
DITD
Cost Data
Decennial
Budget Office
and the Uniform
Tracking System
(UTS)
CDL
Census Unedited File
Census Data
Lake (CDL)
Expected
Delivery Date
01/11/2019
04/24/2019
01/31/2020
This file will give the final status (active,
passive, on hold) for BCUs.
This file will show the BCUs going to InField Address Canvassing.
01/31/2020
This file provides all the addresses on the
MAF prior to In-Field Address Canvassing
and a flag will show the addresses that pass
the filter for Address Canvassing.
This shows the final status of blocks or
BCUs that were triggered and never
triggered.
This file shows the trigger reasons and
events and the results from IR for each
event for each BCU.
This file shows the actions and updates
made by the listers and QC listers. It will
aid DSSD in resolving potential duplicates.
06/28/2019
This file shows the lister actions and the
updates to the MAF. For suppressed
addresses, it will show the MAFIDs added
by the listers. For salted addresses, it will
show the addresses identified as deletes.
This file shows the actions taken by staff
doing the Virtual Canvassing operation.
12/17/2019
This file shows the addresses from Mailout
that were undeliverable by the U.S. Post
Office. The evaluation can use this file to
check the validity of addresses after InField Address Canvassing.
The data will allow a comparison of the
2020 Address Canvassing costs and the
2010 Address Canvassing costs.
05/01/2020
The CUF provides the final enumeration
status of housing units in the census.
11/30/2020
27
06/28/2019
07/31/2019
07/31/2019
11/08/2019
11/20/2020
09/30/2020
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
Data File/Report
Source
Match Files
DITD
Purpose
Expected
Delivery Date
12/30/2020
The match files will indicate whether InField Address Canvassing added addresses
and deleted addresses ended up being
enumerated as valid, residential addresses.
In-Field Address
DCMD
The assessment report will provide a cost
12/2020
Canvassing Operational
per BCU and cost per address for the InAssessment Report
Field Address Canvassing operation.
NRFU Operational
DCMD
The assessment report will provide a cost
03/2021
Assessment Report
per address for the NRFU operation.
PES Files
DITD
The PES results will provide benchmarks
09/30/20211
for evaluating the In-Field and In-Office
Address Canvassing operations.
1
To avoid interference with PES analysis, the evaluation will use PES files after the PES analysis is complete.
The date shown is when PES analysis and reporting is scheduled to complete.
VII. Risks
1. If adequate resources to conduct the evaluation activities are unavailable, then activities may
be delayed or descoped.
2. If the Census Bureau cannot obtain a contractor for the Virtual Canvassing by August 29,
2019, then analysts will not be able to answer some research questions.
3. If a lister adds a suppressed sample address in a manner that differs from the original address,
then the processing may not match the added address to the original MAF address, and a
duplicate address may be created. This can result in an increase in the NRFU workload.
4. If resources are constrained because of competing work on 2020 Census production
activities, then Virtual Canvassing may be delayed or canceled.
5. If information regarding triggers is not available, then the research question regarding
triggers may not be fully answered.
VIII. Limitations
Limitations that may affect the results of this evaluation include:
1. Differences between BCU and Block – The In-Office Address Canvassing unit of
geographic measure is a block, whereas the In-Field Address Canvassing and the PES unit of
measure is a BCU. The results from the two operations need to be analyzed using the same
geographic unit, which in the case of this evaluation, will be a BCU. The relationship of a
block to a BCU is not always one-to-one. As a result, it is possible for a BCU to contain (all
or part of) an active block and (all or part of) a passive block. When a BCU contains at least
part of an active block, it becomes an active BCU and is sent for listing. Even if only the
passive portion of the BCU has one or more address inventory actions, the analysis would
consider the BCU as correctly classified. However, an analysis at the block-level would
include the block in the misclassification estimate, causing a block misclassification rate to
28
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
be higher than a BCU misclassification rate. The same is true for other BCU status
categories.
2. Temporal differences between operations - The timing of when each operation will be
conducted, including the state of the MAF when the address lists or extracts will be created,
may confound the analytic results.
IX.
Issues That Need to be Resolved
Will the Experiments, Assessments, and Evaluations budget cover the cost of contractors to
conduct the Virtual Canvassing operation?
29
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
X.
Division Responsibilities
The table below lists the divisions and offices involved in the development and implementation
of the evaluation and their responsibilities.
Division or Office
Responsibilities
ADSD
•
Deliver LiMA output file from In-Field Address Canvassing to
DSSD.
Contractors
•
Conduct the Virtual Canvassing operation.
CSRM
•
Conduct a cost-benefit analysis of the Address Canvassing
operations.
Conduct analysis on modeling.
Review and provide comments on the evaluation methodology.
•
•
DCMD
•
•
•
DITD
•
•
•
DSSD
•
•
•
•
•
GEO
•
Deliver data files from census and PES operations.
Ensure suppressed addresses go into the Mailout operation and
salted addresses do not move on.
Conduct additional matching of evaluation files to the final
census.
Specify requirements for data products needed to implement to
evaluation.
Design and select the suppressed and salted samples.
Conduct analysis to answer the research questions.
Report on the schedule status to DCMD.
Develop the study plan and write the report.
•
Provide oversight and management of the Virtual Canvassing
operation and the repeat IR.
Write software requirements for the requested data products.
Provide data on the status of blocks and BCUs for In-Field
Address Canvassing.
Provide data on the triggered blocks and BCUs.
•
Conduct a repeat IR for the BCUs in the PES sample.
•
•
NPC
Provide overall management of experiments/evaluations, the
budget, and schedule.
Review the evaluation study plan and report.
Provide cost data on the Address Canvassing operation.
30
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
XI.
Milestone Schedule
Activity
ID
Activity Name
Orig
Start
Duration
Finish
2020 Census Evaluation of the Reengineered Address Canvassing Operation Study Plan
Initial Draft
Prepare Initial Draft of 2020 Census Evaluation of the
Reengineered Address Canvassing Operation Study
Plan
15
05/11/2018
09/27/2018
Distribute Initial Draft of 2020 Census Evaluation of
the Reengineered Address Canvassing Operation
Study Plan to the Author’s Division Chief, Subject
Matter Experts (SMEs) and Other Reviewers
1
09/28/2018
09/27/2018
Incorporate Author’s Division Chief, SMEs and Other
Comments to 2020 Census Evaluation of the
Reengineered Address Canvassing Operation Study
Plan
5
10/10/2018
10/29/2018
Prepare Final Draft 2020 Census Evaluation of the
Reengineered Address Canvassing Operation Study
Plan
5
11/14/2018
01/30/2019
Distribute Final Draft 2020 Census Evaluation of the
Reengineered Address Canvassing Operation Study
Plan to Evaluations & Experiments Coordination
Brach (EXC)
1
01/31/2019
01/31/2019
EXC Distributes Final Draft 2020 Census Evaluation
of the Reengineered Address Canvassing Operation
Study Plan to the DROM Working Group for
Electronic Review
1
02/01/2019
02/01/2019
Receive Comments from the DROM Working Group
on the Final Draft 2020 Census Evaluation of the
Reengineered Address Canvassing Operation Study
Plan
5
02/02/2019
02/19/2019
Schedule the 2020 Census Evaluation of the
Reengineered Address Canvassing Operation Study
Plan for the IPT Lead to Meet with the DROM
Working Group
17
01/14/2019
02/05/2019
Discuss DROM Comments on Final 2020 Census
Evaluation of the Reengineered Address Canvassing
Operation Study Plan
1
02/19/2019
02/19/2019
Prepare FINAL 2020 Census Evaluation of the
Reengineered Address Canvassing Operation Study
Plan
15
02/20/2019
03/14/2019
Final Draft
FINAL
31
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
Activity
ID
Activity Name
Distribute FINAL 2020 Census Evaluation of the
Reengineered Address Canvassing Operation Study
Plan to the EXC
Orig
Start
Finish
Duration
03/15/2019 03/15/2019
1
EXC Staff Distributes the 2020 Census Evaluation of
the Reengineered Address Canvassing Operation
Study Plan and 2020 Memorandum to the DCCO
3
03/18/2019
03/21/2019
DCCO Staff Process the Draft 2020 Memorandum
and the 2020 Census Evaluation of the
Reengineered Address Canvassing Operation Study
Plan to Obtain Editorial Clearance (Chief Editor)
100
03/22/2019
08/09/2019
DCCO Staff Formally Release the 2020 Census
Evaluation of the Reengineered Address Canvassing
Operation Study Plan in the 2020 Memorandum
Series
1
08/12/2019
08/12/2019
2020 Census Evaluation of the Reengineered Address Canvassing Operation Report
Initial Draft of Report
Receive, Verify, and Validate 2020 Census
Evaluation of the Reengineered Address Canvassing
Operation Data for the Suppressed and Salted
Addresses
20
12/16/2019
01/06/2020
Examine Results and Conduct Preliminary Analysis
for the 2020 Census Evaluation of the Reengineered
Address Canvassing Operation Suppressed and
Salted Addresses
20
01/07/2020
05/01/2020
Brief DROM on the Preliminary Results for the 2020
Census Evaluation of the Reengineered Address
Canvassing Operation Suppressed and Salted
Addresses
Conduct “Second” Interactive Review for the 2020
Census Evaluation of the Reengineered Address
Canvassing Operation in PES Sample BCUs
05/21/2020
05/21/2020
08/17/2020
08/31/2020
Conduct Virtual Canvassing for the 2020 Census
Evaluation of the Reengineered Address Canvassing
Operation in PES Sample BCUs
09/01/2020
10/30/2020
Receive, Verify, and Validate 2020 Census
Evaluation of the Reengineered Address Canvassing
Operation Data
20
08/03/2018
09/30/2021
Examine Results and Conduct Analysis
20
10/01/2021
04/29/2022
Prepare Initial Draft of 2020 Census Evaluation of the
Reengineered Address Canvassing Operation Report
15
04/29/2022
07/08/2022
1
07/11/2022
07/11/2022
Distribute Initial Draft of 2020 Census Evaluation of
the Reengineered Address Canvassing Operation
Report to the Author’s Division Chief, SMEs and
Other Reviewers
32
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
Activity
ID
Activity Name
Incorporate Author’s Division Chief, SMEs and Other
Comments 2020 Census Evaluation of the
Reengineered Address Canvassing Operation Report
Orig
Start
Finish
Duration
07/12/2022 07/29/2022
7
Final Draft of Report
Prepare Final Draft 2020 Census Evaluation of the
Reengineered Address Canvassing Operation Report
8
08/01/2022
08/15/2022
Distribute Final Draft 2020 Census Evaluation of the
Reengineered Address Canvassing Operation Report
to Evaluations & Experiments Coordination Br. (EXC)
1
08/16/2022
08/16/2022
EXC Distributes Final Draft 2020 Census Evaluation
of the Reengineered Address Canvassing Operation
Report to the DROM Working Group for Electronic
Review
1
08/17/2022
08/17/2022
Receive Comments from the DROM Working Group
on the Final Draft 2020 Census Evaluation of the
Reengineered Address Canvassing Operation Report
10
08/18/2022
09/08/2022
Schedule the 2020 Census Evaluation of the
Reengineered Address Canvassing Operation
Report for the IPT Lead to Meet with the DROM
Working Group
10
09/09/2022
09/26/2022
Discuss DROM Comments on Final Draft 2020
Census Evaluation of the Reengineered Address
Canvassing Operation Report
1
09/27/2022
09/27/2022
Prepare FINAL 2020 Census Evaluation of the
Reengineered Address Canvassing Operation Report
25
09/28/2022
11/02/2022
Deliver FINAL 2020 Census Evaluation of the
Reengineered Address Canvassing Operation Report
to the EXC
1
11/03/2022
11/03/2022
EXC Staff Distribute the FINAL 2020 Census
Evaluation of the Reengineered Address Canvassing
Operation Report and 2020 Memorandum to the
DCCO
3
11/04/2022
11/09/2022
DCCO Staff Process the Draft 2020 Memorandum
and the FINAL 2020 Census Evaluation of the
Reengineered Address Canvassing Operation Report
to Obtain Clearances (DCMD Chief, Assistant
Director, and Associate Director)
100
11/10/2022
03/30/2023
1
03/31/2023
03/31/2023
FINAL Report
DCCO Staff Formally Release the FINAL 2020
Census Evaluation of the Reengineered Address
Canvassing Operation Report in the 2020
Memorandum Series
33
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
Activity
ID
Activity Name
EXC Staff Capture Recommendations of the FINAL
2020 Census Evaluation of the Reengineered
Address Canvassing Operation Report in the
Census Knowledge Management Application
Orig
Start
Finish
Duration
04/03/2023 04/03/2023
1
XII. Review/Approval Table
Role
Approval Date
Primary Author’s Division Chief (or designee)
02/01/2019
Decennial Census Management Division (DCMD) ADC for Nonresponse,
Evaluations, and Experiments
02/19/2019
Decennial Research Objectives and Methods (DROM) Working Group
02/19/2019
Decennial Census Communications Office (DCCO)
XIII. Document Revision and Version Control History
Version/Editor
0.1
0.2
0.3
1.0
1.1
Date
08/29/2018
09/27/2018
10/25/2018
02/01/2019
03/15/2019
Revision Description
Initial draft for DSSD team review.
Draft for DSSD Division Chief review.
Draft for DROM review.
Final draft for DROM.
Final study plan.
34
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
XIV. Glossary of Acronyms
Below is a list of acronyms used in this study plan.
Acronym
ABR
ADC
ADSD
AVT
BCU
CSRM
CUF
DCCO
DCMD
DITD
DROM
DSF
DSSD
EXC
FHU
FLD
GEO
GQ
GSS
IHU
IL
IPT
IR
LiMA
LUCA
MAF
MAFID
MMVT
NPC
NRFU
PEARSIS
PES
PI
QC
R&M
TEA
TIGER
UAA
U/R
UTS
Definition
Active Block Resolution
Assistant Division Chief
Applications Development and Services Division
Address Validation Test
Basic Collection Unit
Center for Statistical Research and Methodology
Census Unedited File
Decennial Census Communications Office
Decennial Census Management Division
Decennial Information Technology Division
Decennial Research Objectives and Methods Working Group
Delivery Sequence File
Decennial Statistical Studies Division
Evaluations & Experiments Coordination Branch
Final Housing Unit
Field Division
Geography Division
Group Quarters
Geographic Support System
Initial Housing Unit
Independent Listing
Integrated Project Team
Interactive Review
Listing and Mapping Application
Local Update of Census Addresses
Master Address File
Master Address File Identification Number
MAF Model Validation Test
National Processing Center
Nonresponse Followup
Production Environment for Administrative Records Staging,
Integration and Storage
Post-Enumeration Survey
Person Interview
Quality Control
Research & Methodology Directorate
Type of Enumeration Area
Topologically Integrated Geographic Encoding and
Referencing
Undeliverable as Addressed
Urban/Rural
Uniform Tracking System
35
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
XV. References
Hill, Courtney, T. Trang Nguyen and Laura A. Davis (2019), “2020 Post Enumeration Survey:
Sample Design,” DSSD 2020 Post-Enumeration Survey Memorandum Series #2020-C-08,
Draft, January 2019.
Holland, J. (2012), “Final Report for the 2010 Census Evaluation of Automation in Field Data
Collection in Address Canvassing,” DSSD 2010 CPEX Memorandum Series #A-05, July
2012.
JASON Program Office, The MITRE Corporation (2016), “Alternative Futures for the Conduct
of the 2030 Census,” November 2016.
Johnson, Nancy and Shannon McDougall (2019), “2018 End-to-End Census Test: Evaluation of
Address Canvassing,” Draft, March 2019.
Kennel, Timothy L. (2019), “The Design of the Post-Enumeration Survey for the 2020 Census,”
DSSD 2020 Post-Enumeration Survey Memorandum Series #2020-B-01, February 2019.
Russell, Chad (1992), “Results of the Precanvass Suppression Study,” April 1992.
Smith, Damon, Diane F. Barrett, and Michael Beaghen (2003), “Analysis of Deleted and Added
Housing Units in Census 2000 Measured by the Accuracy and Coverage Evaluation,”
Census 2000 Evaluation O.19, Decennial Statistical Studies Division, October 2003.
Snodgrass, Sally, Laura Davis, and April Avnayim (2018), “2020 Research and Testing:
Analysis Report – Address Canvassing Test, Version 1.3, Draft, July 2018.
U.S. Census Bureau (2012), “2010 Census Address Canvassing Operational Assessment,”
January 2012.
U.S. Census Bureau (2013), “Evaluation of Address Frame Accuracy and Quality,” 2010 Census
Planning Memoranda No. 252, February 2013.
U.S. Census Bureau (2015), “2015 Census Address Validation Test,” Version 1.0, March 26,
2016, https://www.census.gov/programs-surveys/decennial-census/2020-census/planningmanagement/final-analysis/2015-adval-report.html.
U.S. Census Bureau (2018a), “2020 Census Operational Plan,” Version 4.0, December 2018.
U.S. Census Bureau (2018b), “2020 Census Detailed Operational Plan for the Address
Canvassing Operation,” Version 2.0, May 2018.
Ward, Justin (2012), “Final Report for the 2010 Census Evaluation: Data-Based Extraction
Processes for the Address Frame,” DSSD 2010 CPEX Memorandum Series #A-04, June
2012.
36
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
Williams, Aneesah (2018), “2020 Research and Testing: Master Address File Coverage Study
2016 Analysis Report,” 2020 Census Program Internal Memorandum Series: 2018.13.i, June
29, 2018.
37
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
XVI. Appendix
Proposal for Unit-level Modeling of MAF Changes based on MAF
History Files to Support a 2020 Census Evaluation Proposal on
Reengineered Address Canvassing
Eric Slud, CSRM, US Census Bureau
October 19, 2018
Abstract: This document describes a proposal to perform unit level modeling of Master
Address File (MAF) changes, with the goal of supporting research previously proposed in
Johnson (2018) as part of the 2020 Census Evaluation research program. Predictive modeling of
MAF adds and deletes has previously been undertaken at the block level in connection with the
2020 Targeted Address Canvassing program, but such models have been found insufficiently
predictive to guide Reengineered Address Canvassing. It is argued that block-level modeling
does not allow important distinctions to be drawn between MAF address changes that might or
might not be detectable by remote sensing from those resulting from status changes within
existing addresses. Such less-obvious changes could reflect ‘hidden units’ as discussed in recent
Census Bureau evaluative reports, changes in occupancy or changes in residential versus
commercial status, or corrections or new instances of geocoding errors. It is proposed to explore
extensive MAF history data as a source of new predictive variables to identify basic housing-unit
addresses at risk for MAF status changes, to support unit level MAF-change modeling. Such unit
level models might hope to achieve the decennial-census objective of improved targeting for
Address Canvassing, but might also serve a useful purpose in evaluating mid-decade in-office
address canvassing and in estimating hidden-unit frequencies and other types of recurring MAF
coverage errors.
1
Introduction
The 2020 Census Evaluation Research Proposal of Johnson (2018) describes a program of
research aimed at assessing error rates of MAF changes of various types: post-Address
Canvassing operation, in-field address canvassing and in-office address canvassing. Much of this
work would be directly related to recent Decennial reengineered canvassing operations, but the
proposal recognizes that many of the same techniques are relevant for continuing assessment of
the efficacy of systems instituted in Geography Division for in-office canvassing for ongoing
maintenance of MAF as the frame for all major Census Bureau surveys from now on.
Some of the assessment methodology proposed in Johnson (2018) is based on matching and
coverage evaluations that might be undertaken with Coverage Measurement data from the
independent listing and matching operations of the Post Enumeration Survey (PES). Other
38
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
suggested assessments would be undertaken from the application of listing and canvassing
systems on geographies from which a sample of actual living quarters would be suppressed or to
which a sample of ‘salted’ or false living quarters would be added. PES results and data from
experiments of that sort would indeed provide useful information on gross error-rates in detecting
incorrect MAF entries. The Johnson (2018) proposal aims to gain information about the rates of
occurrence of types of errors known to be difficult to detect from post-office Delivery Sequence
Files (DSF) and in-field listing, by stratifying the suppressed and seeded addresses according to
address type (single- versus multiunit, mobile units, addresses with indicators associated with
‘hidden units’). However, it seems unlikely that such a design could by itself give a sufficiently
rich picture of geographical neighborhood variation of error types in terms of demographics,
population density, and terrain from design-based analyses alone.
Therefore, some sort of modeling of error rates seems unavoidable. Past efforts to model
MAF additions and deletions from address-canvassing data (Raim and Gargano 2015, Young et
al. 2016) were based on national field-canvassing data and MAF variables and attempted to
produce predictive models for block-level changes in MAF due to canvassing, Interactive
Review, etc. These models were not highly predictive of block-level counts of MAF adds and
deletes, and this may have been because the types of MAF errors that might be partially
predictable from MAF variables are of several different types that would occur differently at
different geographic locations and would be associated with different kinds of MAF variables.
The errors necessarily arise at the level of individual units of living quarters, but when modeled
at block level, only the aggregates of MAF variables to blocks can reasonably serve as predictors.
2
New Approaches
The direction of research suggested here is to model MAF adds and deletes at unit address level,
in terms of MAF and DSF variables and auxiliary data sources. There are several reasons to
prefer unit-level models, which not really to have been undertaken before, for lack of data and
manpower resources. First of all, the MAF and DSF predictor variables primarily describe the
residential and mail-delivery status of individual addresses, not blocks. The predictive value of
address-level information is necessarily diluted if aggregated to block level, so one might expect
better prediction accuracy for models at unit level. A second reason to model at unit level is that
MAF errors can be distinguished by at least several different characteristics related to the origin
and potential detection of the errors: new construction and demolitions can be ascertained either
by street-level observation or remote sensing (satellite pictures); changes in residential status
related to vacancy are opaque to remote sensing but detectable and possibly predictable through
mail-delivery history; subdivision of apartments and other sorts of changes in hidden existing
units with ambiguous addresses are again inaccessible to remote sensing but may have indicators
from address-level DSF and MAF history; similarly, there may be MAF flags indicative of past
and present address-geocoding errors histories. Thus, differences in address characteristics
amenable to analysis from MAF history are likely associated with different types and sources of
MAF errors. Some such errors may be particularly relevant to the success of specific procedures
of in-office canvassing, so a third reason for resolving errors at address level, by type, is the
possibility of improving the assessment of quality of those in-office canvassing procedures. This
description of address and error types further suggests that MAF longitudinal data about
39
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
individual addresses, which has hardly been used in previous modeling approaches, may be a
promising big-data resource to be mined for descriptive and predictive variables. This kind of
data mining may also involve machine-learning ideas, since appropriately combined and recoded
MAF-history variables developed from machine-learning strategies for classification of
addresses. Addresses found to have been misrepresented in MAF might be classified in a number
of carefully prescribed ways, for example, in terms of status changes (subdivided unit such as
garage or basement becoming or ceasing to be living quarters); geocoding error initiation or
correction; new construction or demolition; changes in accessibility (e.g., erection of a gate
around a neighborhood area or development); protracted vacancy, etc.
The sources of predictor variables will build on the analyses done previously in Virgile
(2010), Johnson and Kephart (2013), and Raim and Gargano (2015), relying primarily on MAF
and DSF files. Variables used previously will have to be modified to unit-level variables in some
cases. In addition, multi-year MAF files will be matched where possible, with the objective of
extracting longitudinal histories of variables indicating: interruptions and changes of mail
delivery, occupancy changes, indicators of change of accessibility (such as gates, or building
locks in the case of multiunit structures), flags of new or corrected geocoding errors, indicators of
changes in residency status (such as commercial use, or protracted vacancy), flags indicating new
construction or condemned building, and indicators of ‘hidden units’ by way of changes in the
residential status of subdivided units or basements or outbuildings such as sheds or garages.
Models of MAF changes in terms of predictor variables will build on previous work of Raim
and Gargano (2015) and Young et al. (2016), incorporating new unit level predictors. Some
potential models incorporating the unit-level change models, relevant to the estimation of frame
coverage errors for censuses and surveys, will be investigated following ideas initiated in Slud
(2014).
3
Timeline and Resources Required
The exploratory analysis and modeling activity suggested here will be heavily dependent on the
access of CSRM staff to historical MAF files and help from DSSD staff concerning the meaning
and continuity of variable designations over the years. This research would be maximally
productive if staff with MAF data expertise in DSSD, GEO and possibly ACS could be consulted
on multiple occasions to help develop indicators mirroring MAF errors of various types and
could help in the definition of meaningful types. Once the data have been made available, the
exploratory effort will be undertaken by 2 or 3 CSRM mathematical statisticians for a period
likely running 6 months to 1 year. Deliverable products of this analysis would first include new
MAF variable combinations found to be highly associated with errors of specified types: such
variables might be developed either through modeling efforts or through mapping of outputs of
Machine Learning classification to stand-alone variables. A further outcome of the exploratory
analysis would be predictive models for counts of MAF adds and deletes at unit level that might
be described and evaluated by aggregating unit-level predictions to block- and higher-level
geographic domains.
40
2020 Census Evaluation of the Reengineered Address Canvassing Operation
Version 1.1
References
Johnson, N. and Kephart, K. (2013), “2010 Census Evaluation of Address Frame Accuracy and
Quality,” 2010 Census Planning Series Memoranda No. 252, U.S. Census Bureau,
February 2013.
Johnson, N. (2018), “Evaluation of the Reengineered Address Canvassing Operation,” 2020
Census Evaluation Proposal, internal document, U.S. Census Bureau.
Raim, A. and Gargano, M. (2015), “Selection of Predictors to Model Coverage
Errors in the Master Address File,” Research Report #2015-04, Center for Statistical
Research and Methodology, U.S. Census Bureau,
https://www.census.gov/srd/papers/pdf/RRS2015-04.pdf
Slud, E. (2014), “Modeling Frame Deficiencies for Improved Calibrations, Proceedings of the
American Statistical Association,” Survey Research Methods Section, JSM 2014
proceedings.
Virgile, M. (2010), “Final Report for the 2010 Census Program for Evaluations and
Experiments: Evaluation of Small Multi-Unit Structures Report,” DSSD 2010 CPEX
Memorandum Series #A-01, U.S. Census Bureau, February 2012.
Young, D., Raim, A. and Johnson, N. (2016), “Zero-inflated modelling for characterizing
coverage errors of extracts from the U.S. Census Bureau’s Master Address File,” Journal
of the Royal Statistical Society Series A 180(1), 73-97.
41
File Type | application/pdf |
Author | douglass Abramson |
File Modified | 2019-06-28 |
File Created | 2019-06-28 |