SUPPORTING STATEMENT FOR
PAPERWORK REDUCTION ACT SUBMISSION
Passport Demand
Forecasting Study
OMB Number 1405- 0177; SV-#2012-0006
More than ever before, key decision makers rely on empirical guidance for improving the operations of their organizations. Passport Services’ study strategy relies on scientific sample design, diligent survey administration, effective analysis of survey data, and accurate interpretation of the results. The Passport Demand Forecasting Survey is a critical component of Passport Services’ strategy to marshal empirical data for making efficient, accurate management decisions because it provides Passport Services with reliable projections of demand for U.S. passports.
Another fundamental tenet of our strategy is recognition of the fact that rigorous survey research programs are designed and implemented to minimize the so-called Total Survey Error, as depicted in the following diagram. Under this comprehensive framework, each component of error receives proper attention, because with imbalanced focus on any particular source of error, other error components will gain opportunities to grow and create weak links in the survey process.
Figure 1. Components of total survey error
Total Survey Error |
|||||||||||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|||||||||||||
Errors of |
|
Errors of |
|
Errors of |
|
Errors of Dissemination |
|||||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Sample Coverage |
|
Response Rates |
|
Instrument |
|
Data Collection |
|
Data Editing & Compilation |
|
Imputation & Weighting |
|
Analysis of Survey Data |
|
Striking an optimal balance here requires academic knowledge, hands-on experience, and transparent execution. Conducive to this objective is the solid – yet intuitive – design that Passport Services has envisioned for this study, the main components of which are outlined in this section. Specifically, in what follows we will discuss:
Design and selection of representative samples that are probability-based;
Collection of reliable data using a respondent-friendly protocol;
Effective data enhancements procedures; and
Reliable demand projections and timely reporting of the results.
This forecast study employs an address-based sampling (ABS) methodology to reach a probability-based sample of the U.S. households every month. Increasingly, survey researchers are considering address-based sampling (ABS) methodologies to reach the general public for data collection applications. Essentially, there are three main factors for this change:
Evolving coverage problems associated with the traditional methods of sampling;
Eroding rates of response to single modes of contact; and
Recent improvements in the databases of household addresses available to researchers.
Indeed, recent advances in databases of household addresses have provided a promising alternative for surveys that require contacts with representative samples of households. The Computerized Delivery Sequence File (CDSF) of the USPS is a database that contains all delivery points in the US, a summary of which is provided in the following table. For this survey the monthly sample of addresses are obtained from the enhanced ABS frame developed by Marketing Systems Group (MSG).
Table 1. Distribution of the CDSF delivery type points
Delivery Point Type |
Count |
City Style/Rural Routes |
115,944,396 |
Traditional P.O. Box |
14,239,146 |
Only Way of Getting Mail (OWGM) P.O. Box |
1,404,611 |
Seasonal |
858,184 |
Educational |
94,678 |
Vacant |
3,630,143 |
Throwback |
275,670 |
Drop Points |
731,616 |
Augmented addresses (by MSG) |
150,587 |
Total |
137,329,031 |
By identifying the latitude and longitude of each address, MSG is able to create a one-to-one correspondence between the Postal geographic indicators, which are suitable for mail delivery, and those suitable for sampling designs based on the Census geographic definitions. Subsequently, the resulting database is augmented with a list of geo-demographic indicators to evolve the raw CDSF into an effective sampling frame suitable for selection of probability-based samples.
For this study, a monthly sample of approximately 32,000 addresses is selected to represent the nation. This sample is selected from the MSG-enhanced ABS frame, which is updated on a quarterly basis. Each sample address is name and telephone matched, with the expectation that up to 55 percent of addresses link to a landline telephone number. Moreover, each address has a set of geo-demographic data appended to it by relying on public and commercial sources.
As detailed later, matched telephone numbers are used as part of our non-response follow-up to maximize response rates, whereas the appended ancillary data supports our non-response bias analysis to help develop effective weighting procedures. While a subset of such data is available for individual addresses, others are retrieved at higher levels of aggregation, such as Census Block Group (CBG) or ZIP Code.
On quarterly basis, 25 regional estimates are required for this study, corresponding to the Passport Services’ 25 geographically-based agencies where U.S. citizens may apply in person for a passport. To ensure that survey estimates for each of the agencies are of equal precision, each monthly sample is stratified accordingly to include an equal number of sample addresses per agency. Specifically, the total monthly sample of approximately 32,000 addresses is stratified into 25 strata of 1,280 addresses each. From the monthly sample of 1,280 addresses per agency, it is expected that about 160 completed surveys are secured each month, which when aggregated results in about 480 = 160 × 3 completed surveys every quarter.
The table above provides the distribution of addresses and expected sample size. Note that less than 3 percent of all addresses are currently marked as “vacant” according to the CDSF. Given that a small percentage of such addresses can be occupied by the time the survey administration begins for a given month, it is possible to include a nominal fraction of such households as part of the sampling frame to minimize under coverage.
Each month about 4,000 completed surveys are secured using multiple methods of survey administration. For the study both web and telephone modes of data collection are used to produce the highest rates of response by making the survey experience as convenient for respondents as possible. Briefly, our survey administration protocol consists of the following main steps:
One week before the beginning of the month, a cross-sectional random sample of approximately 32,000 addresses is selected from the latest MSG-enhanced version of the CDSF.
Five days before the beginning of the month, invitation letters are mailed to sample households in Alaska and Hawaii.
Four days before the beginning of the month, invitation letters are mailed to sample households in the Pacific and Mountain Time zones.
Three days before the beginning of the month, invitation letters are mailed to sample households in the Central and Eastern Time zones; this staggered mailing is designed to allow for actual receipt of the invitations to occur simultaneously throughout all selected households nationwide.
Sample households may begin responding to the invitation by web or inbound telephone call on the first day of the month, estimated to be concurrent with receipt of the letter.
On the first day of the month, outbound telephone calls begin to all non-responding households with telephone numbers appended to their records;
On the ninth day of the month, reminder letters are mailed to the remaining non-responding households in Alaska and Hawaii;
On the tenth day of the month, reminder letters are mailed to the remaining non-responding households in the Pacific and Mountain time zones;
On the eleventh day of the month, reminder letters are mailed to the remaining non-responding households in the Central and Eastern time zones;
Data collection closes on or around the 25th of each month upon completion of 4,000 total interviews within that month (to occur no later than the final day of the month); this total includes web responders, inbound telephone interviews, and outbound telephone interviews conducted to achieve the highest possible response rate.
Prior to the start of the month, the survey instrument is programmed and tested in two versions: one for online telephone interviewing (whether inbound or outbound); and the other for self-administration via the web. Data is collected throughout the month using all the available modes that uses one single sample file containing both the records with phone number matched and the unmatched records. The sample file is simultaneously managed regardless of the mode of survey completion. Survey responses, overall progress, progress by geographical areas, and sub-groups of interest are monitored in real time as every mode is captured in the same program and data file.
Between 10 days and four days prior to the start of the month, the sample is matched to external records systems to generate as many mailings as possible addressed to the surname of the household members. The salutation strategy inserts name when available, in addition to including “or Current Resident” on all outgoing mail pieces.
Within each responding household eligible for this study an adult householder familiar with the general travel habits of the given household is asked to participate on behalf of all members of his/her household. While web is encouraged as the preferred mode when appropriate, inbound and outbound telephone are used as alternative means of completing the survey to increase response rates. Both the invitation, as well as outbound and in-bound phone efforts, stresses the importance of selecting a respondent within the household who is at least 18 years of age and is familiar with the general travel habits and travel document needs of all household members. Subsequently, the respondent is asked to provide a complete enumeration of the entire household as well as to respond to questions about the travel needs for both the selected respondent and all eligible household members.
Records in high-incidence Hispanic block groups (defined as blocks that are 75% Hispanic) and those matched with Hispanic surnames are flagged for a bilingual (English/Spanish) invitation letter to be mailed to those addresses, while the remaining invitation letters are English only. The web version of in the instrument is available to all responders in their choice of English or Spanish. Bilingual interviewers are always available to conduct inbound and outbound interviews in either English or Spanish, as preferred by the respondent.
All completed surveys are managed through a unified system and sample file. This ensures that if a respondent completes the survey online or by calling the toll-free number, he or she is not contacted by an outbound dialing interviewer. Returns of invitations determined by the USPS to be Undeliverable as Addressed are expected to be received starting around the seventh day of the month. These records are processed and the sample file updated accordingly. Original invitations that had been name-matched during the sample preparation phase and mailed addressed to that surname’s household but are returned as undeliverable by the USPS are revised and addressed to “Current Resident” in the reminder invitation mailing.
General public survey interviewing is usually dialed in the evening between 6:00 p.m. and 9:00 p.m. local time and during weekends. Daytime outbound calls are also conducted as dictated by sample needs and respondent preferences. Experienced daytime interviewers are briefed to handle inbound calls and callbacks. The data collection team keeps its facilities open and available for inbound and outbound calls from 9:00 a.m. to 11:00 p.m. Monday through Friday; 11:00 a.m. to 6:00 p.m. on Saturday, and noon to 10:00 p.m. on Sunday, Eastern Standard time. In addition, a west coast facility remains open to accommodate outbound dialing through 9:00 p.m. local respondent time in Alaska and Hawaii.
All interviewers used for data collection are experienced survey research interviewers with prior training in use of CATI. The training of STR interviewers surpasses industry standards, averaging about 24 hours before starting on any study. Their training includes an overview of survey research, equipment, and quality standards, as well as role-playing and internal survey work. In addition, STR has an interviewer-mentoring program whereby new interviewers are partnered with experienced supervisors and executive interviewers to allow for one-on-one training. Interviews are digitally recorded and reviewed with interviewers as part of the process of becoming high quality, dedicated interviewers.
Degree of accuracy needed for the purpose described in the justification
As mentioned earlier, Passport Services requires quarterly estimates for each of its 25 geographically-based agencies with a margin of error no larger than ±3% at each stratum at the 95% confidence level in addition to monthly national level estimates with a margin of error no larger than +2% at the 95% confidence level. This margin of error applies to sampling errors and not measurement error. In order to meet these requirements, the LMI Team completes about 4,000 surveys per month. The sampling error associated on average for the 25 quarterly agency estimates based on a sample size of approximately 480, including anticipated design effect, is about ±5% at the 95% confidence level. The sampling error associated with the national estimates based on a sample size of 4,000, including anticipated larger design effect, is about +2% at the 95% confidence level.
Unusual problems requiring specialized sampling procedures
Passport Services desires that the survey be representative of all U.S. citizens and U.S. nationals age 16 and older. At this time, there are no anticipated problems requiring specialized sampling procedures. The LMI Team can and will make adjustments as necessary.
Any use of periodic (less frequent than annual) data collection cycles to reduce burden
Passport Services does not anticipate the use of additional periodic data collection cycles.
All practical steps will be taken to maximize response rates to this survey. A sample of such steps includes:
Building Credibility Through Use of an Advance Mailing:
Constructing the survey instrument to be as respondent-friendly as possible
Sponsor identification via signed letters by a Department of State official
Use of Department of State Graphics and color scheme
Reference to website URL’s for study legitimacy validation
Provision of toll-free information line and email address to field respondents’ questions
Call to action to respond by a specific deadline
Notification of on-going efforts to be conducted through the month to reach the selected household
Outbound Calling Effort:
Up to ten attempts rotated through various day times: (early evening, later evening, and weekends)
Day time calls as requested by respondents
Specific call back scheduling
Offering respondents the most convenient mode of data collection
Reminder Attempts:
USPS returned mail are addressed to “Current Resident” for the reminder mailing
Reminder effort reiterates the importance of the study and contain all information in the original mailing in terms of rapport building
Establishment of legitimacy and availability of web and phone information for respondents who may have questions
The data are adjusted to deal with issues of non-response. We describe our approach to survey non-response adjustments and enhancements below.
Data from all scientific surveys are weighted before the resulting data can be used to produce reliable estimates of population parameters. While reflecting the selection probabilities of sample units, weighting also attempts to compensate for practical limitations of a sample survey, such as differential non-response and under coverage. Furthermore, by taking advantage of auxiliary information about the survey population, weighting can reduce the bias of survey estimates by enabling the responding subset of the sample to better represent its target universe. This is of particular importance for a tracking survey of this nature, since some of the month-to-month random variations can be minimized by weighting the data before survey estimates are produced.
Typically, the weighting process entails three major steps. The first step consists of computation of design weights as the reciprocal of selection probabilities. In the second step, design weights are adjusted for non-response – a process that is guided by a comprehensive non-response bias analysis. In the third step, non-response-adjusted weights are further adjusted to known population estimates to compensate for sampling frame inadequacies. All along, weighting adjustment steps go through a series of quality control checks to detect extreme outliers and to prevent computational inefficiencies.
For this survey, we use the WTADJUST procedure of SUDAAN to weight the monthly survey data1. Unlike traditional raking procedures that are based on iterative proportional fitting, this model-based approach incorporates more main effects and lower-order interactions of variables when computing weights. Moreover, because this procedure allows limiting the resulting weight adjustment factors, one can eliminate extreme weights early in the process and control the variability of the final weights. Consequently, with this alternative it is generally possible to achieve balance with respect to an expanded set of control totals while at the same time reducing the variance of weighted statistics.
Our non-response bias analysis includes comprehensive comparisons of the geo-demographic composition of respondents with their corresponding estimates as reported by Current Population Survey (CPS), or the American Community Survey (ACS) for smaller geographic levels. Each month a fresh round of non-response bias analysis is carried out to provide current guidelines for weight adjustments. In addition to compensating for differential non-response patterns based on the data collection mode, this approach also takes into account seasonal variations when creating the final weights.
It should be noted that prior to non-response bias analysis and computation of survey weights, it is necessary to impute missing data that will result from item non-response and data items that fail edit checks. Since missing data can create inefficiencies for demand projections, when appropriate we use the method of weighted sequential hot-deck2 to impute missing survey data. By incorporating the sampling weights, this method of imputation reflects the unequal probabilities of selection in the monthly sample while controlling the expected number of times a particular respondent’s answer is used as a donor to replace missing values.
Finally, survey estimates can only be interpreted properly in light of their associated sampling errors. Since weighting often increases variances of estimates, use of standard variance calculation formulae with weighted data can result in misleading statistical inferences. For this survey, we will use SUDAAN and SAS to compute weighted demand, variances of which are approximated using the Taylor Series Linearization technique. Without this, any projections of demand are subject to confidence intervals with artificially narrow widths. That is, survey estimates show a larger level of confidence than what they actually should.
The survey questionnaire has been internally pre-tested for timing, content, and clarity. Moreover, STR engaged in a formal initial pre-test of the survey instrument, conducted with a sample of 30 households to confirm that the screening questions and procedures, as well as all survey logic worked as intended. Although the pre-test was designed as a confirmatory procedure, any issues uncovered with survey instructions (such as item wording or incomplete response categories) were addressed with revisions that were incorporated into the final survey materials upon receipt of agency approval.
Survey procedures are tested in several ways and will be on-going when any changes are made to the base survey instrument. These tests will involve no more than nine participants to examine the comprehensibility, structure, and order of survey questions.
Moreover, as part of our testing procedures different survey administration options can be examined to improve response rates. While minor in nature, these options may include:
Addressee Mailing Information
Named salutations when addresses are name-matched vs. unmade salutation using:
Current Resident
State Resident
Other salutations
Initial and Reminder Mailing Treatments
Envelope Tests for Open Rate
Letter Tests for Click Through and Response Rates
Call to action and delivery deadline
Web vs. in-bound phone information placement
Ordering of other invitation information
Logos and stationery options
Timing of Mailing
Testing Efficacy of Reminder Calls
All testing is done in the conduct of the monthly survey utilizing replicated sub-samples of the main sample file.
Based on its experience, STR has developed mailing materials and protocols, which should produce the rates of response required to reach the desired number of monthly interviews. STR plans no testing/experiments of different mailing procedures as a matter of course during the conduct of the fieldwork. Any contemplated mailing procedure changes will be based on observed response rates, mail return rates, and respondent feedback where available. Any proposed mailing procedure changes or experiments will be submitted for approval prior to implementation.
Trent Buskirk, Ph.D., (314) 695-1378. Marketing Systems Group – Sampling Design and Analysis
Gregg Kennedy, (215) 870-8656, Survey Technology and Research Center – Design and Collection
Jyothsna Prabhakaran,., (571) 633-7796, LMI – Survey Design and Analysis
1.R.E. Folsom and A.C. Singh (2000). “The Generalized Exponential Model for Sampling Weight Calibration.” Proceedings of the Section on Survey Research Methods of the American Statistical Association, pp. 598-603.
2.Iannacchione, V.G. (1982). Weighted Sequential Hot Deck Imputation Macros. In Proceedings of the Seventh Annual SAS User’s Group International Conference (pp.759–763). Cary, NC: SAS Institute, Inc.
File Type | application/vnd.openxmlformats-officedocument.wordprocessingml.document |
File Title | SUPPORTING STATEMENT FOR |
Author | USDOS |
File Modified | 0000-00-00 |
File Created | 2021-01-26 |