More than ever before, key decision makers rely on empirical guidance for improving the operations of their organizations. Passport Services’ study strategy relies on scientific sample design, diligent survey administration, effective analysis of survey data, and accurate interpretation of the results. The Passport Demand Forecasting Survey is a critical component of Passport Services’ strategy to marshal empirical data for making efficient, accurate management decisions because it provides Passport Services with reliable projections of demand for U.S. passports.
Another fundamental tenet of our strategy is recognition of the fact that rigorous survey research programs are designed and implemented to minimize the so-called Total Survey Error, as depicted in the following diagram. Under this comprehensive framework each component of error receives proper attention, since with imbalanced focus on any particular source of error other error components will gain opportunities to grow and create weak links in the survey process.
Figure 1. Components of total survey error
Total Survey Error |
|||||||||||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|||||||||||||
Errors of |
|
Errors of |
|
Errors of |
|
Errors of Dissemination |
|||||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Sample Coverage |
|
Response Rates |
|
Instrument |
|
Data Collection |
|
Data Editing & Compilation |
|
Imputation & Weighting |
|
Analysis of Survey Data |
|
Striking an optimal balance requires academic knowledge, hands-on experience, and transparent execution. Conducive to this objective is the solid – yet intuitive – design that Passport Services has envisioned for this study, the main components of which are outlined in this section. Specifically, we will discuss:
Design and selection of representative samples that are probability-based;
Collection of reliable data using a respondent-friendly protocol;
Effective data enhancements procedures; and
Reliable demand projections and timely reporting of the results.
This forecast study employs an address-based sampling (ABS) methodology to reach a probability-based sample of the U.S. households every month. Increasingly, survey researchers are considering address-based sampling (ABS) methodologies to reach the general public for data collection applications. Essentially, there are three main factors for this change:
Evolving coverage problems associated with the traditional methods of sampling;
Eroding rates of response to single modes of contact; and
Recent improvements in the databases of household addresses available to researchers.
Indeed, recent advances in databases of household addresses have provided a promising alternative for surveys that require contacts with representative samples of households. The Computerized Delivery Sequence File (CDSF) of the USPS is a database that contains all delivery points in the United States, a summary of which is provided in the following table. For this survey the monthly sample of addresses are obtained from the enhanced ABS frame developed by Marketing Systems Group (MSG).
Table 1. Distribution of the CDSF delivery type points
Delivery Point Type |
Count |
City Style/Rural Routes |
120,843,190 |
Traditional P.O. Box |
14,184,975 |
Only Way of Getting Mail (OWGM) P.O. Box |
1,433,416 |
Seasonal |
847,540 |
Educational |
95,691 |
Vacant |
2,861,124 |
Throwback |
233,299 |
Drop Points |
739,338 |
Augmented addresses (by MSG) |
108,452 |
Total |
141,347,025 |
By identifying the latitude and longitude of each address MSG is able to create a one-to-one correspondence between the Postal geographic indicators, which are suitable for mail delivery, and those suitable for sampling designs based on the Census geographic definitions. Subsequently, the resulting database is augmented with a list of geo-demographic indicators to evolve the raw CDSF into an effective sampling frame suitable for selection of probability-based samples.
For this study a monthly sample of approximately 28,100 addresses will be selected to represent the nation. This sample is selected from the MSG-enhanced ABS frame, which is updated on a quarterly basis. Each sample address is name and telephone matched, with the expectation that up to 40 percent of addresses link to a landline telephone number. Moreover, each address has a set of geo-demographic data appended to it by relying on public and commercial sources.
As detailed later, matched telephone numbers are used as part of our non-response follow-up to maximize response rates, whereas the appended ancillary data supports our non-response bias analysis to help develop effective weighting procedures. While a subset of such data is available for individual addresses, others are retrieved at higher levels of aggregation, such as Census Block Group (CBG) or ZIP Code.
On a monthly basis, national estimates are required for this study. To ensure that survey estimates for each of the 25 geographically-based passport agencies are of equal precision, each monthly sample is stratified accordingly to include an equal number of sample addresses per agency. Specifically, the total monthly sample of approximately 28,100 addresses is stratified into 25 strata of 1,124 addresses each.
The table above provides the distribution of addresses and expected sample size. Note that less than 3 percent of all addresses are currently marked as “vacant” according to the CDSF. Given that a small percentage of such addresses can be occupied by the time the survey administration begins for a given month, it is possible to include a nominal fraction of such households as part of the sampling frame to minimize under-coverage.
Each month about 2,500 completed surveys will be secured using multiple methods of survey administration. For this study both web and telephone modes of data collection are used to produce the highest rates of response by making the survey experience as convenient for respondents as possible. Briefly, our survey administration protocol consists of the following main steps:
Approximately 25 days before the beginning of the study month a cross-sectional random sample of approximately 28,100 addresses is selected from the latest MSG-enhanced version of the CDSF.
MSG will provide the sample to the survey vendor SSRS approximately 15 days prior to the study month.
Five days before the beginning of the month, invitation letters are mailed to sample households in Alaska and Hawaii.
Four days before the beginning of the month, invitation letters are mailed to sample households in the Pacific and Mountain time zones.
Three days before the beginning of the month, invitation letters are mailed to sample households in the Central and Eastern time zones; this staggered mailing is designed to allow for actual receipt of the invitations to occur simultaneously throughout all selected households nationwide.
Sample households may begin responding to the invitation by web or inbound telephone call on the first day of the month, estimated to be concurrent with receipt of the letter.
On the first day of the month, outbound telephone calls begin to all non-responding households with telephone numbers appended to their records;
On the ninth day of the month, reminder letters are mailed to the remaining non-responding households in Alaska and Hawaii;
On the tenth day of the month, reminder letters are mailed to the remaining non-responding households in the Pacific and Mountain time zones;
On the eleventh day of the month, reminder letters are mailed to the remaining non-responding households in the Central and Eastern time zones;
Data collection closes on the last day of each month upon completion of 2,500 total interviews within that month (to occur no later than the final day of the month); this total includes web responders, inbound telephone interviews, and outbound telephone interviews conducted to achieve the highest possible response rate.
Prior to the start of the month, the survey instruments are programmed and tested in two versions: one for online telephone interviewing (whether inbound or outbound); and the other for self-administration via the web. Data is collected throughout the month using all the available modes that uses one single sample file containing both the records with phone number matched and the unmatched records. The sample file is simultaneously managed regardless of the mode of survey completion. Survey responses, overall progress, progress by geographical areas, and sub-groups of interest are monitored in real time as every mode is captured in the same program and data file.
Approximately 25 days prior to the start of the month, the sample is matched to external records systems to generate as many mailings as possible addressed to the surname of the household members. The salutation strategy inserts a name when available, in addition to including “or Current Resident” on all outgoing mail pieces.
Within each responding household eligible for this study an adult householder familiar with the general travel habits of the given household is asked to participate on behalf of all members of his/her household. While web response is encouraged as the preferred mode when appropriate, inbound and outbound telephone are used as alternative means of completing the survey to increase response rates. Both the invitation, as well as outbound and in-bound phone efforts, stress the importance of selecting a respondent within the household who is at least 18 years of age and is familiar with the general travel habits and travel document needs of all household members. Subsequently, the respondent is asked to provide the number of individuals by age in the entire household, as well as to respond to questions about the travel needs for both the selected respondent and all eligible household members.
All completed surveys are managed through a unified system and sample file. This ensures that if a respondent completes the survey online or by calling the toll-free number, he or she is not contacted by an outbound dialing interviewer. Returns of invitations determined by the USPS to be Undeliverable as Addressed are expected to be received starting around the seventh day of the month. These records are processed and the sample file updated accordingly. Original invitations that had been name-matched during the sample preparation phase and mailed addressed to that surname’s household but are returned as undeliverable by the USPS are revised and addressed to “Current Resident” in the reminder invitation mailing.
General public survey interviewing is usually dialed in the evening between 6:00 p.m. and 9:00 p.m. local time and during weekends. Day time outbound calls are also conducted as dictated by sample needs and respondent preferences. Experienced daytime interviewers are briefed to handle inbound calls and callbacks. The data collection team keeps its facilities open and available for inbound and outbound calls from 9:00 a.m. to 11:00 p.m. Monday through Friday; 11:00 a.m. to 6:00 p.m. on Saturday, and noon to 10:00 p.m. on Sunday, Eastern Standard time. In addition, a west coast facility remains open to accommodate outbound dialing through 9:00 p.m. local respondent time in Alaska and Hawaii.
All interviewers used for data collection are experienced survey research interviewers with prior training in use of Computer Assisted Telephone Interviewing (CATI) techniques. The training of SSRS interviewers surpasses industry standards, averaging about 24 hours before starting on any study. Their training includes an overview of survey research, equipment, and quality standards, as well as role playing and internal survey work. In addition, SSRS has an interviewer mentoring program whereby new interviewers are partnered with experienced supervisors and executive interviewers to allow for one-on-one training. Interviews are digitally recorded and reviewed with interviewers as part of the process of becoming high-quality, dedicated interviewers.
Degree of accuracy needed for the purpose described in the justification:
As mentioned earlier, Passport Services requires monthly national level estimates with a margin of error no larger than +5% at the 95% confidence level. In order to meet these requirements, the Koniag Team will complete about 2,500 surveys per month. The sampling error associated with the national estimates based on a sample size of 2,500, including anticipated larger design effect, is about +2% at the 95% confidence level.
Unusual problems requiring specialized sampling procedures
Passport Services desires that the survey be representative of the U.S. population, age 18 and older. At this time, there are no anticipated problems requiring specialized sampling procedures. The Koniag Team can and will make adjustments as necessary.
Any use of periodic (less frequent than annual) data collection cycles to reduce burden
Passport Services does not anticipate the use of additional periodic data collection cycles.
All practical steps will be taken to maximize response rates to this survey. A sample of such steps includes:
Building Credibility Through Use of an Advance Mailing:
Constructing the survey instrument to be as respondent-friendly as possible
Sponsor identification via signed letters by a Department of State official
Use of Department of State Graphics and color scheme
Reference to website URL’s for study legitimacy validation
Provision of toll-free information line and email address to field respondents’ questions
Call to action to respond by a specific deadline
Notification of on-going efforts to be conducted through the month to reach the selected household
Outbound Calling Effort:
Up to eight attempts rotated through various day times: early evening, later evening, and weekends
Day time calls as requested by respondents
Specific call-back scheduling
Offering respondents the most convenient mode of data collection
Reminder Attempts:
USPS returned mail are addressed to “Current Resident” for the reminder mailing
Reminder effort reiterates the importance of the study and contain all information in the original mailing in terms of rapport building
Establishment of legitimacy and availability of web and phone information for respondents who may have questions
The data are adjusted to deal with issues of non-response. We describe our approach to survey non-response adjustments and enhancements below.
Data from all scientific surveys are weighted before the resulting data can be used to produce reliable estimates of population parameters. While reflecting the selection probabilities of sample units, weighting also attempts to compensate for practical limitations of a sample survey, such as differential non-response and under coverage. Furthermore, by taking advantage of auxiliary information about the survey population, weighting can reduce the bias of survey estimates by enabling the responding subset of the sample to better represent its target universe. This is of particular importance for a tracking survey of this nature, since some of the month-to-month random variations can be minimized by weighting the data before survey estimates are produced.
Typically, the weighting process entails three major steps. The first step consists of computation of design weights as the reciprocal of selection probabilities. In the second step, design weights are adjusted for non-response – a process that is guided by a comprehensive non-response bias analysis. In the third step, non-response-adjusted weights are further adjusted to known population estimates to compensate for sampling frame inadequacies. All along, weighting adjustment steps go through a series of quality control checks to detect extreme outliers and to prevent computational inefficiencies.
For this survey we use the Weighted Adjustment WTADJUST procedure of SUDAAN to weight the monthly survey data1. Unlike traditional ranking procedures that are based on iterative proportional fitting, this model-based approach incorporates more main effects and lower-order interactions of variables when computing weights. Moreover, because this procedure allows limiting the resulting weight adjustment factors, one can eliminate extreme weights early in the process and control the variability of the final weights. Consequently, with this alternative it is generally possible to achieve balance with respect to an expanded set of control totals while at the same time reducing the variance of weighted statistics.
Our non-response bias analysis includes comprehensive comparisons of the geo-demographic composition of respondents with their corresponding estimates as reported by Current Population Survey (CPS), or the American Community Survey (ACS) for smaller geographic levels. Each month a fresh round of non-response bias analysis is carried out to provide current guidelines for weight adjustments. In addition to compensating for differential non-response patterns based on the data collection mode, this approach also takes into account seasonal variations when creating the final weights.
It should be noted that prior to non-response bias analysis and computation of survey weights, it is necessary to impute missing data that will result from item non-response and data items that fail edit checks. Since missing data can create inefficiencies for demand projections, when appropriate we use the method of weighted sequential hot-deck2 to impute missing survey data. By incorporating the sampling weights, this method of imputation reflects the unequal probabilities of selection in the monthly sample while controlling the expected number of times a particular respondent’s answer is used as a donor to replace missing values.
Finally, survey estimates can only be interpreted properly in light of their associated sampling errors. Since weighting often increases variances of estimates, use of standard variance calculation formulae with weighted data can result in misleading statistical inferences. For this survey we will use SUDAAN and Statistical Analysis System (SAS) software to compute weighted demand, variances of which are approximated using the Taylor Series Linearization technique. Without this any projections of demand are subject to confidence intervals with artificially narrow widths. That is, survey estimates show a larger level of confidence than what they actually should.
The survey questionnaire has been internally pre-tested for timing, content, and clarity. Moreover, SSRS engaged in a formal initial pre-test of the survey instrument, conducted with a sample of 30 households to confirm that the screening questions and procedures, as well as all survey logic worked as intended. Although the pre-test was designed as a confirmatory procedure, any issues uncovered with survey instructions (such as item wording or incomplete response categories) were addressed with revisions that were incorporated into the final survey materials upon receipt of agency approval.
Survey procedures are tested in several ways and will be on-going when any changes are made to the base survey instrument. These tests will involve no more than nine participants to examine the comprehensibility, structure and order of survey questions.
All testing is done in the conduct of the monthly survey utilizing replicated sub-samples of the main sample file.
Based on its experience, SSRS has developed mailing materials and protocols, which should produce the rates of response required to reach the desired number of monthly interviews. SSRS plans no testing/experiments of different mailing procedures as a matter of course during the conduct of the fieldwork. Any contemplated mailing procedure changes will be based on observed response rates, mail return rates, and respondent feedback where available. Any proposed mailing procedure changes or experiments will be submitted for approval prior to implementation.
Dr. Mansour Fahimi PH.D.
Executive Vice President, Chief Data Scientist
Jo Prabhakaran
Project Manager and SME, Data Science
1.R.E. Folsom and A.C. Singh (2000). “The Generalized Exponential Model for Sampling Weight Calibration.” Proceedings of the Section on Survey Research Methods of the American Statistical Association, pp. 598-603.
2.Iannacchione, V.G. (1982). Weighted Sequential Hot Deck Imputation Macros. In Proceedings of the Seventh Annual SAS User’s Group International Conference (pp.759–763). Cary, NC: SAS Institute, Inc.
File Type | application/vnd.openxmlformats-officedocument.wordprocessingml.document |
Author | Watkins, Pamela K |
File Modified | 0000-00-00 |
File Created | 2021-01-13 |