2019 NSCG Adaptive Survey Design Experiment Goals, Interventions, Monitoring Metrics, and Potential Invervention Points

Appendix_H.1_AdaptiveDesign_30Oct18.docx

National Survey of College Graduates (NSCG)

2019 NSCG Adaptive Survey Design Experiment Goals, Interventions, Monitoring Metrics, and Potential Invervention Points

OMB: 3145-0141

⚠️ Notice: This form may be outdated. More recent filings and information on OMB 3145-0141 can be found here:

Document [docx]

Download: docx | pdf

APPENDIX H

2019 NSCG Adaptive Survey Design

Experiment Goals, Interventions, Monitoring Metrics, and Potential Intervention Points

2019 NSCG Adaptive Design Experiment Goals, Interventions,

and Monitoring Metrics

The 2019 NSCG Adaptive Design Experiment (“2019 Experiment”) will be structured largely the same as the 2015 and 2017 NSCG Adaptive Design Experiments. Just as in those years, we will have experimental groups for the new sample cases (8,000) and the returning sample cases (10,000) with control groups identified for comparative purposes. Improvements will come from two directions for the 2019 Experiment:

Cases will be identified for interventions based on their ability to reduce the root mean squared error (RMSE) for key variables in the NSCG. Additionally, we will expand the data monitoring metrics that we implement during data collection to include evaluating the stability of survey estimates.
We will automate both the identification and selection of cases for interventions, as well as the delivery of the intervention file directly to the data collection modes. This will reduce the number of handoffs required to enact an intervention, making the implementation of adaptive design more efficient.

In 2015, NCSES and the Census Bureau worked to develop flow processing capabilities for the entire survey, with editing, weighting, and imputation occurring at time points during the data collection period as opposed to waiting until after data collection was over to perform the data processing. For the 2019 Experiment, we will be implementing simplified versions of flow processing to allow us to examine differences between the treatment and control groups not only with respect to representativeness and response rate, but also regarding stability of estimates and the effect of our nonresponse adjustment. These types of metrics will be considered as contributing factors in our decisions to make interventions.

Additionally, we will use past rounds of the NSCG to impute responses for non-respondents throughout data collection, along with the propensity to respond given the application of particular data collection features and the cost of those features. These simulations will allow us to determine which features are most effective at reducing the RMSE of key estimates while understanding their effect on response rates and budget. We can use these simulations to evaluate the 2019 NSCG to see if the effects of data collection features are relatively stable over time.

The second improvement will continue the automation of the data analytic and business rule execution that was ad hoc in nature in the adaptive design experiments from previous cycles. While some monitoring metrics, including R-indicators, were run on an automated basis, specific decisions about when and where interventions should actually occur were the result of extended conversations and incremental data analysis. While these steps were important in the early stages of adaptive design, and for understanding how large interventions would be, adaptive design cannot be implemented in a standardized, repeatable production setting while maintaining such an extremely hands-on approach. For the 2019 Experiment, we will review the analytical questions that arose during past adaptive design decision meetings and attempt to automate these types of analyses in conjunction with the data monitoring metrics.

In a general sense, the goal of the 2019 Experiment is to evaluate new methods for case identification for interventions, expand usage of and access to data monitoring metrics, and develop a baseline level of comfort with automated interventions for adaptive design in a production setting.

The remainder of this appendix discusses several reasonable adaptive design goals, what interventions would allow the NSCG to achieve those goals, and what monitoring metrics would inform those interventions. As noted earlier, the 2019 Experiment will be structured largely the same as the 2015 and 2017 Experiments, and so the goals listed below are like the goals pursued as part of both the 2015 and 2017 Experiments. The major difference is that, instead of focusing on R-indicators, which only require frame data and response indicators, the selection criteria for interventions in the 2019 NSCG will utilize historical and current response data to intervene on cases that will reduce the RMSE of key survey estimates. However, both R-indicators and RMSE of key estimates can be used to reduce the risk of non-response bias in estimates and balance cost, so this change represents an expanding evaluation of monitoring metrics, without losing sight of our main adaptive design goals.

Goal 1: Balance Sample / Reduce Nonresponse Bias

Sampling balancing and/or reducing nonresponse bias relate to maintaining data quality in the face of shrinking budgets and falling response rates. Nonresponse bias arises when the outcomes of interest (the survey estimates) for respondents are different from those of nonrespondents. This difference results in a bias because the resulting estimates only represent a portion of the total target population. Surveys often try to correct for this after data collection using weighting, post-stratification, or other adjustments. Adaptive design interventions during data collection attempt to correct for nonresponse bias during data collection by actually changing the respondent population to be more balanced on frame characteristics related to response and outcome measures.

While discussing R-indicators, Schouten et al., provides reasons why balancing on variables related to response status and outcome variables is desirable. “In fact, we view the R-indicator as a lack-of-association measure. The weaker the association the better, as this implies that there is no evidence that non-response has affected the composition of the observed data.” [3] This suggests that “selective forces…are absent in the selection of respondents” out of the sample population [2], and so nonresponse approaches missing at random, reducing the risk of non-response bias.

Interventions: Interventions are used to change the type or quantity of contacts targeted at specific subgroups or individuals. Interventions that will be considered for inclusion in the 2019 Experiment include:

Sending an unscheduled mailing to sample persons;
Sending cases to computer assisted telephone interviews (CATI) prior to the start of production CATI nonresponse follow up (NRFU), to target cases with an interviewer-assisted mode rather than limiting contacts to self-response modes;
Putting CATI cases on hold, to reduce contacts in interviewer-assisted modes, while still requesting response in self-response modes;
Withholding paper questionnaires while continuing to encourage response in the web mode to reduce the operational and processing costs associated with certain groups of cases;
Withholding web invites to discourage response in certain groups of cases, while still allowing these cases to respond using previous invitations;

Sending paper questionnaires to web nonrespondents earlier than the scheduled mail date to provide two modes of self-response rather than one; and
Changing the CATI call time prioritization to increase or decrease the probability a case is called during a specific time.

Monitoring Methods:

Root Mean Squared Error of Key Estimates
R-indicators [2], [3], [4];
Mahalanobis Distance or other distance measure [5];
Response influence [6]; and
Uncertainty/influence of imputed y-values [7].

We used R-indicators in the 2013 and 2015 Experiments and used a modified version of an R-indicator, and individual balancing propensity score, in the 2017 effort. As a metric, R-indicators were useful for measuring response balance, and served their purpose as a proof of concept for data monitoring. However, employing more metrics during data collection allows us to assess the usefulness of each monitoring metric and provides more confidence that data collection interventions were targeted in the most efficient way possible. That is, if R-indicators identify subgroups that should be targeted to increase response balance, and another metric (e.g., balancing propensity, response influence, Mahalanobis distance, etc.) identifies specific cases in those subgroups that also are likely to have an effect on nonresponse bias, then we have more confidence that those identified cases are the optimal cases for intervention, both from a response balance and nonresponse bias perspective.

Goal 2: Increase Timeliness of Data Collection

Analysts and other data users that need relevant, up-to-date information to build models, investigate trends, and write policy statements rely on timely survey data. NCSES specifically focused on timeliness as a goal for the 2013 NSCG [4] and reduced the length of time from the beginning of data collection to the time of data release from 28 months to 12 months. This required a reduction in the data collection from ten months to six months. In the future, NCSES is interested in further reducing data collection, specifically, from six months to five months.

Interventions: Interventions will attempt to either encourage response to the NSCG earlier than the standard data collection pathway or will be used to stop data collection if new respondents are not changing key estimates. This could be achieved by introducing modes earlier than the standard data collection pathway, sending reminders that elicit response more quickly, or stopping data collection for all or a portion of cases and reallocating resources. Possible interventions include:

Sending cases to CATI prior to the start of production CATI NRFU, to target cases with an interviewer-assisted mode rather than limiting contacts to self-response modes;
Sending paper questionnaires to web nonrespondents earlier than the scheduled mail date to provide two modes of self-response rather than one;
Sending email reminders earlier than the scheduled dates in data collection; and
Stopping data collection for the sample or for subgroups given a sufficient level of data quality. For example, we could stop data collection if:

key estimates have stabilized, and standard errors fall within acceptable ranges, or
the coverage ratio for a subgroup of interest reaches a pre-determined threshold.

Monitoring Methods:

Propensity to Respond by Modes [8];
Change Point Analysis [9];
Stability of Estimates [10]; and
Coverage Ratios.

Ongoing NSCG research conducted by Chandra Erdman and Stephanie Coffey [8] could inform appropriate times to introduce new modes to cases ahead of the standard data collection schedule. Another possibility involves exploring change point analysis. If respondents per day as a metric changes over time, showing fewer responses in a given mode, there may be cause to introduce a new mode ahead of schedule. In addition, we will be able to calculate key estimates on a weekly or semi-weekly basis. As a result, we will be able to track stability of estimates during data collection to identify times when the data collection strategy has peaked, resulting in fewer responses or similar information that was already collected.

Goal 3: Reduce Cost

Controlling costs are always a survey management goal. More recently however, “the growing reluctance of the household population to survey requests has increased the effort that is required to obtain interviews and, thereby, the costs of data collection…[which] has threatened survey field budgets with increased risk of cost overruns” [10]. As a result, controlling cost is an important part of adaptive design. By allowing survey practitioners to reallocate resources during the data collection period, surveys can make tradeoffs to prioritize cost savings over other goals.

Interventions: Interventions will be used to encourage survey response via the web while discouraging response in more expensive modes (mail, CATI), or to eliminate contacts that may be ineffective. Possible interventions include:

Putting CATI cases on hold, to reduce contacts in interviewer-assisted modes, while still requesting response in self-response modes;
Withholding paper questionnaires while continuing to encourage response by web to reduce the operational and processing costs associated with certain groups of cases;
Withholding web invites to discourage response from certain groups of cases, while still allowing these cases to respond using previous invitations;
Prioritizing or deprioritizing cases in CATI during certain call times to increase or decrease the probability a case is called during a specific time frame without having to stop calling the case entirely; and
Stopping data collection for the sample or for subgroups if key estimates and their standard errors have stabilized.

Monitoring Methods:

Root Mean Squared Error of Key Estimates;
R-indicators;
Mahalanobis Distance or other distance measure;
Response influence;
Uncertainty/influence of imputed y-values;
Stability of estimates; and
Numbers of trips to locating.

The same indicators that are valuable for monitoring data quality also could measure survey cost reduction. If cases are in over-represented subgroups, or have low response influence, we may want to reduce or eliminate contacts on those cases.

In addition, the key estimates valuable to increasing timeliness, are also valuable for controlling cost. When estimates stabilize and their standard errors fall within acceptable limits for subgroups or the entire survey, new respondents are providing similar information to that which we have already collected. If continuing data collection would have little effect on estimates and their standard errors, stopping data collection to all or subgroups of cases would be an efficient way to control costs.

Another potential cost-saving intervention would be to limit the number of times a case could be sent to locating. If we have no contact information for a case, or previously attempted contact information has not been useful for obtaining contact, a case is sent to locating where researchers attempt to identify new, more up-to-date contact information. This operation can be time intensive, especially for cases repeatedly sent to locating. We could track the number of times a case is sent to interactive locating, or the length of time it spends in locating. Cases repeatedly sent to locating and cases that spend a large amount of time being researched may not be ultimately productive cases. Reallocating effort spent on these cases to those in locating for a fewer number of times may be a sensible cost-saving measure that allows us to attempt contact on more cases, rather than spending large amounts of time (money) on the same cases.

Adaptive Design Data Collection Intervention Schedule and Intervention Criteria

To provide insight on the way that adaptive design criteria will be applied in the determination of interventions for the 2019 NSCG adaptive design experiment, NCSES is submitting a table documenting the adaptive design intervention schedule and criteria (Table H.1.).

All sample cases will be monitored beginning at week 0. Adaptive interventions will be reviewed and implemented as needed at weeks 4, 5, 6, 8, 10, 12, 14, 16, 18, 20, 23, and 24 of the data collection period. As part of the adaptive design experiment, we have identified certain adaptive interventions that may be implemented depending upon the case monitoring results that could help the NSCG meet its data collection goals. The decision to implement an adaptive intervention will be based on the evaluation of specific criteria associated with the data collection metrics. The specific criteria are described generally below, and the specifics are provided in Table H.1.

The approach for the 2019 NSCG adaptive design experiment is to use predictive models that estimate the RMSE of key survey variables, so that interventions can focus on RMSE reduction for a given cost. This will improve the data quality of survey outcomes, rather than balancing across frame variables, which is the goal of R-indicators. However, those models are currently being built and evaluated. In the event that these models do not have sufficient power, or result in poor predictions of survey outcomes, we will adapt their strategy to maximize the survey R-indicator for a given cost constraint. The propensity models underlying the NSCG R-indicators have been validated over the past three data collection cycles and include variables that are highly correlated to survey outcomes. This use of the propensity models in multiple NSCG cycles provides confidence that our interventions are improving data collection, even if the metric we use is only a proxy for nonresponse bias. NCSES, in coordination with the Census Bureau, plans to make final decisions about the models used for intervention by December 2018. This ensures the intervention methods selected have been jointly reviewed and agreed upon before data collection begins.

The interventions considered for a given week are designed to result in the largest improvement in the target metric while staying below a cost limit. This means that, generally, we do not want to apply an expensive data collection feature (like telephone calls) to a case unless we predict the case is more likely to respond to the more expensive feature than a less expensive feature (like a web invite). At each intervention point, we will be examining both cost and response properties of different data collection features (like sending cases to CATI early or withholding mailed reminders). However, because the NSCG has a sequential design, there are also overarching cost and response properties that will be kept in mind.

Early in the data collection, the adaptive interventions generally attempt to actively improve the target metric (RMSE or R-indicator) by increasing response among cases selected for intervention. Therefore, we will generally be identifying cases who should receive more data collection resources. During the middle of the data collection, some of the interventions continue to improve response for cases that require more data collection resources, for example with extra questionnaire mailings to the specific groups. However, other interventions reduce effort for cases, and these interventions can be applied to cases that are either equally likely to respond to a more or less expensive data collection feature, or those cases that are just highly unlikely to respond. Finally, near the end of the data collection, using metrics such as the number of trips to locating, response propensities, and the number of call attempts, the interventions focus on controlling data collection costs.

The list of potential interventions for each week is shown in Table H.1., which includes information about metrics and criteria used. Additionally, a flowchart view of the potential data collection interventions illustrates which interventions are available each week.

References:

[1] Coffey, S. (2014, April). “Report for the 2013 National Survey of College Graduates Methodological Research Adaptive Design Experiment.” Census Bureau Memorandum for NCSES.

[2] Schouten, B., Cobben, F., Bethlehem, J. (2009, June). “Indicators for representativeness of survey response.” Survey Methodology, 35.1, 101-113.

[3] Schouten, B., Shlomo, N., Skinner, C. (2011). “Indicators for monitoring and improving representativeness of response.” Journal of Official Statistics, 27.2, 231-253.

[4] Coffey, S., Reist, B., White, M. (2013). “Monitoring Methods for Adaptive Design in the National Survey of College Graduates (NSCG).” 2013 Joint Statistical Meeting Proceedings, Survey Research Methods Section. Alexandria, VA: American Statistical Association.

[5] de Leon A.R., Carriere K.C. (2005). “A generalized Mahalanobis distance for mixed data.” Journal of Multivariate Analysis, 92, 174-185.

[6] Särndal, C., Lundström, S. (2008). “Assessing auxiliary vectors for control of nonresponse bias in the calibration estimator.” Journal of Official Statistics, 24, 167-191.

[7] Wagner, J. (2014). “Limiting the Risk of Nonresponse Bias by Using Regression Diagnostics as a Guide to Data Collection.” Presentation at the 2014 Joint Statistical Meetings. August 2014.

[8] Erdman, C., Coffey, S. (2014). “Predicting Response Mode During Data Collection in the NSCG.” Presentation at the 2014 Joint Statistical Meetings. August 2014.

[9] Killick, R., Eckley, I. (2014). “Changepoint: An R Package for Changepoint Analysis.” Downloaded from http://www.lancs.ac.uk/~killick/Pub/KillickEckley2011.pdf on August 8, 2014.

[10] Groves, R.M., Heeringa, S. (2006). “Responsive design for household surveys: tools for actively controlling survey errors and costs.” Journal of the Royal Statistical Society Series A: Statistics in Society, 169, 439-457.

Table H.1. Potential Intervention Points

Date	Week	Production Operation Description	Adaptive Design Interventions	How to determine to intervene using RMSE as the quality metric?	How to determine to intervene using R-indicators as the quality metric?	Other contributing factors
2/7/2019	1	Week 1 Web Invite, Incentives (If Appropriate)	No interventions.	N/A	N/A	N/A
2/14/2019	2	Week 2 Reminder, Questionnaire Mailing (If Mail Preference)	No interventions.	N/A	N/A	N/A
2/28/2019	4 - 23	Production operation varies depending on the data collection week	Activating cases in CATI early or take cases off hold in CATI	If simulations show that sending a case to CATI early will result in reduction in RMSE without increasing the cost beyond predefined limits.	If simulations show that sending some cases to CATI early will result in response and a higher R-indicator without increasing the cost beyond predefined limits.	- If these subgroups are low interest groups (e.g., non-S&E) we may not intervene. - If the subgroups are very large and we do not want to move all cases to CATI, use response propensity for these cases, and move over "higher" propensity cases.
2/28/2019	4 - 23		Putting cases in CATI on hold	If simulations show that cost savings can be obtained without increasing the RMSE beyond predefined limits.	If simulations show that cost savings can be obtained without decreasing the R-indicator beyond predefined limits.	-If key estimates of interest have not stabilized in the experimental group, we may not use this intervention.
2/28/2019	4 - 23	Production operation varies depending on the data collection week	Sending an off-production-path questionnaire	If simulations show that sending questionnaires to a subset of cases will reduce the RMSE without increasing the cost beyond predefined limits.	If simulations show that sending a questionnaire to some cases will result in response and a higher R-indicator without increasing the cost beyond predefined limits.	- If these cases are in over-represented groups or if they are in low interest groups (e.g., non-S&E), we may not intervene.
03/07/2019 03/14/2019 04/25/2019	5, 6, 12	Weeks 5,6,12, Reminder Email	Withhold email reminder	If simulations show that cost savings can be obtained without increasing the RMSE beyond predefined limits.	If simulations show that cost savings can be obtained without decreasing the R-indicator beyond predefined limits.	- If the most over-represented subgroups are not much different from other groups, we may not use this intervention. - If key estimates of interest have not stabilized for the experimental group, we may not use this intervention.
3/7/2019	5	Week 5 Reminder Letter	Withhold reminder contact	If simulations show that cost savings can be obtained without increasing the RMSE beyond predefined limits.	If simulations show that cost savings can be obtained without decreasing the R-indicator beyond predefined limits.	- If the most over-represented subgroups are not much different from other groups, we may not use this intervention. - If key estimates of interest have not stabilized for the experimental group, we may not use this intervention.
3/14/2019	6	Week 6 Reminder Postcard	Withhold reminder contact	If simulations show that cost savings can be obtained without increasing the RMSE beyond predefined limits.	If simulations show that cost savings can be obtained without decreasing the R-indicator beyond predefined limits.	- If the most over-represented subgroups are not much different from other groups, we may not use this intervention. - If key estimates of interest have not stabilized for the experimental group, we may not use this intervention.
3/28/2019	8	Week 8 Questionnaire with Web Invite	Replace questionnaire with web invite	If simulations show that cost savings can be obtained without increasing the RMSE beyond predefined limits.	If simulations show that cost savings can be obtained without decreasing the R-indicator beyond predefined limits.	- If probability of responding by mail > probability of responding by web, we may apply this intervention to all cases in these subgroups. - If the most over-represented subgroups are not much different from other groups, we may not use this intervention.
04/11/2019	10	Week 10 Reminder Email	Withhold email reminder	If simulations show that cost savings can be obtained without increasing the RMSE beyond predefined limits.	If simulations show that cost savings can be obtained without decreasing the R-indicator beyond predefined limits.	- If the most over-represented subgroups are not much different from other groups, we may not use this intervention. - If key estimates of interest have not stabilized for the experimental group, we may not use this intervention.
4/25/2019	12	Week 12 Pressure Sealed, Perforated Reminder (Start of CATI)	Withhold reminder contact	If simulations show that cost savings can be obtained without increasing the RMSE beyond predefined limits.	If simulations show that cost savings can be obtained without decreasing the R-indicator beyond predefined limits.	- If the most over-represented subgroups are not much different from other groups, we may not use this intervention. - If key estimates of interest have not stabilized for the experimental group, we may not use this intervention.
5/23/2019	16	Week 16, Postcard Reminder	Withhold reminder contact	If simulations show that cost savings can be obtained without increasing the RMSE beyond predefined limits.	If simulations show that cost savings can be obtained without decreasing the R-indicator beyond predefined limits.	- If the most over-represented subgroups are not much different from other groups, we may not use this intervention. - If key estimates of interest have not stabilized for the experimental group, we may not use this intervention.
6/6/2019	18	Week 18, Web Invite (Prior Round Respondents)	Withhold reminder contact	If simulations show that cost savings can be obtained without increasing the RMSE beyond predefined limits.	If simulations show that cost savings can be obtained without decreasing the R-indicator beyond predefined limits.	- If the most over-represented subgroups are not much different from other groups, we may not use this intervention. - If key estimates of interest have not stabilized for the experimental group, we may not use this intervention.
6/6/2019	18	Week 18 Questionnaire with Web Invite (Prior Round Nonrespondents)	Replace questionnaire with web invite	If simulations show that cost savings can be obtained without increasing the RMSE beyond predefined limits.	If simulations show that cost savings can be obtained without decreasing the R-indicator beyond predefined limits.	- If probability of responding by mail > probability of responding by web, we may apply this intervention to all cases in these subgroups. - If the most over-represented subgroups are not much different from other groups, we may not use this intervention.
6/20/2019	20	Week 20, Web Invite, new sample, Priority envelope, questionnaire	Replace questionnaire with web invite	If simulations show that cost savings can be obtained without increasing the RMSE beyond predefined limits.	If simulations show that cost savings can be obtained without decreasing the R-indicator beyond predefined limits.	- If probability of responding by mail > probability of responding by web, we may apply this intervention to all cases in these subgroups. - If the most over-represented subgroups are not much different from other groups, we may not use this intervention.
07/11/2019	23	Week 23 Last Chance Email	Withhold email reminder	If simulations show that cost savings can be obtained without increasing the RMSE beyond predefined limits.	If simulations show that cost savings can be obtained without decreasing the R-indicator beyond predefined limits.	- If the most over-represented subgroups are not much different from other groups, we may not use this intervention. - If key estimates of interest have not stabilized for the experimental group, we may not use this intervention.
7/11/2019	23	Week 23, Web Invite	Withhold reminder contact	If simulations show that cost savings can be obtained without increasing the RMSE beyond predefined limits.	If simulations show that cost savings can be obtained without decreasing the R-indicator beyond predefined limits.	- If the most over-represented subgroups are not much different from other groups, we may not use this intervention. - If key estimates of interest have not stabilized for the experimental group, we may not use this intervention.

File Type	application/vnd.openxmlformats-officedocument.wordprocessingml.document
Author	Milan, Lynn M.
File Modified	0000-00-00
File Created	2021-01-20