Part B NPSAS 2016 Full Scale Student Data Collection

This submission requests clearance for the 2015-16 National Postsecondary Student Aid Study (NPSAS:16) data collection instruments and methods. Specific plans are provided below. Materials for full-scale institution contacting, enrollment list collection, and list sampling activities were submitted in a separate package and approved in July 2015.

Respondent Universe
1. Institution Universe

To be eligible for NPSAS:16, an institution will be required, during the 2015-16 academic year, to:

Offer an educational program designed for persons who had completed secondary education;
Offer at least one academic, occupational, or vocational program of study lasting at least 3 months or 300 clock hours;
Offer courses that are open to more than the employees or members of the company or group (e.g., union) that administered the institution;
Be located in the 50 states, the District of Columbia, or Puerto Rico;¹
Be other than a U.S. Service Academy; and
Have a signed Title IV participation agreement with the U.S. Department of Education.

Institutions providing only avocational, recreational, or remedial courses or only in-house courses for their own employees will be excluded. The seven U.S. Service Academies are excluded because of their unique funding/tuition base.

Student Universe

The students eligible for inclusion in the NPSAS:16 sample are those who are enrolled in a NPSAS-eligible institution in any term or course of instruction between July 1, 2015 and April 30, 2016 who are:

Enrolled in (a) an academic program; (b) at least one course for credit that could be applied toward fulfilling the requirements for an academic degree; (c) exclusively non-credit remedial coursework but who the institution has determined are eligible for Title IV aid; or (d) an occupational or vocational program that required at least 3 months or 300 clock hours of instruction to receive a degree, certificate, or other formal award;
Not currently enrolled in high school; and
Not enrolled solely in a GED or other high school completion program.

Statistical Methodology
1. Institution Sample

The NPSAS:16 field test and full-scale institution samples were selected in a different manner than has been done in previous NPSAS studies. The field test institution frame was constructed from the IPEDS:201213 header, Institutional Characteristics (IC), Completions, and Full-year Enrollment files. The full-scale institution frame is constructed from the IPEDS:2014-15 header, Institutional Characteristics (IC), Completions, and Full-year Enrollment files. Creating a separate institution frame for the field test and full-scale studies carries the advantage of having a more accurate and current full-scale institution sample since the frame will be constructed using the most up-to-date IPEDS files. Also, freshening the institution sample will not be needed since we will be using the most up-to-date institution frame available. So that we do not burden them with both field test and full-scale data collections, we removed from the field test frame any large systems (reporters) and individual institutions likely to be selected with certainty (i.e., probability of selection equal to one) for the full-scale. We also removed field test sample institutions from the full-scale frame and later will adjust the weights for the full-scale sample institutions so that they represent the full population of eligible institutions.²

A number of for-profit institutions and large chains of for-profit institutions have been closed or sold recently, so we have to take this into account in the sample design. We have used all available resources to identify these closed institutions. In creating the sampling frame, we identified and excluded institutions that are still in IPEDS but that we know are no longer eligible for NPSAS due to closure.

For the small number of institutions on the frames that have missing enrollment information, we imputed the data using the latest IPEDS imputation procedures to guarantee complete data for the frames. A statistical sample of about 2,000 institutions has been selected from the full-scale frame using stratified random sampling with probabilities proportional to a composite measure of size, which is the same methodology that we have used since NPSAS:96. Institution measures of size were determined using full-year enrollment and baccalaureate completions data. Using composite measure of size sampling ensures that the full-scale target sample sizes are achieved within institution and student sampling strata while also achieving approximately equal student weights across institutions.

We have added a stratum for institution sampling by splitting the public 4-year non-doctorate-granting institutions into two sectors: public 4-year institutions that are primarily sub-baccalaureate and those that are primarily baccalaureate. The sub-baccalaureate institutions are mainly community colleges that offer a small number of bachelor’s degrees in select fields (CCBAs). Over 40% of students in public 4-year non-doctorate-granting institutions are in primarily sub-baccalaureate institutions; the students and funding for the CCBAs are more similar to 2-year than to other 4-year institutions. Splitting the public 4-year non-doctorate-granting institutions into two sectors rather than sampling them together and providing an indicator on the data files allows us to oversample and control the sample size of the CCBAs and students in them, including the baccalaureate recipients.

The institutional strata will be the ten sectors that were used for NPSAS:12, with the public 4-year non-doctorate-granting institutions split into two sectors, as described above:³

Public less-than-2-year
Public 2-year

3a. Public 4-year non-doctorate-granting primarily sub-baccalaureate

3b. Public 4-year non-doctorate-granting primarily baccalaureate

Public 4-year doctorate-granting
Private nonprofit less-than-4-year
Private nonprofit 4-year non-doctorate-granting
Private nonprofit 4-year doctorate-granting
Private for-profit less-than-2-year
Private for-profit 2-year
Private for-profit 4-year

For the full-scale, we expect to obtain an overall 99 percent eligibility rate, and at least an overall 85 percent institutional participation (response) rate. The eligibility and response rates will likely vary by institutional strata. Based on these expected rates, the estimated institution sample sizes and sample yield by the eleven institutional strata (described above) for the full-scale are presented in table 1.

Within each institutional stratum, additional implicit stratification is accomplished by sorting the sampling frame by the following classifications: historically black colleges and universities (HBCU) indicator; Hispanic-serving institutions (HSI) indicator⁴; institutional category(INSTCAT) derived using the level of offerings reported on the IPEDS Institutional Characteristics [IC] component and the number and level of awards that were reported on the IPEDS Completions [C] component; Carnegie classifications of postsecondary institutions⁵; the Office of Business Economics (OBE) region from the IPEDS header file (Bureau of Economic Analysis of the U.S. Department of Commerce Region⁶; state and system for states with large systems, e.g., the SUNY and CUNY systems in New York, the state and technical colleges in Georgia, and the California State University and University of California systems in California; and the institution measure of size. The objective of this implicit stratification is to approximate proportional representation of institutions on these measures.

Table 1. NPSAS:16 full-scale institution sample sizes and yield

Institutional sector	Frame count¹	Number sampled	Number eligible	List respondents
Total	6,915	2,000	1,980	1,683
Public less-than-2‑year	237	22	22	19
Public 2‑year	1,014	376	375	332
Public 4‑year non-doctorate-granting primarily sub-baccalaureate	107	70	70	63
Public 4‑year non-doctorate-granting primarily baccalaureate	178	96	95	86
Public 4‑year doctorate-granting	353	353	352	308
Private nonprofit less-than-4‑year	264	20	19	15
Private nonprofit 4‑year non-doctorate-granting	890	325	325	277
Private nonprofit 4‑year doctorate-granting	642	268	266	222
Private for-profit less-than-2‑year	1,634	70	67	49
Private for-profit 2‑year	908	120	117	93
Private for-profit 4‑year	688	280	273	218

¹ Institution counts based on IPEDS:2013‑14 header files.

NOTE: Detail may not sum to totals because of rounding.

Student Sample

Student Enrollment List Collection

To begin NPSAS data collection, sampled institutions are asked to provide a list of all their NPSAS-eligible undergraduate and graduate students enrolled in the targeted academic year, covering July 1 through June 30. Since NPSAS:2000, institutions have been asked to limit listed students to only those enrolled through April 30. This truncated enrollment period excludes students who first enrolled in May or June, but it allows lists to be collected earlier and, in turn, data collection to be completed in less than 12 months. When evaluated during NPSAS:96, the abbreviated schedule missed only about three percent of the target population, and weighting can account for the minimal lack of coverage. It is currently being reevaluated with results from the field test data collection.

NPSAS:16 will serve as the base year data collection for the 2016/17 Baccalaureate and Beyond Longitudinal Study (B&B:16/17) and will be used to qualify students for cohort membership. To that end, we will ask institutions that award baccalaureate degrees to identify students who are expected to complete requirements for the baccalaureate degree by June 30, 2016 of the NPSAS year. Instead of waiting until June for institutions to positively confirm degree award to these students, we will request that enrollment lists include an indicator (B&B flag) of cohort eligibility for students who have received or are expected to receive the baccalaureate degree by June 30, 2016.

As shown in table 2, the percentage of students in NPSAS:08, initially flagged as potential baccalaureate recipients, who did not actually receive their bachelor’s degree in the NPSAS year (i.e., the false positive rate) was fairly high. Therefore, the NPSAS sampling rates for potential baccalaureates and other undergraduate students will be adjusted to yield the appropriate sample sizes, after accounting for the expected false positive and false negative rates by sector.

Table 2. Weighted false positive rate observed in baccalaureate identification, by sector: NPSAS:08

Institutional sector	False positive rate (weighted)
Public 4‑year non-doctorate-granting	34.7
Public 4‑year doctorate-granting	27.2
Private nonprofit 4-year non-doctorate-granting	22.3
Private nonprofit 4-year doctorate-granting	20.7
Private for-profit 4‑year	32.9

Student Stratification

The student sampling strata for the full-scale will be:

Baccalaureate recipients who are veterans

Baccalaureate recipients from science, technology, engineering, and mathematics (STEM) programs
Baccalaureate recipients from teacher education programs
Baccalaureate recipients from business programs
Baccalaureate recipients from other programs
Other undergraduate students who are veterans
Other undergraduate students
Graduate students who are veterans
First-time graduate students
Master’s degree students in STEM programs
Master’s degree students in education and business programs
Master’s degree students in other programs
Doctoral-research/scholarship/other students in STEM programs
Doctoral-research/scholarship/other students in education and business programs
Doctoral-research/scholarship/other students in other programs
Doctoral-professional practice students
Other graduate students.

If students fall into multiple strata, such as students who are veterans or students with double majors, the ordering of the strata above will be used to prioritize the stratification.

Several student subgroups will be intentionally sampled at rates different than their natural occurrence within the population due to specific analytic objectives. The following groups will be oversampled:

Baccalaureate recipients who are veterans
Baccalaureate recipients from STEM programs
Baccalaureate recipients from teacher education programs
Other undergraduate students who are veterans
Graduate students who are veterans
First-time graduate students
Master’s degree students in STEM programs
Doctoral-research/scholarship/other students in STEM programs
Students and baccalaureate recipients in public 4-year non-doctorate-granting institutions that are primarily sub-baccalaureate
Undergraduate students at all award levels enrolled in for-profit institutions
Master’s degree students enrolled in for-profit institutions

Similarly, we anticipate the following groups will be undersampled:

Baccalaureate recipients from business programs
Master’s degree students in education and business programs
Doctoral-research/scholarship/other students in education and business programs

Because of their sheer number, sampling these last three groups in proportion to the population would make it difficult to draw inferences about the experiences of other baccalaureates, master’s degree, and doctoral students, respectively.

For baccalaureate recipients from teacher education programs, we will identify students using the CIP codes that specify the particular teaching programs. We will also use Schools and Staffing Surveys (SASS) data to determine other majors that are likely to have a large number of students going into teaching at the secondary school level.

To identify and sample veterans, we plan to match the student enrollment lists we receive with a list from the U.S. Department of Veterans Affairs (VA) of veterans who applied for Veteran’s Benefits Administration (VBA) benefits. Only veteran’s estimates at the baccalaureate, undergraduate, and graduate levels, and not by sector, are of interest.

We also considered oversampling students in professional science master’s programs, which is an emerging field and interesting for policy reasons. We think that some institutions will have difficulty accurately identifying these students, so we will not pursue this oversampling. However, it could be tested in the NPSAS:20 field test, if this trend continues. By oversampling master’s degree students enrolled in for-profit programs, we will capture many students in professional science master’s programs.

Substantial differences in federal loan rates were observed in NPSAS:12 between the full sample estimates and the poststratified estimates. This leads to increased weight variation but, more importantly, could lead to bias in the weighted estimates depending on the reason for the discrepancy in estimates. A possible source for error is in the ability of the sample design and sampling weights to account for financial aid application, receipt, or amounts. While we have identified some potential changes to the poststratification to help resolve this issue for NPSAS:16, we also plan to match the student lists to National Student Loan Data System (NSLDS) data and use the financial aid data for student implicit stratification. Within the student explicit strata, we will sort the students by federally aided/unaided, and this will allow the sample proportions of aided and unaided students to approximately match the population within institution and student strata.

Sample Sizes and Student Sampling

Based on experience, we expect to obtain, at minimum, 95 percent eligibility rates and 70 percent student interview response rates, overall and in each sector. We also will continue to employ a variable-based (rather than source-based) definition of study member in the full-scale study, similar to that used in NPSAS:12 and NPSAS:08. Specifically, a study member will be defined as any sample member who is determined to be eligible for the study and, at minimum, has valid data from any source⁷ for the following:

Student type (undergraduate or graduate);

Date of birth or age;

Gender; and

At least 8 of the following 15 variables:

Dependency status	Income	Tuition
Marital status	Class level	Student budget
Race	Degree program	Expected family contribution
Parent education	Baccalaureate status	Receipt of federal aid
Any dependents	Months enrolled	Receipt of nonfederal aid

We expect the rate of study membership to be about 90 percent.

Based on the expected interview response and eligibility rates, student sample sizes and expected yield are presented in table 3. NPSAS:16 is designed to sample about 126,300 students. Table 3 does not show sample sizes adjusted for false positives and false negatives. Students will be sampled on a flow basis as student lists are received. Stratified systematic sampling procedures will be utilized. Within the graduate student strata for veterans and first-time graduate students, the students will be sorted by master’s and doctoral to ensure that the sample will be roughly proportional to the frame. As mentioned above, all strata will be sorted (implicitly stratified) by federally aided/unaided students to maintain proportionality between the sample and frame. Sample yield will be monitored by institutional and student sampling strata, and the sampling rates will be adjusted early, if necessary, to achieve the desired sample yields.

Table 3. Student sample sizes and yields, NPSAS:16 full-scale

	Sample students				Eligible students				Responding students
	Total	Bacca-laureates	Other under-graduate students	Graduate students	Total	Bacca-laureates	Other under-graduate students	Graduate students	Total	Bacca-laureates	Other under-graduate students	Graduate students	Responding students per responding institution
Total	126,316	37,594	67,669	21,053	120,000	35,714	64,286	20,000	84,000	25,000	45,000	14,000	50

Public less-than-2‑year	695	0	695	0	618	0	618	0	382	0	382	0	20
Public 2‑year	21,784	0	21,784	0	19.985	0	19,985	0	13,335	0	13,335	0	40
Public 4‑year non-doctorate-granting primarily sub-baccalaureate	5,755	2,618	3,055	81	5,528	2,490	2,960	77	4,145	1,811	2,278	56	66
Public 4‑year non-doctorate-granting primarily baccalaureate	7,061	2,618	2,746	1,697	6,764	2,490	2,660	1,614	5,027	1,811	2,047	1,169	58
Public 4‑year doctorate-granting	25,977	9,742	11,120	5,115	24,726	9,199	10,697	4,830	18,976	6,892	8,480	3,604	62
Private nonprofit less-than-4‑year	889	0	889	0	853	0	853	0	543	0	543	0	36
Private nonprofit 4‑year non-doctorate-granting	12,038	4,997	4,213	2,828	11,441	4,719	4,053	2,670	8,651	3,499	3,180	1,972	31
Private nonprofit 4‑year doctorate-granting	14,010	5,565	3,496	4,950	13,387	5,293	3,387	4,707	10,047	3,920	2,654	3,472	45
Private for-profit less-than-2‑year	3,440	0	3,440	0	3,268	0	3,268	0	1,843	0	1,843	0	38
Private for-profit 2‑year	7,104	0	7,104	0	6,918	0	6,918	0	4,490	0	4,490	0	48
Private for-profit 4‑year	27,563	12,053	9,127	6,382	26,511	11,522	8,887	6,101	16,562	7,067	5,768	3,727	76

NOTE: Detail may not sum to totals because of rounding.

Quality Control Checks for Lists and Sampling

The number of enrollees on each institution’s student list will be checked against the latest IPEDS full-year enrollment and completions data. The comparisons will be made for each student level: baccalaureate, undergraduate, and graduate. Based on past experience, we expect that only counts within 50 percent of non-imputed IPEDS counts will pass quality control (QC) and will be moved onto student sampling. Institutions that fail QC will be re-contacted to resolve the discrepancy and to verify that the institution coordinator who prepared the student list clearly understood our request and provided a list of the appropriate students. When we determine that the initial list provided by the institution was not satisfactory, we will request a replacement list. We will proceed with selecting sample students when we have either confirmed that the list received is correct or have received a corrected list.

QC is very important for sampling and all statistical activities, and statistical procedures will undergo thorough quality control checks. We have technical operating procedures (TOPs) in place for sampling and general programming. These TOPs describe how to properly implement statistical procedures and QC checks. We will employ a checklist for all statisticians to use to make sure that all appropriate QC checks are done for student sampling.

Some specific sampling QC checks include, but are not limited to, checking that the:

Institutions and students on the sampling frames all have a known, non-zero probability of selection;
Distribution of implicit stratification for institutions is reasonable; and
Number of institutions and students selected match the target sample sizes.

Methods for Maximizing Response Rates

Response rates in the NPSAS:16 full-scale study are a function of success in two basic activities: identifying and locating the sample members involved, then contacting them and gaining their cooperation. Two classes of respondents are involved: institutions and students who were enrolled in those institutions. Institutions will be asked to provide data from institutional records for sampled students. In this section, we describe our plans for maximizing response to the request for data from institutional records. We also present our plans for maximizing response to the student survey.

The data collection contractor for this effort will be RTI International (RTI). RTI has worked with postsecondary institutions for multiple studies on behalf of the Department with experience in both developing a rapport with data providers at postsecondary institutions and with converting student nonrespondents via telephone or web interviews.

Collection of Data from Institutional Records

Our plans for contacting and communicating with institutions, beginning with the process of list acquisition, are designed to ensure the cooperation of as many institutions as possible and to establish rapport with institutional staff. This process will include sending the chief administrator of each institution a package of descriptive materials about the study, follow-up telephone calls to obtain the chief administrator’s consent and cooperation, and asking the chief administrator to designate an Institutional Coordinator (IC) who will serve as our primary point of contact. All contacting materials are provided in appendix D.

All institution coordinators receive information that informs them about the purposes of NPSAS, describes their tasks, and assures them of our commitment to maintaining the confidentiality of data. Written materials will be provided to coordinators explaining each phase of the study, as well as their role in each phase. Training of institution coordinators is geared toward the method of data collection selected by the institution (see below). The system used for collecting institutional record data is accessible only with an ID and password. It provides institution coordinators with instructions for all phases of study participation. Copies of all written materials, as well as answers to frequently asked questions, are available on the website. Experienced NPSAS interview staff carry out these contacts and are assigned to specific institutions, which remain their responsibility throughout the data collection process. This allows NPSAS staff members to establish rapport with the institution’s staff and provides those individuals with a consistent point of contact. Staff members are thoroughly trained in basic financial aid concepts and in the purposes and requirements of the study, which helps them establish credibility with the institution staff.

As an additional means of maximizing institutional participation, we have secured endorsements from 25 professional associations for NPSAS:16 (see appendix B).

As in prior NPSAS studies, NPSAS staff will offer several options for providing the Student Records for sampled students and will invite the coordinator to select the method that is least burdensome and most convenient for the institution. The optional methods for providing student record data are:

Student Records obtained via a web-based data entry interface. The web-based data entry interface displays one student at a time, and the coordinator may enter data in a top to bottom fashion before moving onto the next student.
Student Records obtained by completing an Excel workbook. An Excel workbook will be created for each institution and will be preloaded with the sampled students’ ID, name, and SSN (if available). To facilitate simultaneous data entry by different offices within the institution, the workbook contains a separate worksheet for each topic area. The user will download the Excel worksheet from the secure NPSAS institution website, enter the data, and then upload the data to the website. Validation checks occur both within Excel as data are entered and when the data are uploaded via the website.
Student Records obtained by uploading CSV (comma separated values) files. Institutions with the means to export data from their internal database systems to a flat file may opt for this method of supplying Student Records. Over the last several NPSAS studies, the number of institutions providing data files has increased. Institutions that select this method will be provided with detailed import specifications, and all data uploading will occur through the project’s secure website.

Institution coordinators who elect to use the web-based data entry interface will receive detailed instructions for accessing and using the site. The instrument content is provided in appendix E.

Prior to data collection, student records are matched to the U.S. Department of Education Central Processing System (CPS)—which contains data on federal financial aid applications—for locating purposes and to reduce the burden on the institutions for the student record abstractions. The vast majority of the federal aid applicants (about 95 percent) will match successfully to the CPS prior to Student Records data collection. During data collection, institutions will be asked to provide the student’s last name and Social Security number for the small number of federal aid applicants who did not match to the CPS on the first attempt. After Student Records data collection ends, we will submit the new names and Social Security numbers to CPS for file matching. Any new data obtained for the additional students will be delivered on the Electronic Code Book (ECB) with the data obtained prior to Student Records data collection.

Student Survey: Self-Administered Web and CATI

The following sections outline methods for maximizing response to the NPSAS:16 student survey.

Tracing of Sample Members

To achieve the desired response rate, we propose an integrated tracing approach designed to yield the maximum number of locates with the least expense. During the field test, we evaluated the effectiveness of these procedures for the full-scale study effort. The steps of our tracing plan include the following elements:

Advance Tracing. The advance tracing stage includes tracing steps taken prior to the start of data collection. These include batch database searches and advance intensive tracing (if necessary). Not all schools will be able to give complete or up-to-date locating information on each student, and some cases will require more advanced tracing, before mailings can be sent or the cases can be worked in CATI. To handle cases for which mailing address, phone number, or other contact information is invalid or unavailable, NPSAS staff plan to conduct advance tracing of the cases prior to lead letter mailout and data collection. As lead information is found, additional searches will be conducted through interactive databases to expand on leads found.
Data Collection Mailings. Data collection mailings and e-mails will be used to maintain persistent contact with sample members as needed throughout data collection. The initial letter will include a toll-free number, study website address, and study ID and password, and will request that sample members complete the web survey. Two days after the lead letter mailing, an email message mirroring the letter will also be sent to sample members.

Telephone Locating and Interviewing. The telephone locating and interviewing stage includes calling all available telephone numbers and following up on leads provided by parents and other contacts.
Pre-Intensive Batch Tracing. The pre-intensive batch tracing stage consists of the LexisNexis SSN and Premium Phone batch searches that will be conducted between the telephone locating and interviewing stage and the intensive tracing stage.
Intensive Tracing. The intensive tracing stage consists of tracers conducting database searches after all current telephone numbers have been exhausted. In NPSAS:12, about 71 percent of sample members requiring intensive tracing were located, and about 29 percent of those located responded to the interview. Intensive interactive tracing differs from batch tracing in that a tracer can assess each case on an individual basis to determine which resources are most appropriate and the order in which they should be used. Intensive interactive tracing is also much more detailed due to the personal review of information. During interactive tracing, tracers utilize all previously obtained contact information to make tracing decisions about each case. These intensive interactive searches are completed using a special program that works with RTI’s CMS to provide organization and efficiency in the intensive tracing process. Sources that may be used, as appropriate, include credit database searches, such as Experian, various public websites, and other integrated database services.
Other Locating Options. Other locating activities will take place as needed, including a LexisNexis e-mail search conducted toward the end of data collection for nonrespondents

Training for Data Collection Staff

Telephone data collection will be conducted at the contractor’s Research Operations Center (ROC). NPSAS staff at the ROC will include Quality Control Supervisors (QCSs), , and Data Collection Interviewers (DCIs). Training programs for these staff members are critical to maximizing response rates and collecting accurate and reliable data.

Quality control supervisors, who are responsible for all supervisory tasks, will attend project-specific training for QCSs, in addition to the content of interviewer training. They will receive an overview of the study, background and objectives, and the data collection instrument through a question-by-question review. Supervisors will also receive training in the following areas: providing direct supervision during data collection; handling refusals; monitoring interviews and maintaining records of monitoring results; problem resolution; case review; specific project procedures and protocols; reviewing CATI reports; and monitoring data collection progress.

Training for DCIs, is designed to help staff become familiar with and practice using the Computer-Assisted Telephone Interviewing Case Management System (CATI-CMS) and survey instrument, as well as to learn project procedures and requirements. Particular attention will be paid to quality control initiatives, including refusal avoidance and methods to ensure that quality data are collected. DCIs will receive project-specific training on telephone interviewing and answering questions from web participants regarding the study or related to specific items within the interview. At the conclusion of training, all NPSAS ROC staff must meet certification requirements by successfully completing a certification interview. This evaluation consists of a full-length interview with project staff observing and evaluating interviewers, as well as an oral evaluation of interviewers’ knowledge of the study’s Frequently Asked Questions.

Case Management System

Student interviews will be conducted using a single web-based survey instrument for both web and CATI data collection. The data collection activities will be accomplished through the CATI-CMS, which is equipped with numerous capabilities, including: on-line access to locating information and histories of locating efforts for each case; state-of-the-art questionnaire administration module with full “front-end cleaning” capabilities (i.e., editing as information is obtained from respondents); sample management module for tracking case progress and status; and automated scheduling module which delivers cases to interviewers. The automated scheduling module incorporates the following features:

Automatic delivery of appointment and call-back cases at specified times. This reduces the need for tracking appointments and helps ensure the interviewer is punctual. The scheduler automatically calculates the delivery time of the case in reference to the appropriate time zone.
Sorting of non-appointment cases according to parameters and priorities set by project staff. For instance, priorities may be set to give first preference to cases within certain sub-samples or geographic areas; cases may be sorted to establish priorities between cases of differing status. Furthermore, the historic pattern of calling outcomes may be used to set priorities (e.g., cases with more than a certain number of unsuccessful attempts during a given time of day may be passed over until the next time period). These parameters ensure that cases are delivered to interviewers in a consistent manner according to specified project priorities.
Restriction on allowable interviewers. Groups of cases (or individual cases) may be designated for delivery to specific interviewers or groups of interviewers. This feature is most commonly used in filtering refusal cases, locating problems, or foreign language cases to specific interviewers with specialized skills.
Complete records of calls and tracking of all previous outcomes. The scheduler tracks all outcomes for each case, labeling each with type, date, and time. These are easily accessed by the interviewer upon entering the individual case, along with interviewer notes.
Flagging of problem cases for supervisor action or supervisor review. For example, refusal cases may be routed to supervisors for decisions about whether and when a refusal letter should be mailed, or whether another interviewer should be assigned.
Complete reporting capabilities. These include default reports on the aggregate status of cases and custom report generation capabilities.

The integration of these capabilities reduces the number of discrete stages required in data collection and data preparation activities and increases capabilities for immediate error reconciliation, which results in better data quality and reduced cost. Overall, the scheduler provides a highly efficient case assignment and delivery function by reducing supervisory and clerical time, improving execution on the part of interviewers and supervisors by automatically monitoring appointments and call-backs, and reducing variation in implementing survey priorities and objectives.

Survey Instrument Design

To prepare the student records instrument, in January 2014, NCES convened a technical review panel to discuss the challenges to responding to the NPSAS list and student records data collection requests and approaches that might facilitate the process. In June 2014, NCES received approval to conduct focus groups with institution staff who have participated in past NPSAS student records data collection. This qualitative evaluation informed refinement of items used in the student records instrument for which clearance is requested, and system functionality.

Student interview preparation has involved two meetings of the NPSAS technical review panel. The June 2014 meeting focused specifically on the design and content of the graduate student portion of the survey, while the August 2014 meeting covered the survey more broadly, across all topics and student levels. Since the August meeting, cognitive testing with graduate students helped to clarify concepts to be used in the student survey, and highlighted differences across the various levels and degrees of graduate students. Following the field test data collection, additional cognitive and usability was conducted with the programmed, mobile-friendly instrument.

The NPSAS:16 instruments employ a web-based instrument and deployment system, created by RTI, known as Hatteras which has been in use since NPSAS:08. Hatteras is a flexible system that provides multimode functionality, whereby the survey instrument is created one time and can be used for self-administration, including on mobile devices, CATI, CAPI, or data entry. The instrument is provided in appendix F.

In addition to the functional capabilities of the CMS and web instruments described above, our efforts to achieve the desired response rate will include using established procedures proven effective in other large-scale studies we have completed. These include:

Providing multiple response modes, including a mobile-friendly self-administered and interviewer-administered options.
Offering incentives to encourage response (see incentive structure described below).
Assigning experienced CATI data collectors who have proven their ability to contact and obtain cooperation from a high proportion of sample members.
Training the interviewers thoroughly on study objectives, study population characteristics, and approaches that will help gain cooperation from sample members.
Maintaining a high level of monitoring and direct supervision so that interviewers who are experiencing low cooperation rates are identified quickly and corrective action is taken.
Making every reasonable effort to obtain an interview at the initial contact, but allowing respondent flexibility in scheduling appointments to be interviewed.
Thoroughly reviewing all refusal cases and making special conversion efforts whenever feasible (see next section).

Refusal Aversion and Conversion

Recognizing and avoiding refusals is important to maximize the response rate. We will emphasize this and other topics related to obtaining cooperation during data collector training. Supervisors will monitor interviewers intensely during the early days of outbound calling and provide retraining as necessary. In addition, the supervisors will review daily interviewer production reports produced by the CATI system to identify and retrain any data collectors who are producing unacceptable numbers of refusals or other problems.

Refusal conversion efforts will be delayed for at least one week to give the respondent time after the initial refusal. Attempts at refusal conversion will not be made with individuals who become verbally aggressive or who threaten to take legal or other action. Refusal conversion efforts will not be conducted to a degree that would constitute harassment. We will respect a sample member’s right to decide not to participate and will not impinge this right by carrying conversion efforts beyond the bounds of propriety.

Tests of Procedures and Methods

NCES’s goal for the full-scale NPSAS:16 study is to reduce total error compared to NPSAS:12 so that informed decisions may be made given the resources provided. The NPSAS:16 field test addressed some, although not all of the initiatives that will be implemented in the full-scale study. The following section provides a brief synopsis of field test experiments and findings.

Field Test Results: Evaluation of Burden and Motivation to Participate

A considerable challenge to any NPSAS data collection is to convince sample members to start a 30 minute survey. Motivated by the “foot in the door” approach (Freedman and Fraser, 1966), where a small request is followed by a larger request, we divided the field test survey instrument into two modules, of approximately 10 to 15 minutes in duration, to determine if breaking the survey into smaller tasks increased the likelihood that sample members would participate. The sample members were assigned at random into one of three conditions:

Treatment Group 1: Sample members were asked to complete a 10 to 15 minute survey as the first module for an initial incentive offer of $15. They were then offered the option to continue with the second module, the latter half of the survey, also 10 to 15 minutes in duration, to receive another $15 (for a total of $30 for the survey).
Treatment Group 2: Sample members were asked to complete the same two 10 to 15 minute modules as Treatment Group 1, but were offered an initial incentive of $20, followed by an offer of $10 to complete the second module.
Control Group: Sample members were asked to complete the usual 30 minute survey, like in prior NPSAS data collections, to receive a $30 incentive for the completed survey.

The modules included questions designed specifically to assess measurement error due to recall, fatigue, lack of motivation, and so on. Module 1 included questions to obtain information also available from administrative sources to evaluate accuracy (N16CFEDAMT). Some of those same items were repeated in Module 2 and, in addition, Module 2 included questions on fictitious issues (N16ASNOW and N16SPNNOW) and some questions with reversed wording (N16ACDSATIS and N16SATISACD). The control group and two module groups were evaluated on participation and response rates, breakoff, and timing. Results will be available in the field test report to be included in the full-scale data file documentation.

When comparing the control group to the combined module groups, there were no differences observed in participation rates (62.9 % for the control group versus 61.4% for the combined module groups (χ2 (1, N= 4,536)=0.99, p= 0.32) or response rates (62.5% for control versus 61.1% for the combined groups (χ2 (1, N= 4,497)=0.92, p= 0.34). There was also no significant difference in control versus module groups for breakoff rates (χ2 (1, N= 2,769)=2.45, p= 0.118). While time in the first half of the interview (12.51 minutes for the control group versus 12.91 minutes for the module groups (t=-1.95; p=.05)) was statistically significantly different, the real difference in time was only 24 seconds and favored the control condition. The module approach has been dropped for the full-scale survey.

Field Test Results: Experiment # 2 – Questionnaire Design

Another potential source of measurement error is the survey instrument itself. Given the mobile survey option launched for NPSAS:16, we introduced changes to the survey instrument design to facilitate completion on a smaller screen size with mobile phone navigation. Two types of question designs were restructured and evaluated across devices for three questionnaire items.

Questionnaire Design 1

Design Group 1: Rather than presenting the question on high school coursework as a grid, which asks and obtains responses to several questions in a single screen, a series of three yes/no questions were asked in succession.

Control Group 1: The alternative design was the standard presentation in which questions about high school courses were asked in a single grid.

This formatting experiment resulted in no missingness for the items in the experimental version, compared with the control-group grid format that had item level missingness ranging from 2 percent to 6 percent. Specific rates of missingness for the grid items were 2 percent on AP courses (χ2 (1, N=2,210)=17.41, p < .001), 3 percent on college-level courses (χ2 (1, N=2,210)=31.96, p < .001), and 6 percent on IB courses (χ2 (1, N=2,210)=62.69, p < .001), all of which were significantly higher than the 0 percent missingness rate found for the experimental-group version of the items. The experimental version of the items (=22.04 seconds, SD=6.17) took slightly longer to administer than the control-group grid-format version (=18.14 seconds, SD=7.24) (t(2,210)=13.62, p < .01). The option tested with design group 1, with questions presented as a grid, will be used in the full-scale survey.

Questionnaire Design 2

Design Group 2: Rather than asking just one question, parents’ education, with several response options, that may not properly display on the screen of a mobile device, we used a branching design in which a general question with limited response options was asked first, followed by a series of more specific branching questions to obtain the details.

Control Group 2: The alternative design was the standard presentation in which the question of parents’ education was asked as a single question.

Ninety-three percent of respondents who received the experimental version of the items self-selected “mother” and “father” as the guardians on which to report education. When comparing responses for those who selected “mother” in the experimental version to the responses of those who received the control group “mother” item, there were no significant differences in the levels of education selected. However, the levels of education reported in the experimental version for those who self-selected “father” as one of their parents were statistically different than the education levels reported for the control-group “father” item (χ2 (10, N=2,410)=42.15, p < .001), with more respondents selecting “don’t know” in the control item. The experimental version took significantly longer to complete (=40.63 seconds, SD=11.24) than the original items (= 18.42 seconds, SD=6.32) (t(2,487)=-61.21, p < .001). A combination of the design group and control group designs will be used in the full-scale survey because the combined design will allow for respondents to choose both a parent to reference and that parent’s highest level of education on the same screen, thereby creating both a flexible and efficient design for the items.

Questionnaire Design 3

Design Group 3: Among students who report having studied abroad, a follow-up question asked respondents to select from a list all countries in which they have studied abroad.

Control Group 3: The alternative design presented respondents with a text box in which to list all countries in which they have studied abroad.

The field test data revealed that some respondents did not provide an appropriate answer for the text box entry control group item. Approximately 10 percent of respondents receiving this version of the item provided multiple countries, cities, or unusable responses. For the experimental dropdown version of the item, 23 percent of respondents selected the “other” continent option or left the item missing. The textbox entry item took respondents significantly less time to complete (=8.74 seconds, SD=3.40), at just under 10 seconds, compared to the dropdown item (=16.10 seconds, SD=10.49), which took 16 seconds (t(225)=6.90, p < .001). A combination of the design group and control group designs will be used in the full-scale survey to allow respondents to enter a text string using a predictive searching method to return a list of automated matches based upon the text string entered by the respondent.

Full-Scale Design

NPSAS data collection is complex, in part because the sample students are selected on a flow basis over a 6-month period. Historically, the NPSAS interview response rate is approximately 70 percent overall, and the study member rate is about 90 percent overall, although rates are lower for some sectors, especially the for-profit sector. NPSAS staff will implement a two-pronged approach for using responsive design for NPSAS:16. First, we will focus on increasing the number of study members, especially in sectors with lower rates of study membership. For a respondent to be considered a study member, data for three key variables and for at least 8 of 15 additional variables must be collected either from the student interview and/or student records, in addition to any administrative data. We will identify students with almost all of the data necessary to be a study member, except for a small number of variables, and direct data collection resources to obtain the missing data. Second, we will focus on improving data for study members, for whom we have student records data but not an interview, by identifying and targeting students for whom imputation may not work well. In the later stages of data collection, we will use multiple imputation to impute key variables available mainly from the interview. Students with the greatest amount of variation evident in the imputed values will be targeted in the time remaining in data collection.

Step 1: Increasing the Number of Study Members

During the NPSAS:16 data collection, we will track collection of the specific variables needed for study membership (i.e., three key variables and at least 8 of 15 additional variables) through the sources from which those variables are typically obtained. In order to maximize study membership, we will pursue interventions that will help improve study membership rates. For student records, that will include asking nonresponding institutions to provide an abbreviated student records file that includes only the variables needed for study membership. Likewise, for student interviews, if a sample member lacks only a few variables needed for study membership, we would offer a short survey containing those items needed to qualify as a study member, and for a case to be included on the analysis file.

Step 2: Improving Data for Study Members

To identify interview nonrespondents who are likely to have variability in their imputed data, we will impute key variables, mainly available from the interview, using multiple imputation procedures. Imputation will be conducted later in data collection and by sector because sample members in some sectors will start data collection earlier than others, and have different characteristics and financial aid patterns. Multiple imputation will be performed once or twice per sector. Individual sample members with high variation on key variables (threshold to be determined) will be targeted during the last month or two of data collection with an abbreviated interview containing the key items used in the multiple imputation. Items for the abbreviated interview have been identified in the main survey in appendix F. To determine these items, we initially created a list of analytically important variables, then performed multiple imputation on a subset of the variables using NPSAS:08 data (the last NPSAS to spin off a B&B cohort). Nine of the 14 variables had comparatively larger relative standard errors (RSEs) that were all greater than 25 percent. The abbreviated interview will include these 9 variables, 6 analytically important variables that are new for NPSAS:16, related variables for context, and variables for eligibility determination and incentive/locating information.

In addition to the abbreviated interview, contacts at this stage will draw from interventions that have been successfully used in prior NCES data collections, including additional or special contacts, such as e-mails sent from the NCES project officer or mailings sent via Priority/FedEx/UPS. Like the field test, sample members may choose between incentives paid by check or through PayPal.

Two evaluations are planned to measure how use of multiple imputation to identify and target interview nonrespondents affected the key variables. First, we will perform a comparable multiple imputation at the end of data collection to determine the change in variation among the key variables before and after the use of the abbreviated interview. Second, we will compute estimates and standard errors of the key variables using the data after the use of the abbreviated interview and also using imputed data instead of the data collected with the abbreviated interview. While this second measurement will not show which data are better, it will show how similar or different the data are.

Reviewing Statisticians and Individuals Responsible for Designing and Conducting the Study

The study is being conducted by the National Center for Education Statistics (NCES), U.S. Department of Education. The following statisticians at NCES are responsible for the statistical aspects of the study: Dr. Tracy Hunt-White, Dr. David Richards, Dr. Sean Simone, Mr. Ted Socha, and Dr. Elise Christopher. NCES’s prime contractor for NPSAS:16 is RTI. The following staff members at RTI are working on the statistical aspects of the study design: Dr. Jennifer Wine, Dr. Natasha Janson, Mr. Peter Siegel, Dr. David Wilson, Dr. T. Austin Lacy, Dr. Emilia Peytcheva, Mr. David Radwin, and Dr. Jennie Woo.

Subcontractors include Coffey Consulting; Hermes; HR Directions; Kforce Government Solutions, Inc.; Research Support Services; Shugoll Research; and Strategic Communications, Inc. Consultants are Dr. Sandy Baum, Dr. Stephen Porter, and Ms. Alisa Cunningham. Principal professional RTI staff, not listed above, who are assigned to the study include Mr. Jeff Franklin, Ms. Christine Rasmussen, Ms. Kristin Dudley, Ms. Jamie Wescott, and Ms. Tiffany Mattox.

References

Freedman, J. L., & Fraser, S. C. (1966). Compliance without pressure: The foot-in-the-door technique, Journal of Personality and Social Psychology, 4, 196-202.

Folsom, R.E., Potter, F.J., and Williams, S.R. (1987). Notes on a Composite Size Measure for Self-Weighting Samples in Multiple Domains. Proceedings of the Section on Survey Research Methods of the American Statistical Association, 792-796.

1 Institutions in Puerto Rico were not eligible for NPSAS:12.

2 23 field test institutions were included in the full-scale sample. These institutions are in the new public 4-year non-doctorate-granting primarily sub-baccalaureate stratum, which is described in this section, and are included because this stratum was oversampled.

3 The sector numbering will need to be determined for the data files.

4 A Hispanic-serving institutions indicator is no longer available from IPEDS, so we will create an indicator following the logic that was previously used for IPEDS.

5 We will decide what, if any, collapsing is needed of the categories for the purposes of implicit stratification.

6 For sorting purposes, Alaska and Hawaii are combined with Puerto Rico in the Outlying Areas region rather than in the Far West region.

7 Sample members also must have valid data for at least one of the eighteen specified variables from at least one data source other than the Current Population Survey (CPS).

File Type	application/vnd.openxmlformats-officedocument.wordprocessingml.document
File Title	Chapter 2
Author	spowell
File Modified	0000-00-00
File Created	2021-01-24

Part B NPSAS 2016 Full Scale Student Data Collection