3145-0077 Revised Section B

3145-0077 Revised Section B.docx

2010 National Survey of Recent College Graduates (NSRCG)

OMB: 3145-0077

Document [docx]
Download: docx | pdf

3145-0077, Recent college graduates, revised section b



B. COLLECTION OF INFORMATION EMPLOYING STATISTICAL METHODS

1. Respondent Universe and Sampling Methods

The 2010 NSRCG sample is based on a two-stage sample design as in the past. The first stage is a sample of colleges and universities offering bachelor’s and master’s degrees in SEH fields; the second stage is a sample of eligible SEH graduates from the first stage sample, who are age 75 or younger, not be institutionalized, and be living in the United States, Puerto Rico, or other U.S. territories.


To be eligible for the 2010 NSRCG, a school must be located in the United States, Puerto Rico, or other U.S. territories, and have awarded at least one bachelor‘s or master‘s degree in any of the SEH fields from July 1, 2007 to June 30, 2008 (AY08), and from July 1, 2008 to June 30, 2009 (AY09). The sampling frame for the 2010 NSRCG institutional sample was constructed based on the AY08-AY09 Integrated Postsecondary Education Data System (IPEDS) Completions File1, developed by the U.S. Department of Education, National Center for Education Statistics (NCES). Institutions in the frame were classified by type of control (public, private); region (northeast, north central, southeast, west); and the percentage of minority graduates in SEH fields. These characteristics are used for sorting (implicit stratification of) the institutions for sampling.


The first stage sample will be based on the 302 schools in the 2008 NSRCG as long as each school satisfies the 2010 NSRCG eligibility criteria. While reusing the same school sample reduces costs, there is a possibility of a loss in statistical efficiency due to frame coverage errors that have accumulated over three rounds of the survey. The frame coverage error will be investigated and assessed so that it can be corrected by means of a supplemental sample, as done in the 2008 NSRCG. Reusing the school sample may also cause unplanned high weight variation because of changes in the school sampling frame and distributional changes of SEH graduates over time. To the extent possible, NSF plans to incorporate the distribution changes across schools in terms of the number of graduates by key domains into the second stage sample size determination in order to minimize weight variations that result from changes in the sampling frame.


Sampling frame evaluation based on the IPEDS completions file indicated that 37 of the 2,027 schools in the 2008 school sampling frame were no longer eligible for the 2010 survey. On the other hand, 171 schools were found to be newly eligible for the 2010 NSRCG. The total composite size measure of death institutions translates into an expected sample size of 0.34 schools, while the total composite size measure of birth institutions translates into an expected sample size of 1.28 schools. In addition, one of the 302 sampled schools in the 2008 NSRCG merged with the main university in AY08-AY09, and will be no longer a separate sampling unit in the 2010 school sample. As a result, if the supplemental sample is to reflect the population change, the net size for the sample would be about 0.95 (=1.28 newly eligible school – 0.433 death school). Therefore, the 2010 school sample size remained at 302, which resulted in selection of one new school.

The distributions of graduates represented by the responding schools are examined by key second-stage sampling variables such as degree cohort, degree level, field of major, race/ ethnicity, and gender using the AY08-AY09 IPEDS Completion File. The years between surveys can change the coverage properties of the sampled institutions since “births” (newly eligible institutions) and “deaths” (ineligible institutions) can occur across time and result in a potential change in the population. Therefore, the sampling universe and the sample of educational institutions from 2008 are evaluated to make sure they represent the target population for the 2010 NSRCG. It includes the institutions’ composite measure of size (CMOS) and a need for a supplemental institution sample for coverage purposes.

The composite measure of size for each institution required knowledge of the population counts for the analytic domains and expected sampling rates for the domains. Domains used for the composite measure of size calculation are the following:

  • Two degree levels: bachelor’s and master’s

  • Twenty-one major field categories: chemistry, physics/astronomy, other physical sciences, mathematics/statistics, computer sciences, agricultural/food/environmental sciences, aerospace engineering, chemical engineering, civil engineering, electrical engineering, industrial engineering, mechanical engineering, other engineering, biological sciences, psychology, economics, sociology/anthropology, other social sciences, political science, and two health fields

  • Six demographic groups: non-Hispanic white male; non-Hispanic white female; non-Hispanic Asian male; non-Hispanic Asian female; minority (black, Hispanic, and American Indian/Alaska Native) male; and minority (black, Hispanic, and American Indian/Alaska Native) female

The measure of size for institution i, MOSi, is defined as


,

where = expected sampling rate for degree d, major sampling category k, and demographic group j, and = total number of graduates of institution i with degree d, major sampling category k, and demographic group j.



2. STATISTICAL PROCEDURES

The sampling frame for the SEH graduates is formed from lists of graduates from the sampled universities. Each institution’s list will be stratified by (1) two graduate cohorts—one cohort from the 2007–08 academic year (July 1, 2007-June 30, 2008) and the other cohort from the 2008–09 academic year (July 1, 2008-June 30, 2009); (2) two degree levels—bachelor’s and master’s; (3) the 20 major fields of study sampling categories identified above2; (4) the three race/ethnicity groups—non-Hispanic white, non-Hispanic Asian, and minority (black, Hispanic, and American Indian/Alaska Native); and (5) two gender groups.


After sorting the sampling frame by institution and domain (cohort, degree level, major field, race/ethnicity, and gender in order), the sample of 18,000 graduates are selected sequentially by using a PPS with institution-level and domain-specific sample sizes as the size measure. A total of 480 different strata will be developed for the cross-classification of the above-mentioned domains. Underrepresented minorities will be selected at 3.5 times the rate of whites. Asians and unknown race cases will be selected at 2.1 times white cases. The total sample size will be 18,000.


Appendix C shows the 2008 sample sizes for each stratum of the cohort from the 2005–2006 and 2006–2007 academic years. The 2010 sample design will be similar to 2008. The proposed sample sizes are based on the same sampling rates used for composite size measure calculation for the school sample selection. With these proposed sample sizes, the corresponding sampling rates, defined as the ratio of the sample sizes to the IPEDS counts, are calculated. These sampling rates by stratum will be applied within each eligible responding institution and should result in sampling 18,000 graduates. The domain specific sample sizes are random variables that depend on how closely the number of graduates in the eligible fields as reported by the institutions corresponds to the IPEDS counts used for sampling; minor variation in the achieved sample size is expected.


The analysis of survey data from the 2010 NSRCG requires survey weights to account for unequal probabilities of selection, unit nonresponse, duplicates on the sampling frame, extreme weights, and coverage errors.


Constructing the Institution-Level Weight. The first step of the 2010 NSRCG weighting process will begin with the construction of the sampling weights for the postsecondary institutions. All sampled institutions will have a sampling weight equal to the inverse of the institution’s probability of selection. The nonresponse adjustment cells at the school level will be formed by a cross-classification of significant variables identified from a response propensity model on variables such as institutional control (public and private), region, representation (whether the institution is self-representing or non-self-representing), and percentage of minority graduates.


Constructing the Graduate-Level Sampling Weights. The graduate sampling weight is the product of the institution-level, nonresponse-adjusted weight and the inverse of the conditional probability of selecting the graduate, given that the individual’s institution was selected. The next step will be a weighting adjustment to account for graduate nonresponse. The graduates will be classified as eligible respondents, eligible nonrespondents, ineligible, or eligibility unknown. In addition, the sample can be also partitioned into two groups: located and not located. The graduate level nonresponse adjustment will be computed in three steps: adjustment for not-located cases, adjustment for eligibility unknown cases, and adjustment for eligible nonrespondents. Consequently, the graduate, nonresponse-adjusted weight is the product of these three factors (factor 1 for not-located cases, factor 2 for eligibility unknown cases, and factor 3 for nonrespondents) and the base weight.


Adjustment for Multiple Chances of Selection (Multiplicity Adjustment). The next adjustment to the graduate weight involves those responding graduates who could have been sampled more than once. For example, a person who obtained a U.S. bachelor’s degree in June 2008 and a U.S. master’s degree in June 2009 (both in eligible fields) could have been sampled for either degree. If a respondent had multiple degrees from within or across sampled schools, he or she will very likely be identified before the sample selection so that no graduate will be sampled more than once. Consequently, multiple degree holders are expected to be identified in the weighting stage if they reported eligible degrees from nonsampled schools in addition to sampled schools. To make the survey estimates essentially unbiased, the weights of all responding graduates who could have been sampled multiple times (but not identified at the time of sampling) will be divided by the number of times of possible selection.


Raking Adjustment. As in the past, a raking procedure will be applied to enhance the precision of the 2010 NSRCG estimates after adjusting for multiple degrees. Raking is a method of adjustment that ensures the adjusted weights of the respondents conform to each of the marginal distributions of the auxiliary variables (Deming and Stephan 1940). Raking involves an iterative adjustment of the weights in which fitting methods—such as an iterative proportional fitting algorithm or least squares—are used.


Trimming of Outlier Weights. Raked weights will be evaluated for the existence of outlier weights. To do this, weighted counts for the present and past survey year will be compared for rare populations subject to oversampling (that is, black, Hispanic, and American Indian/Alaska Native). When rare populations are oversampled, excessive variation can occur in the population counts from year to year, particularly when members of rare populations are unexpectedly encountered in sampling a “non-rare” stratum. The large weight given to these rare cases when sampled from a non-rare stratum can cause even one such selection to distort rare population counts from one year to the next. The increase in sampling error can be substantial if the range of weights is large. In particular, extremely large sampling weights can seriously reduce survey precision.


To correct outlier problems, the weight of the outliers will be trimmed by investigating weight distributions for each analytic domain of interest. The raking adjustment will be repeated after weight trimming. This second iteration of raking will serve as a smoothing adjustment to recover the amount trimmed from the outlier weights.


Constructing the Final Weight and Replicate Weights. The final analysis weight will be constructed by implementing the above-mentioned procedures: sampling, nonresponse adjustment, multiplicity adjustment, raking, and trimming. A set of replicate weights will be produced based on the jackknife replication method. The entire weighting process applied to the full sample will then be applied separately to each of the replicates to produce a set of replicate weights for each record.


Standard Errors. Variance estimation procedures similar to those used in the past will be used in 2010: the jackknife replication and generalized variance function (GVF) methods.

3. Methods to Maximize Response Rates

Maximizing Response Rates

A critical issue for the NSRCG is dealing with response rates that declined from around 85 percent in the early 1990s to 68 percent in 2003 and 2006, and 71 percent in 2008. The approach for 2010 survey will be to reduce the number of nonrespondents through improvements in the data collection strategy. Nonresponse to most surveys is caused by two factors: (1) the inability to locate the sample member and (2) the inability to gain cooperation from the located sample member.


The lower response rate in 2003, 2006 and 2008 is not only because sample members were less likely to respond to the survey, but also because it has become much harder to locate and contact sample members. The nonresponse rate for the 2008 NSRCG was about 29 percent, and about 20 percent was nonlocated cases. The NSRCG population of recent college graduates is highly mobile and more likely to have only a cell phone.

Methods of maximizing response rates include offering multiple modes for completing the interview, offering incentives, addressing the cell phone issue, converting refusals effectively, and applying intensive locating efforts. The 2010 NSRCG is a multimode study, with web, mail and telephone modes offered. NSF is planning to emphasize first the lowest marginal cost mode, the web, followed by the second lowest cost mode, mail, and finally telephone, the most expensive. In addition to this emphasis, MPR will use a number of techniques to try to ensure early participation in the 2010 NSRCG, which reduces follow-up costs.


Incentive plan in 2010


To increase the response rate and minimize potential bias, NSF plans to offer a monetary incentive. Based on results from the 2008 NSRCG incentive experiment3, the plan is to initially offer $20 for a completed questionnaire, which changes later in the field period to a differential amount that favors web completes.


Those who are refusals in the survey are likely to remain as nonrespondents without an offer of incentive. Incentives are effective in increasing the survey response rate, which in turn help to minimize possible nonresponse bias in the final survey estimates. The incentive plan will use a two-tiered incentive level. The invitation letter will offer a $20 postpaid incentive for completing the survey. In the questionnaire mailing that follows 4-5 weeks later (or when initial web returns begin to sharply decline), we will offer these nonrespondents $20 for completing the enclosed paper questionnaire or a telephone interview or $30 for a completed web questionnaire. At the end of the 2008 NSRCG incentive experiment, groups offered the differential incentive had response rates 10–13 percentage points higher than groups not offered an incentive and several percentage points higher than those that were offered only $20 for all completes.


Locating

NSF will start locating the sample members early by obtaining the latest contact information from the alumni offices of the institution from which they received their sampled degree. These offices are often the best source of current information because they have a vested interest in maintaining contact with alumni. Early locating will mostly involve various nonintrusive locating resources to collect the best contact information on the sample members prior to the data collection.

All survey mailings will utilize the “Return Service Requested” option to ensure that the postal service will provide a forwarding address for any undeliverable mail. During the data collection field period, all cases still lacking a valid address or telephone number will be handled by the most experienced locators who will: (1) search more extensive (often more expensive) electronic databases for contact information, (2) conduct individually customized Internet searches, and (3) contact school departments from which the sample member graduated or associations in which he or she might have memberships. In addition, emerging sources of information, such as cell phone directories and search engines, will be monitored for possible use in locating NSRCG sample members.


Addresses Outside the United States

If a sample member has a current address outside the United States, NSF will institute special procedures to try to confirm that the person is still outside the United States on the reference date of October 1, 2010, and therefore ineligible for the study. This will include calling the sampled graduate and all available contacts during the week of October 1, 2010. This will be done before mailing any initial invitation. If we can identify the sampled graduate as ineligible, the case can be coded as ineligible without expending additional resources.


Telephone and Address Verification Form (TAVF)

An advance letter will be mailed to the sampled graduates four weeks prior to October 1, 2010. To increase initial contact rates, a telephone and address verification form (TAVF) will accompany the advance letter. TAVF will collect the usual contact information, cell phone information (the service provider), and the sampled graduate’s email address(es). MPR’s toll-free telephone number and email address will also be included for sample members who have questions. A postage-paid return envelope will facilitate returning completed TAVFs. See Appendix E for the TAVF.


Encouraging Web Completes


Web completes provide significant cost savings because they require minimal staff intervention. We will encourage web completes through two means: 1) the initial invitation letter mailing will not include a paper questionnaire (web will be the only option) and 2) the follow-up paper questionnaire mailing will offer a differential incentive. The 2008 NSRCG incentive experiment, WebFirst groups (no questionnaire in the first mailing) still had web completes representing 70 percent of all their total completes three months into the field period, while the groups with the differential incentive had 80 percent web completes, saving paper questionnaire receipt and processing time.


Increasing Contacts with Cell-Phone-Only Households


Because a substantial proportion of the NSRCG sample will have only cell phones, NSF intends to use the email and postcard approach that proved so successful in 2008 to reach sample members. In 2010, we plan to initiate the mail and email prompts earlier than 2008, and continue a series of mailings and reminder emails throughout the field period. The reminder mailings will consist mainly of postcards. Postcards prove more effective than letters because our message is clearly visible, whereas a letter can be thrown away without being opened. Every email reminder in 2008 produced a small bump in completed interviews, especially web completes.


Data Collection

A multimode data collection protocol will be used to improve the likelihood of gaining cooperation from sample cases that are located. Sample cases will be initially offered only a web option to encourage survey completion by the web, which is the most cost effective and timely method of data collection. Recent graduates are highly web-literate, so offering a web response option is apt to be appealing to NSRCG respondents.


The follow-up mailing will offer a choice of response options by inclusion of the paper questionnaire and web survey online access information before CATI begins to contact the nonrespondents to paper and web survey invitations.


In addition to these procedures, the following steps will be taken to maximize response rates and minimize nonresponse:

  • Developing “user friendly” survey materials that are simple to understand and use

  • Sending attractive, personalized material using priority mail, making a reasonable request of the respondent’s time, and making it easy for the respondent to comply

  • Using priority mail for targeted mailings to improve the chances of reaching respondents and convincing them that the survey is important

  • Devoting significant time to interviewer training on how to deal with problems related to nonresponse and ensuring that interviewers are appropriately supervised and monitored

  • Using refusal-conversion strategies that specifically address the reason why a potential respondent has initially refused, and then training conversion specialists in effective counterarguments

See Appendices E and F for survey mailing materials.


Dealing with Issues of Nonresponse Bias

To minimize the potential nonresponse bias in the NSRCG, weighting procedures will be used to compensate for nonrespondents in the final weighted estimates. Multivariate logistic regression analyses will be conducted to identify the sampling frame variables that might have affected the sample members’ response propensity.


However, NSF was still concerned with the lower than expected survey response rate in the NSRCG and conducted a nonresponse bias study in the first stage and a study of the effects of late respondents in the second stage of past NSRCG data collections (see Appendix G). Results of the research on the first stage nonresponse bias showed that while we need to continue to achieve high response rates, such as 99 percent at the school-level, we can be sufficiently flexible to compromise the school-level response rate down to 90 percent. Because of the late start of the first stage list collection in 2010, it may be necessary to accept a lower first stage response rate in order to avoid delay in beginning of the second stage data collection (and maintain similar data collection schedules with the other two SESTAT surveys).


Any potential bias will be assessed before stopping list collection below a response rate of 99 percent. NSF will perform real-time monitoring at several points before the list collection is closed, including at 90 percent response rate to examine the nonrespondent and respondent institutions in terms of key school-level characteristics including geographic location, type of institution control, and percent of minority graduates. NSF will examine whether there are systematic differences on key variables (e.g., graduate counts) between nonresponding and responding schools (for example, historically black colleges in the past have been more likely to be nonresponding schools) in order to decide whether to continue with the list collection or whether the nonresponse can be taken into account through weighting. The research results based on the 2003 and 2006 data found no significant bias in the final data and any small differences were properly addressed in the nonresponse weighting adjustments.


In 2010, the base weights for nonresponse will be adjusted using the procedures described above. Also, NSF may consider looking at a few other sampling variables for the weighting strategy, such as school or respondent location, to see if the nonresponse adjustment weighting can be fine-tuned. Careful selection of factors for constructing the weighting classes will reduce the potential for nonresponse bias. Weights will also be adjusted to control distributions for some variables to known totals from the sample frame, as described above. An assessment will be made of the extent of remaining bias by comparing weighted estimates for the survey sample that can be observed in the sample frame (e.g., degree field, degree level, and gender) to estimates for the population that the weighted sample is intended to represent.

4. TestING of Procedures

Because data from all three SESTAT surveys are combined into a unified data system, the surveys must be closely coordinated to provide comparable data from each survey. Most questionnaire items in the three surveys are the same.


The SESTAT survey questionnaire items are divided into two types of questions: core and module. Core questions are defined as those considered to be the base for all three SESTAT surveys. These items are essential for sampling, respondent verification, basic labor force information, and/or robust analyses of the science and engineering workforce in the SESTAT integrated data system. They are asked of all respondents each time they are surveyed, as appropriate, to establish the baseline data and to update the respondents’ labor force status and changes in employment and other demographic characteristics. Module items are defined as special topics that are asked less frequently on a rotational basis of the entire target population or some subset thereof. Module items tend to provide the data needed to satisfy specific policy, research or data user needs.


All content items in the SESTAT survey questionnaires had undergone an extensive review before they were included in the final version of the 2008 questionnaires. The 2010 NSRCG questionnaires will include no new items. It will include several module items rotating in from prior survey rounds and one new category in the core item on disability, which was taken from the ACS although not previously fielded in the SESTAT surveys.


For 2010, the NSRCG questionnaire content has been revised from 2008 as follows:


  • Survey reference date changed from October 1, 2008 to October 1, 2010.

  • Rotated in a question determining if the respondent’s employer is a new business (last asked in 2003).

  • Rotated in a module on respondent’s job satisfaction with various job attributes (last asked in 2003).

  • Rotated in a module on employer-provided job benefits (last asked in 1997).

  • Rotated in a question on sources of Federal support with a reduced number of Federal agencies in the list (last asked in 2003).

  • Rotated in questions about professional meeting attendance and association membership (last asked in 2003).

  • Rotated in a module on respondent’s ranking of the importance of various job attributes (last asked in 2003).

  • Rotated in immigrant module questions (receipt year of permanent U.S. resident visa, year came to U.S., type of entry visa, reasons for coming to U.S., dual citizenship status) (last asked in 2003).

  • Added a new category on the respondent’s functional limitations with regards to concentrating, remembering or making decisions (from ACS questionnaire).

  • Removed some of the questions from the 2008 module on community college

  • Rotated out a module on second job (status, job description, job category, relatedness of second job to highest degree).

  • Removed a category ‘Chronic illness or disability’ as a reason for working fewer than 35 hours per week due to too few cases reporting that reason in past survey cycles.

  • Modified the format of telephone numbers of the respondents to specify home, work, or cell number for the daytime, evening and other telephone numbers.

  • Added four new computer occupation codes and dropped one old code from the Job Category List based on the updated 2010 Standard Occupation Classifications (SOC).

  • Modified the Field of Study List based on the updated 2010 Classification of Instructional Programs (CIP).


A complete list of questions proposed to be added, dropped, or modified in the 2010 NSRCG questionnaire is included in Appendix D.


The 2010 NSRCG web survey instrument will be updated from the 2008 NSRCG web instrument in light of recommendations from the usability testing conducted on the NSCG web survey instrument by the Census Bureau’s Statistical Research Division.


2008 Survey Methodology Tests

Differential Incentive Experiment


In 2008, NSF conducted an incentive experiment to test the effects of offering different levels of incentives based on response options (see Appendix I). If cell phone listings were readily available, the task of locating and interviewing members of the NSRCG population would be relatively straight forward. However, cell phone listings not readily available and with the large, and ever increasing number of young adults in cell phone only households, making contact is often problematic. Even when contact is made, convincing these young adults to participate in a survey is another challenge. Thus, locating, making contact and then interviewing young adults can be very labor intensive and, thus, costly.


A monetary incentive can help reduce this problem by motivating people to respond who might not otherwise do so. Investigating this behavior was an important part of the motivation behind the 2008 NSRCG monetary incentive experiment. The other motivating factor was increased cost efficiency. Because web questionnaires are notably less costly to process than either paper or CATI questionnaires, exploring whether offering a differential incentive that favored web completes would significantly increase web completes was another goal of the 2008 NSRCG experiment.


The incentive experiment results showed that groups that were offered no incentive had response rates 10 to nearly 15 percentage points lower than groups that were offered incentives and groups that offered the differential incentive had the highest response rates, as well as the highest proportion of web completes. Another part of the experiment, not offering a paper questionnaire as part of the initial survey package also increased the overall proportion of web completes.


Based on these outcomes, NSF plans to incorporate a two-tiered monetary incentive in 2010, along with not including a paper questionnaire in the initial survey mailing. The plan will offer a $20 incentive as part of the initial mailing and later offer a $20/$30 differential incentive as part of the second mailing five weeks later, when a paper questionnaire is also enclosed. Using this incentive from the start may also shorten the field period.


Survey Methodology Tests to be Undertaken


NSF currently does not have any plans to conduct methodological tests in the 2010 NSRCG. Should NSF decides to include such tests, the plan will be submitted for OMB approval prior to implementation.



5. Contacts for Statistical Aspects of Data Collection

As mentioned, the data will be collected by MPR, a research contractor selected through an open competition. Chief consultant on statistical aspects of data collection is Donsig Jang (202) 484-4246 at Mathematica. At NSF the contacts for statistical aspects of data collection are Stephen Cohen, SRS Chief Statistician (703) 292-7769, and Kelly Kang, NSRCG Project Manager (703) 292-7796.



1 The Completions File contains the number of degrees/other awards granted by the postsecondary institution in each field of study (CIP code), by level of award/degree, and race/ethnicity and gender of the recipient.

2 Two health fields will be combined to be consistent with the level of analytic domains. That is, all health fields are reported in the same reporting cell.

3 See “2008 NSRCG incentive experiment results: Internal Working Paper” in Appendix I.

File Typeapplication/vnd.openxmlformats-officedocument.wordprocessingml.document
Authorsplimpto
File Modified0000-00-00
File Created2021-02-02

© 2024 OMB.report | Privacy Policy