Download:
pdf |
pdfSDR OMB Package for the 2015 Cycle
Attachment E
2013 Survey of Doctorate Recipients: Sample
Design and Implementation Report
Survey of Doctorate Recipients
Page E-1
2013 SURVEY OF DOCTORATE
RECIPIENTS:
Sample Design and Implementation
PREPARED FOR:
Lynn Milan, SDR COTR
National Science Foundation
4201 Wilson Boulevard
Arlington, VA 22230
(703) 292-5111
Lori Thurgood
SRI International
1100 Wilson Blvd., Suite 2800
Arlington, VA 22209
(703) 247-8528
FEBRUARY 11, 2013
PREPARED BY:
Brenda G. Cox, SRA
Karen Grigorian
Michael Yang
Mike Sinclair
NORC at the
University of Chicago
55 East Monroe Street
Chicago, IL 60603
(312) 759-4000
2013 SDR | Sample Design and Implementation
This report was prepared by NORC at the University of Chicago for SRI under Subcontract 52000059, which is in turn conducted for the National Science Foundation under Prime Contract
Number NSFDACS12C1299 (Task Order 8NSFSRS083228). The NORC internal Project
Number is 7442.
This version of the report contains suppression of small cells for confidentiality reasons and may
be publically released. Please contact NORC for further information regarding this document or
for information about the suppressed version named 2013 SDR Sample Design and
Implementation Report_11Feb2013_Final.docx.
NORC Authors
Karen Grigorian
Michael Yang
Mike Sinclair
SRA, International Author
Brenda G. Cox
Table of Contents
1.
Overview of the 2013 SDR Sample Design..................................................... 1
2.
Design Changes from the 2010 SDR .............................................................. 3
3.
Frame Development ........................................................................................ 6
3.1
Sample Frame Construction ........................................................................... 6
3.1.1 Frame File Layout ...................................................................................................... 7
3.1.2 Missing Data Imputation Rules for Sampling Stratification and Sort Variables ....... 12
3.2
Old Cohort Sample Frame Construction ....................................................... 20
3.2.1 NSDR Old Cohort Frame Definition ......................................................................... 21
3.2.2 ISDR Old Cohort Frame Definition........................................................................... 21
3.2.3 2010 SDR Final Eligibility Status and Frame Assignment ....................................... 21
3.2.4 Evaluation of Old Cohort Frame Strata Assignments .............................................. 24
3.3
4.
New Cohort Sample Frame Construction ...................................................... 25
Sample Stratification ..................................................................................... 27
4.1
NSDR Sample Stratification .......................................................................... 27
4.1.1 Demographic Group Recode ................................................................................... 28
4.1.2 Degree Field Recodes ............................................................................................. 28
4.2
ISDR Sample Stratification............................................................................ 29
FINAL REPORT | i
2013 SDR | Sample Design and Implementation
5.
6.
Sample Size .................................................................................................. 31
5.1
NSDR Sample Size ...................................................................................... 31
5.2
ISDR Sample Size ........................................................................................ 33
Sample Allocation ......................................................................................... 35
6.1
Background on NSDR Sample Allocation Procedures .................................. 35
6.1.1 Introduction of the Maintenance Cut ........................................................................ 35
6.1.2 The 2013 NSDR and its Derivation from 2003 and 2010 NSDR Redesigns ........... 36
6.2
Allocation of the 2013 NSDR Sample to Panel Members and New Cohorts .. 36
6.2.1 The NSDR Allocation Process ................................................................................. 37
6.2.2 The 2013 NSDR Allocation Results ......................................................................... 39
6.2.3 Trends over Time in the NSDR Sample Allocation .................................................. 41
7.
8.
6.3
ISDR Sample Allocation................................................................................ 44
6.4
NSDR and ISDR Probabilistic Rounding ....................................................... 45
Sample Selection .......................................................................................... 46
7.1
NSDR Sample Selection ............................................................................... 46
7.2
ISDR Sample Selection ................................................................................ 47
Concluding Remarks ..................................................................................... 49
References .................................................................................................................... 53
Appendices ................................................................................................................... 55
Appendix A.1 2013 SDR Birth Region Code Frame Mapped to SESTAT Geocodes
and Race/Ethnicity Imputation based on Birthplace ................................................. 56
Appendix A.2 2013 SDR Field of Study Coding Taxonomies Crosswalk ................. 66
Appendix A.3 2013 SDR Data Sources Used to Develop Sampling Frame Variables80
Appendix B.1 2013 NSDR Strata and Frame Counts .............................................. 82
Appendix B.2 2013 NSDR Minimum Respondent Stratum Sample Sizes with and
without Finite Population Correction Adjustment and Associated Yield Rates .......... 86
Appendix B.3 2013 NSDR Final Sample Allocation................................................. 91
Appendix C.1 2013 ISDR Strata with Frame Population Counts and Sample Cases95
Appendix D Detailed NSDR Allocation Algorithm and Final 2013 NDR allocation ... 97
FINAL REPORT | ii
2013 SDR | Sample Design and Implementation
1.
Overview of the 2013 SDR Sample Design
Since its inception in 1950, the National Science Foundation (NSF) has been charged to “Provide
a central clearinghouse for the collection, interpretation and analysis of data on scientific and
technical resources in the United States, and provide a source of information for policy
formulation by other federal agencies” (NSF Web Site 2012). The Survey of Doctorate
Recipients (SDR) has been an important means for the NSF to accomplish this objective.
Conducted biennially since 1973, the SDR follows a sample of U.S. trained doctorates in
science, engineering, and health (SEH) throughout their careers from shortly after degree award
by a U.S. institution through age 75. The SDR is widely used by the U.S. Congress and Federal
agencies, universities and professional societies, and other organizations, and individuals
interested in knowing more about the nation’s education, supply, and employment of doctorate
recipients in SEH fields. Employers in universities, industry, and government sectors also use
the SDR to understand and predict trends in employment opportunities and salaries for
doctorates in SEH fields.
Until the 2003 survey cycle, the SDR restricted data collection to U.S. residents. The 2003 SDR
included two methodological changes to determine whether data could be successfully collected
from U.S. trained SEH doctorates that reside outside the U.S. The first change expanded the data
collection for the traditional SDR by completing surveys with sampled cases discovered to be
living outside the U.S. The second change attempted data collection regardless of country of
residence for a sample of the non-U.S. citizens with degrees awarded in the 2001 and 2002
academic years. These non-U.S. citizens were ineligible for inclusion in the new cohort portion
of the traditional SDR frame because they reported plans to emigrate after degree receipt
(Grigorian and Hoffer, 2005). Collection of data from international residents proved to be
feasible and a formal design was developed for the international survey and its longitudinal panel
of U.S. trained SEH doctorates that reside outside the U.S. in the 2006 cycle and continued in the
2008 cycle. (Cox, Grigorian, and Yang, 2006). Both the main and international samples targeted
U.S.-trained SEH doctorates younger than 76 years on the survey reference date, but the main
SDR target population was restricted to those residing in the U.S., and the eligible international
target population was restricted to those residing outside of the U.S. As a result of this sample
Prepared for NSF by NORC | 1
2013 SDR | Sample Design and Implementation
segregation, the potential analytic power of both samples was diminished in terms of coverage
and sample sizes. Survey data collected from non-U.S. residing respondents from the main SDR
sample and survey data collected from U.S. residing respondents from the international sample
were not utilized for analysis.
To address this issue, the NSF decided to integrate the two surveys to create a unified survey of
U.S. trained SEH doctorates that provides analysts with the capability of studying and comparing
U.S. versus non-U.S. residents in the 2010 survey cycle. NSF decided to refer to the integrated
data set as the Survey of Doctorate Recipients (SDR) and its two components as the National
Survey of Doctorate Recipients (NSDR) and the International Survey of Doctorate Recipients
(ISDR).1
The integrated sample design developed for the 2010 SDR has been maintained for the 2013
SDR. This report describes the 2013 SDR sample designs and implementation for both the
NSDR and ISDR sample components. We begin by summarizing the main design changes from
the 2010 SDR in Section 2. These changes were relatively minor and were generally restricted to
the development of the sampling frame variables. We then discuss in detail the main parameters
of the 2013 SDR design. Section 3 describes the frame construction process for the different
cohorts of the population. Section 4 presents the 2013 SDR stratification scheme for two sample
components. Section 5 discusses the sample sizes of the 2013 NSDR and ISDR sample
components. Section 6 discusses the strategy and results of sample allocation across strata and
substrata. Section 7 reports the sample selection procedures. Finally, Section 8 provides
recommendations for future SDR sample design research.
1
Analysts should note that study documentation for previous survey cycles (2003 to 2008) refers to the NSDR as the
SDR and that the ISDR designation was not applied to this sample component as it was still in feasibility stage in the
2003 survey cycle.
Prepared for NSF by NORC | 2
2013 SDR | Sample Design and Implementation
2.
Design Changes from the 2010 SDR
Changes in a longitudinal study such as the SDR must be documented so that study planners can
properly understand and use data from past survey cycles. In addition, analysts need such data to
assess whether and to what extent differences in study design may impact on time-series
analyses. The changes between the 2013 SDR and 2010 SDR sample designs are minimal as the
2013 design closely followed that of the 2010. The majority of the differences relate to
differences in the construction of sampling frame variables. This section documents the limited
differences that do exist between the two designs.
Target Population Definition
The reference date changed from 1 October 2010 used in the 2010 SDR to 1 February
2013 for the 2013 SDR.
Frame Construction
The field of study taxonomy for the 2013 SDR new cohort included 15 new fields added
by the Survey of Earned Doctorates (SED). Those new fields and how they map to the
SESTAT fine field of study code are shown in Table 2.1.
While the definition of the location status variable (LOCSTAT) did not change from
2010 to 2013 SDR, the method for creating it for the old cohort nonrespondent cases was
improved in 2013 SDR. As part of the 2010 SDR post-processing procedures, the most
current sample member location variable (RESPLO3) was created for located
nonresponse cases using the NSF-approved algorithm.2 This variable was used to assign
LOCSTAT13 for the 2013 SDR old cohort cases that finalized as located nonrespondents
in 2010 SDR. In the 2010 cycle, last known address was used to create LOCSTAT10 for
the located nonresponse cases, but was not done using the RESPLO3 program code
algorithm which is more precise.
The region of birth variable used to sort ISDR cases was updated. A new region of birth
variable (BIREGION) was developed and replaces the region of birth variable used in
2010 ISDR (ISDRCODE). The variable BIREGION more closely aligns with the other
2
For details regarding the algorithm for creating RESPLO3 for the 2010 SDR nonresponse cases, see the
memorandum entitled “2010 Survey of Doctorate Recipients Final Data Delivery” addressed to Lynn Milan (NSF)
and Dave Edson (MPR) from Karen Grigorian and Lance Selfa (NORC) dated 11 January 2013 included with the
2010 SDR final data delivery files.
Prepared for NSF by NORC | 3
2013 SDR | Sample Design and Implementation
place of birth variables used in post-processing and standard publications and the code
frame can be found in Appendix A.1.
Table 2.1
New SED Field of Study Codes mapped to SESTAT Codes
New SED Field of Study Code
(PHDFIELD)
Label
Code
Code
SESTAT Field of Study Code
(NSDRMED)
Label
415 Robotics
D67 Computer/information sciences
509 Astronomy, Other
871 Astronomy and astrophysics
577 Medical Physics/Radiological Science
878 Physics, except biophysics
316 Structural Engineering
726 Civil engineering
168 Virology
637 Microbiological sciences and immunology
104 Computational Biology
642 OTHER biological sciences
155 Structural Biology
642 OTHER biological sciences
167 Environmental Toxicology
642 OTHER biological sciences
207 Oral Biology/Oral Pathology
786 Medicine (e.g., dentistry, optometry, osteopathic, podiatry, veterinary)
227 Gerontology
731 OTHER health/medical sciences
684 Gerontology
930 OTHER social sciences
806 Urban Education and Leadership
808 Educational Policy Analysis
Not applicable; non-SEH field
833 International Education
912 Hospitality, Food Service and Tourism Management
The 2000 Census list of Hispanic surnames became available since conducting the 2010
SDR. As such, this updated Hispanic surname list was used to impute ethnicity for 2013
SDR new cohort cases in the frame when ethnicity was not reported. In 2010 SDR, the
1990 Census list of Hispanic surnames was used.
While the 2013 SDR old cohort disability status frame variable was constructed as it was
in the past, the 2010 SDR questionnaire included a new disability or functional limitation
category contributing to an increase in the number of disabled old cohort frame cases. In
SDR frame construction, the most recently reported data are used to construct the
disability status variable. SDR respondents that classify themselves as having a moderate
Prepared for NSF by NORC | 4
2013 SDR | Sample Design and Implementation
or greater difficulty with any disability category in the survey are classified as disabled in
the subsequent round’s frame file.
Prior to the 2010 cycle, respondents could choose from four disability categories (i.e.,
difficulty with seeing, hearing, walking, or lifting). Starting with the 2010 SDR, a fifth
disability category for reporting difficulty with concentrating, remembering or making
decisions was added.
As a result of the added disability category in the 2010 SDR, the number of old cohort
frame cases classified as disabled in the 2013 SDR frame file was noticeably greater. To
assess the impact of the added category, the disability status was calculated for the 2013
SDR old cohort as it was defined for the 2010 old cohort frame cases using responses
from just the four disability categories and compared to the disability status calculated
using all five disability categories. This comparison showed that the fifth new cognitive
disability category caused an increase in the number of disabled old cohort frame cases of
7.6 percent.
This difference was limited to the old cohort frame cases. The method for deriving
disability status for the new cohort was unchanged from the prior cycle. For new cohort
frame cases, disability status is derived from the SED variable HANDICAP. If at least
one disability was indicated at HANDICAP, the new cohort frame case was coded as
disabled. For the 2013 and 2010 SDR new cohort frame cases, the SED disability
categories have consistently been blind/visually impaired, deaf/hard of hearing,
physical/orthopedic disability, learning/cognitive disability, vocal/speech disability, and
other self-specified disability.3 New cohort frame cases not reporting disability status in
SED are imputed to be non-disabled.
Sample Selection
For the 2013 ISDR, the redefined region of birth variable (BIREGION) was used for
sorting instead of the more aggregated birth region variable (ISDRCODE) that was used
in the 2010 ISDR.
3
Note that the SED definition of disability is more encompassing than the SDR definition. The SED definition
includes a response option for vocal/speech disabilities in addition to an “other specified” response option. SDR has
neither of these. Furthermore, SED does not differentiate the degree of disability difficulty; respondents simply
report having a disability or not. In SDR, only individuals with a moderate or greater degree of disability are
considered disabled.
Prepared for NSF by NORC | 5
2013 SDR | Sample Design and Implementation
3.
3.1
Frame Development
Sample Frame Construction
The sample frame construction for the NSDR and the ISDR components were done together
reflecting the SDR’s integrated sample design. While the target population definitions for these
two sample components are different and sample selection is done separately, the frame
construction requirements for the variables included in each frame are identical. Thus, this
sample frame construction section discusses the frame construction process for the NSDR and
ISDR together.
The target population of the 2013 SDR covered individuals who met the following requirements,
regardless of residency location:
Received a doctoral degree in an SEH field from a U.S. institution;
Age 75 years or younger on February 1, 2013; and
Living in a noninstitutionalized setting on February 1, 2013.
A sampling frame was constructed to represent the NSDR and ISDR target populations,
henceforth referred to as the 2013 SDR frame. A sampling frame is a set of elements and a set of
procedures for identifying and locating the population elements. The frame usually contains
information for sample stratification and sample selection. The goals of frame construction are
twofold: one is to represent all population elements in the frame so they all have some known
non-zero probability of being selected into the sample; the other is to define auxiliary variables
for sample stratification and survey operations. The old cohort frame was developed from the
2010 SDR sample and the new cohort frames were developed from the two most recent cohorts
added to the Doctorate Records File (DRF). The DRF is a cumulative database of all U.S.granted research doctorates constructed using data collected from the SED, an annual census of
research doctorates awarded by U.S. academic institutions since 1920.
The 2013 SDR frame was constructed as two separate databases:
1. The old cohort frame was constructed from the 2010 SDR sample (n=45,697) including
only those eligible for 2013 SDR (n=44,602) and
Prepared for NSF by NORC | 6
2013 SDR | Sample Design and Implementation
2. Approximately half of the new cohort frame was constructed from the 2010 SED records
(n=48,034) including only those cases eligible for the 2013 SDR (n=35,242). The other
half of the new cohort frame was constructed from the 2011 SED records (n=49,010)
including only those cases eligible for 2013 SDR (n=36,664). Because the survey
reference date was shifted from October 1 to February 1, 2013, the fully processed DRF
was available for both new cohort years (2010 and 2011) at the time of frame-building.
Unlike previous survey rounds, when new cohort frames were constructed separately
from each SED year’s database as it became available, the SDR team was able to build a
single new cohort frame file covering both SED survey years.
3.1.1 Frame File Layout
While two separate files (i.e., the old and new cohort frame files) make up the total 2013 SDR
sampling frame, the file layout for each frame file is the same and is shown in Table 3.1. The
layout describes each variable and its code frame, where feasible. Variables with longer coding
taxonomies, such the field of study variables, can be found in Appendix A. When this occurs, it
is noted in “Values” column in Table 3.1.
Table 3.1
2013 SDR Sample Frame File Layout
Variable
Description
Format
Length
Values
Case Identifiers
SU_ID
Survey ID
Char
8
Randomly assigned value
REFID
Reference ID
Reference ID prior to 2010 SDR and
integration1
DRF ID
DRF ID initially assigned, and subsequently
dropped when a duplicate DRF entry was
detected after SDR sample selection.
Char
9
Randomly assigned value
Char
9
Randomly assigned value
Char
7
Randomly assigned value
Char
7
Randomly assigned value
Char
1
1 = Located in U.S.; 2 = Located outside
U.S.
Char
3
See 2010 SESTAT geocode code frame in
Appendix A.1
REFID_ORIG
DRF_ID
DRF_ID_ORIG
Location Variables
LOCSTAT13
SMLOC13
Most current location indicator; for old cohort
cases derived from SMLOC13 and
LOCSTAT10, for new cohort cases, derived
from PDUS13.
Most current sample member location; for
located old cohort cases, derived from
RESPLO3; for unlocated old cohort cases, set
to 999; and for new cohort cases, derived from
DRF variable PDLOC or reported address (if
Prepared for NSF by NORC | 7
2013 SDR | Sample Design and Implementation
Table 3.1
Variable
PDUS13
2013 SDR Sample Frame File Layout
Description
PDLOC is missing or unspecific).
Post-graduation location derived from DRF
variable PDUSFOR for new cohort.
Format
Char
Length
1
Values
1 = Located in U.S. (includes missing in
PDUSFOR)
2 = Located outside U.S.
9 = NA, old cohort
1 = Located in U.S.
LOCSTAT10
Location status indicator for old cohort cases.
Char
1
2 = Located outside U.S.
9 = NA, new cohort
Stratification Variables
DROP13
Disposition for 2013 round sampling
Char
3
STRATUM13
2013 Stratum assignment
Char
3
Char
3
NSDR=001-150
Char
3
ISDR=A6, C7-C9, D1-D12, F43-F70
NSDRSTRAT13
ISDRSTRAT13
2013 NSDR Stratum assignment regardless of
sample component membership
2013 ISDR Stratum assignment regardless of
sample component membership
BASEWGT10
2010 SDR base weight
Num
8
NSFGRP
NSF demographic group for NSDR
Char
1
See DROP13 code frame in Table 3.17
NSDR=001-150; ISDR=A6, C7-C9, D1-D12,
F43-F70
Actual base weight for panel cases from
2010, 1.0-43.0
1 = Hispanic, regardless of race, citizenship
at birth, and disability status
2 = NH black, regardless citizenship at birth
and disability status
3 = U.S. born, NH Asian regardless of
disability status
4 = NH American Indian, regardless of
citizenship at birth and disability status
5 = NH Pacific Islander, regardless of
citizenship at birth and disability status
6 = U.S. born, disabled, NH white
7 = U.S. born, not disabled, NH white
8 = Non-U.S. born, NH white, regardless of
disability status
9 = Non-U.S. born, NH Asian, regardless of
disability status
1 = U.S. citizens at birth
2 = Hispanic, non-U.S. citizen at birth
ISDRGRP
ISDR demographic group
Char
1
3 = NH black, non-U.S. citizen at birth
4 = NH Asian, non-U.S. citizen at birth
5 = NH white, non-U.S. citizen at birth
6 = NH other race, non-U.S. citizen at birth
Prepared for NSF by NORC | 8
2013 SDR | Sample Design and Implementation
Table 3.1
2013 SDR Sample Frame File Layout
Variable
PHDFIELD
Description
Doctoral field of study from the current DRF
Format
Length
Values
See DRF field of study code frame in
Appendix A.2
See DRF field of study code frame in
Appendix A.2
Char
3
Char
3
Char
3
See DRF field of study code frame in
Appendix A.2
Char
3
See NSDRMED code frame in Appendix A.2
Char
2
See SDRFLD15 code frame in Appendix A.2
Char
1
See DSTFLD8 code frame in Appendix A.2
Char
1
See MAJFLD7 code frame in Appendix A.2
Char
1
See FOD3 code frame in Appendix A.2
FOD3
Doctoral field of study from the DRF when
initially sampled
Doctoral field of study from the DRF updated
with degree changes reported in the SDR and
approved by NSF
SESTAT field of study code; for old cohorts,
this is derived from ND2MED for respondents,
and NSDRMED10 for nonrespondents; for
new cohorts, this is derived from PHDFIELD
15-level field of study used in sampling
(formerly SDRFLD)
8-level field of study used in sampling
(formerly DSTFLD)
7-level field of study used in sampling
(formerly MAJFLD)
3-level field of study used in sampling
SEX13
Sex or gender indicator
Char
1
1 = Male; 2 = Female
HCAPIN13
Disability status indicator
Char
1
Y = Disabled; N = Not disabled
HISPANIC13
Hispanic ethnicity indicator
Char
1
HISPCAT13
Hispanic group
Char
1
ASIAN13
Asian race indicator
Char
1
1 = Hispanic; 2 = Not Hispanic
1 = Mexican; 2 = Puerto Rican; 3 = Cuban; 4
= Other Hispanic
1 = Asian; 2 = Not Asian
BLACK13
Black race indicator
Char
1
NATIVE13
American Indian race indicator
Char
1
PACIFIC13
Pacific Islander race indicator
Char
1
1 = Black; 2 = Not Black
1 = American Indian; 2 = Not American
Indian
1 = Pacific Islander; 2 = Not Pacific Islander
WHITE13
White race indicator
Char
1
1 = White; 2 = Not White
RACE13
Race-only indicator, independent of ethnicity
Char
1
RACETH13
Concatenated race/ethnicity value
Char
20
BIRCIT13
Citizenship at birth indicator
Char
1
Num
4
Num
4
Num
4
PHDFIELD_ORIG
PHDFIELD_SDR
NSDRMED13
SDRFLD15
DSTFLD8
MAJFLD7
1 = Asian; 2 = Black; 3 = American Indian; 4
= Pacific Islander; 5 = White; 6 = Multi-race
Concatenation of Ethnicity and Race in the
form of ETH-RACE
Ethnicity: HISP, NH
Race: ASIAN, BLACK, NATIVE,
PACIFIC, WHITE
1 = U.S. citizen at birth ; 2 = Non-U.S.
citizen at birth
Sort Variables
PHDFY
PHDFY_ORIG
SDRAYR
Fiscal (academic) year of doctorate in the
current DRF
Fiscal (academic) year of doctorate from the
DRF when initially sampled
Fiscal (academic) year of doctorate with year
changes reported in the SDR and approved by
NSF
1958-2011, cases before 1958 have missing
data
1958-2011, cases before 1958 have missing
data
1958-2011, cases before 1958 have missing
data
Prepared for NSF by NORC | 9
2013 SDR | Sample Design and Implementation
Table 3.1
2013 SDR Sample Frame File Layout
Variable
Description
Format
Length
Values
See 2010 SESTAT geocode code frame in
Appendix A.1
BTHST13
Geocode for state/country of birth
Char
3
BIREGION
Region of birth used for sorting of the new
cohort; replaces ISDRCODE from 2010 SDR
Char
6
See Birth Region crosswalk in Appendix A.1
MOB_13
Month of birth known at start of 2013 round
Num
2
1-12, -3 = missing
DOB_13
Day of birth known at start of 2013 round
Num
2
1-31, -3 = missing
YOB_13
Year of birth known at start of 2013 round
Num
4
1934-1992, -3 = missing
AGE13
Age on the 2013 SDR reference date
Num
2
21-75
AGEYR13
Year of birth reported and imputed
Num
4
1934-1992
Data Source Variables
INSDRMED13
SESTAT field of study code source flag
Char
2
ISDRAYR
Fiscal year of doctorate source flag
Char
2
ISEX13
Sex source flag
Char
2
IHCAPIN13
Disability status source flag
Char
2
IHISPANIC13
Hispanic ethnicity source flag
Char
2
IHISPCAT13
Hispanic group source flag
Char
2
IASIAN13
Asian race source flag
Char
2
IBLACK13
Black race source flag
Char
2
INATIVE13
American Indian race source flag
Char
2
IPACIFIC13
Pacific Islander race source flag
Char
2
IWHITE13
White race source flag
Char
2
ILOCSTAT13
Location status source flag
Char
2
IPDUS13
Post-graduation location source flag
Char
2
IBIRCIT13
Birth citizenship source flag
Char
2
ICURCIT13
Current citizenship source flag
Char
2
IBTHST13
Birth state/country source flag
Char
2
IAGE13
Age source flag
Char
2
Char
4
See Source Flag code frame in Appendix
A.3
Operational Variables
SDRTYP13
2013 SDR sample component assignment
NSDR or ISDR
01 = 2010 Refusal
02 = 2010 Cooperative
03 = 2010 NIR
SAMPTYPE13
SURVEY10
Sample Type*
Completed survey in 2010 round
Char
Char
2
1
05 = New Cohort
06 = New Cohort—SED SM Refusal
07 = New Cohort—MIL/MIR/Other
nonresponse
Y = Yes, completed survey; N = No, did not
complete survey; L = new cohort
Prepared for NSF by NORC | 10
2013 SDR | Sample Design and Implementation
Table 3.1
Variable
2013 SDR Sample Frame File Layout
Description
Format
Length
Values
STRATUM10
Stratum assigned in 2010 round
Char
3
NSDR = 001-150;
ISDR=A6, C7-C9, D1-D12, F43-F70;
New cohort = XXX
SDRTYP10
2010 SDR sample component assignment
Sample component or frame to which a case
was initially allocated.
Char
4
NSDR, ISDR, or NEW (for new cohort)
Char
4
NSDR or ISDR
CURCIT13
Current citizenship indicator
Char
1
PDOCSTAT
Post-graduation status in the DRF.
Char
1
ORIGCOMP
1 = Currently U.S. citizen; 2 = Not U.S.
citizen currently
0 = Returning to, or continuing in, predoctoral employment
1= Signed contract or made definite
commitment
2 = Negotiating with a specific organization,
or more than one
3 = Seeking appointment but have no
specific prospects
4 = Other full-time degree program
5 = Do not plan to work or study
6 = Other
A = Has postdoctoral fellowship
PREVDOC
Flag to indicate if the SM has earned a U.S.
research doctorate before the sampled degree
according to the DRF denoted in DRF
variables PHDCOUNT and PREVDRF
Char
1
9 = Missing
1 = Sampled degree is first and only
doctorate
2 = Prior doctorate is non-SEH doctorate
3 = Prior doctorate is SEH doctorate
(ineligible)
0 = MD from U.S. institution
1 = DVM from U.S. institution
2 = DDS, DMD from U.S. institution
3 = Other medical from U.S. institution
4 = All other doctorates from U.S. institution
PROFDEG
DRF variable that indicates if a professional
degree is earned or in progress
Char
1
5 = MD from non-U.S. institution
6 = DVM from non-U.S. institution
7 = DDDS, DMD from non-U.S. institution
DRF_REF
Indicator of refusal to complete DRF
Char
1
ETHN_REF_DRF
Indicator of refused ethnicity in DRF
Char
1
8 = Other medical from non-U.S. institution
9 = All other doctorates from non-U.S.
institution
M = No other degree reported
Y = Explicitly refused to complete SED; N =
Did not explicitly refuse
Y = Ethnicity refused in the DRF; N =
Ethnicity reported in the DRF; M = SED
nonrespondent
Prepared for NSF by NORC | 11
2013 SDR | Sample Design and Implementation
Table 3.1
2013 SDR Sample Frame File Layout
Variable
Description
RACE_REF_DRF
Indicator of refused race in DRF
Format
Length
Char
1
Values
Y = Race refused in the DRF; N = Race
reported in the DRF; M = SED
nonrespondent
DRF = Doctorate Records File; NH = Non-Hispanic.
1 In
2010 SDR, REFID was reassigned for cases originally sampled for ISDR. Prior to 2010 SDR, ISDR REFIDs started with "30"; these
cases are currently assigned REFIDs starting with "2I".
3.1.2 Missing Data Imputation Rules for Sampling Stratification and Sort Variables
While there are many variables in the sampling frame file, there are only a few sampling
stratification variables which define the strata, and only five of these may have missing data.
One sort variable is also imputed when there are missing data. The six sampling stratification
and sort variables that might have missing data are as follows:
1. RACETH13, derived from ASIAN13, BLACK13, HISPANIC13, NATIVE13,
PACIFIC13, and WHITE13
2. SEX13
3. LOCSTAT13
4. BIRCIT13
5. HCAPIN13
6. AGE13
The imputation rules and the amount of missing data for each of these sampling stratification
variables in the 2013 SDR frame file are detailed below.
RACETH13. RACETH13 was constructed from the separate race/ethnicity variables
ASIAN13, BLACK13, HISPANIC13, NATIVE13, PACIFIC13, and WHITE13 after they were
fully imputed. RACETH13 is defined in the following hierarchical manner:
If a case is Hispanic or Latino, assign the case to the Hispanic value regardless of race;
If a case is not Hispanic (NH) and is black, assign the case to the NH black value
regardless of other race selections;
If a case is not Hispanic or black, and is Asian, assign the case to the NH Asian value
regardless of other race selections;
If a case is not Hispanic, black, or Asian, and is American Indian or Alaskan Native,
assign the case to the NH American Indian value regardless of other race selections;
Prepared for NSF by NORC | 12
2013 SDR | Sample Design and Implementation
If a case is not Hispanic, black, Asian, or American Indian, and is Native Hawaiian or
other Pacific Islander, assign the case to the NH Pacific Islander value regardless of other
race selections; and
Otherwise, assign the case to NH white.
Race/ethnicity variables are reported in either the SED or the SDR. When multiple reports exist,
the most current report was used. Despite attempts to obtain this information in the SED and
SDR surveys, some amount of missing data existed. The rules used for defining the race and
ethnicity variables in 2013 SDR frame are as follows:
1. Use reported data from the most current version of the SDR;
2. Use reported data from the SED;
3. When ethnicity is missing, use the U.S. Census Bureau Hispanic surname list and
logically impute any matches as Hispanic ethnicity (if race is also missing and the
surname is Hispanic, impute the race to white);4
4. When race is missing, and ethnicity is either missing or non-Hispanic, use the GENESYS
Asian surname list5, and logically impute any matches as NH Asian;
5. When ethnicity is still missing, but race is reported, use place of birth to logically impute
ethnicity;
6. When race and ethnicity are both still missing, use place of birth to logically impute race
and ethnicity;
7. Where hot deck imputation exists from a past survey cycle, use the hot deck imputed
values; and
8. When race and ethnicity are both still missing and place of birth is missing, impute to NH
white.
The crosswalk of birth places to race and ethnicity imputation assignments is located in
Appendix A.1. The sources for race and ethnicity data in the 2013 SDR frame files are detailed
in Tables 3.2 and 3.3. The distribution of the resulting race/ethnicity group assignments is shown
in Table 3.4.
4
The 2013 new cohort cases were updated using the Hispanic surname list based on the 2000 U.S. Census available
as of 2011 located at http://www.census.gov/genealogy/www/data/2000surnames/index.html. The 2013 old cohort
cases were updated using the Hispanic surname list based on the 1990 U.S. Census.
5
Market Systems Group provides the GENESYS Sampling Systems suite of sampling tools, which includes this
algorithm that matches surnames to an Asian surname list for a nominal fee (http://www.m-sg.com/Web/genesys/index.aspx).
Prepared for NSF by NORC | 13
2013 SDR | Sample Design and Implementation
Table 3.2
Race Data Sources: 2013 SDR Frame
Race Data Source
Self-reported
Surname imputation (Asian)
Birthplace imputation
Hotdeck imputation
Default imputation (white)
Total
Cases
109,494
1,467
2,175
51
3,321
2010
Panel
43,486
137
782
51
146
2010
SED
32,396
604
738
0
1,504
2011
SED
33,612
726
655
0
1,671
Overall
116,508
44,602
35,242
36,664
Table 3.3
Ethnicity Data Sources: 2013 SDR Frame
Ethnicity Data Source
Self-reported
Surname imputation (Hispanic)
Birthplace imputation
Hotdeck imputation
Default imputation (non-Hispanic)
Total
Cases
110,205
315
1,560
51
4,377
2010
Panel
44,043
24
299
51
185
2010
SED
32,310
148
761
0
2,023
2011
SED
33,852
143
500
0
2,169
Overall
116,508
44,602
35,242
36,664
Table 3.4
Race/Ethnicity Assignment: 2013 SDR Frame
Race/ethnicity Group
Hispanic
NH-American Indian
NH-Asian
NH-Black
NH-Pacific Islander
NH-White
Total
Cases
7,591
771
33,487
5,778
290
68,591
2010
SDR
3,138
339
10,525
2,636
144
27,820
2010
SED
2,108
210
11,131
1,545
72
20,176
2011
SED
2,345
222
11,831
1,597
74
20,595
Overall
116,508
44,602
35,242
36,664
SEX13. Sex is primarily obtained from the SED survey data, and is very complete. However,
starting with the 2003 SDR, cases with missing sex information completing the survey in an
online mode (i.e., telephone interview or web survey) have been asked to identify their sex. If
sex information is not in the DRF or reported in the SDR, sex data are updated with results found
through Internet searches that reveal the sample member’s sex through pictures or other
unambiguous documentation (e.g., a sample member is described with female pronouns and
thanks her husband for support in her dissertation). Any remaining missing sex data cases are
Prepared for NSF by NORC | 14
2013 SDR | Sample Design and Implementation
imputed to be female by default, giving these cases with unknown sex a higher probability of
selection.
The sources for the sex data in the 2013 SDR frame files are detailed in Table 3.5. The
distribution of the resulting sex assignments is shown in Table 3.6.
Table 3.5
Sex Data Sources: 2013 SDR Frame
Sex Data Source
Self-reported
Verified with Internet source
Default imputation (female)
Total
Cases
116,437
52
19
2010
SDR
44,562
35
5
2010
SED
35,231
6
5
2011
SED
36,644
11
9
Overall
116,508
44,602
35,242
36,664
Table 3.6
Sex Assignment: 2013 SDR Frame
Sex Assignment
Male
Female
Total
Cases
70,107
46,401
2010
SDR
28,945
15,657
2010
SED
20,125
15,117
2011
SED
21,037
15,627
Overall
116,508
44,602
35,242
36,664
LOCSTAT13. The LOCSTAT13 variable indicates the last known residence location of the
sample member prior to 2013 SDR sampling, either in or out of the U.S. For the located 2010
SDR panel cases, this information primarily comes from the survey for respondents and
contacting data for nonrespondents. For panel cases not found in the 2010 cycle, the last known
residence location is obtained from past SDR cycles or planned post-graduation location reported
in the SED. For the new cohort frame, LOCSTAT13 is derived only from planned postgraduation location reported in the SED. Any cases with no residency data from the SDR and
the SED are imputed to be in the U.S. by default. The 2010 SDR was the first cycle to use this
variable.6
6
For more details about the LOCSTAT variable development for the 2010 SDR and continued for the 2013 SDR,
see the memoranda “2010 SDR Sample Frame Development Memo #3 – Sample Member Location Variable” sent
to Daniel Foley and Steve Cohen, NSF, on April 23, 2010 from Karen Grigorian, NORC, and Brenda Cox, SRA,
and “2013 SDR Frame Decisions – Frame File Layout” sent to Lynn Milan, NSF, on September 18, 2012 and
finalized October 4, 2012 from Karen Grigorian and Lance Selfa, NORC and Brenda Cox SRA.
Prepared for NSF by NORC | 15
2013 SDR | Sample Design and Implementation
The sources for the location data in the 2013 SDR frame files are detailed in Table 3.7. The
distribution of the resulting location assignments is shown in Table 3.8.
Table 3.7
Location Data Sources: 2013 SDR Frame
Location Data Source
SDR
SED
Default imputation (in the U.S.)
Total
Cases
43,488
67,820
5,200
2010
SDR
43,488
898
216
2010
SED
2011
SED
0
32,867
2,375
0
34,055
2,609
Overall
116,508
44,602
35,242
36,664
Table 3.8
Location Assignment: 2013 SDR Frame
Location Assignment
In the U.S.
Out of the U.S.
Total
Cases
103,087
13,421
2010
SDR
39,132
5,470
2010
SED
31,300
3,942
2011
SED
32,655
4,009
Overall
116,508
44,602
35,242
36,664
BIRCIT13. The BIRCIT13 variable indicates the sample member’s citizenship at the time of
birth, as either “U.S.” or “non-U.S.” Citizenship information is asked in each round of the SDR,
and so for the majority of panel members, this information comes from the SDR survey. For
nonrespondents to the SDR and new cohort sample members, this information is obtained from
the SED. Cases that have never reported birth citizenship were imputed to be non-U.S. born.
The sources for birth citizenship data in the 2013 SDR frame files are detailed in Table 3.9. The
distribution of the resulting birth citizenship assignments is shown in Table 3.10.
Table 3.9
Citizenship at Birth Sources: 2013 SDR Frame
Citizenship at Birth Data Source
Self-reported in SDR
Self-reported in SED
Citizenship imputed from DRF with
BIRTHPL and PDLOC
Default imputation (non-U.S. born)
Overall
Total
Cases
42,135
69,760
2010
SDR
42,135
2,026
2010
SED
2011
SED
0
33,298
0
34,436
48
4,565
13
428
12
1,932
23
2,205
116,508
44,602
35,242
36,664
Prepared for NSF by NORC | 16
2013 SDR | Sample Design and Implementation
Table 3.10
Citizenship at Birth Assignment: 2013 SDR Frame
Citizenship at Birth Assignment
U.S. born
Not U.S. born
Total
Cases
65,388
61,120
2010
SDR
28,430
16,172
2010
SED
18,284
16,958
2011
SED
18,674
17,990
Overall
116,508
44,602
35,242
36,664
HCAPIN13. The HCAPIN13 variable indicates the sample member’s most current disability
status – either disabled or not disabled. Disability information is asked in each round of the
SDR, and so for the majority of panel members, this information comes from the SDR survey.
Any SDR survey respondent that reports having a moderate or greater disability of any type (e.g.,
seeing; hearing; walking; lifting; or concentrating, remembering, or making decisions) is
considered disabled. For nonrespondents to the SDR and new cohort sample members, this
disability information is obtained from the SED. If at least one disability was indicated in the
SED disability variable HANDICAP, HCAPIN13 was coded as disabled. The SED disability
categories are blind/visually impaired, deaf/hard of hearing, physical/orthopedic disability,
learning/cognitive disability, vocal/speech disability, and other self-specified disability. Cases
never reporting disability status are imputed to be non-disabled.
The sources for disability status in the 2013 SDR frame files are detailed in Table 3.11. The
distribution of the resulting disability status assignments is shown in Table 3.12.
Table 3.11
Disability Status Source: 2013 SDR Frame
Disability Status Data Source
Self-reported in SDR
Self-reported in SED
Default imputation (not disabled)
Total
Cases
42,126
65,813
8,569
2010
SDR
42,126
1,942
534
2010
SED
2011
SED
0
31,395
3,847
0
32,476
4,188
Overall
116,508
44,602
35,242
36,664
Prepared for NSF by NORC | 17
2013 SDR | Sample Design and Implementation
Table 3.12
Disability Status Assignment: 2013 SDR Frame
Disability Status Assignment
Disabled
Not disabled
Total
Cases
5,394
111,114
2010
SDR
3,410
41,192
2010
SED
960
34,282
2011
SED
1,024
35,640
Overall
116,508
44,602
35,242
36,664
AGE13. The AGEYR13 variable indicates the sample member’s year of birth and is used to
create AGE13 and IAGE13. The primary sources of AGEYR13 are birth year data reported on
the SED, supplemented with birth year information collected on the SDR. Any missing data on
AGEYR13 are imputed from sample members’ bachelor’s degree year, if known, or from their
doctorate award year, which is known for all sample members. The birth year imputation rules
assume that sample members earned degrees at an age somewhat lower than average for the
population; when based on bachelor’s degree award year, sample members are assumed to be 18
when earning this degree, and when based on doctorate award year, sample members are
assumed to be 21 when earning this degree. These younger age assumptions are intentional so to
minimize any sample undercoverage caused by eliminating doctorates with missing birth year
that may have earned a degree at a young age. During data collection, every effort is made to
collect date of birth from sample members with an imputed birth date to confirm their eligibility
for the sample. In the next survey cycle, newly obtained unimputed birth date data replace the
imputed birth year estimate in frame construction.
The sources for age in the 2013 SDR frame files are detailed in Table 3.13. The distribution of
the resulting age assignments is shown in Table 3.14.
Table 3.13
Age Source: 2013 SDR Frame
Age Data Source
Self-reported in SDR
Self-reported in SED
BA Year Imputation
PhD Year Imputation
Total
Cases
29,087
81,994
1,517
3,910
2010
SDR
29,087
15,014
151
350
2010
SED
2011
SED
0
32,875
717
1,650
0
34,105
649
1,910
Overall
116,508
44,602
35,242
36,664
Prepared for NSF by NORC | 18
2013 SDR | Sample Design and Implementation
Table 3.14
Age Assignment: 2013 SDR Frame
Age Assignment
Under 35
35-39
40-44
45-49
50-54
55-59
60-64
65-75
Total
Cases
48,818
21,575
12,093
7,995
6,727
6,036
5,161
8,103
2010
SDR
2,855
6,169
6,767
5,843
5,321
5,102
4,652
7,893
2010
SED
21,223
8,375
2,919
1,122
743
478
264
118
2011
SED
24,740
7,031
2,407
1,030
663
456
245
92
Overall
116,508
44,602
35,242
36,664
Prepared for NSF by NORC | 19
2013 SDR | Sample Design and Implementation
SUMMARY OF SAMPLING VARIABLES DATA SOURCES. Table 3.15 summarizes the
data source type for the sampling stratification and sort variables subject to imputation. These
results are shown by variable and by the three main sample frame components.
Table 3.15
Sample
Frame
Component
2010 SDR
Data Source for Sample Frame Variables Subject to Imputation: 2013
SDR Frame
Sample Frame Variable
Race (RACE13)
Ethnicity (HISPANIC13)
Sex (SEX13)
Location (LOCSTAT13)
Citizenship at birth (BIRCIT13)
Disability status (HCAPIN13)
Birth year (AGEYR13)
2013 SDR Sample Frame Cases
Reported
Values in
Imputed
Assigned
the Final
from a NonDefault
Frame
default Rule
Imputation
43,486
970
146
44,043
374
185
44,562
35
5
44,386
0
216
44,161
13
428
44,068
0
534
44,101
501
0
2010 SED
Race (RACE13)
Ethnicity (HISPANIC13)
Sex (SEX13)
Location (LOCSTAT13)
Citizenship at birth (BIRCIT13)
Disability status (HCAPIN13)
Birth year (AGEYR13)
32,396
32,310
35,231
32,867
33,298
31,395
32,875
1,342
909
6
0
12
0
2,367
1,504
2,023
5
2,375
1,932
3,847
0
2011 SED
Race (RACE13)
Ethnicity (HISPANIC13)
Sex (SEX13)
Location (LOCSTAT13)
Citizenship at birth (BIRCIT13)
Disability status (HCAPIN13)
Birth year (AGEYR13)
Overall
Race (RACE13)
Ethnicity (HISPANIC13)
Sex (SEX13)
Location (LOCSTAT13)
Citizenship at birth (BIRCIT13)
Disability status (HCAPIN13)
Birth year (AGEYR13)
33,612
33,852
36,644
34,055
34.436
32.476
34,105
109,494
110,205
116,437
111,308
111,895
107,939
111,081
1,381
643
11
0
23
0
2,559
3,693
1,926
52
0
48
0
5,427
1,671
2,169
9
2,609
2,205
4,188
0
3,321
4,377
19
5,200
4,565
8,569
0
3.2
Old Cohort Sample Frame Construction
The 2013 SDR old cohort population is composed of doctorates who received their SEH degree
prior to July 2010. The old cohort frame is a secondary frame because it is derived from the
Prepared for NSF by NORC | 20
2013 SDR | Sample Design and Implementation
panel sample and each frame member carries a sampling weight to represent the old cohort
population. The frame construction process for the 2013 old cohort was relatively simple. As
noted in Subsection 3.1, SDR survey responses were used to update the sample frame variables,
whenever possible, and to determine eligibility for either NSDR or ISDR frame inclusion. The
following subsections provide the NSDR and ISDR old cohort frame definitions and show the
final eligibility status of the 2010 SDR sample for the 2013 cycle.
3.2.1 NSDR Old Cohort Frame Definition
The 2013 NSDR old cohort frame was derived from the 2010 NSDR sample consisting of 40,000
cases. The 2013 NSDR old cohort frame included only cases that met the 2013 SDR target
population requirements (e.g., received a doctoral degree in an SEH field from a U.S. institution,
age 75 years or younger on the survey reference date of February 1, 2013, and living in a
noninstitutionalized setting on the reference date) and were last located in the U.S. or one of its
territories as defined by the LOCSTAT13 frame variable.
3.2.2 ISDR Old Cohort Frame Definition
The 2013 ISDR old cohort frame was derived from both the 2010 NSDR and ISDR samples. All
2010 ISDR cases that met the 2013 SDR target population requirements were included in the
2013 ISDR old cohort frame and selected for the 2013 ISDR sample with certainty.
Additionally, all 2010 NSDR cases that met the 2013 SDR target population requirements and
were last located outside of the U.S. or one of its territories (as defined by the LOCSTAT13
variable) were also included in the 2013 ISDR frame and selected for the sample with certainty.
3.2.3 2010 SDR Final Eligibility Status and Frame Assignment
Table 3.16 shows the 2013 SDR old cohort frame status for all 2010 SDR sampled cases.
Ultimately, there are 44,602 cases included in the 2013 SDR old cohort frame – 38,424 included
in the 2013 NSDR old cohort frame and 6,178 included in the 2013 ISDR old cohort sample.
Prepared for NSF by NORC | 21
2013 SDR | Sample Design and Implementation
Table 3.16
2013 SDR Old Cohort Frame Status by 2010 SDR Sample Type
2010 SDR Sample
2013 SDR Old Cohort Frame Status
Total
NSDR
ISDR
Eligible
44,602
38,968
5,634
00
NSDR Frame Eligible
38,424
38,424
0
00
ISDR Frame Eligible (selected with certainty)
6,178
544
5,634
1,095
1,032
63
855
816
39
1
1
0
Ineligible
01 Age ineligible
07
Age ineligible, according to BA year or PhD year
11
Non-SEH doctoral degree field per SDR
13
8
5
12
No doctorate degree earned per SDR
1
1
0
13
Duplicate case per SDR
0
0
0
13b
Double Doctorate; first SEH doctorate earned before SED
2010/2011
0
0
0
1
1
0
196
179
17
26
24
2
2
2
0
45,697
40,000
5,697
14
15
Frame ineligible, not otherwise defined
Deceased per SDR
16
Terminally ill per SDR
19
Institutionalized two consecutive SDR cycles
Overall
Further, Table 3.17 shows the 2013 SDR eligibility status of all 186,111 cases ever included in
the SDR sample since its inception in 1973. Cases are classified as one of three types: (1)
eligible, (2) permanently ineligible, or (3) eligible, but deselected (sampled out) in a previous
SDR cycle. Note that permanently ineligible cases met the SDR eligibility criteria at one point
in time, but due to changed circumstances became ineligible and are expected to never become
eligible again (e.g., the case is over age 75 or known to be deceased).
It is important to note, that in addition to the 44,602 cases that were eligible for either the 2013
NSDR old cohort frame or 2013 ISDR old cohort sample, there were 2,824 cases classified as
permanently ineligible which would have been age 75 or younger on the survey reference date.
These 2,824 were not included in sampling, but retained for later use in response rate
calculations and weighting adjustments of the age eligible SDR population.
Prepared for NSF by NORC | 22
2013 SDR | Sample Design and Implementation
Table 3.17
All Cases Ever Included in SDR by 2013 SDR Frame Status
2013 SDR Old Cohort Frame Status
Eligible for Old Cohort Frame Inclusion
Total
Cases
44,602
Percent
23.97%
00
Eligible for 2013 NSDR Panel Sample Frame
38,424
20.65%
00
Eligible for 2013 ISDR Panel Sample Frame
6,178
3.32%
Age Ineligible for Old Cohort Frame Inclusion
34,734
18.66%
01
Age ineligible
29,037
15.60%
07
Age ineligible, according to BA year or PhD year
3,112
1.67%
2,824
1.52%
Age Eligible, Otherwise Ineligible for Old Cohort Frame Inclusion
11
Non-SEH doctoral degree field per SDR
78
0.04%
12
No doctorate degree earned per SDR
80
0.04%
13
Duplicate case per SDR
30
0.02%
13b
Double Doc; first SEH doctorate earned before SED 2010/2011
3
0.00%
14
Frame ineligible, not otherwise defined
26
0.01%
02
Permanently out of scope per SDR, not otherwise defined
110
0.06%
15
Deceased per SDR
1108
0.60%
16
Terminally ill per SDR
95
0.05%
19
Institutionalized two consecutive SDR cycles
6
0.00%
04
Non-US citizen, out of country 1993-1997 (dropped in 1999)
396
0.21%
05
Non-US citizen, out of country 1995-1997 (dropped in 1999)
71
0.04%
06
Non-US citizen, out of country (dropped in 1997)
391
0.21%
17
Non-US citizens, out of country (dropped in 2003)
128
0.07%
18
Non-US citizens, out of country (dropped in 2001)
297
0.16%
20
Other permanent ineligible in 1995, not otherwise defined
5
0.00%
103,951
55.85%
51,707
27.78%
Deselected Through Sampling
21
Deselected in sampling 1973-1995 SDR
22
Deselected in 1997 sampling
2,976
1.60%
23
Deselected in 1999 sampling
15,256
8.20%
24
Deselected in 2001 sampling
2,930
1.57%
26
Deselected in 2003 sampling
2,854
1.53%
28
Deselected in 2006 sampling
776
0.42%
29
Deselected in 2008 sampling
4,968
2.67%
30
Deselected in 2010 sampling
724
0.39%
25
Humanities sample dropped from SDR sample
21,760
11.69%
186,111
100.00%
Overall
Prepared for NSF by NORC | 23
2013 SDR | Sample Design and Implementation
3.2.4 Evaluation of Old Cohort Frame Strata Assignments
In a longitudinal survey sampling frame, it is desirable to have the variables used to stratify the
sample remain consistent over time resulting in consistent strata assignments. Changes to
stratification assignment should be justified. This is also true to the SDR.
All 2013 SDR old cohort frame cases which changed strata assignment from their 2010 strata
assignment were evaluated to ensure that the change was accurate and correct. There were a
total of 2,623 out of 44,602 old cohort frame eligible cases (5.9 percent) that changed strata
assignment from 2010 to 2013. Some changes are expected as the SDR sample design updates
stratification variables with the most current reported data and actively seeks to replace imputed
data with reported data.
As is usually the case for the SDR, the primary reason for strata assignment changes in the 2013
frame are the differences in disability status coded from 2010 survey responses. Typically, an
equivalent number of old cohort cases switch disability status to and from being disabled.
However, in the 2013 SDR old cohort frame, a greater proportion of cases became disabled as a
result of the change to the disability question in the 2010 survey (for more details see Section 2).
The secondary reason for stratification assignment change resulted from a change in the sample
member’s location. Table 3.18 details the reasons why 2,623 2013 SDR old cohort frame cases
changed from their 2010 SDR strata assignment.
Prepared for NSF by NORC | 24
2013 SDR | Sample Design and Implementation
Table 3.18
Reason for Strata Assignment Change from 2010 to 2013 SDR
Code
Reason for Strata Change
01
Only location changes to out of U.S., no other demographic changes
511
0
0
511
02
Became disabled
928
925
0
3
03
Became not disabled
580
577
0
3
04
Revised sex
9
7
0
0
05
Birth citizenship changed
115
96
14
5
06
Field of study changed
164
147
11
6
07a
Race/ethnicity changed from 2010 survey
66
34
18
14
07b
Race/ethnicity changed with 2001 reported data
205
191
12
2
09a
Birth citizenship and race/ethnicity changed
24
23
1
0
09b
Birth citizenship and field of study changed
5
5
0
0
10a
Disability status and race/ethnicity changed
10
10
0
0
10b
Disability status, race/ethnicity, and field of study changed
2
2
0
0
11
Race/ethnicity and field of study changed
4
4
0
0
2,623
2,021
56
544
Total
3.3
Overall
2010 and 2013 Frame
Components
Both
Both
NSDR
NSDR
ISDR
to ISDR
New Cohort Sample Frame Construction
As noted previously in Subsection 3.1, the data source for constructing the 2010 SDR new cohort
frame was the two most recent doctoral cohorts included in the DRF from the 2010 and 2011
SED rounds.
As with the old cohort frame, cases considered eligible for the 2013 SDR new cohort frame
needed to first meet the 2013 SDR target population requirements of having received a doctoral
degree in a SEH field from a U.S. institution, being 75 years or younger on the survey reference
date of February 1, 2013, and living in a noninstitutionalized setting on the reference date. The
variable LOCSTAT13 was used to assign the target population eligible cases into either the
NSDR or the ISDR new cohort frames. Table 3.19 shows the 2013 SDR new cohort frame status
for all 2010 and 2011 SED cases.
Prepared for NSF by NORC | 25
2013 SDR | Sample Design and Implementation
Table 3.19
2013 SDR New Cohort Frame Status by SED Cohort
SED Cohort
2013 SDR New Cohort Frame Status
Total
2010
2011
Eligible
71,906
35,242
36,664
00
NSDR Frame Eligible
63,955
31,300
32,655
00
ISDR Frame Eligible
7,951
3,942
4,009
25,138
12,792
12,346
10
5
5
12
6
6
25,056
12,753
12,303
60
28
32
97,044
48,034
49,010
Ineligible
01 Age ineligible
03
Deceased, according to the DRF
11
Non-SEH doctoral degree field
13b
Double Doc; first SEH doctorate earned before SED 2010/2011
Overall
Prepared for NSF by NORC | 26
2013 SDR | Sample Design and Implementation
4.
Sample Stratification
Sample stratification for the 2013 SDR sample design is identical to the approach used for the
2010 SDR. The NSDR portion of the frame was stratified into 150 strata and the ISDR portion
was stratified into 44 strata. The NSDR and ISDR sampling frames are stratified and the sample
allocated separately. Cases are assigned to the NSDR or the ISDR sampling frames based on the
target population definitions that utilizes predicted residency location of in or out of the U.S. (as
defined by the frame variable LOCSTAT13). For the detailed definition of LOCSTAT13, see
page 15 of Subsection 3.1.2 of this report.
4.1
NSDR Sample Stratification
The 2013 NSDR frame contained 38,424 panel and 63,955 new cohort members. The NSDR
stratification scheme is presented in Appendix Table B.1 along with the distribution of the
sampling frame by stratum. The NSDR stratification approach introduced in the 2003 cycle has
been continually implemented through the 2013 cycle with one minor exception. The 2003 and
2006 NSDR cycles included missing race strata; these strata were eliminated for the 2008, 2010,
and 2013 NSDR designs when logical imputation rules were used to impute missing
race/ethnicity data during sampling frame development when this information was not
previously reported in the SDR or SED (see page 13 in Subsection 3.1.2 of this report for the
detailed race/ethnicity imputation rules). Strata were defined based upon the cross of
demographic group by gender by degree field.
Degree field was collapsed in varying ways depending upon the population size of doctorates in
the demographic group, resulting in a total of 150 explicit strata. Within each stratum, the data
records were sorted by citizenship, disability status, degree field, and year of degree receipt prior
to sample selection. This created an implicit stratification of the sample within each explicit
stratum to ensure the sample selected is balanced on these factors.
Prepared for NSF by NORC | 27
2013 SDR | Sample Design and Implementation
4.1.1 Demographic Group Recode
Demographic group is a composite variable based upon U.S. citizenship at birth, race/ethnicity,
and disability status with collapsing as needed for small populations. After collapsing, the
demographic group stratification variable was defined as follows:
1. Hispanics, regardless of race, citizenship at birth and disability status;
2. NH blacks, regardless of citizenship at birth and disability status;
3. U.S. citizen at birth, NH Asians (excluding Hawaiians and Pacific Islanders) regardless
of disability status;
4. NH American Indians (including Alaskan natives), regardless of citizenship at birth and
disability status;
5. NH Pacific Islanders (including native Hawaiians), regardless of citizenship at birth and
disability status;
6. U.S. citizen at birth, disabled, NH whites;
7. U.S. citizen at birth, non-disabled NH whites;
8. Non-U.S. citizen at birth, NH whites regardless of disability status; and
9. Non-U.S. citizen at birth, NH Asians regardless of disability status.
These nine groups were defined in a hierarchical manner as the group definitions imply. For
example, all Hispanics belong to the first demographic group regardless of other demographic
characteristics. Similarly, all NH blacks belong to the second demographic group regardless of
other characteristics.
4.1.2 Degree Field Recodes
As for the 2003 to 2010 NSDR, the 2013 NSDR used two degree field recodes for stratifying
different demographic groups. The first recode is the 15-category SDR degree field variable
(SDRFLD15) which was used to stratify the three largest demographic groups: (7) U.S. citizens
at birth, nondisabled NH whites; (8) non-U.S. citizens at birth, NH whites; and (9) non-U.S.
citizens at birth NH Asians. The second recode is the 7-category SESTAT major degree field
variable (MAJFLD7) that was used to stratify the remaining demographic groups except for
American Indians and Pacific Islanders which were not stratified by degree field. The mapping
of both degree field recode variables to the detailed SED degree field code frame can be found in
Appendix A.2 2013 SDR Field of Study Coding Taxonomies Crosswalk.
Prepared for NSF by NORC | 28
2013 SDR | Sample Design and Implementation
The SDR degree field recode has the following 15 categories:
1. Chemistry;
2. Physics/astronomy;
3. Earth/ocean/atmospheric sciences;
4. Mathematics;
5. Computer and information sciences;
6. Agricultural sciences;
7. Medical sciences;
8. National Institutes of Health (NIH) biological sciences;
9. Other biological sciences;
10. Psychology;
11. Economics;
12. Anthropology/archeology/sociology;
13. Other social sciences;
14. Electrical/electronics/communications engineering; and
15. Other engineering.
The SESTAT major degree field recode has these seven categories:
1. Information sciences/mathematics and statistics;
2. Biological and agricultural sciences;
3. Health sciences;
4. Physical and related sciences;
5. Social sciences;
6. Psychology; and
7. Engineering.
4.2
ISDR Sample Stratification
The 2013 ISDR frame contained 6,178 panel and 7,951 new cohort members. As noted above,
the 2013 ISDR stratification approach was identical to the one used for the 2010 ISDR
developed for the integrated sample design. The 2013 ISDR stratification scheme is presented in
Appendix Table C.1 along with the distribution of the sampling frame by stratum.
Prepared for NSF by NORC | 29
2013 SDR | Sample Design and Implementation
The 2013 ISDR strata were defined by the cross of U.S. versus non-U.S. citizen at birth by
race/ethnicity by gender by degree field. Race/ethnicity was defined as Hispanic, NH black, NH
Asian, NH white, and NH other races, where other races combines American Indians and Pacific
Islanders. U.S. citizens at birth were collapsed over race/ethnicity in stratification. Gender was
defined as male and female. The 2013 ISDR collapsed over gender for non-U.S. citizens at birth
that were of NH-black and NH-other races. The 7-category SESTAT major degree field recode
(MAJFLD7) was used in stratification for non-U.S. citizens at birth that were NH Asians and NH
whites. A three-digit degree recode, FOD3, was used to stratify U.S. citizens at birth and
Hispanics and NH blacks that were non-U.S. citizens at birth. The 3-category degree field
recode (FOD3) has these categories:
1. Computer and information sciences, mathematics, physical sciences, and engineering;
2. Biological and agricultural sciences, and health sciences; and
3. Psychology and social sciences.
Non-U.S. citizens at birth whose race was NH other races were collapsed over field of degree as
well as gender.
The sort order for frame records within each ISDR stratum was defined based upon SESTAT
major degree field and then by place of birth.
The place of birth sorting approach was introduced as a part of the 2006 ISDR redesign which
redefined the ISDR strata used in sample selection after comparing race/ethnicity to country of
origin (Cox, Grigorian, and Yang, 2006). To control for country of origin in sampling, the ISDR
used a 12-level subcontinent of birth as a sort variable for stratum records in the 2006, 2008, and
2010 cycles. The subcontinent code frame was as follows and was used in this sort order:
Oceania, Europe, Canada, Mexico and Central America, South America, Central Africa, South
Africa, North Africa, Middle East, Southeast Asia, and Northern Asia. For the 2013 cycle, the
subcontinent variable was replaced with the more detailed birth region variable shown in
Appendix A.1. The revised region of birth variable offered more control in sorting. Note that the
sort order for both the subcontinent and the birth region variables were chosen so that adjacent
locations would tend to have similar ethnicity/race characteristics.
Prepared for NSF by NORC | 30
2013 SDR | Sample Design and Implementation
5.
Sample Size
The sample size requirements for the 2013 SDR were consistent with those used in the 2010
SDR resulting in 47,078 sampled cases. The NSDR retained its usual sample size of 40,000
doctorates. As was introduced in the 2010 SDR design, the NSDR frame size was reduced by
transferring potential non-U.S. residents to the ISDR frame using the Last Location Rule.7 The
ISDR panel frame size increased as a consequence of the addition of these transferred NSDR
panel members and the ISDR continued its practice of sampling panel members with certainty.
Thus, the transfer of panel cases from the NSDR to the ISDR frame resulted in a sample size
increase for panel members in the ISDR sample component. However, the ISDR new cohort
sample size was set to 900 as it was for the 2006, 2008, and 2010 survey cycles.
5.1
NSDR Sample Size
The NSDR sample size is defined in terms of sampled doctorates after permanent ineligibles
have been removed from the frame prior to sample allocation.8 Of the cases that become
permanent ineligible, most are panel members that will be age 76 or older on the reference date
for the next survey cycle. In addition, a small number of the remaining age eligible panel
members are permanently ineligible for one or more of the following reasons:
Deceased,
Terminally ill/permanently incapacitated,
No earned doctorate,
Earned doctorate after the new cohort academic years (June 30, 2011 for 2013 SDR), or
Earned doctorate in a non-SEH field.
Of the age-eligible permanent ineligible cases, most are deceased.
7
The Last Location Rule categorizes cases, both U.S. citizens and non-U.S. citizens, as likely to be permanent nonU.S. residents when their residence was outside the U.S. in the previous survey cycle; panel frame cases previously
included in the NSDR sample component transfer with certainty to the ISDR frame if they were last found outside of
the U.S.
8
A panel member is defined to be permanent ineligible when they are not now and will never again be a member of
the SDR target population for a future survey cycle. Prior to 1999, permanent ineligibles (other than age ineligibles)
were left in the frame and the desired sample size was expanded to account for their presence.
Prepared for NSF by NORC | 31
2013 SDR | Sample Design and Implementation
Beginning with the 1999 NSDR, the desired sample size for most survey cycles has been 40,000.
An exception was made for the 2006 NSDR, which had a sample size of 42,955 expanded to
accommodate three new SED cohorts. For the 2008 NSDR, NSF decided to return to the
standard sample size of about 40,000 cases, which was also used for the 2010 and 2013 NSDR.
Because new cohort population counts were available for both cohorts, the final sample size was
exactly 40,000 (see Section 7 Sample Selection for details).
For the 2013 NSDR, we followed the 2010 procedures for defining the minimum desired sample
size of completed interviews per stratum. For each stratum, the NSF has specified use of a
minimum sample size that is the equivalent of 60 completed interviews, except for the two
American Indian strata where a minimum sample size equivalent of 150 interviews has been set.
This requirement recognizes that some strata are so small that after accounting for the finite
population size effect on precision, much less than the specified amount of completed interviews
may be needed to achieve the desired precision level. In comparison to an infinite population, a
finite population of size N has its variance for a sample of size n reduced by a finite population
correction factor (fpc) of
. To meet this precision requirement, the minimum stratum
sample size for most strata was set to the equivalent of the number of completed interviews when
adjusted for the stratum’s fpc. Under this approach the minimum sample size allocated to
stratum h is set to
nh' * N h
nh '
nh N h
where Nh is the population size for stratum h, and
n h'
is either 60 or 150 depending on the
stratum receiving the minimum assignment. Note that
nh
will be smaller than
the reduction in variance due to the fpc. In other words, a sample of
limited size is equivalent to a sample of
n h'
nh
n h'
, which reflects
cases in a stratum of
cases in an extremely large stratum. The effect of
ignoring the fpc would be to overestimate the minimum required sample size for a stratum.
Appendix B.2 shows the estimated population size for each stratum, the desired respondent
sample size with and without fpc adjustment, and when the stratum sample size was set to the
Prepared for NSF by NORC | 32
2013 SDR | Sample Design and Implementation
minimum respondent sample size with fpc adjustment. Note that the minimum respondent
sample sizes were defined in terms of completed interviews with eligible doctorates. For the
2013 NSDR, yield rates (number of completed interviews with eligible doctorates divided by
sample size) were estimated for the cross of demographic group by gender based upon the 2010
NSDR data collection experience for the equivalent frame population.9 For strata with their
sample sizes set to the minimum respondent sample size, the yield rates shown in Appendix B.2
were used to determine the total sample cases to be selected from the stratum.
This minimum stratum sample size requirement was introduced in the 2003 NSDR redesign
(Cox 2003, Yang et al. 2004). That redesign also redefined the strata so that they conformed
more closely to analysis domains used in reporting, particularly with respect to the collapsing of
very small race/ethnicity groups over degree fields to achieve strata with populations of
sufficient size for reporting.10 Small race/ethnicity by sex domains such as Hispanics and NHblacks have an additional domain sample size supplement that increases the sample size for the
individual strata within the domain and for the overall domain (see Table 6.1 in the next section).
5.2
ISDR Sample Size
The actual 2013 ISDR sample component size is 7,078 cases. The ISDR new cohort sample size
remained the same as in 2010 at 900 cases and the panel sample size increased from 4,797 in
2010 to 6,178 in 2013. The historical development of the 2013 ISDR panel sample can be
described as follows:
600 cases selected for the 2003 ISDR from the 2001 and 2002 SED new cohorts that
were non-U.S. citizens reporting plans to emigrate after graduation,
900 cases selected for the 2006 ISDR from the 2003, 2004 and 2005 SED new cohorts
that were non-U.S. citizens reporting plans to emigrate after graduation,
156 non-U.S. citizen cases removed from the 2006 NSDR frame for being abroad for two
consecutive rounds and transferred to the 2006 ISDR sample,
948 cases selected for the 2008 ISDR from the 2006 and 2007 SED new cohorts that
were non-U.S. citizens reporting plans to emigrate after graduation,
9
Yield rates and not response rates were used because we had to account for loss due to ineligibility and
nonresponse.
10
Generally, the strata represent populations of size 500 or more. A few strata were allowed to have smaller
population sizes to prevent excessive collapsing over degree fields.
Prepared for NSF by NORC | 33
2013 SDR | Sample Design and Implementation
228 non-U.S. citizen cases removed from the 2008 NSDR frame for being abroad for two
consecutive rounds and transferred to the 2008 ISDR sample,
900 cases selected for the 2010 ISDR from the 2008 and 2009 SED new cohorts that
reported plans to emigrate (without regard to citizenship),
15 ISDR panel cases determined to be permanently ineligible in the 2003 to 2008 cycles
removed from the 2010 eligible frame;
1,980 cases with most recent location outside the U.S. transferred from the 2010 NSDR
frame to the 2010 ISDR frame,
63 ISDR panel cases determined to be permanently ineligible in the 2010 cycle removed
from the 2013 eligible frame; and
544 cases with most recent location outside the U.S. transferred for the 2013 NSDR
frame to the 2013 ISDR frame.
Once transferred into or sampled for the ISDR, panel cohorts have remained in the sample for
future survey cycles. At present, the intention is to build up the longitudinal ISDR panel over
several cycles and to establish a fixed sample size for this sample component when the
characteristics of international residents are better understood.
Prepared for NSF by NORC | 34
2013 SDR | Sample Design and Implementation
6.
Sample Allocation
The 2013 SDR used essentially the same basic approach for sample allocation as the 2010 SDR.
However, one change to a sampling stratification variable noted previously did have an impact
on the 2013 SDR sample allocation. Specifically, the stratification variable measuring disability
was modified to include cognitive disabilities in the 2013 sampling frame which expanded the
number of frame members classified as disabled and increased the population sizes for the U.S.
born, non-Hispanic white disabled strata.
6.1
Background on NSDR Sample Allocation Procedures
This section provides historical background on the development of the sample allocation
procedures for the 2013 NSDR as they relate to the current sample design.
6.1.1 Introduction of the Maintenance Cut
Prior to 1995, the NSDR retained all eligible panel members in the sample with certainty and
then selected a sample from the new cohort frame for each stratum to update the sample
coverage for the current survey cycle. As a consequence the NSDR sample size increased
steadily over time resulting in unacceptable increases in the total survey costs (Mitchell,
Moonesinghe, and Cox, 1998).
In the 1995 survey cycle, the NSDR introduced the concept of a maintenance cut which required
that the total sample size of selected new cohorts and panel members be fixed to a pre-specified
number of attempted interviews in that survey cycle (Moonesinghe, 1998). Each subsequent
survey cycle has implemented a maintenance cut, although the total specified sample size has
varied over time. Since 1999, the total NSDR sample size has been fixed at 40,000 attempted
interviews, with the exception of the 2006 cycle which had a sample size of 42,955 expanded to
accommodate three new SED cohorts.
This maintenance cut only affects the total sample size being allocated and is not intended to be a
uniform cut to the number of panel members selected from each stratum. Rather the total
specified sample size is reallocated to each stratum’s new cohorts and panel members following
Prepared for NSF by NORC | 35
2013 SDR | Sample Design and Implementation
the sample design in place for that survey cycle keeping, for the most part, a proportional
allocation of the sample between new and panel cases based on their respective populations.
6.1.2 The 2013 NSDR and its Derivation from 2003 and 2010 NSDR Redesigns
The 2013 NSDR sample design is derived from the redesign implemented in the 2003 NSDR,
together with the 2006, 2008, and 2010 modifications to the NSDR and sample selection
procedures. The 2003 NSDR redesign redefined the strata to ensure adequate minimum
population sizes for each stratum and to better respond to analysts data needs (Cox, 2003).
About 75 percent of the sample is allocated with probability proportional to population size to
maximize the precision in the survey estimates. The remainder of the sample is allocated
disproportionally to ensure adequate estimation capability for small minority domains and to
ensure that each stratum is allocated sufficient numbers of attempted interviews so that they can
be expected to yield the equivalent of 60 completed interviews.
The 2006 NSDR modified the 2003 NSDR design to impute missing data for stratification
variables like race/ethnicity but otherwise the design remained the same (Yang et al., 2006). The
2008 NSDR also used logical editing to impute missing data for all stratification variables
including race and ethnicity.
In 2010, the NSDR and ISDR frames were integrated into one, although the samples for the two
subpopulations are stratified and allocated separately (Cox et al., 2012b). The 2010 NSDR
followed the same sample design and allocation procedures as the 2008 NSDR except that the
2008 old cohort NSDR sample members were moved to the 2010 ISDR frame when they were
found to be living outside the U.S. The 2013 NSDR followed the 2010 NSDR sample design
procedures exactly except for the redefinition of the disabled frame variable to include the
cognitively impaired.
6.2
Allocation of the 2013 NSDR Sample to Panel Members and New Cohorts
The NSDR panel sample allocation procedure is an iterative process that first proportionally
allocates the sample to each stratum, and then increases the initial sample sizes in certain strata
to achieve the minimum samples sizes desired for the number of completed interviews and for
the specified analytical domains as needed, which in turn requires the allocation for the
Prepared for NSF by NORC | 36
2013 SDR | Sample Design and Implementation
remaining strata to be decreased to maintain the overall sample size. Some recycling of these
steps is required to make sure all of the sample targets are met. In addition, since the panel cases
are selected using a probability-proportionate-to-size (PPS) selection procedure (see Section 7),
once the sample is specified for each stratum, an iterative process is used to identify the certainty
selections in each strata and then to select from the remaining cases the balance of the sample
required. For the new cohort, the sample is allocated proportionally across the strata. Since
there are no minimum sample sizes or domain target restrictions to apply, no further adjustment
is required. The new cohort sample is also selected using systematic sequentially sorted
sampling procedures rather than a PPS procedure so certainty identification is not required.
Appendix Table B.3 shows the total of 40,000 cases as they were finally allocated, including the
36,666 panel cohort sampled cases and the 1,632 and 1,702 new cohort sampled cases for the
2010 and 2011 academic years, respectively.
6.2.1 The NSDR Allocation Process
The NSDR sample consists of two cohorts: the panel cohort and the new cohort. The new
cohort is further divided into two separate cohort groups, one for each new SED cohorts defined
by the two academic years. Across the two cohorts, the total sample was allocated to the panel
cohort and new cohort proportionately based on population size. The sample allocated to the
new cohort was further subdivided by allocating it proportionately to the two new cohorts.
Within each new cohort, the sample is allocated to the strata proportionately based on the
population size per stratum. Within the old cohort, however, an iterative process was required to
allocate the sample across the strata to ensure that the minimum sample size requirements are
met for all selected domains and strata.
Specifically, the 2013 NSDR panel sample allocation consisted of five iterative steps:
1. Allocate the sample proportionally to each stratum;
2. Allocate extra sample to specific demographic groups by gender domains through
supplemental domain allocation;
3. Allocate supplemental sample to the small strata if needed to achieve the minimum
sample size requirement;
4. Adjust the allocation for the remaining strata that are not involved in steps 2 and 3 to
maintain the overall sample size; and
Prepared for NSF by NORC | 37
2013 SDR | Sample Design and Implementation
5. Repeat steps 2 through 4 as needed to ensure the minimum sample size requirements are
achieved for all domains and all strata.
While large strata received only the proportional allocation, the smallest strata could receive
additional sample through the stratum supplemental allocation and the domain supplemental
allocation. Both the stratum and domain supplemental allocations are designed to support
subgroup analyses with sufficient sample size. The size of the domain supplemental allocations
was the same in 2013 as had been since 2003. The final panel sample allocation was therefore a
combination of a proportional allocation across all strata, a domain-specific supplement allocated
proportionately across strata in that domain, and a stratum-specific supplement added to each
stratum, if needed, to obtain the minimum stratum size.
Since the panel sample allocation is based on weighted population counts instead of the number
of cases on the frame, some strata did not have enough cases to support the desired allocation. In
that situation, the allocated sample size is the same as the number of cases available while the
balance of the sample is allocated to the other panel cohort strata via the iterative steps described
above. That is, as such changes took place, the iterative process was repeated as needed until all
requirements are met.
For the new cohort sample allocation is a straight proportional allocation based on the number of
cases per stratum.
The allocation process worked as follows: First, the domain supplemental samples totaling
4,550 sample cases overall were proportionally allocated to the strata associated with each
designated small domain defined by gender and demographic group receiving a supplemental
sample. The domain specific allocation was based upon the stratum’s estimated total population
size across all cohorts. This domain specific allocation was fixed and never changed under the
subsequent sample size iterations. Second, the remaining sample (35,450) was allocated in an
iterative process.
The iterative portion of the sample allocation process began with a proportional allocation of the
remaining 35,450 sample cases based on the estimated population size of each stratum. The next
step in the first iteration was to make additional stratum-level allocations as needed to ensure that
each stratum had its minimum sample size allocation. For each stratum, the resultant total
Prepared for NSF by NORC | 38
2013 SDR | Sample Design and Implementation
sample size of proportional, domain-specific, and stratum-specific allocations was further
allocated to the panel and new cohort substrata. When the stratum’s panel cohort sample
allocation exceeded the number of panel cohort frame members, the panel cohort allocation was
reduced to the number of panel cohort frame members in that stratum.
To decide if the second iteration was needed, the total sample size allocated across all strata was
compared to the desired sample of 40,000 cases. Because that total exceeded 40,000 cases (due
to the stratum-level allocations made in the first iteration), a second allocation was needed. The
second iteration began by redefining the number of sample cases to be proportionately allocated
as 35,450 minus the total number of cases allocated across all the stratum-specific allocations of
the first iteration. This reduced sample size for the proportional allocation was again
proportionately allocated across all strata in this second iteration. As before, the next step was to
make additional stratum-level allocations as needed to ensure that each stratum had their
minimum size allocation. This step might lead to additional strata needing a stratum-level
allocation as well as increasing the stratum-level allocations made in the first allocation. Again,
the revised total stratum size allocation was further allocated to the old versus new substrata and
the panel cohort substratum allocation was reduced when it exceeded the number of old cohort
frame cases.
The iteration process continued following the pattern of the second iteration until the total
sample allocated across all strata was 40,000 and all the minimum stratum-level sample size
requirements were met. Ultimately, a total of 1,555 sample cases were allocated at the stratumlevel to ensure that minimum stratum sample size requirements were met, leaving 34,154 cases
to be proportionately allocated to strata after 4,550 cases had been allocated at the domain level.
For further clarification of the iteration process, see Appendix D for detailed specifications and
the final 2013 NDR allocation.
6.2.2 The 2013 NSDR Allocation Results
As noted earlier, the domain-specific allocation was fixed. The purpose of the domain allocation
was to maintain the sampling rates for the small domains that were achieved in previous NSDR
survey cycles. Analysts routinely combine design strata to form domains for separate estimation,
which should be duly reflected in the sample design and allocation. Without the domain
Prepared for NSF by NORC | 39
2013 SDR | Sample Design and Implementation
allocation, we would have allocated far more sample to the U.S.-born, non-disabled, white strata
than past surveys, and there would be insufficient old cohort cases in the frame to support such
allocation. As reported in Sample Design and Implementation for the 2003 Survey of Doctorate
Recipients, additional sample had been allocated to minority by gender subpopulations prior to
the 2003 NSDR (Yang et al., 2004). Such purposeful oversampling was carried out to support
NSDR analyses on these small domains. Similar domain allocation has been implemented in the
2006 to 2010 NSDR survey cycles.
Following this practice, the 2013 NSDR allocated 4,550 cases to ten demographic by gender
domains, with the extra sample allocated proportionally to the strata composing each domain.
This extra sample size was arbitrarily set to the sample sizes allocated in the 2003 NSDR, which
in turn was set to yield approximately the same average sampling ratio of population size to
sample size in each domain as was achieved in the 2001 NSDR, while avoiding allocation of old
cohort sample sizes in excess of the available frame cases. Table 6.1 gives the size of the
supplemental allocation to each of the domains that received such allocation.
Table 6.1
Domain Supplemental Allocation
Demographic Group
Hispanic
NH black
U.S. born, NH Asian
U.S. born, non-disabled NH white
Non-U.S. born, NH white
Non-U.S. born, NH Asian
Sex
Male
Female
Male
Female
Male
Female
Supplemental
Allocation
750
750
750
750
500
500
Female
Male
Female
Female
250
50
50
200
Total Supplemental Allocation
4,550
Overall, a total of 34,154 cases were allocated through proportional allocation and the remaining
5,846 cases were allocated through stratum or domain level supplemental allocations. The final
sample size allocated through the two supplemental allocations was smaller than the total
supplemental allocation in the first iteration because a fraction of the supplemental allocation
Prepared for NSF by NORC | 40
2013 SDR | Sample Design and Implementation
was added back to the proportional allocation when there was a shortage of old cohort cases in
the frame. For the same reason, it is not possible to divide the total supplemental allocation
between stratum and domain level supplemental allocations.
The sample allocation took place in November 2012, when population counts were available for
the 2010 and 2011 SED cohorts as well as the old cohorts. As a consequence, the 2013 NSDR
sample size allocated was exactly 40,000 and the sample was allocated in one step. Prior to
sample selection, allocations of less than 1 sample case to any 2010 or 2011 SED new cohort
stratum with one or more frame members were rounded up to 1, still resulting in a final 2013
new cohort sample of 3,339 instead of the 3,334 originally allocated.
The overall impact of the revised 2010 NSDR frame building procedures used in 2013 frame
building too was to reduce the frame size as panel cohort cases were transferred to the ISDR and
new cohort cases were incorporated into the ISDR frame that would have been in the NSDR
frame with the rules used in previous survey cycles. The impact was modest, given that the
major transfer of emigrants had occurred in the 2010 frame building, which had the effect of
reducing the need for stratum-specific allocated sample to 1,555 compared to the 1,591 used in
the 2010 NSDR. The proportion of the sample being allocated proportionately decreased to 85
percent for the 2013 NSDR compared to 86 percent for the 2010 NSDR. Finally, the panel
cohort stratum allocations were 101 percent of the panel cohort frame sizes, which was 102
percent for the 2010 NSDR. The 2013 NSDR frame building procedures remained the same, so
we would have expected a modest decrease in panel cohort allocations in excess of available
panel cohort sample cases between 2013 and 2010DR samples.
6.2.3 Trends over Time in the NSDR Sample Allocation
Each survey cycle the NSDR sample of 40,000 sampled cases has about 85 percent of the total
sample allocated in proportion to current population sizes for each stratum. As a consequence,
the sample allocation changes over survey cycles to reflect trends in the distribution of SEH
doctorates by race/ethnicity, sex, and other stratification variables. This section discusses
changes observed in the 2013 NSDR sample allocation as a consequence of the changing
composition of the SEH population over time and changing definition for the disabled strata.
Prepared for NSF by NORC | 41
2013 SDR | Sample Design and Implementation
U.S. Citizen at Birth Males. At the inception of the NSDR, the vast majority of the nation’s
trained SEH doctorates were U.S. citizen at birth, white, and male. Since that time, there has
been an ever increasing percentage of new cohorts which are non-U.S. citizen at birth, minority
racial groups, and female. As a consequence, doctorates aging out of the NSDR population
reduce the overall proportion of the total population of U.S. citizen at birth, white, males, while
there is a somewhat reduced percentage of U.S. citizen at birth, white, male doctorates entering
the NSDR population. The reduction in the relative population size of U.S. citizen at birth, white
males led to a modest reduction in the number of old cohorts retained in the 2013 NSDR
sample—96.2 percent of eligible old cohorts—in comparison to the 95.4 percent of all eligible
old cohorts retained in the 2013 sample.
U.S. Citizen at Birth Asian Females. The overall population sizes for these strata in 2013
ranged from 12 to 30 percent when expressed as a percentage of the 2010 population sizes.
These strata are growing at a higher rate than the strata for other domains which means that the
new cohort cases needs to be assigned proportionately more of the stratum’s sample and the
subsampling rate for old cohorts increased slightly. The overall effect is stratum maintenance
cuts that range from 8 to 13 percent which is about twice as large as the overall average
maintenance cut of 5.4 percent across strata.
U. S. Citizen at Birth Disabled Whites. The disabled population presents a difficult
problem for stratification as disabled status may change from one survey cycle to another.
Disability is defined as reporting disability in the prior SDR cycle for the panel cases or in the
SED for new cohorts. Various alternative definitions for disability have been studied, but this
definition produces the best results. However, a not-insubstantial number of sample cases
stratified as nondisabled later report being disabled in the survey and vice versa. The movement
from nondisabled to disabled has the most negative consequences as these cases have large
weights in comparison to sample cases selected from the disabled strata. This type of movement
was observed in the 2013 NSDR frame in part due to the additional cognitive disability category
added to the 2010 survey. Prior to the 2010 cycle, respondents could choose from four disability
categories (i.e., difficulty with seeing, hearing, walking, or lifting). Starting with the 2010 SDR,
a fifth disability category for reporting difficulty with concentrating, remembering or making
decisions was added. As a result of the added disability category in the 2010 SDR, the number
Prepared for NSF by NORC | 42
2013 SDR | Sample Design and Implementation
of old cohort frame cases classified as disabled in the 2013 SDR frame file was noticeably
greater (also discussed in Section 2).
Specifically, 4.5 percent of cases initially stratified as non-disabled in the 2010 frame reported
being disabled in the 2010 survey, while 38.8 percent of cases stratified as disabled in the 2010
frame reported being nondisabled in the 2010 survey. To assess the impact of the added
category, the disability status was calculated for the 2013 SDR old cohort as it was defined for
the 2010 old cohort frame cases using responses from just the four disability categories and
compared to the disability status calculated using all five disability categories. This comparison
showed that the fifth new cognitive disability category caused an increase in the number of
disabled old cohort frame cases of 7.6 percent. However, disability status is only used to stratify
U.S. born, white cases in the NSDR frame. Table 6.2 shows the impact of the cognitive
disability category on the NSDR old cohort frame cases in the U.S. born white strata (strata 47 to
90 which include the disabled and non-disabled strata).
Table 6.2
2013 NSDR U.S. Born White Old Cohort Frame Cases by Disability Status
Derived by the 4-Category and 5-Category Disability Definition
Old Cohort Disability Definition
Disabled Status
Based on 4 categories
Population
Estimate
Total
Not disabled
Disabled
Percent
Based on 5 categories
Population
Estimate
Percent
515,700
100.0%
515,700
100.0%
472,900
91.7%
470,000
91.1%
42,800
8.3%
45,700
8.9%
Old Cohort Disability Definition
Disabled Status
Total
Not disabled
Disabled
Based on 4 categories
Case
Count
Percent
Based on 5 categories
Case
Count
Percent
22,032
100.0%
22,032
100.0%
20,153
91.5%
20,027
90.9%
1,879
8.5%
2,005
9.1%
Finally, comparing the 2010 NSDR allocation results to the 2013 NSDR results, we see a 1.1
percent increase in the proportion of U.S. citizen at birth, white disabled; 2.1 percent of the 2010
Prepared for NSF by NORC | 43
2013 SDR | Sample Design and Implementation
NSDR allocated sampling frame was U.S. citizen at birth, white disabled, and 3.2 percent of the
2013 NSDR allocated sample frame was U.S. citizen at birth, white disabled.
Demographic Domains by Sex. Table 6.3 compares the percent of the population for each
demographic by sex domain by panel and new cohort and overall for the 2013 and 2010 SDR
population. The table also shows the relative increase or decrease in the population sizes. As
noted, the biggest proportional change observed is a decrease of 2.3 percent in the population of
U.S. citizens at birth, NH white, nondisabled males. Proportional growth can be seen in many of
the non-white domains, particularly the non-U.S. citizen at birth Asian men and women when
comparing the 2013 to 2010 SDR population distribution.
Table 6.3
Population Proportions by Demographic Domain: 2010 and 2013 NSDR
Demographic Group
Defined by NSFGRP by Sex
Hispanic males, regardless of race, citizenship at birth, and disability status
Hispanic females, regardless of race, citizenship at birth, and disability status
NH black males, regardless citizenship at birth and disability status
NH black females, regardless citizenship at birth and disability status
U.S. citizen at birth, NH Asian males regardless of disability status
U.S. citizen at birth, NH Asian females regardless of disability status
NH American Indian males, regardless of citizenship at birth and disability status
NH American Indian females, regardless of citizenship at birth and disability status
NH Pacific Islander males, regardless of citizenship at birth and disability status
NH Pacific Islander females, regardless of citizenship at birth and disability status
U.S. citizen at birth disabled, NH white males
U.S. citizen at birth disabled, NH white females
U.S. citizen at birth, not disabled, NH white males
U.S. citizen at birth, not disabled, NH white females
Non-U.S. citizen at birth, NH white males, regardless of disability status
Non-U.S. citizen at birth, NH white females, regardless of disability status
Non-U.S. citizen at birth, NH Asian males, regardless of disability status
Non-U.S. citizen at birth, NH Asian females, regardless of disability status
Overall
NH=non-Hispanic.
6.3
2013 SDR
2010 SDR
Percent of Population
Percent of Population
Total
Old
New
Total
Old
New
2.2%
2.1%
3.4%
2.1%
2.0%
3.1%
1.5%
1.4%
2.8%
1.3%
1.2%
2.5%
1.7%
1.7%
1.9%
1.7%
1.7%
1.9%
1.5%
1.4%
2.5%
1.4%
1.3%
2.3%
1.0%
1.0%
1.5%
0.9%
0.9%
1.3%
0.7%
0.7%
1.5%
0.6%
0.6%
1.3%
0.4%
0.4%
0.3%
0.4%
0.4%
0.3%
0.2%
0.2%
0.3%
0.2%
0.2%
0.3%
0.1%
0.1%
0.1%
0.1%
0.1%
0.1%
0.1%
0.1%
0.1%
0.1%
0.1%
0.1%
3.8%
4.0%
1.1%
3.4%
3.7%
0.2%
1.4%
1.4%
0.8%
1.3%
1.3%
0.3%
36.9% 38.1% 21.8% 39.2% 40.7% 22.1%
18.4% 18.4% 18.7% 18.2% 18.2% 18.9%
7.1%
6.9%
8.7%
7.1%
6.9%
9.0%
2.9%
2.7%
5.6%
2.8%
2.6%
5.8%
14.6% 14.3% 18.5% 14.1% 13.6% 20.0%
5.6%
5.2% 10.5%
5.0%
4.5% 10.6%
100.0% 100.0% 100.0% 100.0% 100.0% 100.0%
Differences
2013 to 2010 SDR
Total
Old
New
0.1%
0.1%
0.3%
0.2%
0.2%
0.3%
0.0%
0.0%
0.0%
0.1%
0.1%
0.2%
0.1%
0.1%
0.2%
0.1%
0.1%
0.2%
0.0%
0.0%
0.0%
0.0%
0.0%
0.0%
0.0%
0.0%
0.0%
0.0%
0.0%
0.0%
0.4%
0.3%
0.9%
0.1%
0.1%
0.5%
-2.3% -2.6% -0.3%
0.2%
0.2% -0.2%
0.0%
0.0% -0.3%
0.1%
0.1% -0.2%
0.5%
0.7% -1.5%
0.6%
0.7% -0.1%
0.0%
0.0%
0.0%
ISDR Sample Allocation
All 6,178 panel cohort ISDR cases were selected with certainty in 2013, following the practice of
the previous survey cycles. As in the 2010 survey cycle, the SED 2010 and 2011 ISDR new
cohort cases were also allocated in one pass. The 900 new cohort ISDR sample cases were
allocated proportionally to strata based upon population sizes. As introduced in the 2010 ISDR,
Prepared for NSF by NORC | 44
2013 SDR | Sample Design and Implementation
the 2013 ISDR used 44 new cohort strata to replace the 10 strata used for new cohorts in the
2006 and 2008 survey cycles. Any unrounded stratum allocation less than 1 was forced to be 1
to make sure these strata got represented in the sample. The frame counts and actual allocation
of the ISDR sample is shown in Appendix C.
6.4
NSDR and ISDR Probabilistic Rounding
The final sample allocations were rounded to integers before they were used in sample selection.
Chromy’s probability minimum replacement sampling algorithm was used to convert each
stratum and substratum allocation to an integer while keeping the total sample size fixed to the
desired sample totals for the NSDR panel and new cohorts and for the ISDR new cohorts
(Chromy, 1979). Probabilistic rounding converts the sample size to integers without changing
the ultimate unconditional selection probabilities. As a consequence, except for strata with
insufficient panel cohort cases available for sampling, the ultimate unconditional probability of
selection based on rounded sample allocations were the same for all panel and new cohort cases
within each stratum.
Prepared for NSF by NORC | 45
2013 SDR | Sample Design and Implementation
7.
Sample Selection
The 2013 SDR sample selection procedures were unchanged from the prior rounds of the SDR.
7.1
NSDR Sample Selection
The 2013 NSDR sample selection was carried out separately for the panel cohort, the 2010 SED
new cohort, and the 2011 SED new cohort. Prior to 2010 SDR, the NSDR sample was selected
in two parts, with the Part 2 selection for the most recent cohort delayed until July when final
counts were available. However, the reference date for the 2013 SDR was changed to 1
February 2013 allowing time for both new cohort years to be available for sample selection.
Although the 2010 and 2011 SED new cohort frames could have been combined and just one
new cohort sample selected, we chose to continue the practice of selecting independent samples
for each new cohort year to maintain control over the stratum sample size selected from each
new cohort year. Within each cohort, the sample was selected independently from each stratum
based on the final allocation presented in Appendix B.3.
As for past survey cycles, the panel cohort sample was selected with probability proportional to
size (PPS) where the measure of size was the 2010 SDR sampling weight (the inverse of the
probability of selection). For each stratum, the sampling algorithm started by identifying and
removing certainty cases through an iterative procedure. A panel cohort case was selected with
certainty when its selection probability was equal to or greater than 1.0 based on its measure of
size. These certainty cases were transferred to the sample and revised selection probabilities
were calculated based upon the remaining frame cases. The measures of size of the remaining
panel cohorts were then compared to the revised selection probability and additional certainty
cases designated. Iteration terminated when all certainty selections had been identified and
removed. Next, the noncertainty cases within each stratum were sorted by citizenship, disability
status, 15-level SDR degree field11, and year of doctoral degree award. Finally, the balance of
11
Prior to 2003, the DRF field of degree variable (PHDFIELD) was used in sorting with no control imposed over
year of degree receipt from 1991 to 2001. The intent had been to use SDR field of degree from 2003 on together
with year of degree receipt, but the DRF field of degree continued to be used in 2003 and 2006 due to
oversight. Use of the multi-level DRF field of degree in sorting left little potential for control over the year of
Prepared for NSF by NORC | 46
2013 SDR | Sample Design and Implementation
the panel cohort sample (i.e., the total stratum allocation minus the number of certainty cases)
was selected from each stratum as a systematic PPS sample.
The 2010 SED new cohort sample was selected at the same time as the 2011 SED new cohort
sample, using exactly the same systematic sampling procedures. Both new cohort samples were
selected using the same sampling algorithm as used for selecting the old cohort sample. Every
case in the new cohort frame was assigned 1 as the measure of size for the PPS selection. There
were no certainty selections from the new cohorts, and the new cohort sample within each
stratum was an equal probability systematic sample. Across strata, however, sampling
probabilities vary.
Both the panel cohort and new cohort samples were selected systematically from the sorted list
within each stratum, where the sorting variables operated as implicit stratification variables. The
efficiency of a systematic sample can be increased if the units on the list are sorted by
characteristics that are relevant to analysis. Sorting places similar cases next to each other on the
list so that each stratum sample includes a mix of cases representative of their population with
respect to the sorting variables. Because citizenship and disability status are of analytical interest
but were not featured in the stratification of minority demographic groups, it made sense to use
these as the first two sorting variables. Sorting by the 15-level SDR degree field variable
provides discrimination over degree field for American Indians and Pacific Islanders that are not
stratified by degree field and also greater control over the degree field distribution for minority
groups that are only stratified by the 7-level SESTAT degree field recode. Because analysts
frequently report for domains based upon age or years since degree award, the frame was also
sorted by years since degree award to control the age distribution of the final sample.
7.2
ISDR Sample Selection
The ISDR panel cohort cases were selected with certainty. The 2010 and 2011 SED new cohort
files were combined for selection purposes, using the final sample size stratum allocations
presented in Appendix C.1. The new cohort sample was selected systematically from the sorted
degree receipt. The oversight was corrected beginning with the 2008 survey with the 15-level degree field variable
used for sorting to reserve the potential for control over year of degree receipt within degree field as originally
planned.
Prepared for NSF by NORC | 47
2013 SDR | Sample Design and Implementation
list within each stratum, where the sorting variables (SESTAT major degree field and
continent/region) operated as implicit stratification variables.
Prepared for NSF by NORC | 48
2013 SDR | Sample Design and Implementation
8.
Concluding Remarks
The 2013 SDR sample design closely followed the 2010 SDR design. The process of creating
the 2010 SDR sample design which integrated the main or NSDR survey with the ISDR survey
required many design changes from the 2008 SDR program, but was well worth the effort.
Methodological research conducted using 2008 SDR data enabled the NSF to integrate the ISDR
sample cases accrued over the 2003 to 2008 cycles with the NSDR cases to improve the
coverage properties of the SDR. This in turn provided the ability for the SDR to produce
estimates for all cases graduating in the 21st century whether they were residing in or out of the
U.S. and likewise to report the estimates by this status. The integration research in 2008 also
included the development of an integrated set of sampling strata that used the predicted location
of the cases to create a more homogeneous segmentation. As a result, we expect improved
survey precision of the estimates with this revised stratification approach. Furthermore, we
aligned the strata with around the cases expected residency determined in the data collection
operations of survey administration and locating. This research resulted in a new integrated
survey weighting procedure for the combined NSDR and ISDR cases that adjusted for
nonresponse using a logistic regression technique and incorporated a poststratification procedure
to ensure the weighted estimates reproduced population totals from the combined NSDR and
ISDR sampling frames. For a discussion of the integrated research and the creation of the
predicted location see (Cox et al., 2012a).
No matter how carefully survey redesigns are researched and implemented, substantial design
changes need to be evaluated after a cycle or two to allow for adjustments in the event
deficiencies are recognized. For the 2003 NSDR redesign, the design strata were redefined to be
more responsive to the domains commonly used by data analysts. This process identified the
fact that the NSDR strata were often based upon imputed data for race/ethnicity. Steps were
taken to obtain the missing data in the 2003 and 2006 survey cycles, but there was still more
missing data for race/ethnicity than desirable for stratification. As a result, for the 2008 NSDR
introduced a multistep imputation procedure to logically impute this missing data when it had yet
to be collected from sample members. This imputation approach was found to be reasonably
effective in predicting missing race/ethnicity (Selfa et al., 2012) and was adopted for use in the
Prepared for NSF by NORC | 49
2013 SDR | Sample Design and Implementation
2010 and 2013 cycles. With the 2013 cycle, the integration of the NSDR and ISDR sample has
been completed as originally planned. However, discussed, we recommend additional steps be
taken to revisit the study objectives to determine whether the current sample design best supports
the SDR’s estimation goals. We also recommend conducting research to explore additional steps
to unify the components of the SDR samples, NSDR and ISDR, into a single sample design and
allocation methodology. The 2013 SDR sampling procedures followed the methodology
adopted in 2010 with the minor exceptions as noted in Section 2 which leveraged research
conducted on the 2008 cycle’s selected sample and data collection results. In particular, the
2008 survey cycle was the first cycle to have sufficient ISDR interviews completed to facilitate
the analysis of the two SDR components separately and together. In a related investigation,
2008 SDR integrated weights were developed to facilitate integrated analyses (Harter et al.,
2012) based on a weighting class procedure and to bridge the changes to the traditional and
integrated estimates. The weighting process was enhanced in 2010 using a logistic regression
methodology which is expected to be applied to the 2013 sample (Sinclair and Batishev, 2012).
This research as noted enables the ISDR and NSDR data sets to be used in combination to
provide insight into key analytic issues for international residents and domains that are of special
interest. We note that the 2010 NSDR and ISDR design strata were defined based upon input
from the NSF analysts and the same stratification plan was adopted for 2013 as discussed in
Section 4.
The 2013 design follows the 2010 design that adopted new procedures for the ISDR sample size
and allocation. In 2010, to build up the ISDR sample size, eligible panel members from the
previous survey cycle were taken with certainty into the ISDR sample. Most ISDR panel
members were doctorates earning their degrees in the 21st century sampled as new cohorts.
Other cases were transferred out of the NSDR frame for the 2010 survey cycle when they were
identified as being international residents in the data collection for the previous survey cycle.
Most of these transferred cases are doctorates earning their degree in the 20th century, although
there are a small number of doctorates earning their degree in the 21st century transferred from
the NSDR frame to the ISDR frame. The same approach was followed for the 2013 design. At a
future date, the ISDR may need to establish a fixed total ISDR sample size and implement a
maintenance cut in each survey cycle just as the NSDR has done since the 1995 survey cycle.
The NSF has been considering this, but a specific ISDR sample size or specific survey round for
Prepared for NSF by NORC | 50
2013 SDR | Sample Design and Implementation
implementing these changes has not yet been established.12 Also at that point, we recommend a
review of the current sample allocation to ascertain whether the survey data results are fully
meeting the NSF’s analytic goals for the SDR.
The integrated SDR data set can be expected to provide valuable insights concerning migration
of U.S. trained doctorates. International residency may be becoming more attractive for recent
doctorates as well as for experienced doctorates. Some doctorates leave the U.S. permanently
but others return. Still others move back and forth repeatedly across national boundaries. The
integrated SDR data provides valuable guidance into the characteristics of doctorates who choose
to be international residents on a temporary or permanent basis.
Here we recommended some next steps for future research and program improvements:
Development of sample design and sample allocation statistical program (possibly coded
in SAS or other portable software) that will enable the NSF and NORC to easily examine
the impact of different design choices (using different stratification and/or sample
allocation methodologies) on domain specific samples sizes and their corresponding
precision levels. Results will generate suggested design changes to improve the
precision levels for specific domains (to be specified based on a fresh review of the study
objectives) and will evaluate the trade-offs associated with the effects of oversampling as
warranted on aggregate estimates that cover multiple domains. In particular this
approach would suggest an optimal sample size for the international students and how to
best allocate the same between the panel and new cohort cases for determining the use of
maintenance cuts.
Evaluate the migration patterns and citizenship status of cases initially sampled for the
ISDR to determine if ISDR sample members currently located in the U.S. appear to
making a permanent residency change to the U.S. and should be transferred to the NSDR
frame.
12
See the memorandum entitled “2010 SDR Integration Memo 5 – Identification of International Residents Among
21st Century Doctorates” addressed to Dan Foley and Steve Cohen (NSF) from Brenda Cox (SRA) and Karen
Grigorian dated 11 May 2010.
Prepared for NSF by NORC | 51
2013 SDR | Sample Design and Implementation
To explore methods to evaluate additional unification of the ISDR and NSDR sample
designs with the goal of creating a single SDR sampling methodology using stratification
based on the predicted location to control the sample sizes for national and international
cases.
Generation of a reference document that describes the changes to the sample design,
survey and sampling frame eligibility standards, weighting methodology and survey
definitions during the last two decades of the SDR program that can be easily updated
each survey cycle once started.
Research into alternative methods for handling the longitudinal aspects of the eligibility
status of the panel cases, use of the panel cases prior cycle eligibility status data, and how
to best use this information in light of the changes to the survey eligibility standards
between the sample members earning their degrees in the 20th and 21st centuries.
Prepared for NSF by NORC | 52
2013 SDR | Sample Design and Implementation
References
Chromy, J.R. (1979), “Sequential Sample Selection Methods,” Proceedings of the American
Statistical Association, Survey Research Methods Section, 401–406.
Cox, Brenda G. (2003). The Survey of Doctorate Recipients: Redesigned for the 21st Century.
Report submitted to the National Science Foundation by RoperASW under subcontract from
Mathematica Policy Research, Inc., Washington. DC
Cox, Brenda. G., Karen Grigorian, Fang Wang, Rebecca Wang, and Rachel Harter (2012a).
2010 Survey of Doctorate Recipients: Investigating an Integrated Design for the 21st Century.
Report submitted to the National Science Foundation by the National Opinion Research Center
at the University of Chicago, Chicago, IL.
Cox, Brenda. G., Karen Grigorian, Fang Wang, and Rebecca Wang (2012b). 2010 Survey of
Doctorate Recipients: Sample Design and Implementation. Report submitted to the National
Science Foundation by the National Opinion Research Center at the University of Chicago,
Chicago, IL.
Cox, Brenda G., Karen Grigorian and Michael Yang (2006). The 2006 International Survey of
Doctorate Recipients (ISDR): Sample Design. Report submitted to the National Science
Foundation by Battelle under subcontract to the National Opinion Research Center at the
University of Chicago, IL.
Grigorian, Karen and Tom Hoffer (2005). Non-U.S. Citizen Undercoverage Feasibility Study
Report. Report submitted to the National Science Foundation by the National Opinion Research
Center at the University of Chicago, Chicago, IL.
Harter, Rachel, Michael Sinclair, Karen Grigorian, Susan Hinkins, Brenda G. Cox, Rebecca
Wang, Peter Kwok, Michael Yang, and Fang Wang (2012). 2008 Integrated Survey of
Doctorate Recipients: Weighting and Variance Estimation Report, Report submitted to the
National Science Foundation by the National Opinion Research Center at the University of
Chicago, Chicago, IL.
Mitchell, Susan, Ramal Moonesinghe, and Brenda Cox (1998). Using the Survey of Doctorate
Recipients in Time Series Analysis: 1989-1993. Final report submitted to the National Science
Foundation under a subcontract to the National Research Council. Washington, DC:
Mathematica Policy Research, Inc.
Prepared for NSF by NORC | 53
2013 SDR | Sample Design and Implementation
Moonesinghe, Ramal (1998). Sampling Design and Weighting Procedures for the 1995 Survey of
Doctorate Recipients. Final Report submitted to the National Science Foundation under a
subcontract to the National Research Council. Washington, DC: Mathematica Policy Research,
Inc.
National Science Foundation public website. (2012). NSF at a Glance.
http://www.nsf.gov/about/glance.jsp.
Selfa, Lance, Jessica Knoerzer, Karen Grigorian, and Lynn Milan (2012). Coping with Missing
Data: Assessing Methods for Logically Assigning Race/Ethnicity, Report presented at the
American Association of Public Opinion Research 67th Annual Conference, May 2012, Orlando,
FL.
Sinclair, Michael, and Julia Batishev. (2012). 2010 Integrated Survey of Doctorate Recipients
(NSDR/ISDR): Survey Weighting Methodology Using Logistic Modeling Procedures, Report
submitted to the National Science Foundation by NORC at the University of Chicago, IL,
September 12, 2012.
Yang, Y. Michael, Brenda G. Cox, Karen Grigorian and Scott Sederstrom. (2006). Sample
Design and Implementation for the 2006 Survey of Doctorate Recipients, Report submitted to
the National Science Foundation by the National Opinion Research Center at the University of
Chicago, Chicago, IL.
Yang, Y. Michael, Karen Grigorian, Scott Sederstrom, Rachel Harter, and Tom Hoffer. (2004).
Sample Design and Implementation for the 2003 Survey of Doctorate Recipients, Report
submitted to the National Science Foundation by the National Opinion Research Center at the
University of Chicago, Chicago, IL.
Prepared for NSF by NORC | 54
2013 SDR | Sample Design and Implementation
Appendices
Appendix A – Sample Frame File Coding Taxonomies
A.1
2013 SDR Birth Region Code Frame Mapped to SESTAT Geocodes and
Race/Ethnicity Imputation based on Birthplace
A.2
2013 SDR Field of Study Coding Taxonomies Crosswalk
A.3
2013 SDR Data Sources Used to Develop Sampling Frame Variables
Appendix B – 2010 NSDR Stratification Scheme
B.1
2013 NSDR Strata and Frame Counts
B.2
2013 NSDR Minimum Respondent Stratum Sample Sizes with and without Finite
Population Correction Adjustment and Associated Yield Rates
B.3
2013 NSDR Final Sample Allocation
Appendix C – 2010 ISDR Stratification Scheme
C.1
2013 ISDR Strata with Frame Population Counts and Sample Cases
Appendix D – Detailed Specifications, Formulas and Final 2013 NDR allocation
Prepared for NSF by NORC | 55
2013 SDR | Sample Design and Implementation
Appendix A.1 2013 SDR Birth Region Code Frame Mapped to SESTAT
Geocodes and Race/Ethnicity Imputation based on Birthplace
SDR Birth Region (BIREGION)
Code
1.1.0
1.2.0
1.3.0
Region Name
Central Africa
Western Africa
Eastern Africa
SESTAT Location (Geocode)
Geocode
SESTAT Location Name
Imputed
Race/Ethnicity
base on
Birthplace
401
Angola
NH black
408
Cameroon
NH black
410
Central African Republic
NH black
411
Chad
NH black
413
Congo
NH black
416
Equatorial Guinea
NH black
419
Gabon
NH black
443
Sao Tome and Principe
NH black
459
Zaire
NH black
462
Africa, not specified
NH black
463
Central Africa, not specified
NH black
465
Equatorial Africa, not specified
NH black
466
French Equatorial Africa, not specified
NH black
403
Benin (formerly Dahomey)
NH black
406
Burkina Faso
NH black
409
Cape Verde
NH black
420
Gambia
NH black
421
Ghana
NH black
423
Guinea
NH black
424
Guinea-Bissau
NH black
425
Ivory Coast
NH black
429
Liberia
NH black
433
Mali
NH black
434
Mauritania
NH black
439
Niger
NH black
440
Nigeria
NH black
444
Senegal
NH black
447
Sierra Leone
NH black
454
Togo
NH black
467
French West Africa, not specified
NH black
469
Western Africa, not specified
NH black
402
Bassas da India
NH black
405
British Indian Ocean Territory
NH black
407
Burundi
NH black
412
Comoros
NH black
414
Djibouti
NH black
417
Ethiopia
NH black
418
Europa Island
NH black
422
Glorioso Islands
NH black
Prepared for NSF by NORC | 56
2013 SDR | Sample Design and Implementation
Appendix A.1 2013 SDR Birth Region Code Frame Mapped to SESTAT
Geocodes and Race/Ethnicity Imputation based on Birthplace
SDR Birth Region (BIREGION)
Code
1.4.0
1.5.0
2.1.0
Region Name
Southern Africa
North Africa
Middle East
SESTAT Location (Geocode)
Geocode
SESTAT Location Name
Imputed
Race/Ethnicity
base on
Birthplace
426
Juan de Nova Island
NH black
427
Kenya
NH black
431
Madagascar
NH black
432
Malawi
NH black
435
Mayotte
NH black
437
Mozambique
NH black
441
Reunion
NH black
442
Rwanda
NH black
445
Mauritius
NH black
446
Seychelles
NH black
448
Somalia
NH black
453
Tanzania
NH black
457
Uganda
NH black
460
Zambia
NH black
461
Zimbabwe
NH black
464
Eastern Africa, not specified
NH black
471
Eritrea
NH black
404
Botswana
NH black
428
Lesotho
NH black
438
Namibia
NH black
449
South Africa
NH white
450
St. Helena
NH black
452
Swaziland
NH black
455
Tromelin Island
NH black
470
Southern Africa, not specified
NH black
400
Algeria
NH white
415
Egypt
NH white
430
Libya
NH white
436
Morocco
NH white
451
Sudan
NH black
456
Tunisia
NH white
458
Western Sahara
NH black
468
North Africa, not specified
NH black
201
Bahrain
NH white
208
Cyprus
NH white
213
Iraq
NH white
214
Israel
NH white
216
Jordan
NH white
220
Kuwait
NH white
Prepared for NSF by NORC | 57
2013 SDR | Sample Design and Implementation
Appendix A.1 2013 SDR Birth Region Code Frame Mapped to SESTAT
Geocodes and Race/Ethnicity Imputation based on Birthplace
SDR Birth Region (BIREGION)
Code
2.2.0
2.3.0
Region Name
Southwest Asia
Southeast Asia
SESTAT Location (Geocode)
Geocode
SESTAT Location Name
Imputed
Race/Ethnicity
base on
Birthplace
222
Lebanon
NH white
228
Oman
NH white
232
Qatar
NH white
233
Saudi Arabia
NH white
237
Syria
NH white
240
Turkey
NH white
241
United Arab Emirates
NH white
243
Yemen, Peoples Democratic Republic
NH white
244
Yemen, Unified (1991 and after)
NH white
245
Asia, not specified
NH white
246
Asia Minor, not specified
NH white
248
Gaza Strip
NH white
250
Iraq-Saudi Arabia, Neutral Zone
NH white
251
Mesopotamia, not specified
NH white
252
Middle East, not specified
NH white
253
Palestine, not specified
NH white
254
Persian Gulf States, not specified
NH white
256
West Bank
NH white
200
Afghanistan
NH Asian
202
Bangladesh
NH Asian
203
Bhutan
NH Asian
210
India
NH Asian
212
Iran
NH white
225
Maldives
NH Asian
227
Nepal
NH Asian
229
Pakistan
NH Asian
236
Sri Lanka
NH Asian
257
Southwest Asia, not specified
NH Asian
204
Brunei
NH Asian
205
Myanmar (formerly Burma )
NH Asian
206
Cambodia
NH Asian
211
Indonesia
NH Asian
221
Laos
NH Asian
224
Malaysia
NH Asian
230
Paracel Islands
NH Asian
231
Philippines
NH Asian
234
Singapore
NH Asian
235
Spratley Islands
NH Asian
239
Thailand
NH Asian
Prepared for NSF by NORC | 58
2013 SDR | Sample Design and Implementation
Appendix A.1 2013 SDR Birth Region Code Frame Mapped to SESTAT
Geocodes and Race/Ethnicity Imputation based on Birthplace
SDR Birth Region (BIREGION)
Code
2.4.0
3.1.0
Region Name
East Asia
Eastern Europe,
including FSU
SESTAT Location (Geocode)
Geocode
SESTAT Location Name
Imputed
Race/Ethnicity
base on
Birthplace
242
Vietnam
NH Asian
249
Indochina, not specified
NH Asian
255
Southeast Asia, not specified
NH Asian
258
Timor-Leste
NH Asian
207
China
NH Asian
209
Hong Kong
NH Asian
215
Japan
NH Asian
217
Korea, not specified
NH Asian
218
South Korea
NH Asian
219
North Korea
NH Asian
223
Macao
NH Asian
226
Mongolia
NH Asian
238
Taiwan
NH Asian
247
East Asia, not specified
NH Asian
104
Bulgaria
NH white
105
Czechoslovakia or Czech Republic
NH white
117
Hungary
NH white
128
Poland
NH white
132
Romania
NH white
147
Yugoslavia
NH white
150
Eastern Europe, not specified
NH white
155
Slovakia
NH white
156
Serbia/Montenegro/Kosovo
NH white
157
Slovenia
NH white
158
Macedonia
NH white
159
Bosnia-Hercegovina
NH white
160
Croatia
NH white
180
USSR
NH white
181
Baltic states, not specified
NH white
182
Estonia
NH white
183
Latvia
NH white
184
Lithuania
NH white
185
Moldova
NH white
186
Belarus (Byelarus)
NH white
187
Russia
NH white
188
Kazakhstan
NH white
189
Armenia
NH white
190
Azerbaijan
NH white
191
Georgia
NH white
Prepared for NSF by NORC | 59
2013 SDR | Sample Design and Implementation
Appendix A.1 2013 SDR Birth Region Code Frame Mapped to SESTAT
Geocodes and Race/Ethnicity Imputation based on Birthplace
SDR Birth Region (BIREGION)
Code
3.2.0
3.3.0
3.4.0
Region Name
Central Europe
Western Europe
Northern Europe
SESTAT Location (Geocode)
Geocode
SESTAT Location Name
Imputed
Race/Ethnicity
base on
Birthplace
192
Uzbekistan
NH white
193
Ukraine
NH white
194
Tajikistan
NH white
195
Kyrgyzstan
NH white
196
Turkmenistan
NH white
102
Austria
NH white
110
Germany, not specified
NH white
111
West Germany
NH white
112
West Berlin
NH white
113
East Berlin
NH white
114
East Germany
NH white
120
Italy
NH white
122
Liechtenstein
NH white
124
Malta
NH white
146
Vatican City
NH white
149
Central Europe, not specified
NH white
103
Belgium
NH white
109
France
NH white
123
Luxembourg
NH white
125
Monaco
NH white
126
Netherlands
NH white
137
Switzerland
NH white
148
Europe, not specified
NH white
154
Western Europe, not specified
NH white
106
Denmark
NH white
107
Faroe Islands
NH white
108
Finland
NH white
118
Iceland
NH white
119
Ireland
NH white
121
Jan Mayen
NH white
127
Norway
NH white
135
Svalbard
NH white
136
Sweden
NH white
138
United Kingdom, not specified
NH white
139
England
NH white
140
Scotland
NH white
141
Wales
NH white
142
Northern Ireland
NH white
143
Guernsey
NH white
Prepared for NSF by NORC | 60
2013 SDR | Sample Design and Implementation
Appendix A.1 2013 SDR Birth Region Code Frame Mapped to SESTAT
Geocodes and Race/Ethnicity Imputation based on Birthplace
SDR Birth Region (BIREGION)
Code
3.5.0
4.0.0
5.0.0
Region Name
Southern Europe
South America
Caribbean
SESTAT Location (Geocode)
Geocode
SESTAT Location Name
Imputed
Race/Ethnicity
base on
Birthplace
144
Jersey
NH white
145
Isle of Man
NH white
151
Lapland, not specified
NH white
152
Northern Europe, not specified
NH white
100
Albania
NH white
101
Andorra
NH white
115
Gibraltar
NH white
116
Greece
NH white
129
Portugal
NH white
130
Azores Islands
NH white
131
Madeira Islands
NH white
133
San Marino
NH white
134
Spain
153
Southern Europe, not specified
375
Argentina
Hispanic white
376
Bolivia
Hispanic white
377
Brazil
NH white
378
Chile
Hispanic white
379
Colombia
Hispanic white
380
Ecuador
Hispanic white
381
Falkland Islands
NH white
382
French Guiana
Hispanic white
383
Guyana
384
Paraguay
Hispanic white
385
Peru
Hispanic white
386
Suriname
NH black
387
Uruguay
Hispanic white
388
Venezuela
Hispanic white
389
South America, not specified
Hispanic white
330
Anguilla
NH black
331
Antigua and Barbuda
NH black
332
Aruba
NH white
333
Bahamas
NH black
334
Barbados
NH black
335
British Virgin Islands
NH black
336
Cayman Islands
NH black
337
Cuba
338
Dominica
339
Dominican Republic
Hispanic white
NH white
NH black
Hispanic white
NH black
Hispanic white
Prepared for NSF by NORC | 61
2013 SDR | Sample Design and Implementation
Appendix A.1 2013 SDR Birth Region Code Frame Mapped to SESTAT
Geocodes and Race/Ethnicity Imputation based on Birthplace
SDR Birth Region (BIREGION)
Code
6.1.0
6.2.01
6.2.02
Region Name
Central America,
including Mexico
USA - Pacific
USA - Mountain
SESTAT Location (Geocode)
Geocode
SESTAT Location Name
Imputed
Race/Ethnicity
base on
Birthplace
340
Grenada
NH black
341
Guadeloupe
NH black
342
Haiti
NH black
343
Jamaica
NH black
344
Martinique
NH black
345
Montserrat
NH black
346
Netherlands Antilles
NH black
347
St. Barthelemy
NH black
348
St. Kitts-Nevis
NH black
349
St. Lucia
NH black
350
St. Vincent and the Grenadines
NH black
351
Trinidad and Tobago
NH black
352
Turks and Caicos Islands
NH black
353
Caribbean, not specified
NH black
354
Antilles, not specified
NH black
355
British West Indies, not specified
NH black
356
Latin America, not specified
357
Leeward Islands, not specified
NH black
358
West Indies, not specified
NH black
359
Windward Islands, not specified
310
Belize
Hispanic white
311
Costa Rica
Hispanic white
312
El Salvador
Hispanic white
313
Guatemala
Hispanic white
314
Honduras
Hispanic white
315
Mexico
Hispanic white
316
Nicaragua
Hispanic white
317
Panama
Hispanic white
318
Central America, not specified
Hispanic white
002
Alaska
006
California
NH white
015
Hawaii
NH Asian
041
Oregon
NH white
053
Washington
NH white
093
Pacific region, state suppressed
NH white
004
Arizona
NH white
008
Colorado
NH white
016
Idaho
NH white
030
Montana
NH white
Hispanic white
NH black
NH white
Prepared for NSF by NORC | 62
2013 SDR | Sample Design and Implementation
Appendix A.1 2013 SDR Birth Region Code Frame Mapped to SESTAT
Geocodes and Race/Ethnicity Imputation based on Birthplace
SDR Birth Region (BIREGION)
Code
6.2.03
6.2.04
6.2.05
6.2.06
6.2.07
Region Name
USA - West South
Central
USA - East South
Central
USA - South
Atlantic
USA - West North
Central
USA - East North
Central
Imputed
Race/Ethnicity
base on
Birthplace
SESTAT Location (Geocode)
Geocode
SESTAT Location Name
032
Nevada
NH white
035
New Mexico
NH white
049
Utah
NH white
056
Wyoming
NH white
092
Mountain region, state suppressed
NH white
005
Arkansas
NH white
022
Louisiana
NH white
040
Oklahoma
NH white
048
Texas
NH white
091
West South Central region, state suppressed
NH white
001
Alabama
NH white
021
Kentucky
NH white
047
Tennessee
NH white
090
East South Central region, state suppressed
NH white
028
Mississippi
NH white
010
Delaware
NH white
011
District of Columbia
NH white
012
Florida
NH white
013
Georgia
NH white
024
Maryland
NH white
037
North Carolina
NH white
045
South Carolina
NH white
051
Virginia
NH white
054
West Virginia
NH white
089
South Atlantic region, state suppressed
NH white
019
Iowa
NH white
020
Kansas
NH white
027
Minnesota
NH white
029
Missouri
NH white
031
Nebraska
NH white
038
North Dakota
NH white
046
South Dakota
NH white
088
West North Central region, state suppressed
NH white
017
Illinois
NH white
018
Indiana
NH white
026
Michigan
NH white
039
Ohio
NH white
055
Wisconsin
NH white
087
East North Central region, state suppressed
NH white
Prepared for NSF by NORC | 63
2013 SDR | Sample Design and Implementation
Appendix A.1 2013 SDR Birth Region Code Frame Mapped to SESTAT
Geocodes and Race/Ethnicity Imputation based on Birthplace
SDR Birth Region (BIREGION)
Code
6.2.08
6.2.09
6.2.10
6.3.0
7.0.0
Region Name
USA - Middle
Atlantic
USA - New
England
USA - Territories
Northern North
America
Oceania
Imputed
Race/Ethnicity
base on
Birthplace
SESTAT Location (Geocode)
Geocode
SESTAT Location Name
034
New Jersey
NH white
036
New York
NH white
042
Pennsylvania
NH white
086
Middle Atlantic region, state suppressed
NH white
009
Connecticut
NH white
023
Maine
NH white
025
Massachusetts
NH white
033
New Hampshire
NH white
044
Rhode Island
NH white
050
Vermont
NH white
085
New England region, state suppressed
NH white
060
American Samoa
NH white
066
Guam
NH white
067
Johnston Atoll
NH white
069
Northern Mariana Islands
NH white
071
Midway Islands
NH white
072
Puerto Rico
076
Navassa Island
NH white
078
U.S. Virgin Islands
NH white
079
Wake Island
NH white
081
Baker Island
NH white
082
Howland Island
NH white
083
Jarvis Island
NH white
084
Kingman Reef
NH white
095
Palmyra Atoll
NH white
096
U.S. State or Territory (Puerto Rico and Island Areas)
NH white
300
Bermuda
NH black
301
Canada
NH white
302
Greenland
NH native
303
St. Pierre and Miquelon
NH black
304
North America, not specified
NH white
500
Ashmore and Cartier Islands
NH white
501
Australia
NH white
502
Christmas Island, Indian Ocean
NH white
503
Clipperton Island
NH white
504
Cocos Islands
NH white
505
Cook Islands
NH white
506
Coral Sea Islands
NH white
507
Fiji
NH white
Hispanic white
Prepared for NSF by NORC | 64
2013 SDR | Sample Design and Implementation
Appendix A.1 2013 SDR Birth Region Code Frame Mapped to SESTAT
Geocodes and Race/Ethnicity Imputation based on Birthplace
SDR Birth Region (BIREGION)
Code
7.1.0
8.2.0
Region Name
At sea/abroad
Missing
SESTAT Location (Geocode)
Geocode
SESTAT Location Name
Imputed
Race/Ethnicity
base on
Birthplace
508
French Polynesia
NH white
509
Kiribati
NH white
510
Marshall Islands
NH white
511
Micronesia
NH white
512
Nauru
NH white
513
New Caledonia
NH white
514
New Zealand
NH white
515
Niue
NH white
516
Norfolk Island
NH white
517
Palau
NH white
518
Papua New Guinea
NH white
519
Pitcairn Islands
NH white
520
Solomon Islands
NH white
521
Tokelau
NH white
522
Tonga
NH white
523
Tuvalu
NH white
524
Vanuatu
NH white
525
Wallis and Futuna Islands
NH white
526
Western Samoa
NH white
527
Oceania, not specified
NH white
528
Polynesia, not specified
NH white
529
Melanesia, not specified
NH white
550
Antarctica
NH white
551
Bouvet Island
NH white
552
French Southern and Antarctic Lands
NH white
553
Heard and McDonald Islands
NH white
554
At Sea
NH white
555
Abroad, not specified
NH white
999
Missing/Unknown
NH white
FSU = Former Soviet Union country; NH = non-Hispanic.
Prepared for NSF by NORC | 65
2013 SDR | Sample Design and Implementation
Appendix A.2 2013 SDR Field of Study Coding Taxonomies Crosswalk
3-level FOS
FOD3
Code
1
Label
Computer
and
information
sciences,
mathematics,
physical
sciences,
engineering
7-level FOS
MAJFLD7
Code
1
Label
Computer
and math
sciences
8-level DST FOS
DSTFLD8
Code
1
2
Label
Computer
and
information
sciences
Mathematics
and statistics
15-level FOS
SDRFLD15
Code
5
4
Label
Computer/
information
sciences
Mathematics
Code
D67
5
Physical
sciences
1
Chemistry
Code
Label
400
Computer Science
410
Information Science/Systems
415
Robotics
419
Computer/Information Sciences, Other
Applied mathematics
420
Applied Mathematics
842
Mathematics, general
498
Mathematics/Statistics, General
843
Operations research
363
Operations Research
465
Operations Research
930
Operations Research
450
Statistics
690
Statistics
425
Algebra
430
Analysis & Functional Analysis
435
Geometry/Geometric Analysis
440
Logic
445
Number Theory
455
Topology/Foundations
460
Computing Theory & Practice
499
Mathematics/Statistics, Other
520
Analytical Chemistry
521
Agricultural/Food
522
Inorganic Chemistry
524
Nuclear Chemistry
526
Organic Chemistry
528
Medicinal/Pharmaceutical
530
Physical Chemistry
532
Polymer Chemistry
534
Theoretical Chemistry
845
Physical
sciences
Computer/information sciences
DRF FOS
PHDFIELD
841
844
4
SESTAT FOS
NSDRMED13
Label
873
Statistics
OTHER mathematics
Chemistry, except biochemistry
Prepared for NSF by NORC | 66
2013 SDR | Sample Design and Implementation
Appendix A.2 2013 SDR Field of Study Coding Taxonomies Crosswalk
3-level FOS
FOD3
Code
Label
7-level FOS
MAJFLD7
Code
Label
8-level DST FOS
DSTFLD8
Code
Label
15-level FOS
SDRFLD15
Code
2
Label
Physics/
astronomy
Code
871
878
3
Earth/ocean/
atmospheric
sciences
872
SESTAT FOS
NSDRMED13
Label
Astronomy and astrophysics
Physics, except biophysics
Atmospheric sciences and
meteorology
DRF FOS
PHDFIELD
Code
Label
538
Chemistry, General
539
Chemistry, Other
500
Astronomy
505
Astrophysics
506
Astronomy/Astrophysics
509
Astronomy, Other
560
Acoustics
561
Atomic/Molecular/Chemical Physics
562
Electron Physics
563
Electromagnetism
564
Particle (Elementary) Physics
565
Biophysics
566
Fluids
567
Mechanics
568
Nuclear Physics
569
Optics/Phototonics
570
Plasma/Fusion Physics
572
Polymer Physics
573
Thermal Physics
574
Condensed Matter/Low Temperature Physics
575
Theoretical Physics
576
Applied Physics
577
Medical Physics/Radiological Science
578
Physics, General
579
Physics, Other
510
Atmospheric Chemistry & Climatology
512
Atmospheric Physics & Dynamics
514
Meteorology
Prepared for NSF by NORC | 67
2013 SDR | Sample Design and Implementation
Appendix A.2 2013 SDR Field of Study Coding Taxonomies Crosswalk
3-level FOS
FOD3
Code
Label
7-level FOS
MAJFLD7
Code
Label
8-level DST FOS
DSTFLD8
Code
Label
15-level FOS
SDRFLD15
Code
Label
Code
875
876
7
Engineering
8
Engineering
14
15
Electrical/
electronic/
communications
engineering
Other
engineering
SESTAT FOS
NSDRMED13
Label
Geology
Geological sciences, other
DRF FOS
PHDFIELD
Code
Label
518
Atmospheric Science/Meteorology, General
519
Atmospheric Science/Meteorology, Other
540
Geology
548
Mineralogy & Petrology
549
Mineralogy/Petrology/Geological Chemistry
550
Stratigraphy & Sedimentation
552
Geomorphology & Glacial Geology
554
Applied geology
555
Applied Geology/Geological Engineering
542
Geochemistry
544
Geophysics & Seismology
545
Geophysics, Solid Earth
546
Paleontology
547
Fuel Technology/Petroleum Engineering
558
Geological & Earth Sciences, General
559
Geological & Earth Sciences, Other
877
Oceanography
590
Oceanography, Chemical & Physical
D87
Earth sciences/other physical
sciences
585
Hydrology & Water Resources
595
Marine Sciences
599
Ocean/Marine, Other
321
Computer Engineering
372
Systems Engineering
318
Communications Engineering
322
Electrical Engineering
323
Electronics Engineering
324
Electrical, Electronics & Communications Engineering
300
Aerospace, Aeronautical & Astronautical
727
Computer and systems engineering
728
Electrical, electronics and
communications engineering
721
Aerospace, aeronautical,
astronautical engineering
Prepared for NSF by NORC | 68
2013 SDR | Sample Design and Implementation
Appendix A.2 2013 SDR Field of Study Coding Taxonomies Crosswalk
3-level FOS
FOD3
Code
Label
7-level FOS
MAJFLD7
Code
Label
8-level DST FOS
DSTFLD8
Code
Label
15-level FOS
SDRFLD15
Code
Label
Code
SESTAT FOS
NSDRMED13
Label
303
Agricultural Engineering
724
306
Bioengineering & Biomedical Engineering
725
Bioengineering and biomedical
engineering
Chemical engineering
312
Chemical Engineering
726
Civil engineering
315
Civil Engineering
316
Structural Engineering
327
Engineering Mechanics
330
Engineering Physics
Engineering sciences, mechanics
and physics
333
Engineering Science
730
Environmental engineering
336
Environmental Health Engineering
731
Engineering, general
398
Engineering, General
733
Industrial and manufacturing
engineering
Materials engineering, including
ceramics and textiles
339
Industrial & Manufacturing Engineering
309
Ceramic Sciences Engineering
342
Materials Science Engineering
369
Polymer & Plastics Engineering
735
Mechanical engineering
345
Mechanical Engineering
736
Metallurgical engineering
348
Metallurgical Engineering
737
Mining and minerals engineering
351
Mining & Mineral Engineering
738
354
Naval Architecture/Marine Engineering
739
Naval architecture and marine
engineering
Nuclear engineering
357
Nuclear Engineering
740
Petroleum engineering
366
Petroleum Engineering
741
OTHER engineering
376
Engineering Management & Administration
360
Ocean Engineering
375
Textile Engineering
399
Engineering, Other
005
Agricultural Animal Breeding
007
Animal Husbandry
D74
2
Biological,
agricultural,
3
Biological,
agricultural,
6
Agricultural
sciences
Label
Agricultural engineering
734
Biological
and
Code
722
729
2
DRF FOS
PHDFIELD
605
Animal sciences
Prepared for NSF by NORC | 69
2013 SDR | Sample Design and Implementation
Appendix A.2 2013 SDR Field of Study Coding Taxonomies Crosswalk
3-level FOS
FOD3
Code
Label
agricultural
sciences,
health
sciences
7-level FOS
MAJFLD7
Code
Label
and
environmental
life sciences
8-level DST FOS
DSTFLD8
Code
Label
15-level FOS
SDRFLD15
Code
Label
Code
SESTAT FOS
NSDRMED13
Label
and
environmental
life sciences
606
607
608
680
681
Food sciences and technology
Plant sciences
OTHER agricultural sciences
Environmental science or studies
Forestry sciences
DRF FOS
PHDFIELD
Code
Label
010
Animal Nutrition
012
Dairy Science
014
Animal Science, Poultry (or Avian)
019
Animal Science, Other
040
Food Sciences
042
Food Distribution
043
Food Science
044
Food Science & Technology, Other
020
Agronomy & Crop Science
023
Agricultural & Horticultural Plant Breeding
025
Agricultural & Horticultural Plant Breeding (2010 & 2011)
025
Plant Breeding/Genetics (1920-2009)
030
Plant Pathology/Phytopathology
032
Plant Protection/Pest Management
039
Plant Sciences, Other
050
Horticulture Science
045
Soil Sciences
046
Soil Chemistry/Microbiology
049
Soil Sciences, Other
098
Agriculture, General
099
Agricultural Science, Other
054
Fish and Wildlife Science
055
Fishing & Fisheries Sciences/Management
580
Environmental Science
081
Environmental Science
060
Wildlife
065
Forestry Science
066
Forest Sciences & Biology
Prepared for NSF by NORC | 70
2013 SDR | Sample Design and Implementation
Appendix A.2 2013 SDR Field of Study Coding Taxonomies Crosswalk
3-level FOS
FOD3
Code
Label
7-level FOS
MAJFLD7
Code
Label
8-level DST FOS
DSTFLD8
Code
Label
15-level FOS
SDRFLD15
Code
8
Label
NIH biological
sciences
Code
Code
Label
068
Forest Engineering
070
Forest/Resources Management
072
Wood Science & Pulp/Paper Technology
074
Natural Resources/Conservation
079
Forestry & Related Science, Other
080
Wildlife/Range Management
Biochemistry and biophysics
100
Biochemistry
633
Botany
125
Plant Physiology
636
Genetics, animal and plant
115
Plant Genetics
170
Genetics/Genomics, Human & Animal
171
Genetics
110
Bacteriology
156
Microbiology/Bacteriology
157
Microbiology
168
Virology
Microbiological sciences and
immunology
639
Pharmacology, human and animal
180
Pharmacology, Human & Animal
640
Physiology and pathology, human
and animal
158
Cancer Biology
175
Pathology, Human & Animal
185
Physiology, Human & Animal
186
Animal/Plant Physiology
130
Anatomy
137
Evolutionary Biology
160
Neurosciences
642
Other biological
sciences
DRF FOS
PHDFIELD
631
637
9
SESTAT FOS
NSDRMED13
Label
OTHER biological sciences
166
Parasitology
631
Biochemistry and biophysics
105
Biophysics
632
Biology, general
198
Biology/Biomedical Sciences, General
633
Botany
120
Plant Pathology/Phytopathology
129
Botany/Plant Biology
Prepared for NSF by NORC | 71
2013 SDR | Sample Design and Implementation
Appendix A.2 2013 SDR Field of Study Coding Taxonomies Crosswalk
3-level FOS
FOD3
Code
Label
7-level FOS
MAJFLD7
Code
3
Label
Health
8-level DST FOS
DSTFLD8
Code
4
Label
Health
15-level FOS
SDRFLD15
Code
7
Label
Medical
sciences
Code
SESTAT FOS
NSDRMED13
Label
DRF FOS
PHDFIELD
Code
Label
136
Cell/Cellular Biology & Histology
154
Molecular Biology
139
Ecology
Nutritional sciences
163
Nutrition Sciences
641
Zoology, general
148
Entomology
189
Zoology, Other
642
OTHER biological sciences
102
Bioinformatics
103
Biomedical Sciences
104
Computational Biology
107
Biotechnology
133
Biometrics & Biostatistics
140
Hydrobiology
142
Developmental Biology/Embryology
145
Endocrinology
151
Immunology
155
Structural Biology
167
Environmental Toxicology
169
Toxicology
199
Biology/Biomedical Sciences, Other
634
Cell and molecular biology
635
Ecology
638
781
Audiology and speech pathology
200
Speech-Language Pathology & Audiology
782
Health services administration
212
Health Systems/Service Administration
786
Medicine (e.g., dentistry, optometry,
osteopathic, podiatry, veterinary)
205
Dentistry
207
Oral Biology/Oral Pathology
225
Medical/Surgery
235
Optometry/Ophthalmology
250
Veterinary Sciences
787
Nursing (4 years or longer program)
230
Nursing Science
788
Pharmacy
240
Medicinal/Pharmaceutical Sciences
Prepared for NSF by NORC | 72
2013 SDR | Sample Design and Implementation
Appendix A.2 2013 SDR Field of Study Coding Taxonomies Crosswalk
3-level FOS
FOD3
Code
Label
7-level FOS
MAJFLD7
Code
Label
8-level DST FOS
DSTFLD8
Code
Label
15-level FOS
SDRFLD15
Code
Label
Code
789
Physical therapy and other
rehabilitation/ therapeutic services
790
Public health (including
environmental health and
epidemiology)
791
3
Psychology,
social
sciences
5
Social
sciences
6
Social
sciences
11
12
13
Economics
Anthropology/
archeology/
sociology
Other social
sciences
SESTAT FOS
NSDRMED13
Label
OTHER health/medical sciences
DRF FOS
PHDFIELD
Code
Label
245
Rehabilitation/Therapeutic Services
210
Environmental Health
211
Environmental Toxicology
215
Public Health
219
Public Health/Epidemiology
220
Epidemiology
222
Kinesiology/Exercise Science
224
Hospital Administration
227
Gerontology
298
Health Sciences, General
299
Health Sciences, Other
601
Agricultural economics
000
Agricultural Economics
923
Economics
666
Economics
667
Economics
668
Econometrics
650
Anthropology
921
Anthropology and archaeology
773
Archaeology
922
Criminology
658
Criminology
929
Sociology
686
Sociology
620
Area and ethnic studies
652
Area /Ethnic/Cultural/Gender Studies
770
American/U.S. Studies
771
Linguistics
676
Linguistics
729
Linguistics
902
Public policy studies
682
Public Policy Analysis
924
Geography
670
Geography
925
History of science
710
History, Science & Technology & Society
927
International relations
674
International Relations/Affairs
Prepared for NSF by NORC | 73
2013 SDR | Sample Design and Implementation
Appendix A.2 2013 SDR Field of Study Coding Taxonomies Crosswalk
3-level FOS
FOD3
Code
Label
7-level FOS
MAJFLD7
Code
6
Label
Psychology
8-level DST FOS
DSTFLD8
Code
7
Label
Psychology
15-level FOS
SDRFLD15
Code
10
Label
Psychology
Code
SESTAT FOS
NSDRMED13
Label
928
Political science and government
930
OTHER social sciences
704
Educational psychology
DRF FOS
PHDFIELD
Code
Label
678
Political Science & Government
679
Political Science/Public Administration
662
Demography/Population Studies
684
Gerontology
694
Urban Affairs/Studies
698
Social Sciences, General
699
Social Sciences, Other
618
Educational Psychology
822
Educational Psychology
891
Clinical psychology
600
Clinical Psychology
892
Counseling psychology
609
Counseling
893
Experimental psychology
615
Experimental Psychology
894
General psychology
648
Psychology, General
895
Industrial/Organizational psychology
621
Industrial & Organizational
896
Social psychology
639
Social Psychology
897
OTHER psychology
603
Cognitive Psychology & Psycholinguistics
606
Comparative Psychology
612
Developmental & Child Psychology
613
Human Development & Family Studies
616
Experimental/Comparative Psychology/Physiology
619
Human Engineering
620
Family Psychology
624
Personality Psychology
627
Physiological/Psychobiology Psychology
630
Psychometrics
633
Psychometrics & Quantitative Psychology
636
School Psychology
649
Psychology, Other
Prepared for NSF by NORC | 74
2013 SDR | Sample Design and Implementation
Appendix A.2 2013 SDR Field of Study Coding Taxonomies Crosswalk
3-level FOS
FOD3
Code
Label
7-level FOS
MAJFLD7
Code
Label
8-level DST FOS
DSTFLD8
Code
15-level FOS
SDRFLD15
Label
Code
Label
Not applicable; non-SEH field
Code
SESTAT FOS
NSDRMED13
Label
DRF FOS
PHDFIELD
Code
Label
002
Agricultural Business/Management
657
Criminal Justice & Corrections
695
Urban/City, Community & Regional Planning
700
American History (U.S. & Canada)
703
Asian History
705
European History
706
African History
707
Latin American History
708
Middle/Near East Studies
718
History, General
719
History, Other
720
Classics
723
Comparative Literature
724
Folklore
725
English and American Literature
726
English Language and Literature
732
American Literature (U.S. & Canada)
733
English Literature (British & Commonwealth)
734
English Language
735
Creative Writing
736
Speech & Rhetorical Studies
738
Letters, General
739
Letters, Other
740
French
743
German
746
Italian
749
Spanish
752
Russian
Prepared for NSF by NORC | 75
2013 SDR | Sample Design and Implementation
Appendix A.2 2013 SDR Field of Study Coding Taxonomies Crosswalk
3-level FOS
FOD3
Code
Label
7-level FOS
MAJFLD7
Code
Label
8-level DST FOS
DSTFLD8
Code
Label
15-level FOS
SDRFLD15
Code
Label
Code
SESTAT FOS
NSDRMED13
Label
DRF FOS
PHDFIELD
Code
Label
755
Slavic (other than Russian)
758
Chinese
762
Japanese
765
Hebrew
768
Arabic
769
Other Languages & Literature
774
Art, Applied
775
Art, Fine/Applied
776
Art History/Criticism/Conservation
780
Music
785
Philosophy
786
Music Theory & Composition
787
Music Performance
788
Musicology/Ethnomusicology
789
Music, Other
790
Religion/Religious Studies
791
Religion and Theology
792
Bible/Biblical Studies
795
Drama/Theater Arts
798
Humanities, General
799
Humanities, Other
800
Curriculum & Instruction
805
Educational Administration & Supervision
806
Urban Education and Leadership
807
Educational Leadership
808
Educational Policy Analysis
810
Educational/Instructional Media Design
814
Educational Measurement & Statistics
Prepared for NSF by NORC | 76
2013 SDR | Sample Design and Implementation
Appendix A.2 2013 SDR Field of Study Coding Taxonomies Crosswalk
3-level FOS
FOD3
Code
Label
7-level FOS
MAJFLD7
Code
Label
8-level DST FOS
DSTFLD8
Code
Label
15-level FOS
SDRFLD15
Code
Label
Code
SESTAT FOS
NSDRMED13
Label
DRF FOS
PHDFIELD
Code
Label
815
Educational Statistics/Research Methods
820
Educational Assessment/Testing/Measurement
825
School Psychology
830
Social/Philosophical Foundations of Education
833
International Education
835
Special Education
840
Counseling Education/Counseling & Guidance
845
Higher Education/Evaluation & Research
850
Pre-elementary/Early Childhood Teacher Education
852
Elementary Teacher Education
854
Jr. High Education
856
Secondary Teacher Education
858
Adult & Continuing Teacher Education
860
Agricultural Education
861
Art Education
862
Business Education
864
English Education
866
Foreign Languages Education
867
Physical Education, Health and Recreation
868
Health Education
870
Family & Consumer/Human Science
872
Technical & Industrial Arts Education
874
Mathematics Education
876
Music Education
878
Nursing Education
880
Physical Education & Coaching
882
Reading Education
884
Science Education
Prepared for NSF by NORC | 77
2013 SDR | Sample Design and Implementation
Appendix A.2 2013 SDR Field of Study Coding Taxonomies Crosswalk
3-level FOS
FOD3
Code
Label
7-level FOS
MAJFLD7
Code
Label
8-level DST FOS
DSTFLD8
Code
Label
15-level FOS
SDRFLD15
Code
Label
Code
SESTAT FOS
NSDRMED13
Label
DRF FOS
PHDFIELD
Code
Label
885
Social Science Education
886
Speech Education
887
Technical Education
888
Trade & Industrial Education
889
Teacher Education & Professional Development
898
Education, General
899
Education, Other
900
Accounting
901
Finance
905
Banking/Financial Support Services
910
Business Administration & Management
912
Hospitality, Food Service and Tourism Management
915
Business/Managerial Economics
916
International Business/Trade/Commerce
917
Management Information Systems/Business Statistics
920
Marketing Management & Research
921
Human Resources Development
925
Business Statistics
935
Organizational Behavior
938
Business Management/Administration, General
939
Business Management/Administration, Other
940
Communication Research
945
Journalism
947
Mass Communication/Media Studies
950
Film, Radio, TV & Digital Communication
957
Communication Theory
958
Communication, General
959
Communication, Other
Prepared for NSF by NORC | 78
2013 SDR | Sample Design and Implementation
Appendix A.2 2013 SDR Field of Study Coding Taxonomies Crosswalk
3-level FOS
FOD3
Code
Label
7-level FOS
MAJFLD7
Code
Label
8-level DST FOS
DSTFLD8
Code
Label
15-level FOS
SDRFLD15
Code
Label
Code
SESTAT FOS
NSDRMED13
Label
DRF FOS
PHDFIELD
Code
Label
960
Architecture/Environmental Design
964
Family/Consumer Science/Human Science
968
Law
972
Library Science
974
Parks/Sports/Rec./Leisure/Fitness
976
Public Administration
980
Social Work
984
Theology/Religious Education
988
Professional Fields, General
989
Other Fields, NEC
999
Unknown Field
DRF = Doctorate Records File; DST = detailed statistical table; FOS = field of study.
NOTE: PHDFIELD degrees shown in highlight were added to the Survey of Earned doctorates field of study taxonomy in the 2010 cycle.
Prepared for NSF by NORC | 79
2013 SDR | Sample Design and Implementation
Appendix A.3 2013 SDR Data Sources Used to Develop Sampling Frame
Variables
Data Source Flag Code
Data Source Flag Code Description
10
DRF, reported data for variable
11
Citizenship imputed from DRF with BIRTHPL and PDLOC
12
DRF, data updated in DRF after used for SDR sample
13
Postdoc location imputed from DRF with PDEMPLOY
14
Sex assigned from name in DRF
20
Hispanic surname list
21
Asian surname list
22
Race reconciliation file (Not Imputed Data)
23
Race reconciliation file (Imputed Data)*
24
PI Category Not Available, Other Race Selected
26
Sex assigned during frame processing (Not Imputed Data)
27
Race/ethnicity imputed from birth place
28
Race/ethnicity imputed to default modal assignment (NH white)
29
SDR data delivery hot-deck imputation
30
Pre-1991 source data (Not Imputed Data)
31
2001 SDR sampling file (Not Imputed Data)
32
2003 chronic unlocatables (Not Imputed Data)
33
Master 2003 data base (Not Imputed Data)
34
Survey administration data
35
Permanent Ineligible database
40
SDR 1993
41
SDR 1995
42
SDR 1997
43
SDR 1999
44
SDR 2001
4X
SDR 2001, reported data used for Hispanic indicator
45
SDR 2003
46
SDR 2006
47
SDR 2008
48
SDR 2010
80*
Age imputed from PhD year, degree earned at 21 years
81*
Age imputed from BA year, degree earned at 18 years
90
ISEX: Missing data, imputed female
91
IHCAPIN: Missing data, imputed not handicapped
92
IBIRCIT: Missing data, imputed not native born
93
ICURCIT: Missing data, imputed current U.S. citizen
94
IPDUS: Missing data, imputed staying in U.S.
Prepared for NSF by NORC | 80
2013 SDR | Sample Design and Implementation
Appendix A.3 2013 SDR Data Sources Used to Develop Sampling Frame
Variables
Data Source Flag Code
Data Source Flag Code Description
95
IHSPIN: Missing data, imputed not Hispanic
96
ILOCSTAT: Missing data, imputed to in U.S.
99
Missing data
* The birth year imputation rules assume that sample members earned degrees at an age somewhat lower than average for the
population. This is intentional so that we minimize any sample undercoverage caused by eliminating doctorates with missing birth
year’s that may have earned a degree at a young age. During data collection, every effort is made to collect date of birth from sample
members with an imputed birth date to confirm their eligibility for the sample, and in the next survey cycle the unimputed data replace
the imputed birth year estimate in frame construction.
Prepared for NSF by NORC | 81
2013 SDR | Sample Design and Implementation
Appendix B.1 2013 NSDR Strata and Frame Counts
Field of Degree
Old
Cohort
Sample
Cases
2010 New
Cohort
Population
Size
2011 New
Cohort
Population
Size
Total
Frame
Size
Stratum
Demographic Group
Gender
1
Hispanic
Male
Computer/math
101
76
99
276
2
3
Hispanic
Hispanic
Male
Male
Biological and agri. sci.
Health sci.
338
60
251
18
251
33
840
111
4
5
Hispanic
Hispanic
Male
Male
Physical and related sci.
Social sci.
237
191
126
141
140
150
503
482
6
7
Hispanic
Hispanic
Male
Male
Psychology
Engineering
144
266
64
223
79
240
287
729
8
9
Hispanic
Hispanic
Female
Female
Computer/math
Biological and agri. sci.
57
309
23
273
18
297
98
879
10
11
Hispanic
Hispanic
Female
Female
Health sci.
Physical and related sci.
82
99
55
78
61
79
198
256
12
13
Hispanic
Hispanic
Female
Female
Social sci.
Psychology
187
346
126
178
173
243
486
767
14
Hispanic
Female
Engineering
76
100
85
261
15
NH Black
Male
Computer/math
81
60
54
195
16
17
NH Black
NH Black
Male
Male
Biological and agri. sci.
Health sci.
275
77
139
30
155
45
569
152
18
19
NH Black
NH Black
Male
Male
Physical and related sci.
Social sci.
178
299
87
109
90
116
355
524
20
21
NH Black
NH Black
Male
Male
Psychology
Engineering
128
235
38
147
56
132
222
514
22
23
NH Black
NH Black
Female
Female
Computer/math
Biological and agri. sci.
61
253
24
213
29
210
114
676
24
25
NH Black
NH Black
Female
Female
Health sci.
Physical and related sci.
153
69
129
57
138
61
420
187
26
27
NH Black
NH Black
Female
Female
Social sci.
Psychology
252
294
143
187
165
183
560
664
28
NH Black
Female
Engineering
70
64
65
199
29
U.S. Born, NH Asian
Male
Computer/math
71
61
50
182
30
31
U.S. Born, NH Asian
U.S. Born, NH Asian
Male
Male
Biological and agri. sci.
Health sci.
255
56
137
21
146
22
538
99
32
33
U.S. Born, NH Asian
U.S. Born, NH Asian
Male
Male
Physical and related sci.
Social sci.
143
70
64
41
62
42
269
153
34
35
U.S. Born, NH Asian
U.S. Born, NH Asian
Male
Male
Psychology
Engineering
64
201
26
150
28
156
118
507
36
U.S. Born, NH Asian
Female
Computer/math
54
17
20
91
37
38
U.S. Born, NH Asian
U.S. Born, NH Asian
Female
Female
Biological and agri. sci.
Health sci.
270
62
197
37
180
34
647
133
39
40
U.S. Born, NH Asian
U.S. Born, NH Asian
Female
Female
Physical and related sci.
Social sci.
75
81
41
56
49
52
165
189
41
U.S. Born, NH Asian
Female
Psychology
145
84
87
316
Prepared for NSF by NORC | 82
2013 SDR | Sample Design and Implementation
Appendix B.1 2013 NSDR Strata and Frame Counts
Field of Degree
Old
Cohort
Sample
Cases
2010 New
Cohort
Population
Size
2011 New
Cohort
Population
Size
74
81
59
214
Total
Frame
Size
Stratum
Demographic Group
Gender
42
U.S. Born, NH Asian
Female
43
NH American Indian
Male
All Fields
170
89
98
357
44
NH American Indian
Female
All Fields
158
108
109
375
45
NH Pacific Islander
Male
All fields
63
30
41
134
46
NH Pacific Islander
Female
All fields
67
38
28
133
47
U.S. Born, Disabled NH White
Male
Computer/math
112
39
25
176
48
U.S. Born, Disabled NH White
Male
Biological and agri. sci.
366
83
94
543
49
50
U.S. Born, Disabled NH White
U.S. Born, Disabled NH White
Male
Male
Health sci.
Physical and related sci.
53
307
13
56
19
74
85
437
51
52
U.S. Born, Disabled NH White
U.S. Born, Disabled NH White
Male
Male
Social sci.
Psychology
213
186
56
39
43
39
312
264
53
U.S. Born, Disabled NH White
Male
Engineering
193
64
66
323
54
U.S. Born, Disabled NH White
Female
Computer/math
30
10
13
53
55
56
U.S. Born, Disabled NH White
U.S. Born, Disabled NH White
Female
Female
Biological and agri. sci.
Health sci.
122
66
76
42
89
27
287
135
57
58
U.S. Born, Disabled NH White
U.S. Born, Disabled NH White
Female
Female
Physical and related sci.
Social sci.
44
125
21
26
28
52
93
203
59
U.S. Born, Disabled NH White
Female
Psychology
159
69
60
288
60
U.S. Born, Disabled NH White
Female
Engineering
29
9
9
47
61
U.S. Born, Not Disabled NH White
Male
Chemistry
1,500
589
591
2,680
62
U.S. Born, Not Disabled NH White
Male
Physics/astronomy
1,048
514
606
2,168
63
64
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Male
Male
Earth/ocean/atmos.
Math
470
696
239
428
247
441
956
1,565
65
66
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Male
Male
Computer/info. sci.
Agricultural sci.
277
604
372
228
378
241
1,027
1,073
67
68
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Male
Male
Medical sci.
NIH bio sci.
326
1,462
233
913
258
894
817
3,269
69
70
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Male
Male
Other bio sci.
Psychology
1,386
1,675
852
662
815
639
3,053
2,976
71
72
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Male
Male
Economics
Anthro/arch/sociology
511
449
199
257
206
286
916
992
73
74
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Male
Male
Other social sci.
Electrical/electron/comm.
694
552
502
336
441
342
1,637
1,230
75
U.S. Born, Not Disabled NH White
Male
Other engineering
1,639
1,121
1,116
3,876
76
U.S. Born, Not Disabled NH White
Female
Chemistry
400
320
343
1,063
77
78
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Female
Female
Physics/astronomy
Earth/ocean/atmos.
109
146
138
178
133
160
380
484
79
80
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Female
Female
Math
Computer/info. sci.
145
68
154
79
142
74
441
221
81
U.S. Born, Not Disabled NH White
Female
Agricultural sci.
174
193
224
591
Engineering
Prepared for NSF by NORC | 83
2013 SDR | Sample Design and Implementation
Appendix B.1 2013 NSDR Strata and Frame Counts
Field of Degree
Old
Cohort
Sample
Cases
2010 New
Cohort
Population
Size
2011 New
Cohort
Population
Size
Total
Frame
Size
Stratum
Demographic Group
Gender
82
U.S. Born, Not Disabled NH White
Female
Medical sci.
655
757
739
2,151
83
84
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Female
Female
NIH bio sci.
Other bio sci.
834
876
963
886
941
927
2,738
2,689
85
86
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Female
Female
Psychology
Economics
2,010
132
1,445
98
1,568
64
5,023
294
87
88
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Female
Female
Anthro/arch/sociology
Other social sci.
460
427
415
400
421
421
1,296
1,248
89
U.S. Born, Not Disabled NH White
Female
Electrical/electron/comm.
65
39
31
135
90
U.S. Born, Not Disabled NH White
Female
Other engineering
237
368
375
980
91
Non-U.S. born, NH White
Male
Chemistry
143
137
140
420
92
Non-U.S. born, NH White
Male
Physics/astronomy
188
206
240
634
93
94
Non-U.S. born, NH White
Non-U.S. born, NH White
Male
Male
Earth/ocean/atmos.
Math
65
143
68
172
52
157
185
472
95
96
Non-U.S. born, NH White
Non-U.S. born, NH White
Male
Male
Computer/info. sci.
Agricultural sci.
96
66
197
54
226
45
519
165
97
98
Non-U.S. born, NH White
Non-U.S. born, NH White
Male
Male
Medical sci.
NIH bio sci.
68
145
88
193
69
221
225
559
99
100
Non-U.S. born, NH White
Non-U.S. born, NH White
Male
Male
Other bio sci.
Psychology
116
92
181
122
177
127
474
341
101
102
Non-U.S. born, NH White
Non-U.S. born, NH White
Male
Male
Economics
Anthro/arch/sociology
94
58
119
48
124
47
337
153
103
104
Non-U.S. born, NH White
Non-U.S. born, NH White
Male
Male
Other social sci.
Electrical/electron/comm.
82
214
115
267
130
324
327
805
105
Non-U.S. born, NH White
Male
Other engineering
420
475
552
1,447
106
Non-U.S. born, NH White
Female
Chemistry
76
93
90
259
107
108
Non-U.S. born, NH White
Non-U.S. born, NH White
Female
Female
Physics/astronomy
Earth/ocean/atmos.
77
66
61
43
45
37
183
146
109
110
Non-U.S. born, NH White
Non-U.S. born, NH White
Female
Female
Math
Computer/info. sci.
81
74
62
68
69
56
212
198
111
112
Non-U.S. born, NH White
Non-U.S. born, NH White
Female
Female
Agricultural sci.
Medical sci.
68
72
42
140
27
137
137
349
113
114
Non-U.S. born, NH White
Non-U.S. born, NH White
Female
Female
NIH bio sci.
Other bio sci.
107
95
209
198
227
185
543
478
115
116
Non-U.S. born, NH White
Non-U.S. born, NH White
Female
Female
Psychology
Economics
138
75
281
57
333
72
752
204
117
118
Non-U.S. born, NH White
Non-U.S. born, NH White
Female
Female
Anthro/arch/sociology
Other social sci.
76
78
53
115
81
120
210
313
119
Non-U.S. born, NH White
Female
Electrical/electron/comm.
79
68
62
209
120
Non-U.S. born, NH White
Female
Other engineering
80
160
195
435
121
Non-U.S. born, NH Asian
Male
Chemistry
422
427
416
1,265
122
Non-U.S. born, NH Asian
Male
Physics/astronomy
328
395
445
1,168
Prepared for NSF by NORC | 84
2013 SDR | Sample Design and Implementation
Appendix B.1 2013 NSDR Strata and Frame Counts
2010 New
Cohort
Population
Size
2011 New
Cohort
Population
Size
Total
Frame
Size
Stratum
Demographic Group
Gender
123
Non-U.S. born, NH Asian
Male
Earth/ocean/atmos.
82
76
93
251
124
125
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Male
Male
Math
Computer/info. sci.
238
227
300
439
328
459
866
1,125
126
127
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Male
Male
Agricultural sci.
Medical sci.
103
97
106
152
118
150
327
399
128
129
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Male
Male
NIH bio sci.
Other bio sci.
321
298
431
431
443
468
1,195
1,197
130
131
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Male
Male
Psychology
Economics
70
110
62
103
56
117
188
330
132
133
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Male
Male
Anthro/arch/sociology
Other social sci.
61
74
23
78
38
88
122
240
134
135
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Male
Male
Electrical/electron/comm.
Other engineering
588
1,195
781
1,393
927
1,472
2,296
4,060
136
137
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Female
Female
Chemistry
Physics/astronomy
205
84
246
113
275
117
726
314
138
139
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Female
Female
Earth/ocean/atmos.
Math
80
99
59
208
60
195
199
502
140
141
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Female
Female
Computer/info. sci.
Agricultural sci.
89
83
130
69
156
105
375
257
142
143
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Female
Female
Medical sci.
NIH bio sci.
105
266
202
462
208
470
515
1,198
144
145
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Female
Female
Other bio sci.
Psychology
287
98
500
187
482
170
1,269
455
146
147
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Female
Female
Economics
Anthro/arch/sociology
83
78
109
63
134
66
326
207
148
149
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Female
Female
Other social sci.
Electrical/electron/comm.
85
101
110
218
113
227
308
546
150
Non-U.S. born, NH Asian
Female
Other engineering
206
417
480
1,103
38,424
31,300
32,655
102,379
Total
Field of Degree
Old
Cohort
Sample
Cases
NH = Non-Hispanic.
Prepared for NSF by NORC | 85
2013 SDR | Sample Design and Implementation
Appendix B.2 2013 NSDR Minimum Respondent Stratum Sample Sizes with and without Finite
Population Correction Adjustment and Associated Yield Rates
Minimum Sample Size Derivation
Field of Degree
2013 Total
Population
Minimum
Respondent
Sample
Size
Unadjusted
for FPC
Minimum
Respondent
Sample
Size with
FPC
Adjustment
Supplemental
Sample
Allocated to
Meet Sample
Size
Minimum?
2010
NSDR
Weighted
Yield Rate
Used
Sample
Size
Allocation
Stratum
Demographic Group
Gender
1
2
Hispanic
Hispanic
Male
Male
Computer/math
Biological and agri. sci.
1,267
4,274
60
60
57
59
3
4
Hispanic
Hispanic
Male
Male
Health sci.
Physical and related sci.
365
2,917
60
60
52
59
5
6
Hispanic
Hispanic
Male
Male
Social sci.
Psychology
2,340
1,820
60
60
59
58
79.2%
79.2%
7
Hispanic
Male
Engineering
3,425
60
59
79.2%
8
Hispanic
Female
Computer/math
294
60
50
9
10
Hispanic
Hispanic
Female
Female
Biological and agri. sci.
Health sci.
3,300
841
60
60
59
56
78.1%
78.1%
11
12
Hispanic
Hispanic
Female
Female
Physical and related sci.
Social sci.
1,005
1,944
60
60
57
58
78.1%
78.1%
13
Hispanic
Female
Psychology
3,738
60
59
78.1%
14
Hispanic
Female
Engineering
845
60
56
78.1%
15
NH Black
Male
Computer/math
967
60
56
70.2%
16
NH Black
Male
Biological and agri. sci.
3,178
60
59
17
18
NH Black
NH Black
Male
Male
Health sci.
Physical and related sci.
792
2,026
60
60
56
58
19
20
NH Black
NH Black
Male
Male
Social sci.
Psychology
3,294
1,752
60
60
59
58
70.2%
70.2%
21
NH Black
Male
Engineering
2,669
60
59
70.2%
22
23
NH Black
NH Black
Female
Female
Computer/math
Biological and agri. sci.
376
2,810
60
60
52
59
24
25
NH Black
NH Black
Female
Female
Health sci.
Physical and related sci.
1,704
742
60
60
58
56
26
27
NH Black
NH Black
Female
Female
Social sci.
Psychology
2,677
4,057
60
60
59
59
28
NH Black
Female
Engineering
706
60
55
29
U.S. Born, NH Asian
Male
Computer/math
796
60
56
30
31
U.S. Born, NH Asian
U.S. Born, NH Asian
Male
Male
Biological and agri. sci.
Health sci.
2,731
240
60
60
59
48
32
33
U.S. Born, NH Asian
U.S. Born, NH Asian
Male
Male
Physical and related sci.
Social sci.
1,478
741
60
60
58
56
34
35
U.S. Born, NH Asian
U.S. Born, NH Asian
Male
Male
Psychology
Engineering
600
2,211
60
60
55
58
79.2%
79.2%
Yes
Yes
79.2%
79.2%
78.1%
70.2%
Yes
Yes
Yes
70.2%
70.2%
72.6%
72.6%
72.6%
72.6%
72.6%
72.6%
Yes
72.6%
80.2%
Yes
80.2%
80.2%
80.2%
80.2%
Yes
Prepared for NSF by NORC | 86
80.2%
80.2%
2013 SDR | Sample Design and Implementation
Appendix B.2 2013 NSDR Minimum Respondent Stratum Sample Sizes with and without Finite
Population Correction Adjustment and Associated Yield Rates
Minimum Sample Size Derivation
Field of Degree
Supplemental
Sample
Allocated to
Meet Sample
Size
Minimum?
2010
NSDR
Weighted
Yield Rate
Used
Sample
Size
Allocation
2013 Total
Population
Minimum
Respondent
Sample
Size
Unadjusted
for FPC
Minimum
Respondent
Sample
Size with
FPC
Adjustment
211
2,403
60
60
47
59
Yes
80.1%
80.1%
438
666
60
60
53
55
Yes
80.1%
80.1%
Stratum
Demographic Group
Gender
36
37
U.S. Born, NH Asian
U.S. Born, NH Asian
Female
Female
Computer/math
Biological and agri. sci.
38
39
U.S. Born, NH Asian
U.S. Born, NH Asian
Female
Female
Health sci.
Physical and related sci.
40
41
U.S. Born, NH Asian
U.S. Born, NH Asian
Female
Female
Social sci.
Psychology
702
1,257
60
60
55
57
80.1%
80.1%
42
U.S. Born, NH Asian
Female
Engineering
714
60
55
80.1%
43
NH American Indian
Male
All Fields
3,501
150
144
Yes
76.4%
44
NH American Indian
Female
All Fields
2,083
150
140
Yes
84.0%
45
NH Pacific Islander
Male
All fields
703
60
55
Yes
84.8%
46
NH Pacific Islander
Female
All fields
546
60
54
Yes
80.8%
47
U.S. Born, Disabled NH White
Male
Computer/math
2,671
60
59
48
49
U.S. Born, Disabled NH White
U.S. Born, Disabled NH White
Male
Male
Biological and agri. sci.
Health sci.
8,895
1,027
60
60
60
57
50
51
U.S. Born, Disabled NH White
U.S. Born, Disabled NH White
Male
Male
Physical and related sci.
Social sci.
7,376
5,177
60
60
60
59
81.3%
81.3%
52
53
U.S. Born, Disabled NH White
U.S. Born, Disabled NH White
Male
Male
Psychology
Engineering
4,516
4,723
60
60
59
59
81.3%
81.3%
54
55
U.S. Born, Disabled NH White
U.S. Born, Disabled NH White
Female
Female
Computer/math
Biological and agri. sci.
446
2,976
60
60
53
59
Yes
81.6%
81.6%
56
57
U.S. Born, Disabled NH White
U.S. Born, Disabled NH White
Female
Female
Health sci.
Physical and related sci.
1,196
665
60
60
57
55
Yes
Yes
81.6%
81.6%
58
59
U.S. Born, Disabled NH White
U.S. Born, Disabled NH White
Female
Female
Social sci.
Psychology
2,975
3,853
60
60
59
59
60
U.S. Born, Disabled NH White
Female
Engineering
431
60
53
61
U.S. Born, Not Disabled NH White
Male
Chemistry
36,761
60
60
80.8%
62
63
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Male
Male
Physics/astronomy
Earth/ocean/atmos.
26,109
11,703
60
60
60
60
80.8%
80.8%
64
65
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Male
Male
Math
Computer/info. sci.
17,439
7,362
60
60
60
60
80.8%
80.8%
66
67
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Male
Male
Agricultural sci.
Medical sci.
14,871
8,139
60
60
60
60
80.8%
80.8%
68
69
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Male
Male
NIH bio sci.
Other bio sci.
36,724
34,732
60
60
60
60
80.8%
80.8%
70
U.S. Born, Not Disabled NH White
Male
Psychology
41,324
60
60
80.8%
81.3%
Yes
81.3%
81.3%
81.6%
81.6%
Yes
Prepared for NSF by NORC | 87
81.6%
2013 SDR | Sample Design and Implementation
Appendix B.2 2013 NSDR Minimum Respondent Stratum Sample Sizes with and without Finite
Population Correction Adjustment and Associated Yield Rates
Minimum Sample Size Derivation
Field of Degree
2013 Total
Population
Minimum
Respondent
Sample
Size
Unadjusted
for FPC
Minimum
Respondent
Sample
Size with
FPC
Adjustment
Supplemental
Sample
Allocated to
Meet Sample
Size
Minimum?
2010
NSDR
Weighted
Yield Rate
Used
Sample
Size
Allocation
Stratum
Demographic Group
Gender
71
72
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Male
Male
Economics
Anthro/arch/sociology
12,825
11,140
60
60
60
60
80.8%
80.8%
73
74
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Male
Male
Other social sci.
Electrical/electron/comm.
17,700
13,664
60
60
60
60
80.8%
80.8%
75
U.S. Born, Not Disabled NH White
Male
Other engineering
40,930
60
60
80.8%
76
U.S. Born, Not Disabled NH White
Female
Chemistry
9,794
60
60
81.9%
77
78
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Female
Female
Physics/astronomy
Earth/ocean/atmos.
2,746
3,570
60
60
59
59
81.9%
81.9%
79
80
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Female
Female
Math
Computer/info. sci.
3,624
1,631
60
60
59
58
81.9%
81.9%
81
82
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Female
Female
Agricultural sci.
Medical sci.
4,310
16,366
60
60
59
60
81.9%
81.9%
83
84
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Female
Female
NIH bio sci.
Other bio sci.
21,048
21,928
60
60
60
60
81.9%
81.9%
85
86
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Female
Female
Psychology
Economics
49,293
3,185
60
60
60
59
81.9%
81.9%
87
88
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Female
Female
Anthro/arch/sociology
Other social sci.
11,328
10,571
60
60
60
60
81.9%
81.9%
89
U.S. Born, Not Disabled NH White
Female
Electrical/electron/comm.
963
60
56
90
U.S. Born, Not Disabled NH White
Female
Other engineering
6,170
60
59
81.9%
91
Non-U.S. born, NH White
Male
Chemistry
3,607
60
59
66.6%
92
Non-U.S. born, NH White
Male
Physics/astronomy
4,814
60
59
93
94
Non-U.S. born, NH White
Non-U.S. born, NH White
Male
Male
Earth/ocean/atmos.
Math
1,317
3,666
60
60
57
59
95
96
Non-U.S. born, NH White
Non-U.S. born, NH White
Male
Male
Computer/info. sci.
Agricultural sci.
2,647
1,319
60
60
59
57
97
98
Non-U.S. born, NH White
Non-U.S. born, NH White
Male
Male
Medical sci.
NIH bio sci.
1,056
3,807
60
60
57
59
99
100
Non-U.S. born, NH White
Non-U.S. born, NH White
Male
Male
Other bio sci.
Psychology
3,031
2,396
60
60
59
59
66.6%
66.6%
101
102
Non-U.S. born, NH White
Non-U.S. born, NH White
Male
Male
Economics
Anthro/arch/sociology
2,420
875
60
60
59
56
66.6%
66.6%
103
104
Non-U.S. born, NH White
Non-U.S. born, NH White
Male
Male
Other social sci.
Electrical/electron/comm.
2,141
5,574
60
60
58
59
105
Non-U.S. born, NH White
Male
Other engineering
10,804
60
60
106
Non-U.S. born, NH White
Female
1,508
60
58
Chemistry
Yes
Yes
81.9%
66.6%
Yes
Yes
Yes
Yes
66.6%
66.6%
66.6%
66.6%
66.6%
66.6%
66.6%
66.6%
66.6%
Yes
Prepared for NSF by NORC | 88
67.4%
2013 SDR | Sample Design and Implementation
Appendix B.2 2013 NSDR Minimum Respondent Stratum Sample Sizes with and without Finite
Population Correction Adjustment and Associated Yield Rates
Minimum Sample Size Derivation
Field of Degree
2013 Total
Population
Minimum
Respondent
Sample
Size
Unadjusted
for FPC
Minimum
Respondent
Sample
Size with
FPC
Adjustment
Supplemental
Sample
Allocated to
Meet Sample
Size
Minimum?
2010
NSDR
Weighted
Yield Rate
Used
Sample
Size
Allocation
Stratum
Demographic Group
Gender
107
108
Non-U.S. born, NH White
Non-U.S. born, NH White
Female
Female
Physics/astronomy
Earth/ocean/atmos.
812
437
60
60
56
53
Yes
Yes
67.4%
67.4%
109
110
Non-U.S. born, NH White
Non-U.S. born, NH White
Female
Female
Math
Computer/info. sci.
1,138
645
60
60
57
55
Yes
Yes
67.4%
67.4%
111
112
Non-U.S. born, NH White
Non-U.S. born, NH White
Female
Female
Agricultural sci.
Medical sci.
530
1,400
60
60
54
58
Yes
Yes
67.4%
67.4%
113
114
Non-U.S. born, NH White
Non-U.S. born, NH White
Female
Female
NIH bio sci.
Other bio sci.
2,824
2,479
60
60
59
59
115
116
Non-U.S. born, NH White
Non-U.S. born, NH White
Female
Female
Psychology
Economics
3,731
967
60
60
59
56
Yes
67.4%
67.4%
117
118
Non-U.S. born, NH White
Non-U.S. born, NH White
Female
Female
Anthro/arch/sociology
Other social sci.
948
1,593
60
60
56
58
Yes
Yes
67.4%
67.4%
119
Non-U.S. born, NH White
Female
Electrical/electron/comm.
120
Non-U.S. born, NH White
Female
Other engineering
121
Non-U.S. born, NH Asian
Male
Chemistry
122
Non-U.S. born, NH Asian
Male
Physics/astronomy
123
124
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Male
Male
125
126
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
127
128
67.4%
67.4%
738
60
55
Yes
67.4%
1,693
60
58
Yes
67.4%
10,870
60
60
68.9%
8,616
60
60
68.9%
Earth/ocean/atmos.
Math
1,968
6,309
60
60
58
59
Male
Male
Computer/info. sci.
Agricultural sci.
6,324
2,657
60
60
59
59
68.9%
68.9%
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Male
Male
Medical sci.
NIH bio sci.
2,552
8,498
60
60
59
60
68.9%
68.9%
129
130
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Male
Male
Other bio sci.
Psychology
8,037
783
60
60
60
56
Yes
68.9%
68.9%
131
132
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Male
Male
Economics
Anthro/arch/sociology
2,836
634
60
60
59
55
Yes
68.9%
68.9%
133
134
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Male
Male
Other social sci.
Electrical/electron/comm.
1,722
15,735
60
60
58
60
135
Non-U.S. born, NH Asian
Male
Other engineering
31,242
60
60
136
Non-U.S. born, NH Asian
Female
Chemistry
4,812
60
59
137
138
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Female
Female
Physics/astronomy
Earth/ocean/atmos.
1,661
679
60
60
58
55
Yes
Yes
68.1%
68.1%
139
140
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Female
Female
Math
Computer/info. sci.
2,482
1,679
60
60
59
58
Yes
68.1%
68.1%
141
142
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Female
Female
Agricultural sci.
Medical sci.
1,478
2,589
60
60
58
59
Yes
Yes
68.9%
68.9%
68.9%
68.9%
68.9%
68.1%
Yes
Prepared for NSF by NORC | 89
68.1%
68.1%
2013 SDR | Sample Design and Implementation
Appendix B.2 2013 NSDR Minimum Respondent Stratum Sample Sizes with and without Finite
Population Correction Adjustment and Associated Yield Rates
Minimum Sample Size Derivation
2013 Total
Population
Minimum
Respondent
Sample
Size with
FPC
Adjustment
Supplemental
Sample
Allocated to
Meet Sample
Size
Minimum?
2010
NSDR
Weighted
Yield Rate
Used
Sample
Size
Allocation
Stratum
Demographic Group
Gender
143
144
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Female
Female
NIH bio sci.
Other bio sci.
6,535
6,986
60
60
59
59
145
146
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Female
Female
Psychology
Economics
2,410
1,524
60
60
59
58
Yes
68.1%
68.1%
147
148
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Female
Female
Anthro/arch/sociology
Other social sci.
852
1,377
60
60
56
57
Yes
Yes
68.1%
68.1%
149
Non-U.S. born, NH Asian
Female
Electrical/electron/comm.
2,559
60
59
68.1%
150
Non-U.S. born, NH Asian
Female
Other engineering
5,224
60
59
68.1%
Total
Field of Degree
Minimum
Respondent
Sample
Size
Unadjusted
for FPC
68.1%
68.1%
845,574
77.4%
NH = Non-Hispanic.
Prepared for NSF by NORC | 90
2013 SDR | Sample Design and Implementation
Appendix B.3 2013 NSDR Final Sample Allocation
Field of Degree
2010
SED
New
Cohort
Cases
Old
Cohort
Cases
2011
SED
New
Cohort
Cases
Total
Allocation
Stratum
Demographic Group
Gender
1
Hispanic
Male
Computer/math
94
7
9
110
2
3
Hispanic
Hispanic
Male
Male
Biological and agri. sci.
Health sci.
325
56
21
21
367
65
4
5
Hispanic
Hispanic
Male
Male
Physical and related sci.
Social sci.
229
176
11
12
12
13
252
201
6
7
Hispanic
Hispanic
Male
Male
Psychology
Engineering
144
255
6
19
7
20
157
294
8
9
Hispanic
Hispanic
Female
Female
Computer/math
Biological and agri. sci.
55
281
28
31
64
340
10
11
Hispanic
Hispanic
Female
Female
Health sci.
Physical and related sci.
75
87
6
8
6
8
87
103
12
13
Hispanic
Hispanic
Female
Female
Social sci.
Psychology
170
342
13
18
18
25
201
385
14
Hispanic
Female
Engineering
68
10
9
87
15
NH Black
Male
Computer/math
78
6
5
89
16
17
NH Black
NH Black
Male
Male
Biological and agri. sci.
Health sci.
264
72
13
14
291
79
18
19
NH Black
NH Black
Male
Male
Physical and related sci.
Social sci.
169
280
8
10
9
11
186
301
20
21
NH Black
NH Black
Male
Male
Psychology
Engineering
128
219
14
12
136
245
22
23
NH Black
NH Black
Female
Female
Computer/math
Biological and agri. sci.
61
234
20
21
70
275
24
25
NH Black
NH Black
Female
Female
Health sci.
Physical and related sci.
141
64
13
6
13
6
167
76
26
27
NH Black
NH Black
Female
Female
Social sci.
Psychology
232
294
14
19
16
18
262
331
28
NH Black
Female
Engineering
62
7
7
76
29
U.S. Born, NH Asian
Male
Computer/math
67
5
5
77
30
31
U.S. Born, NH Asian
U.S. Born, NH Asian
Male
Male
Biological and agri. sci.
Health sci.
238
49
13
5
15
5
266
59
32
33
U.S. Born, NH Asian
U.S. Born, NH Asian
Male
Male
Physical and related sci.
Social sci.
132
64
6
6
144
72
34
35
U.S. Born, NH Asian
U.S. Born, NH Asian
Male
Male
Psychology
Engineering
62
185
15
16
36
U.S. Born, NH Asian
Female
Computer/math
48
5
5
58
37
38
U.S. Born, NH Asian
U.S. Born, NH Asian
Female
Female
Biological and agri. sci.
Health sci.
240
56
24
5
21
5
285
66
39
40
U.S. Born, NH Asian
U.S. Born, NH Asian
Female
Female
Physical and related sci.
Social sci.
68
70
5
7
6
6
79
83
41
U.S. Born, NH Asian
Female
Psychology
129
10
10
149
68
216
Prepared for NSF by NORC | 91
2013 SDR | Sample Design and Implementation
Appendix B.3 2013 NSDR Final Sample Allocation
Field of Degree
2010
SED
New
Cohort
Cases
Old
Cohort
Cases
2011
SED
New
Cohort
Cases
Total
Allocation
Stratum
Demographic Group
Gender
42
U.S. Born, NH Asian
Female
43
NH American Indian
Male
44
NH American Indian
Female
45
NH Pacific Islander
Male
All fields
65
46
NH Pacific Islander
Female
All fields
67
47
U.S. Born, Disabled NH White
Male
Computer/math
108
48
U.S. Born, Disabled NH White
Male
Biological and agri. sci.
360
49
50
U.S. Born, Disabled NH White
U.S. Born, Disabled NH White
Male
Male
Health sci.
Physical and related sci.
55
298
51
52
U.S. Born, Disabled NH White
U.S. Born, Disabled NH White
Male
Male
Social sci.
Psychology
209
182
53
U.S. Born, Disabled NH White
Male
Engineering
191
54
U.S. Born, Disabled NH White
Female
Computer/math
55
56
U.S. Born, Disabled NH White
U.S. Born, Disabled NH White
Female
Female
Biological and agri. sci.
Health sci.
119
71
57
58
U.S. Born, Disabled NH White
U.S. Born, Disabled NH White
Female
Female
Physical and related sci.
Social sci.
49
120
59
U.S. Born, Disabled NH White
Female
Psychology
156
60
U.S. Born, Disabled NH White
Female
Engineering
31
61
U.S. Born, Not Disabled NH White
Male
Chemistry
1,437
24
23
1,484
62
U.S. Born, Not Disabled NH White
Male
Physics/astronomy
1,010
21
25
1,056
63
64
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Male
Male
Earth/ocean/atmos.
Math
453
669
9
17
10
18
472
704
65
66
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Male
Male
Computer/info. sci.
Agricultural sci.
267
581
15
9
16
10
298
600
67
68
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Male
Male
Medical sci.
NIH bio sci.
309
1,411
10
37
10
36
329
1,484
69
70
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Male
Male
Other bio sci.
Psychology
1,335
1,617
35
27
33
25
1,403
1,669
71
72
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Male
Male
Economics
Anthro/arch/sociology
502
428
8
10
8
11
518
449
73
74
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Male
Male
Other social sci.
Electrical/electron/comm.
677
525
21
13
18
14
716
552
75
U.S. Born, Not Disabled NH White
Male
Other engineering
1,563
45
45
1,653
76
U.S. Born, Not Disabled NH White
Female
Chemistry
382
13
15
410
77
78
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Female
Female
Physics/astronomy
Earth/ocean/atmos.
104
135
6
7
5
7
115
149
79
80
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Female
Female
Math
Computer/info. sci.
140
64
7
6
153
70
81
U.S. Born, Not Disabled NH White
Female
Agricultural sci.
163
8
9
180
Engineering
68
9
7
84
All Fields
170
5
5
180
All Fields
149
9
9
167
34
Prepared for NSF by NORC | 92
2013 SDR | Sample Design and Implementation
Appendix B.3 2013 NSDR Final Sample Allocation
Field of Degree
2010
SED
New
Cohort
Cases
Old
Cohort
Cases
2011
SED
New
Cohort
Cases
Total
Allocation
Stratum
Demographic Group
Gender
82
U.S. Born, Not Disabled NH White
Female
Medical sci.
623
32
31
686
83
84
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Female
Female
NIH bio sci.
Other bio sci.
802
842
41
37
40
39
883
918
85
86
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Female
Female
Psychology
Economics
1,939
127
61
65
2,065
133
87
88
U.S. Born, Not Disabled NH White
U.S. Born, Not Disabled NH White
Female
Female
Anthro/arch/sociology
Other social sci.
440
408
17
17
18
18
475
443
89
U.S. Born, Not Disabled NH White
Female
Electrical/electron/comm.
90
U.S. Born, Not Disabled NH White
Female
Other engineering
227
15
91
Non-U.S. born, NH White
Male
Chemistry
138
92
Non-U.S. born, NH White
Male
Physics/astronomy
181
93
94
Non-U.S. born, NH White
Non-U.S. born, NH White
Male
Male
Earth/ocean/atmos.
Math
65
138
95
96
Non-U.S. born, NH White
Non-U.S. born, NH White
Male
Male
Computer/info. sci.
Agricultural sci.
97
98
Non-U.S. born, NH White
Non-U.S. born, NH White
Male
Male
99
100
Non-U.S. born, NH White
Non-U.S. born, NH White
101
102
64
69
16
258
6
5
149
9
10
200
7
7
73
152
92
66
8
10
110
72
Medical sci.
NIH bio sci.
68
140
7
8
5
9
80
157
Male
Male
Other bio sci.
Psychology
111
89
8
5
8
5
127
99
Non-U.S. born, NH White
Non-U.S. born, NH White
Male
Male
Economics
Anthro/arch/sociology
90
58
5
5
100
67
103
104
Non-U.S. born, NH White
Non-U.S. born, NH White
Male
Male
Other social sci.
Electrical/electron/comm.
78
207
5
11
6
13
89
231
105
Non-U.S. born, NH White
Male
Other engineering
405
19
23
447
106
Non-U.S. born, NH White
Female
Chemistry
75
5
5
85
107
108
Non-U.S. born, NH White
Non-U.S. born, NH White
Female
Female
Physics/astronomy
Earth/ocean/atmos.
72
64
7
7
5
6
84
77
109
110
Non-U.S. born, NH White
Non-U.S. born, NH White
Female
Female
Math
Computer/info. sci.
75
66
5
9
5
7
85
82
111
112
Non-U.S. born, NH White
Non-U.S. born, NH White
Female
Female
Agricultural sci.
Medical sci.
68
69
8
9
78
86
113
114
Non-U.S. born, NH White
Non-U.S. born, NH White
Female
Female
NIH bio sci.
Other bio sci.
102
89
9
8
9
8
120
105
115
116
Non-U.S. born, NH White
Non-U.S. born, NH White
Female
Female
Psychology
Economics
134
72
12
5
14
7
160
84
117
118
Non-U.S. born, NH White
Non-U.S. born, NH White
Female
Female
Anthro/arch/sociology
Other social sci.
72
73
5
7
7
6
84
86
119
Non-U.S. born, NH White
Female
Electrical/electron/comm.
67
8
7
82
120
Non-U.S. born, NH White
Female
Other engineering
68
8
10
86
121
Non-U.S. born, NH Asian
Male
Chemistry
405
17
17
439
122
Non-U.S. born, NH Asian
Male
Physics/astronomy
314
16
18
348
Prepared for NSF by NORC | 93
2013 SDR | Sample Design and Implementation
Appendix B.3 2013 NSDR Final Sample Allocation
Field of Degree
2010
SED
New
Cohort
Cases
Old
Cohort
Cases
2011
SED
New
Cohort
Cases
Total
Allocation
Stratum
Demographic Group
Gender
123
Non-U.S. born, NH Asian
Male
Earth/ocean/atmos.
77
124
125
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Male
Male
Math
Computer/info. sci.
230
219
12
17
13
19
255
255
126
127
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Male
Male
Agricultural sci.
Medical sci.
98
91
6
6
107
103
128
129
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Male
Male
NIH bio sci.
Other bio sci.
308
288
18
18
18
19
344
325
130
131
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Male
Male
Psychology
Economics
69
106
6
5
80
115
132
133
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Male
Male
Anthro/arch/sociology
Other social sci.
134
135
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Male
Male
Electrical/electron/comm.
Other engineering
136
137
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Female
Female
138
139
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
140
141
84
61
74
69
82
567
1,146
32
56
37
59
636
1,261
Chemistry
Physics/astronomy
193
73
11
6
13
6
217
85
Female
Female
Earth/ocean/atmos.
Math
67
94
7
9
7
9
81
112
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Female
Female
Computer/info. sci.
Agricultural sci.
70
74
7
8
85
84
142
143
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Female
Female
Medical sci.
NIH bio sci.
98
252
10
20
10
22
118
294
144
145
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Female
Female
Other bio sci.
Psychology
270
92
23
8
22
8
315
108
146
147
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Female
Female
Economics
Anthro/arch/sociology
71
70
6
6
8
6
85
82
148
149
Non-U.S. born, NH Asian
Non-U.S. born, NH Asian
Female
Female
Other social sci.
Electrical/electron/comm.
71
95
7
10
7
10
85
115
150
Non-U.S. born, NH Asian
Female
Other engineering
195
19
21
235
36,661
1,635
1,704
40,000
Total
NH = Non-Hispanic.
NOTE: Grayed out cells have been suppressed for confidentiality reasons.
Prepared for NSF by NORC | 94
2013 SDR | Sample Design and Implementation
Appendix C.1 2013 ISDR Strata with Frame Population Counts and Sample Cases
Stratification Schema
Stratum
Number
Demographic Group
Sex
Frame Population Size
Field of Degree
Computer & info. sciences/math/physical
sciences/engineering
Biological, agricultural, or health sciences
D1
U.S.Born; all race/ethnicities
Male
D2
U.S.Born; all race/ethnicities
Male
D3
U.S.Born; all race/ethnicities
Male
D4
U.S.Born; all race/ethnicities
Female
D5
U.S.Born; all race/ethnicities
D6
U.S.Born; all race/ethnicities
D7
Non-U.S. Born; Hispanic, any race
Male
D8
D9
Non-U.S. Born; Hispanic, any race
Non-U.S. Born; Hispanic, any race
Male
Male
D10
Non-U.S. Born; Hispanic, any race
Female
D11
Non-U.S. Born; Hispanic, any race
Female
Computer & info. sciences/math/physical
sciences/engineering
Biological, agricultural, or health sciences
Psychology or social sciences
Computer & info. sciences/math/physical
sciences/engineering
Biological, agricultural, or health sciences
D12
Non-U.S. Born; Hispanic, any race
Female
Psychology or social sciences
Sampled Cases
20th
Century
Frame
(estimate)
21st
Century
Frame
(estimate)
New
Cohort
from
2010/2011
(actual)
Total
Sample
Size
20th
Century
Old
Cohort
Cases
21st
Century
Old
Cohort
Cases
New
Cohort
from
2010/2011
Cases
5,986
3,512
1,921
553
360
164
133
63
Total
Frame
3,306
2,056
1,059
191
185
96
67
22
4,516
3,330
987
199
240
154
64
22
1,098
524
410
164
93
29
45
19
Female
Psychology or social sciences
Computer & info. sciences/math/physical
sciences/engineering
Biological, agricultural, or health sciences
2,061
965
865
231
143
53
64
26
Female
Psychology or social sciences
2,636
1,723
697
216
157
78
55
24
1,804
363
1,203
238
202
24
151
27
1,260
923
332
108
807
686
120
129
140
113
20
8
106
91
14
14
336
8
277
51
49
655
82
500
73
85
518
33
409
76
68
424
185
188
51
44
13
26
5
729
230
424
75
79
14
56
9
820
302
454
64
83
18
57
8
6
7
70
8
9
C7
Non-U.S. Born; NH-Black
All
C8
Non-U.S. Born; NH-Black
All
Computer & info. sciences/math/physical
sciences/engineering
Biological, agricultural, or health sciences
C9
Non-U.S. Born; NH-Black
All
Psychology or social sciences
F43
Non-U.S. Born; NH-Asian
Male
Computer/information sciences or mathematics
2,809
818
1,612
380
237
35
159
43
F44
Non-U.S. Born; NH-Asian
Male
Biological/agricultural/environmental life sciences
3,834
1,854
1,743
237
264
82
156
26
F45
F46
Non-U.S. Born; NH-Asian
Non-U.S. Born; NH-Asian
Male
Male
Health sciences
Physical sciences
615
4,201
205
1,704
355
2,141
54
356
61
301
10
76
44
185
7
40
F47
F48
Non-U.S. Born; NH-Asian
Non-U.S. Born; NH-Asian
Male
Male
Social sciences
Psychology
4,251
259
1,464
51
2,350
174
436
35
394
31
71
274
49
F49
Non-U.S. Born; NH-Asian
Male
Engineering
11,508
4,439
6,096
973
894
196
588
110
Prepared for NSF by NORC | 95
2013 SDR | Sample Design and Implementation
Appendix C.1 2013 ISDR Strata with Frame Population Counts and Sample Cases
Stratification Schema
Frame Population Size
Sampled Cases
F50
Non-U.S. Born; NH-Asian
Female
Computer/information sciences or mathematics
669
20th
Century
Frame
(estimate)
111
70
20th
Century
Old
Cohort
Cases
7
F51
F52
Non-U.S. Born; NH-Asian
Non-U.S. Born; NH-Asian
Female
Female
Biological/agricultural/environmental life sciences
Health sciences
2,461
823
721
88
1,439
624
302
111
215
94
38
5
142
77
35
12
F53
F54
Non-U.S. Born; NH-Asian
Non-U.S. Born; NH-Asian
Female
Female
Physical sciences
Social sciences
1,219
1,819
337
362
736
1,203
146
254
118
206
20
30
81
148
17
28
F55
Non-U.S. Born; NH-Asian
Female
Psychology
483
101
298
84
46
6
31
9
F56
Non-U.S. Born; NH-Asian
Female
Engineering
1,554
280
1,078
196
140
15
102
23
F57
Non-U.S. Born; NH-White
Male
Computer/information sciences or mathematics
2,296
764
1,319
213
207
36
147
24
F58
Non-U.S. Born; NH-White
Male
Biological/agricultural/environmental life sciences
2,303
1,101
1,064
138
190
53
122
15
F59
F60
Non-U.S. Born; NH-White
Non-U.S. Born; NH-White
Male
Male
Health sciences
Physical sciences
421
2,799
184
1,079
216
1,528
21
192
41
230
11
52
156
22
F61
F62
Non-U.S. Born; NH-White
Non-U.S. Born; NH-White
Male
Male
Social sciences
Psychology
3,174
416
1,122
176
1,747
219
305
21
269
32
50
8
184
35
F63
Non-U.S. Born; NH-White
Male
Engineering
4,815
2,185
2,335
295
373
99
241
33
F64
Non-U.S. Born; NH-White
Female
Computer/information sciences or mathematics
493
82
342
69
64
10
46
8
F65
F66
Non-U.S. Born; NH-White
Non-U.S. Born; NH-White
Female
Female
Biological/agricultural/environmental life sciences
Health sciences
1,228
327
278
12
822
273
128
42
121
40
15
92
14
5
F67
F68
Non-U.S. Born; NH-White
Non-U.S. Born; NH-White
Female
Female
Physical sciences
Social sciences
692
1,518
125
322
490
988
77
208
85
170
10
24
66
123
9
23
F69
Non-U.S. Born; NH-White
Female
Psychology
701
309
331
61
52
14
31
7
F70
Non-U.S. Born; NH-White
Female
Engineering
743
210
449
84
78
15
53
10
A6
Non-U.S. Born; NH-Other Races
All
All
132
24
97
11
14
85,635
34,262
43,422
7,951
7,078
1,678
4,500
900
Stratum
Number
Demographic Group
Sex
Field of Degree
Overall
Total
Frame
21st
Century
Frame
(estimate)
467
New
Cohort
from
2010/2011
(actual)
91
Total
Sample
Size
21st
Century
Old
Cohort
Cases
53
New
Cohort
from
2010/2011
Cases
10
NOTES: Detailed cases counts for the sampled cases by cohort are suppressed for confidentiality reasons. Specific grayed out cells have been suppressed for confidentiality reasons
Prepared for NSF by NORC | 96
2013 SDR | Sample Design and Implementation
Appendix D. Detailed NSDR Allocation Algorithm and Final 2013 NDR allocation
The NSDR Allocation Algorithm
NOTATION
Let h = 1 to H denote the NSDR strata where H = 150.
Let N(h) denote the stratum h population size and N(+) the total population size across all
strata.
Let OLDN(h) denote the stratum h population size for old cohorts.
Let NEWN(h) denote the stratum h population size for new cohorts.
Let OLDCOUNT(h) denote the total old cohort cases in the stratum h frame.
Let NEWCOUNT(H) denote the total new cohort cases in the stratum h frame.
Let d = 1 to D denote the NSDR domains that receive a domain-level sample supplement.
Let DOMSAM(hd) denote the fixed domain-level allocation made to each stratum in domain
d and DOMSAM(+d) be the fixed sample size allocated across all strata in domain d and
DOMSAM(++) denote the total domain-level sample size allocated across all strata and all
domains.
Let DOMN(hd) denote the population size of stratum h in domain d and DOMN(+d) denote
the total population size of domain d.
Let PROPSAM(+i) be the total sample size set to be allocated proportionately to all strata in
Interation i where I = 1 to I and let PROPSAM(+i) be the proportional sample allocated to
stratum h.
Let STSPSAM(hi) be the stratum-level sample size supplement allocated to stratum h in
Iteration i.
Let DESSAM(hi) be the desired total sample to be allocated to stratum h in Iteration i.
Let OLDDESSAM(hi) be the desired total sample to be allocated to the old cohort
substratum of stratum h in Iteration i.
Let NEWDESSAM(hi) be the desired total sample to be allocated to the new cohort
substratum of stratum h in Iteration i.
Let OLDACTSAM(hi) be the actual total sample that can be allocated to the old cohort
substratum stratum h given the number of old and new cohort frame members in Iteration i.
Let NEWACTSAM(hi) be the actual total sample that can be allocated to the new cohort
substratum stratum h given the number of old and new cohort frame members in Iteration i.
Let TOTACTSAM(hi) be the actual total sample allocated in Iteration i to old and new
cohorts and TOTACTSAM(+i) be the total actual sample allocated in Iteration i across all
strata.
Let MINSAM(h) be the minimum sample size to be allocated to stratum h before the finite
population adjustment.
Let ADJMINSAM(h) be the minimum sample size to be allocated to stratum h after the finite
population adjustment.
Prepared for NSF by NORC | 97
2013 SDR | Sample Design and Implementation
ITERATION 0
Set the values for the fixed domain-level allocation as:
(
)
(
(
(
)
)
)
Define the minimum number of attempted interviews to be allocated to each stratum as:
( )
( )
( )
( )
( )
. (See Section 5 for details.)
Note that the starting sample size for Iteration i is determined at the end of Iteration i-1.
The exception is for Iteration 1 where the starting value for the sample size to be allocated
proportionately across strata is set to PROPSAM(1)=40,000-DOMSAM(++).
Each iteration from i = 1 to I then follows the steps for “Iteration i” below.
ITERATION i
Define the proportional sample to be allocated to stratum h in Iteration i as:
( )
(
( )
( )
(
)
( )
Allocate the desired stratum h sample to the old and new cohort substrata:
( )
( )
( ) DOMSAM(hd).
Calculate the desired sample to be allocated to stratum h in domain d as:
( )
( )
.
( )
Define the stratum-level supplement to be allocated to stratum h as
( )
)
( )
( )
( )
( )
( )
( )
Determine the number of actual frame cases that can be allocated given the number of old
cohort frame members:
If OLDCOUNT(h)File Type | application/pdf |
Author | Proudfoot, Steven L |
File Modified | 2015-06-19 |
File Created | 2015-06-19 |