Attachment L

Attachment L. Detailed Sampling and Weighting Plan - 06 08 2011.doc

2012 - 2014 National Youth Tobacco Survey (NYTS)

Attachment L

OMB: 0920-0621

Document [doc]
Download: doc | pdf

D R A F T

2012 NYTS Sampling and Weighting Plan

The National Youth Tobacco Survey (NYTS) will employ a repeat cross-sectional design to develop national estimates of tobacco use behaviors and exposure to pro- and anti-tobacco influences among students enrolled in grades 6-12. The study represents the continuation of the NYTS cycles that took place in 1999, 2000, 2002, 2004, 2006, 2009 and 2011 conducted by ICF Macro.

The primary objectives of the NYTS are to develop estimates of tobacco use behaviors and exposure to pro- and anti-tobacco influences among students enrolled in middle school and high school grades; to identify differences related to demographic characteristics (age, grade, gender, and race/ethnicity); and to determine whether there are time trends in tobacco use behaviors and exposure to influences that promote or discourage tobacco use. Such information is required to support CDC’s responsibilities in providing technical assistance in the planning, monitoring, and evaluation of national, state, and local tobacco prevention and control programs.

As presented in this appendix, every effort has been made to maintain the methodology established in prior cycles of the NYTS to permit comparability across cycles. In doing so we note that the 2009 cycle marked a change in design as the NYTS moved to a design incorporating features of the Youth Risk Behavior Survey (YBS) in order to support a more efficient coordinated implementation. ICF Macro worked closely with CDC to develop minimal set of changes to the design of the NYTS that would preserve comparability with past cycles and align the studies in a way that supports continuity of coordination. .

Refinements to the NYTS design have also been made throughout the history of the study to account for changing demographics of the in-school population and to meet CDC’s policy needs. For example, current trends of increasing percentages of minority students will lead to more efficient sampling of minority students. The planned design increases the representation of black and Hispanic students by oversampling schools and areas with high concentrations of these minority groups.

This appendix first presents the sample design, including a discussion of our plan for achieving the target yields in terms of numbers of participating students. Section 2 discusses the weighting methods that support accurate, efficient estimates of student characteristics with respect to tobacco use and related behaviors. Section 3 discusses the estimation process in general, and sets out anticipated precision levels resulting from the planned sample sizes.

1. Sampling Design

The sampling universe for the 2012 NYTS will consist of all public, Catholic and other private school students in grades 6 through 12 in the 50 States and the District of Columbia. The sampling frame for schools will be constructed using files obtained from QED Inc. (now operating as MDR). The QED database, used as the basis for the sampling frame throughout the NYTS history, encompasses both private and public schools and includes the latest data from the Common Core of Data from the National Center for Education Statistics.

The NYTS design effectively oversamples black and Hispanic students by increasing the sampling intensity in the largest schools.. More specifically, the oversampling is achieved by selecting schools and primary sampling units (PSUs) with probabilities proportional to size (PPS). By selecting larger schools with greater probabilities, the approach also assigns greater probabilities to high-concentration schools and areas. The oversampling of minorities is also achieved by targeting large schools for double class sampling.

The sample will be a stratified, three-stage cluster sample stratified by racial/ethnic status and urban versus rural. Within each stratum, a primary sampling unit (PSU), defined as a county or a group of counties, will be chosen without replacement at the first stage. In subsequent sampling stages, a probabilistic selection of schools and students will be made from the sample PSUs.

We plan to oversample black and Hispanic students by using PPS methods that will increase the probability of selection of large schools; these schools tend to have relatively high minority student enrollments. In addition, more classes will be selected in large schools which tend to have relatively high enrollment of minority students.

First Stage PSU

In the first stage, 100 PSUs will be randomly sampled using probability proportional to size methods. Themeasure of size will be student enrollment, a measure which tends to increase the yield of minority students (to the extent that larger schools have higher concentrations of minority students). Within PSU, schools will be classified into two levels (high school, middle school) based on grade ranges and into three strata (Small, Medium, and Large) based on total enrollment, as defined in Section 1.4.

Second Stage Schools/SSU

In the second stage, we will draw two large schools from each PSU, one per level (high school and middle school). In addition, we will randomly select one small school per level and one medium school per level in randomly selected subsets of PSUs. Including the small and medium schools, a total of 244 schools will be selected, with the expectation that approximately 85% or 208 of these schools would participate in the survey. Because some schools may not have a complete set of grades for the school level (middle school or high school), the second-stage units (SSUs), are sometimes combinations of schools.

Third Stage Classes/Students

In large schools, we will select an average of 1.85 classes per grade by selecting 2 classes per grade in selected large schools, and one class per grade in the remaining schools. The double class sampling will take place in 80 sampled large high- and middle- schools randomly chosen from the sample large schools. Exhibit 1 presents a summary of the sampling design features.

Exhibit 1 Key Sampling Design Features

Sampling Stage

Sampling Units

Sample Size (Approximate)

Stratification

Measure of Size

1

PSUs: Counties or groups of counties

100

Urban vs. non-urban (2 strata);

Minority concentration (8 strata)

Aggregate school size in target grades

2

Schools

244 school selections: 200 large schools, 24 medium schools and 20 small schools

Small, Medium and Large;

High-school vs. middle-school

Aggregate student enrollment

3

Classes/Students

1 or 2 classes per grade,

24,500 participating students



The following sections discuss two overarching design issues, with Section 1.1 discussing sample sizes, and Section 1.2 discussing the use of a measure of size as a means of oversampling minority students. Sections 1.3 through 1.5 discuss the three stages of sampling (PSUs, Schools or SSUs, and Students). Finally, Section 1.6 reviews replacement procedures.

1.1 Sample Sizes

This design is based on a requirement that the survey yield completed surveys from 24,500 students over the seven middle school and high school grades. Targeting 3,500 students per grade, the expected yield after attrition and non-response will be 10,500 middle school students (grades 6, 7, and 8) and 14,000 high school students (grades 9, 10, 11 and 12).

The original specifications for NYTS sample sizes were not given in terms of student yields, rather, they were specified in terms of the precision of the resulting estimates. Thus the NYTS was designed to produce the key estimates accurate to within ± 5% at a 95% precision level. Estimates by grade, gender, and grade by gender meet this standard. The same standard is used for the estimates for racial/ethnic groups by school level. Over the past several cycles of the NYTS, we have confirmed that sample sizes, and resulting student yields, were sufficient to achieve design goals in terms of precision.

For the 2012 NYTS design, we developed anticipated precision levels ensuring that this design does meet the original precision targets. The balance of this section develops the number of schools selected and students per school that, combined with anticipated response rates, will result in completed surveys from approximately 24,500 students.

As detailed in Section 1.4, “virtual” schools, or second stage units (SSUs), are constructed so as to contain a full complement of grades—6 to 8 for middle schools, and 9 to 12 for high schools. Schools are further classified by size based on grade-level enrollments; the definition of size strata is provided in Section 1.4. This allows us to ensure that a sampled school of a given size classification will be able to support the student sample sizes given in Exhibit 2.

Students are sampled from schools in intact classes. The design assumes a class size of 25 students for most schools (large schools). Large schools will support a draw of up to two standard classes per grade, and medium schools will support a draw of one. Small schools will not support the minimum draw of 25 students—one “class” per grade—in these schools we take all available students (average of 12 per class). We also assume that the average class size is 20 students in medium schools.Across the six cycles of the NYTS, the school participation has averaged 90%, with a low of 83%. Student participation has averaged 91% with a low of 88%. To be conservative, we have assumed slightly lower values in developing the sample design for the 2012 NYTS: 85% for schools and 85% for students.

Exhibit 2 summarizes the sample sizes planned for each school type. This table lists the number of schools drawn and participating schools. For participating schools, we list the number of students selected and, finally, the number of students responding.

Exhibit 2 Sample Size Projections for Participating Students on the 2012 NYTS

School Level/
Size Class

Class Sampling

Number of Schools Selected

Number of Responding Schools

Classes Selected Per Grade

Number of Students Selected Per School

Number of Selected Students

Number of Responding Students

Per Grade

High School

Large

Double class sampling

80

68

2

200

13,600

11,560


Single class sampling

20

17

1

100

1,700

1,445


Medium

Single class sampling

12

10

1

80

800

680


Small

Single class sampling

10

9

n/a

48

432

367


Total Schools/SSU

122

104



16,532

14,052

3,513

Middle School

Large

Double class sampling

80

68

2

150

10,200

8,670


Single class sampling

20

17

1

75

1,275

1,084


Medium

Single class sampling

12

10

1

60

600

510


Small

Single class sampling

10

9

n/a

36

324

275


Total Schools/SSU

122

104



12,399

10,539


Grand Total Schools/SSU

244

208



28,931

24,591

3,513

Note that schools in this table are second-stage units (SSUs) created by combining actual schools so that each virtual school unit has a complete set of grades for the level. These will be expanded to physical schools, as described below. We anticipate that the number of physical schools inducted into the sample will be roughly 270 using the historical average number of actual schools per SSU (virtual school).

1.2 Measure of Size

The sampling approach will utilize probability proportional to size (PPS) sampling methods to achieve over-sampling of black and Hispanic students. In PPS sampling, when the measure of size is defined as the count of final-stage sampling units, and a fixed number of units are selected in the final stage, the result is an equal probability of selection for all members of the universe. For the NYTS, we approximate these conditions, and thus obtain a roughly-self weighting sample. This section describes the type of measure of size to be employed for selecting PSUs and schools to over-sample black and Hispanic students.

As in previous NYTS cycles, the coefficients for the measure of size weighting function were developed using simulation studies that ensure that target sample sizes are met for the two racial/ethnic groups of analytic interest (black and Hispanic students). These target sample sizes are based on precision requirements stated in the initial NYTS design, and are developed in Section 3.2. of The simulation studies, whichinvestigate the relationship of various weighting functions to the resulting numbers and percentages of minority students in the obtained samples, showed that the targets could be met by a measure with equal coefficients . In other words, the measure of size was simplified to the total student enrollment.The effectiveness of a weighted measure of size in achieving oversampling is dependent upon the distributions of black and Hispanic students in schools. For example, if U.S. schools had identical percentages of minorities in every school, then the sample of students from any sample of schools would mirror the national percentages and use of a weighted measure of size would fail to oversample blacks and Hispanics. We know this is not the case, however, as we find a great deal of clustering by race and ethnicity within schools. Application of a weighted measure of size in prior cycles of NYTS has been effective in oversampling blacks and Hispanics.

The measure of size will be used also to compute stratum sizes and PSU sizes.

1.3 First-Stage PSU Sampling

This section describes the first-stage selection of PSUs. It begins with the definition of PSUs, followed by their stratification, allocation and selection.

1.3.1 Definition of PSU

In defining PSUs, several issues are considered:

  1. Each PSU should be large enough to contain the requisite numbers of schools and students by grade, yet not so large as to be selected with near-certainty.

  2. Each PSU should be compact geographically so that field staff can go from school to school easily.

  3. There should be recent data available to characterize the PSUs.

  4. PSU definitions should be consistent with secondary sampling unit (school) definitions.

Generally, counties will be equivalent to PSUs with two exceptions: (1) low population counties are combined to provide sufficient numbers of schools and students, and (2) counties that are very large may be split to avoid becoming certainty or near-certainty PSUs. County population figures will be aggregated from school enrollment data for the grades of interest. Enrollment data are being obtained from the most recent Common Core of Data from the National Center for Education Statistics and the current school and school district data files of QED database, which are updated continuously.

The PSU frame for the 2012 NYTS will be formed directly from counties using methods developed by ICF Macro. The methods employ both student counts and geographic data to ensure that the PSUs have the correct number of schools and students, and that the PSUs are compact geographically.

1.3.2 Definition of Strata

The PSUs will be organized into 16 strata, based on urban/rural location and minority enrollment.

PSUs are classified as “urban” if they are in one of the 54 largest MSAs in the U.S.; otherwise, they were classified as “rural.” This creates two strata defined by urban status. If the percentage of Hispanic students in the PSU exceeds the percentage of black students, then the PSU is classified as Hispanic. Otherwise it is classified as black. This results in four strata defined by urban status and predominant minority.

Finally, within each of these strata, four density groupings are formed, multiplying the number of strata to 16. The approach involves the computation of optimum stratum boundaries using the cumulative square root of “f” method developed by Dalenius and Hodges. The boundaries or cutoffs change as the frequency distribution (“f”) for the racial groupings change from one survey cycle to the next. Hispanic urban and Hispanic rural PSUs were classified into four density groupings depending upon the percentages of Hispanics in the PSU. Black urban and black rural PSUs were also classified into four groupings depending upon the percentages of blacks in the PSU.

The stratum boundaries are computed anew during the design of each cycle’s survey to reflect the changing distribution of minority concentration across schools and areas.

1.3.3 Allocation of the PSU sample

We will design and select a sample of 100 PSUs. In order to stay as close as possible to maximum sample efficiency in terms of precision, the initial allocation of PSUs will be made proportional to student enrollment. Then, we will make adjustments to the initial allocation to meet minority targets, and evaluate these adjustments using sample simulations. This entire process is the same as that used in prior cycles of NYTS.

1.3.4 Selection of PSUs

Within each first-stage stratum, the PSUs will be sorted by five-digit zip code to attain a form of implicit geographic stratification. Implicit stratification, coupled with the probability proportional to size (PPS) sampling method will ensure geographic sample representation.

The following systematic sampling procedures will be applied to the stratified frame to select a PPS sample of PSUs.

  1. Select 100 PSUs systematic, probability proportional to size draw within each stratum. This method constructs a sampling interval computed as the sum of the measure of size for the PSUs in the stratum divided by the number of PSUs to be selected in each stratum.

2. Subsample at random 10 of the 100 sampled PSUs for small school sampling and 12 of the 100 sampled PSUs for medium school sampling.

1.4 Second Stage—Selection of Schools

Schools will be stratified by school level—middle schools and high schools—and by size. We will define three school strata—small, medium and large. Middle schools were those that contained any of grades 6, 7, or 8, and high schools were those that contained grades any of grades 9 through 12. Schools that contained a mix of high- and middle-school grades will be split into two sampling units, one for each level.

We define three school size strata—small, medium and large. Small schools are defined first as those schools with 25 or fewer students in one or more of the eligible grades for the level. Of schools that are not small, medium schools are defined as having fewer than 50 students in one or more of the eligible grades for the level. The remaining schools—those that had at least 50 students at each grade—are considered large.

S

Figure 1—Cluster School Construction and Grade Sampling for High Schools

chools are classified as “whole” for high schools if they have all high-school grades 9 to 12, and whole for middle schools if they had all grades 6 to 8. Otherwise, they are considered a “fragment” school. Fragment schools formed component schools that are linked with other schools (fragment or whole) to form a cluster school that has all four grades. This process is illustrated in Figure 1, where fragment school A is linked with whole school B, to form a cluster school, or Secondary Sampling Unit (SSU) XXX. We plan to link schools before sampling using an algorithm previously used in the NYTS that combines geographically proximate schools. Cluster schools are then treated as a single school, or sampling unit, during sampling with selection performed at the grade level as described below.

For large schools, one high school and one middle school will be selected with PPS systematic sampling within a PSU. The schools will be selected into the sample with probability proportional to the weighted measure of size.

Small and medium schools will be sampled independently from large schools; they will be selected by drawing two mutually exclusive subsamples of PSUs, one subsample of 12 PSUs for medium school sampling and one subsample of 10 PSUs for small school sampling. In the first subsample, one medium high school and one medium middle school will be drawn from each of the 12 PSUs. In the other subsample, one small high school and one small middle school will be drawn from each of the 10 PSUs. Medium and small schools will be selected in each sub-sampled PSU with probability proportional to size, using the weighted measure of size.

1.5 Third Stage—Selection Students

1.5.1 Selection of Grades and Classes

Except for cluster schools, classes are selected from all eligible grades in each school. In school clusters, grade samples are selected independently with one component school being selected for each grade.

The method of selecting students will vary from school to school, depending upon the organization of that school and whether a cluster of schools is involved. The key element of the school sampling strategy is to identify a structure that partitions the students into mutually exclusive, collectively exhaustive groupings that are of approximately equal sizes and that are accessible. Beyond that basic requirement, we will do the partitioning to result in groups in which both genders and students of all ability levels are represented. In selecting classes, we will generally give preference to selecting from mandatory courses such as English. Another option is to select from all classes that meet during a particular time of day such as all second or third period classes.

We will not use special procedures to sample for minorities at the school building level for two reasons:

  • Schools do not maintain student rosters that identify students by racial/ethnic affiliation.

  • We feel this would be viewed by many schools as an offensive practice.

We plan to select one or two classes per grade from each participating school. Two classes per grade are selected in schools classified as “high minority.” In the case of school clusters, we will conduct our sampling on a grade by grade basis. At each grade, we will determine the identity of all schools in the cluster with students in that grade. If each school has enough students in the grade, then we will pick randomly one of the schools with probability proportional to grade enrollment and then select all of the classes from that school.

A “class” will be defined by our sampling team so that it meets size and composition requirements before the sampling is done. For example, two small classes may be combined and treated as one for sampling purposes. Or, boys and girls physical education classes may be combined. This approach is an efficient method of data collection in schools and also has the advantage of using the classroom teacher to distribute consent forms; hence, it tends to yield higher student participation rates. The disadvantage of this approach is that the sampling design tends to be less efficient because students within a class section tend to be more homogeneous than the entire student population within a school. The effect of this inefficiency has been accounted for in our estimates of the design effect of the study.

1.5.2 Selection of Students

All eligible students in a selected class will be invited to take the survey.

1.6 Replacement of Schools/School Systems

We will not replace refusing school districts, schools, classes, or students. We have allowed for school and student non-response by inflating the sample sizes to account for non-response. With this approach, all schools can be contacted in a coordinated recruitment effort, which is not possible for methods that allow for replacing schools.

2. Weighting

The final data set will be weighted to reflect the initial probabilities of selection and non-response patterns, to mitigate large variations in sampling weights, and to post-stratify the data to known control characteristics.

2.1 Initial Weighting

The basic weights will be computed as the reciprocal of the probability of selection of that case. If ki is the number of PSUs to be selected from stratum i, Ni is the size of stratum i and Nij is the size of PSU j in stratum i (in all cases size refers to our proposed measure of size), then the probability of selection of PSU j is kiNij/Ni.

Assuming that one school is to be selected in stratum i, Nijr is the size of school r in PSU j in stratum i, then the conditional probability of selection of the school given the selection of the PSU is Nijr/Nij. If Cijr is the number of classes in school ijr and m is the number of classes to be selected then the conditional probability of selection of a class is m/Cijr. Since all students are selected, the conditional probability of selection of a student given the selection of the class is unity.

The overall probability of selection of a student in stratum-i is the product of the three conditional probabilities of selection:

  • kiNij/Ni

  • Nijr/Nij

  • m/Cijr

This product, the student probability of selection, simplifies to a factor proportional to
ki Nijr /NiCijr

For cluster schools, this probability is adjusted for the grade selection procedure. Therefore, the probabilities of selection will be the same for all students in a given non-cluster school, and roughly equal in cluster schools.

2.2 Non-response Adjustments

Several adjustments are planned to account for student and school non-response patterns. An adjustment for student non-response will be made using gender and grade within school. With this adjustment, the sum of the student weights over participating students within a school matches the total enrollment by grade in the school. This adjustment factor will be capped in extreme situations, such as when only one or two students respond in a school, to limit the potential effects of extreme weights (i.e., unequal weighting effects on survey variances).

The weights of students in participating schools will be adjusted to account for nonparticipation by other schools. The adjustment uses the ratio of the weighted sum of measures of size over all selected schools in the stratum (numerator of adjustment factor), and over the subset of participating schools in a stratum (denominator of adjustment factor). The adjustment factor will be computed and applied to small and large schools separately.

2.3 Weight Trimming

Extreme variation in sampling weights can cause inflated sampling variances, and offset the precision gained from a well-designed sampling plan. This variation can occur, for example, if the number of respondents in a particular school is very low, causing a large non-response adjustment factor to be computed. One strategy to compensate for this is to trim extreme weights and distribute the trimmed weight among the untrimmed weights.

The trimming procedure will be iterative. In each iteration an optimal weight, Wo is calculated from the sum of the squared weights in the sample. Then, each weight Wi is marked and trimmed if it exceeds that optimal weight. The trimmed weight is summed within grade and spread out proportionally over the unmarked cases in the grade. This process is repeated until little or no weight is being trimmed. Weight trimming is done within stratum.

The trimming procedure is set to redistribute 5% of the weight on the first iteration.

2.4 Post-stratification to National Estimates

National estimates of racial/ethnic percentages were obtained from the two sources. Private schools enrollments by grade and five racial/ethnic groups were obtained from the Private School Universe Survey (PSS), and public school enrollments by grade, gender, and five racial/ethnic categories were obtained from the Common Core of Data (CCD), both produced by the National Center for Education Statistics (NCES).

These databases were combined to produce the enrollments for all schools, and to develop population percentages to use as controls in the post-stratification step. For post-stratification purposes, a unique race/ethnicity is assigned to respondents with missing data on race/ethnicity, those with an “Other” classification, and those reporting multiple races. For private schools, we use two race/ethnic classifications—white and non-white. For public schools we use the full five categories.

Given a national estimate of Ra and a weighted population estimate of Pa for race category a in some grade, the simple post-stratification factor would be the ratio of Ra to Pa for each race and grade.

3 Estimators, Variance and Precision

This section describes the computation of weighted survey estimators and the appropriate measures of sampling variability, or sampling error. These measures, such as variances and standard errors, will support the construction of confidence intervals and other statistical inference such as statistical testing. We then go on to look at the precision of the resulting estimates, and demonstrate that these do meet the original NYTS design specifications.

3.1 Survey Estimators and Variance Estimators

If wi is the weight of case i (the inverse of the probability of selection adjusted for non-response and post-stratification adjustments) and xi is a characteristic of case i (e.g., xi=1 if student i smokes, but is zero otherwise), then the mean of characteristic x will be (Σ wixi)/(Σ wi). A population total would be computed similarly as (Σ wixi). Weighted population estimates will be computed with the Statistical Analysis System (SAS) and SUDAAN software.

Sampling variances that account for the complex sampling design will be estimated using the method of general linearized estimators1 as implemented in the SUDAAN2 or SAS Version 9.1 survey procedures. These software packages permit estimation of sampling variances for multistage stratified sampling designs, accounting for unequal weighting, and for sample clustering and stratification. These software packages require the specification of sampling stages and sampling parameters (strata, PSU).

3.2 Expected Precision

This section reviews the expected precision precision for survey estimates resulting from the planned sample sizes. The sample sizes conform to the requirement that the study produce an overall yield of 24,500 students, based on assumptions about the distribution of the sample across demographic subgroups.

The derivation of sample sizes is driven by the precision levels required for subgroup estimates, specifically for the smallest subgroups defined by grade and by gender. With a sample size of about 3,500 participants by grade—totals of 10,500 and 14,000 for middle school and high school grades, respectively—the design will ensure the required precision levels for design effects as large as 3.0. This section shows that estimates for subgroups of size 1,750, such as those defined by grade and by gender, will achieve the +/-5% precision levels for 95% confidence intervals.

Based on the prior NYTS studies we can expect the following subgroup estimates to be within ± 5% at 95% precision level (which requires standard errors of 2.5% or less), and therefore, tomeet the precision requirements that have driven the sampling design from its inception:

  • Estimates by grade, gender, and grade by gender

  • Minority group estimates by school level (middle school versus high school) for black and Hispanic students

These precision estimates are based on the following projected numbers of participating students, also detailed in Exhibit 3:

  • Approximately 3,517 students per grade or 1,758 per grade by gender (with a 50% gender distribution)

  • At least 2,096 black students per school level; and 2,260 Hispanic students per school level

As highlighted in Exhibit 3 below, the sample design will yield 2,275 and 2,096 black students for high schools and middle schools, respectively. The exhibit also shows that higher numbers are expected for Hispanic students, 2,563 for high schools and 2,260 for middle schools.

Exhibit 3 Expected Minority Yields

Part I: High Schools

School Strata

Number of Schools

Expected Number
of Black Students

Expected Number
of Hispanic Students

Large HS, Double Class y

80

1,850

2,081

Large HS, Single Class

20

231

260

Medium HS

12

121

139

Small HS

10

73

83

Total

122

2,275

2,563

Part II: Middle Schools

School Strata

Number of Schools

Expected Number
of Black Students

Expected Number
of Hispanic Students

Large MS, Double Class

80

1,734

1,908

Large MS, Single Class

20

217

238

Medium MS

12

91

52

Small MS

10

54

62

Total

122

2,096

2,260

The design was also driven by the smallest subgroups defined by minority group—black and Hispanic students—and by school level. Exhibit 4 below shows how these levels of precision are attained for these racial/ethnic subgroup estimates. The exhibit also considers estimates defined by gender separately for each school level (Middle School, High School).

The exhibit considers three design effect scenarios (DEFF values of 1, 2 and 3), and shows that expected standard errors are 2.07% or less even for the most conservative scenarios (DEFF=3). These standard errors are comfortably below the levels necessary for the required levels of precision (SE=2.5%).

Exhibit 4 Standard Error of Subgroup Estimates


Sample Size

Standard Error (Max)

DEFF=1

DEFF=2

DEFF=3

Middle School

2,022

1.11%

1.57%

1.93%

High School

2,262

1.05%

1.49%

1.82%

Overall

4,284

0.76%

1.08%

1.32%

Middle School

2,241

1.06%

1.49%

1.83%

High School

2,554

0.99%

1.40%

1.71%

Overall

4,795

0.72%

1.02%

1.25%

Gender and level

1,750

1.20%

1.69%

2.07%

It should be noted that confidence intervals vary depending upon whether an estimate represents the full population or a subset, such as a particular grade, gender, or racial/ethnic group. Within a grouping, they also vary depending on the level of the estimate and the design effect associated with the measure.



1 Skinner CJ, Holt D, and Smith TMF, Analysis of Complex Surveys, John Wiley & Sons, New York, 1989, pp. 50.

2 Shah BV, Barnwell GG, Bieler GS. SUDAAN: software for the statistical analysis of correlated data, release 7.5, 1997 [user’s manual]. Research Triangle Park, NC: Research Triangle Institute; 1997.



File Typeapplication/msword
AuthorWindows User
Last Modified Byarp5
File Modified2011-06-08
File Created2011-06-01

© 2024 OMB.report | Privacy Policy