Attachment 7 2008

Attachment 7 2008.doc

National Health and Nutrition Examination Survey (NHANES)--2009-2010

Attachment 7 2008

OMB: 0920-0237

Document [doc]
Download: doc | pdf

Attachment 7


Sampling Information

Attachment 7 - Sampling Information




A. Sampling Information Tables


Table 1: Sampling domains and target yearly examination sample sizes for NHANES 2007-2010


Race/ethnicity-sex-age-income sampling domain



Target number of examined SPs

Black, non-Hispanic

M&F

0-11 mos.

50

Black, non-Hispanic

M&F

1-2 yrs.

85

Black, non-Hispanic

M&F

3-5 yrs.

85

Black, non-Hispanic

M

6-11 yrs.

85

Black, non-Hispanic

M

12-19 yrs.

92

Black, non-Hispanic

M

20-39 yrs.

105

Black, non-Hispanic

M

40-49 yrs.

53

Black, non-Hispanic

M

50-59 yrs.

53

Black, non-Hispanic

M

60+ yrs.

105

Black, non-Hispanic

F

6-11 yrs.

85

Black, non-Hispanic

F

12-19 yrs.

85

Black, non-Hispanic

F

20-39 yrs.

105

Black, non-Hispanic

F

40-49 yrs.

53

Black, non-Hispanic

F

50-59 yrs.

53

Black, non-Hispanic

F

60+ yrs.

105

Black, non-Hispanic —Overall



1,197





Hispanic

M&F

0-11 mos.

90

Hispanic

M&F

1-2 yrs.

100

Hispanic

M&F

3-5 yrs.

100

Hispanic

M

6-11 yrs.

100

Hispanic

M

12-19 yrs.

102

Hispanic

M

20-39 yrs.

140

Hispanic

M

40-49 yrs.

70

Hispanic

M

50-59 yrs.

70

Hispanic

M

60+ yrs.

123

Hispanic

F

6-11 yrs.

100

Hispanic

F

12-19 yrs.

102

Hispanic

F

20-39 yrs.

140

Hispanic

F

40-49 yrs.

70

Hispanic

F

50-59 yrs.

70

Hispanic

F

60+ yrs.

147

Hispanic--Overall



1,565





Low Income White/Other

M&F

0-11 mos.

43

Low Income White/Other

M&F

1-2 yrs.

54

Low Income White/Other

M&F

3-5 yrs.

54

Low Income White/Other

M

6-11 yrs.

27

Low Income White/Other

M

12-19 yrs.

27

Low Income White/Other

M

20-29 yrs.

31

Low Income White/Other

M

30-39 yrs.

31

Low Income White/Other

M

40-49 yrs.

31

Low Income White/Other

M

50-59 yrs.

31

Low Income White/Other

M

60-69 yrs.

31

Low Income White/Other

M

70-79 yrs.

31

Low Income White/Other

M

80+ yrs.

20

Low Income White/Other

F

6-11 yrs.

27

Low Income White/Other

F

12-19 yrs.

27

Low Income White/Other

F

20-29 yrs.

31

Low Income White/Other

F

30-39 yrs.

31

Low Income White/Other

F

40-49 yrs.

31

Low Income White/Other

F

50-59 yrs.

31

Low Income White/Other

F

60-69 yrs.

31

Low Income White/Other

F

70-79 yrs.

31

Low Income White/Other

F

80+ yrs.

31





Non-Low Income White/Other

M&F

0-11 mos.

70

Non-Low Income White/Other

M&F

1-2 yrs.

70

Non-Low Income White/Other

M&F

3-5 yrs.

70

Non-Low Income White/Other

M

6-11 yrs.

70

Non-Low Income White/Other

M

12-19 yrs.

71

Non-Low Income White/Other

M

20-29 yrs.

79

Non-Low Income White/Other

M

30-39 yrs.

81

Non-Low Income White/Other

M

40-49 yrs.

82

Non-Low Income White/Other

M

50-59 yrs.

79

Non-Low Income White/Other

M

60-69 yrs.

80

Non-Low Income White/Other

M

70-79 yrs.

79

Non-Low Income White/Other

M

80+ yrs.

70

Non-Low Income White/Other

F

6-11 yrs.

70

Non-Low Income White/Other

F

12-19 yrs.

68

Non-Low Income White/Other

F

20-29 yrs.

75

Non-Low Income White/Other

F

30-39 yrs.

79

Non-Low Income White/Other

F

40-49 yrs.

79

Non-Low Income White/Other

F

50-59 yrs.

75

Non-Low Income White/Other

F

60-69 yrs.

72

Non-Low Income White/Other

F

70-79 yrs.

67

Non-Low Income White/Other

F

80+ yrs.

68

White/Other--Overall



2,238





Overall Total



5,000



Table 2. Expected NHANES 1-year, 2-year, 3-year, and 4-year sample sizes by sampling domain with 15 PSUs per year

Race/ethnicity-sex-age-income sampling domain



1 year

2 year

3 year

4 year

Black, non-Hispanic

M&F

0-11 mos.*

50

100

150

200

Black, non-Hispanic

M&F

1-2 yrs.

85

170

255

340

Black, non-Hispanic

M&F

3-5 yrs.

85

170

255

340

Black, non-Hispanic

M

6-11 yrs.

85

170

255

340

Black, non-Hispanic

M

12-19 yrs.

92

184

276

368

Black, non-Hispanic

M

20-39 yrs.

105

210

315

420

Black, non-Hispanic

M

40-49 yrs.

53

105

158

210

Black, non-Hispanic

M

50-59 yrs.

53

105

158

210

Black, non-Hispanic

M

60+ yrs.

105

210

315

420

Black, non-Hispanic

F

6-11 yrs.

85

170

255

340

Black, non-Hispanic

F

12-19 yrs.

85

170

255

340

Black, non-Hispanic

F

20-39 yrs.

105

210

315

420

Black, non-Hispanic

F

40-49 yrs.

53

105

158

210

Black, non-Hispanic

F

50-59 yrs.

53

105

158

210

Black, non-Hispanic

F

60+ yrs.

105

210

315

420

Black, non-Hispanic--Overall



1,197

2,394

3,591

4,788








Hispanic

M&F

0-11 mos.*

104

208

312

416

Hispanic

M&F

1-2 yrs.

100

200

300

400

Hispanic

M&F

3-5 yrs.

100

200

300

400

Hispanic

M

6-11 yrs.

100

200

300

400

Hispanic

M

12-19 yrs.

102

204

306

408

Hispanic

M

20-39 yrs.

140

280

420

560

Hispanic

M

40-49 yrs.

70

140

210

280

Hispanic

M

50-59 yrs.

70

140

210

280

Hispanic

M

60+ yrs.

150

300

450

600

Hispanic

F

6-11 yrs.

100

200

300

400

Hispanic

F

12-19 yrs.

102

204

306

408

Hispanic

F

20-39 yrs.

140

280

420

560

Hispanic

F

40-49 yrs.

70

140

210

280

Hispanic

F

50-59 yrs.

70

140

210

280

Hispanic

F

60+ yrs.

147

294

441

588

Hispanic--Overall



1,565

3,130

4,695

6,260








Low Income White/Other

M&F

0-11 mos.*

45

90

135

180

Low Income White/Other

M&F

1-2 yrs.

54

108

162

216

Low Income White/Other

M&F

3-5 yrs.

54

108

162

216

Low Income White/Other

M

6-11 yrs.

27

54

81

108

Low Income White/Other

M

12-19 yrs.

27

54

81

108

Low Income White/Other

M

20-29 yrs.

31

62

93

124

Low Income White/Other

M

30-39 yrs.

31

62

93

124

Low Income White/Other

M

40-49 yrs.

31

62

93

124

Low Income White/Other

M

50-59 yrs.

31

62

93

124

Low Income White/Other

M

60-69 yrs.

31

62

93

124

Low Income White/Other

M

70-79 yrs.

31

62

93

124

Low Income White/Other

M

80+ yrs.

20

40

60

80

Low Income White/Other

F

6-11 yrs.

27

54

81

108

Low Income White/Other

F

12-19 yrs.

27

54

81

108

Low Income White/Other

F

20-29 yrs.

31

62

93

124

Low Income White/Other

F

30-39 yrs.

31

62

93

124

Low Income White/Other

F

40-49 yrs.

31

62

93

124

Low Income White/Other

F

50-59 yrs.

31

62

93

124

Low Income White/Other

F

60-69 yrs.

31

62

93

124

Low Income White/Other

F

70-79 yrs.

31

62

93

124

Low Income White/Other

F

80+ yrs.

31

62

93

124








Non-Low Income White/Other

M&F

0-11 mos.*

70

140

210

280

Non-Low Income White/Other

M&F

1-2 yrs.

70

140

210

280

Non-Low Income White/Other

M&F

3-5 yrs.

70

140

210

280

Non-Low Income White/Other

M

6-11 yrs.

70

140

210

280

Non-Low Income White/Other

M

12-19 yrs.

71

142

213

284

Non-Low Income White/Other

M

20-29 yrs.

79

158

237

316

Non-Low Income White/Other

M

30-39 yrs.

81

162

243

324

Non-Low Income White/Other

M

40-49 yrs.

82

164

246

328

Non-Low Income White/Other

M

50-59 yrs.

79

158

237

316

Non-Low Income White/Other

M

60-69 yrs.

80

160

240

320

Non-Low Income White/Other

M

70-79 yrs.

79

158

237

316

Non-Low Income White/Other

M

80+ yrs.

70

140

210

280

Non-Low Income White/Other

F

6-11 yrs.

70

140

210

280

Non-Low Income White/Other

F

12-19 yrs.

68

136

204

272

Non-Low Income White/Other

F

20-29 yrs.

75

150

225

300

Non-Low Income White/Other

F

30-39 yrs.

79

158

237

316

Non-Low Income White/Other

F

40-49 yrs.

79

158

237

316

Non-Low Income White/Other

F

50-59 yrs.

75

150

225

300

Non-Low Income White/Other

F

60-69 yrs.

72

144

216

288

Non-Low Income White/Other

F

70-79 yrs.

67

134

201

268

Non-Low Income White/Other

F

80+ yrs.

68

136

204

272

White/Other--Overall



2,238

4,476

6,714

8,952








Overall Total



5,000

10,000

15,000

20,000


*There are no explicit targets for infants (age <1 yr.). The numbers given here are the expected yield, given the estimated amount of screening.

Table 3. Expected NHANES sample size and response rates after four years (2007-2010) by sampling domains


Race/ethnicity-sex-age- income sampling domain



Projected population in year 2008

Total sample

Expected interview response rate (%)

Expected number of interviewed SPs

Expected exam response rate (%)

Expected number of examined SPs

Black, non-Hispanic

M&F

0-11 mos.

719,185

218

94%

205

92%

200

Black, non-Hispanic

M&F

1-2 yrs.

1,396,763

384

92%

352

88%

340

Black, non-Hispanic

M&F

3-5 yrs.

1,993,139

407

87%

356

84%

340

Black, non-Hispanic

M

6-11 yrs.

1,861,369

397

89%

352

86%

340

Black, non-Hispanic

M

12-15 yrs.

2,639,626

422

89%

375

87%

368

Black, non-Hispanic

M

16-19 yrs.

4,836,058

548

81%

445

77%

420

Black, non-Hispanic

M

20-39 yrs.

2,388,683

292

75%

219

72%

210

Black, non-Hispanic

M

40-59 yrs.

1,993,381

292

75%

219

72%

210

Black, non-Hispanic

M

60+ yrs.

1,833,520

684

67%

456

61%

420

Black, non-Hispanic

F

6-11 yrs.

1,812,239

402

88%

354

85%

340

Black, non-Hispanic

F

12-19 yrs.

2,650,667

388

90%

349

88%

340

Black, non-Hispanic

F

20-39 yrs.

5,771,249

533

81%

433

79%

420

Black, non-Hispanic

F

40-49 yrs.

2,914,677

283

77%

219

74%

210

Black, non-Hispanic

F

50-59 yrs.

2,434,324

283

77%

219

74%

210

Black, non-Hispanic

F

60+ yrs.

2,721,468

687

67%

459

61%

420

Black, non-Hispanic—Overall



37,966,348

6,220

81%

5,010

77%

4,788










Hispanic

M&F

0-11 mos.

956,023

456

95%

432

91%

416

Hispanic

M&F

1-2 yrs.

1,876,390

469

89%

419

85%

400

Hispanic

M&F

3-5 yrs.

2,719,343

461

92%

423

87%

400

Hispanic

M

6-11 yrs.

2,517,646

477

86%

410

84%

400

Hispanic

M

12-19 yrs.

3,239,835

466

90%

418

87%

408

Hispanic

M

20-39 yrs.

7,669,437

689

85%

587

81%

560

Hispanic

M

40-49 yrs.

2,959,450

359

81%

291

78%

280

Hispanic

M

50-59 yrs.

1,831,538

359

81%

291

78%

280

Hispanic

M

60+ yrs.

1,625,133

816

78%

633

74%

600

Hispanic

F

6-11 yrs.

2,422,751

467

87%

406

86%

400

Hispanic

F

12-15 yrs.

3,096,683

457

91%

416

89%

408

Hispanic

F

16-19 yrs.

7,083,806

690

84%

583

81%

560

Hispanic

F

20-39 yrs.

2,862,243

346

83%

286

81%

280

Hispanic

F

40-59 yrs.

1,921,269

346

83%

286

81%

280

Hispanic

F

60+ yrs.

2,083,573

814

75%

612

72%

588

Hispanic--Overall



44,865,120

7,671

85%

6,492

82%

6,260

Race/ethnicity-sex-age- income sampling domain



Projected population in year 2008

Total sample

Expected interview response rate (%)

Expected number of interviewed SPs

Expected exam response rate (%)

Expected number of examined SPs










Low-Income White/Other

M&F

0-11 mos.

456,475

196

93%

182

92%

180

Low-Income White/Other

M&F

1-2 yrs.

853,105

230

96%

220

94%

216

Low-Income White/Other

M&F

3-5 yrs.

1,247,951

235

94%

222

92%

216

Low-Income White/Other

M

6-11 yrs.

1,105,432

126

93%

117

86%

108

Low-Income White/Other

M

12-19 yrs.

1,559,851

125

87%

109

86%

108

Low-Income White/Other

M

20-29 yrs.

1,916,727

149

86%

129

83%

124

Low-Income White/Other

M

30-39 yrs.

1,253,152

171

75%

128

73%

124

Low-Income White/Other

M

40-49 yrs.

1,389,294

147

87%

128

84%

124

Low-Income White/Other

M

50-59 yrs.

1,245,643

161

80%

128

77%

124

Low-Income White/Other

M

60-69 yrs.

1,017,338

155

83%

128

80%

124

Low-Income White/Other

M

70-79 yrs.

642,642

180

74%

133

69%

124

Low-Income White/Other

M

80+ yrs.

436,802

131

75%

99

61%

80

Low-Income White/Other

F

6-11 yrs.

1,100,696

122

92%

112

89%

108

Low-Income White/Other

F

12-19 yrs.

1,537,174

119

91%

109

91%

108

Low-Income White/Other

F

20-29 yrs.

2,749,535

153

84%

128

81%

124

Low-Income White/Other

F

30-39 yrs.

1,705,231

144

88%

127

86%

124

Low-Income White/Other

F

40-49 yrs.

1,526,726

145

87%

127

86%

124

Low-Income White/Other

F

50-59 yrs.

1,485,365

152

88%

134

82%

124

Low-Income White/Other

F

60-69 yrs.

1,525,604

172

80%

138

72%

124

Low-Income White/Other

F

70-79 yrs.

1,339,664

180

71%

129

69%

124

Low-Income White/Other

F

80+ yrs.

1,466,528

227

76%

173

55%

124


Race/ethnicity-sex-age- income sampling domain



Projected population in year 2008

Total sample

Expected interview response rate (%)

Expected number of interviewed SPs

Expected exam response rate (%)

Expected number of examined SPs










Non-Low-Income White/Other

M&F

0-11 mos.

2,130,918

327

89%

292

86%

280

Non-Low-Income White/Other

M&F

1-2 yrs.

4,305,315

354

82%

289

79%

280

Non-Low-Income White/Other

M&F

3-5 yrs.

6,464,103

366

85%

310

76%

280

Non-Low-Income White/Other

M

6-11 yrs.

6,686,619

396

76%

302

71%

280

Non-Low-Income White/Other

M

12-19 yrs.

9,656,681

364

82%

299

78%

284

Non-Low-Income White/Other

M

20-29 yrs.

11,868,349

440

76%

333

72%

316

Non-Low-Income White/Other

M

30-39 yrs.

12,116,185

506

68%

345

64%

324

Non-Low-Income White/Other

M

40-49 yrs.

14,702,605

482

70%

336

68%

328

Non-Low-Income White/Other

M

50-59 yrs.

14,261,362

508

65%

329

62%

316

Non-Low-Income White/Other

M

60-69 yrs.

9,403,249

499

69%

345

64%

320

Non-Low-Income White/Other

M

70-79 yrs.

5,198,607

460

73%

337

69%

316

Non-Low-Income White/Other

M

80+ yrs.

2,813,405

462

69%

321

61%

280

Non-Low-Income White/Other

F

6-11 yrs.

6,314,206

360

82%

294

78%

280

Non-Low-Income White/Other

F

12-19 yrs.

9,186,595

342

81%

278

80%

272

Non-Low-Income White/Other

F

20-29 yrs.

11,181,398

406

78%

315

74%

300

Non-Low-Income White/Other

F

30-39 yrs.

11,997,292

453

73%

332

70%

316

Non-Low-Income White/Other

F

40-49 yrs.

14,954,947

461

72%

332

69%

316

Non-Low-Income White/Other

F

50-59 yrs.

14,638,075

441

71%

315

68%

300

Non-Low-Income White/Other

F

60-69 yrs.

9,840,533

449

68%

303

64%

288

Non-Low-Income White/Other

F

70-79 yrs.

5,931,088

490

61%

298

55%

268

Non-Low-Income White/Other

F

80+ yrs.

4,094,525

501

67%

337

54%

272










White/Other--Overall



215,306,993

12,487

76%

9,542

72%

8,952










TOTAL



298,138,462

26,378

80%

21,044

76%

20,000


Table 4. Expected distribution of NHANES sample for one year of data collection by sampling domains with 15 PSUs


Sex and Age

Total Examined Sample

Black, non-Hispanic

Hispanic

White/Other






Males and Females










Less than 1 year

269

50

104

115

1-2 years

309

85

100

124

3-5 years

309

85

100

124






Males










6-11 years

282

85

100

97

12-15 years

292

92

102

98

16-19 years

467

105

140

222

20-39 years

236

53

70

113

40-59 years

233

53

70

110

60-69 years

 

 

 

111

70-79 years

566

105

150

110

80+ years

 

 

 

90






Females










6-11 years

282

85

100

97

12-15 years

282

85

102

95

16-19 years

461

105

140

216

20-39 years

233

53

70

110

40-59 years

229

53

70

106

60-69 years

 

 

 

103

70-79 years

552

105

147

98

80+ years

 

 

 

99


 

 

 

 

Overall Total

5,000

1,197

1,565

2,238







Table 5: Minimum sample size* required in an analytic cell to estimate p with a CV of 30 percent by various design effects**


Proportions

p or (1-p)


Design effect


1


1.25


1.5


1.75


2


2.5


3


0.01


1,100


1,375


1,650


1,925


2,200


2,750


3,300


0.02


544


681


817


953


1,089


1,361


1,633


0.05


211


264


317


369


422


528


633


0.10


100


125


150


175


200


250


300


0.15


63


79


94


110


126


157


189


0.20


44


56


67


78


89


111


133


0.25


33


42


50


58


67


83


100


0.30


30


38


45


53


60


75


90


0.40


30


38


45


53


60


75


90


0.50


30


38


45


53


60


75


90

*Sample size n= deff* (1‑p)/( p* CV**2) for p<=0.25; n= 30 * deff for p>0.25

**deff=1 for SRS

Table 6: Minimum sample size* required in an analytic cell to estimate difference in p with a CV of 30 percent by various design effects**


Estimated proportions


Design Effect


p1


p2


1


1.25


1.5


1.75


2


2.5


3


0.05


0.10


611


764


917


1,069


1,222


1,528


1,833


0.05


0.15


194


243


292


340


389


486


583


0.05


0.20


102


128


154


179


205


256


307


0.10


0.15


967


1,208


1,450


1,692


1,933


2,417


2,900


0.10


0.20


278


347


417


486


556


694


833


0.10


0.25


137


171


206


240


274


343


411


0.15


0.20


1,278


1,597


1,917


2,236


2,556


3,194


3,833


0.15


0.25


350


438


525


613


700


875


1,050


0.15


0.30


167


208


250


292


333


417


500


0.20


0.25


1,544


1,931


2,317


2,703


3,089


3,861


4,633


0.20


0.30


411


514


617


719


822


1,028


1,233


0.20


0.35


191


239


287


335


383


478


574


0.25


0.30


1,767


2,208


2,650


3,092


3,533


4,417


5,300


0.25


0.35


461


576


692


807


922


1,153


1,383


0.25


0.40


211


264


317


369


422


528


633


0.30


0.35


1,944


2,431


2,917


3,403


3,889


4,861


5,833


0.30


0.40


500


625


750


875


1,000


1,250


1,500


0.30


0.45


226


282


339


395


452


565


678


0.40


0.45


2,167


2,708


3,250


3,792


4,333


5,417


6,500


0.40


0.50


544


681


817


953


1,089


1,361


1,633


0.40


0.55


241


301


361


421


481


602


722


0.50


0.55


2,211


2,764


3,317


3,869


4,422


5,528


6,633


0.50


0.60


544


681


817


953


1,089


1,361


1,633


0.50


0.65


236


295


354


413


472


590


707

*n=deff*(p1*q1 +p2*q2)/((CV*(p1‑p2))**2); q=1-p;

**deff=1 for SRS


Table 7a. NHANES examined sample (n=10,000) after two years. Estimated CVs for a 10-percent statistic, assuming a design effect of 1.5

Sex and Age

Total Examined Sample

Black, non-Hispanic

Hispanic

White/Other






Males and Females










Less than 1 year

0.158

0.367

0.255

0.242

1-2 years

0.148

0.282

0.260

0.233

3-5 years

0.148

0.282

0.260

0.233






Males










6-11 years

0.155

0.282

0.260

0.264

12-19 years

0.152

0.271

0.257

0.262

20-39 years

0.120

0.254

0.220

0.174

40-49 years

0.169

0.359

0.311

0.244

50-59 years

0.170

0.359

0.311

0.248

60-69 years

 

 

 

0.247

70-79 years

0.109

0.254

0.212

0.248

80+ years

 

 

 

0.274






Females










6-11 years

0.155

0.282

0.260

0.264

12-19 years

0.155

0.282

0.257

0.267

20-39 years

0.121

0.254

0.220

0.177

40-49 years

0.170

0.359

0.311

0.248

50-59 years

0.172

0.359

0.311

0.252

60-69 years

 

 

 

0.256

70-79 years

0.111

0.254

0.214

0.262

80+ years

 

 

 

0.261


 

 

 

 

Overall Total

0.037

0.075

0.066

0.055

 

 

 

 

 

























Table 8a. NHANES examined sample (n=10,000) after two years. Estimated CVs for a 5-percent statistic, assuming a design effect of 1.5

Sex and Age

Total Examined Sample

Black, non-Hispanic

Hispanic

White/Other






Males and Females










Less than 1 year

0.230

0.534

0.370

0.352

1-2 years

0.215

0.409

0.377

0.339

3-5 years

0.215

0.409

0.377

0.339






Males










6-11 years

0.225

0.409

0.377

0.383

12-19 years

0.221

0.394

0.374

0.381

20-39 years

0.175

0.368

0.319

0.253

40-49 years

0.246

0.521

0.451

0.355

50-59 years

0.248

0.521

0.451

0.360

60-69 years

 

 

 

0.358

70-79 years

0.159

0.368

0.308

0.360

80+ years

 

 

 

0.398






Females










6-11 years

0.225

0.409

0.377

0.383

12-19 years

0.225

0.409

0.374

0.387

20-39 years

0.176

0.368

0.319

0.257

40-49 years

0.248

0.521

0.451

0.360

50-59 years

0.250

0.521

0.451

0.367

60-69 years

 

 

 

0.372

70-79 years

0.161

0.368

0.311

0.381

80+ years

 

 

 

0.379

 

 

 

 

 

Overall Total

0.053

0.109

0.095

0.080

 

 

 

 

 




















Table 7b. NHANES examined sample (n=20,000) after four years. Estimated CVs for a 10-percent statistic, assuming a design effect of 1.5











Sex and Age

Total Examined Sample

Black, non-Hispanic

Hispanic

White/Other






Males and Females










Less than 1 year

0.112

0.260

0.180

0.171

1-2 years

0.105

0.199

0.184

0.165

3-5 years

0.105

0.199

0.184

0.165






Males










6-11 years

0.109

0.199

0.184

0.187

12-19 years

0.108

0.192

0.182

0.186

20-39 years

0.085

0.179

0.155

0.123

40-49 years

0.120

0.254

0.220

0.173

50-59 years

0.120

0.254

0.220

0.175

60-69 years

 

 

 

0.174

70-79 years

0.077

0.179

0.150

0.175

80+ years

 

 

 

0.194






Females










6-11 years

0.109

0.199

0.184

0.187

12-19 years

0.109

0.199

0.182

0.188

20-39 years

0.086

0.179

0.155

0.125

40-49 years

0.120

0.254

0.220

0.175

50-59 years

0.122

0.254

0.220

0.178

60-69 years

 

 

 

0.181

70-79 years

0.078

0.179

0.152

0.186

80+ years

 

 

 

0.185

 

 

 

 

 

Overall Total

0.026

0.053

0.046

0.039

 

 

 

 

 







Table 8b. NHANES examined sample (n=20,000) after four years. Estimated CVs for a 5-percent statistic, assuming a design effect of 1.5

Sex and Age

Total Examined Sample

Black, non-Hispanic

Hispanic

White/Other






Males and Females










Less than 1 year

0.163

0.377

0.262

0.249

1-2 years

0.152

0.290

0.267

0.240

3-5 years

0.152

0.290

0.267

0.240






Males










6-11 years

0.159

0.290

0.267

0.271

12-19 years

0.156

0.278

0.264

0.270

20-39 years

0.124

0.260

0.226

0.179

40-49 years

0.174

0.368

0.319

0.251

50-59 years

0.175

0.368

0.319

0.255

60-69 years

 

 

 

0.253

70-79 years

0.112

0.260

0.218

0.255

80+ years

 

 

 

0.281






Females










6-11 years

0.159

0.290

0.267

0.271

12-19 years

0.159

0.290

0.264

0.274

20-39 years

0.124

0.260

0.226

0.182

40-49 years

0.175

0.368

0.319

0.255

50-59 years

0.177

0.368

0.319

0.259

60-69 years

 

 

 

0.263

70-79 years

0.114

0.260

0.220

0.270

80+ years

 

 

 

0.268

 

 

 

 

 

Overall Total

0.038

0.077

0.067

0.056

 

 

 

 

 














  1. NHANES Analytic and Reporting Guidelines (available online at http://www.cdc.gov/nchs/about/major/nhanes/nhanes2003-2004/analytical_guidelines.htm)

Last Update: December, 2005

Last Correction: September, 2006


Introduction

This document presents analytic and reporting guidelines that should be used for NHANES data analyses and publications. It represents the latest information from the National Center for Health Statistics on recommended approaches for analysis of all NHANES data, but with a particular focus on data collected in the continuous NHANES (since 1999). Previous versions of NHANES analytic guidelines (the NHANES III Analytic Guidelines http://www.cdc.gov/nchs/about/major/nhanes/nhanes3/nh3gui.pdf and the NHANES 1999-2000 Addendum to the NHANES III Analytic Guidelines http://www.cdc.gov/nchs/data/nhanes/guidelines1.pdf) can still be used. These analytic guidelines will be modified and updated on a periodic basis as new information is acquired and as new statistical techniques for analysis of complex sample surveys are introduced. Users should regularly visit the NHANES website to see if a new version of these latest analytic guidelines has been released.

Summary recommendations


Following is the current list of analytic and reporting guidelines for NHANES public release data. Additional guidelines may be included on future updates as well as more detailed information and examples for some of the existing guidelines.

1. The first and over-riding analytic guideline is that the data user, prior to any analysis of the data, should read all relevant documentation for the survey and for the specific data items to be used in an analysis.


Many analytic problems and misinterpretation of the data can be avoided by reading the documentation, examining the data collection protocols and data collection instruments, and conducting preliminary descriptive evaluation of the data. The documentation will indicate how the data were collected, how the data are coded and the amount of missing data. The documentation will also indicate if a data item was collected on all or a sub-sample of sample persons, if it was collected on a limited age-range, or if exclusion criteria were applied for a specific examination component. Specific information on laboratory tests and quality control for these tests are available. For trend analysis, the current documentation can be compared with documentation from past NHANES surveys to determine if a specific data item is comparable with a similar data item collected in previous surveys.


Data collected in NHANES comes from interviews, examinations, and laboratory tests based on blood and urine samples. There may also be measures taken in the home, such as dust or tap water collection. The source of a data item (interview, MEC, sera) is important for both assessment of quality of information and for determining the appropriate sampling weight to be used for producing statistical estimates.


As with any data set, NHANES data are subject to sampling and non-sampling

errors (including measurement error). Interview (questionnaire) data are based on self-reports and are therefore subject to non-sampling errors such as recall problems, misunderstanding of the question, and a variety of other factors. Examination data and laboratory data are subject to measurement variation and possible examiner effects. The NHANES program maintains high standards to insure non-sampling and measurement errors are minimized. Prior to data collection, extensive protocols are developed and reviewed by the public health and scientific community. Prior to and during data collection, NHANES field staff participate in comprehensive training and annual refresher training for Interviewers and MEC staff. As data are processed, extensive quality control procedures are applied. Despite the rigorous quality control standards, estimates produced from any data set are subject to sampling and non-sampling variation and interpretation of analysis must proceed accordingly.


Data content and data collection protocols may change over time; this is another reason to read the documentation in order to understand any issues in comparability of data over time. Changes in methods may occur at any time and the user should not assume they have


2. NHANES has changed from a periodic survey to a continuous survey and the release of public use data files (and their format) has changed as well.

In the past, NHANES surveys were conducted on a periodic basis and the data were released as single, multiyear data sets. For example, NHANES III covered the 6 calendar years 1988-1994 and is generally analyzed as one, 6-year survey. In addition, previous NHANES public use data files tended to be large and few in number. Since 1999, NHANES has been planned and conducted as a continuous annual survey. For a variety of reasons, including disclosure issues, the continuous NHANES survey data is released on public use data files in two-year increments (e.g. NHANES 1999-2000, NHANES 2001-2002, NHANES 2003-2004, etc.). Since the inception of the continuous NHANES, public use data files are released on an ongoing basis as many smaller component-specific data files. For a two-year analysis, sample size is smaller and the number of geographic units in the sample is more limited than, for example NHANES III. Sample size and statistical power consideration should be used to determine if a two-year sample is sufficient for a particular analysis or if 4 (or even 6) years of the survey need to be combined to produce statistically reliable analysis. This is addressed more fully later in this document.

3. Be aware of the complex survey design and sample weighting methodology..

NHANES is a complex sample survey. The overall sample design and weighting methodology has been similar over the history of the survey. The sample design and weighting methodology for NHANES 1999-2004 is very similar to past NHANES data releases. Primary Sampling Units (PSUs) are generally single counties, although small counties are sometimes combined to meet a minimum population size. In the years 1999-2001, NHANES was based on a design linked to the National Health Interview Survey (NHIS). The NHANES PSUs were a subset of the PSUs previously selected for the NHIS. An independent set of PSU’s was selected for 2002-2006; the sampling frame for this design was all counties in the United States.


The additional stages of selection in the probability design for NHANES 1999-2004 remain very similar to past NHANES designs. Clusters of households are selected and each person in a selected household is screened for demographic characteristics. One or more persons per household may be selected for the sample. For NHANES 1999-2000, there were 12,160 persons selected for the sample, 9,965 of those were interviewed (81.9 percent) and 9,282 (76.3 percent) were examined in the MEC. For NHANES 2001-2002, there were 13,156 persons selected for the sample, 11,039 of those were interviewed (83.9 percent), and 10,477 (79.6 percent) were examined in the MEC. For NHANES 2003-2004, there were 12,761 persons selected for the sample, 10,122 of those were interviewed (79.3 percent) and 9,643 (75.6 percent) were examined in the MEC.


As with any complex probability sample, the sample design information should be explicitly used when producing statistical estimates or undertaking statistical analysis of the NHANES data. In particular, sample weights and the first stage of the cluster design need to be considered. The sampling weights provided must be used to produce unbiased national estimates. The sample weights for NHANES 2003-2004 reflect the unequal probabilities of selection, non-response adjustments and adjustments to independent population controls. The proper sample weight must be used. If only data from the Interviewed sample is used, then the appropriate SAS variable is WTINT2YR. If data from the MEC examination is used, then the appropriate SAS variable is WTMEC2YR.


Because NHANES is a complex probability sample, analytic approaches based on data from simple random sample are usually inappropriate. Ignoring the complex design can lead to biased estimates and overstated significance levels. Sample weights and the stratification and clustering of the design must be incorporated into an analysis to get proper estimates and standard errors of estimates.


Data are sometimes collected on sub-samples of the full design for any NHANES survey. These data are available but public release of these files may lag behind the main data release for any two-year period due to extra time needed for processing and quality assurance review. In addition, each subsample involves another stage of selection and separate sample weights that account for that stage of selection and additional non-response. For analysis of subsample data, appropriate subsample weights must be used and they are included on any data file where relevant.

4. Be aware of, and utilize, proper variance estimation procedures.

The procedure for variance estimation (sampling errors) is the same for 2003-2004 as for 2001-2002. This method creates Masked Variance Units (MVUs) which can be used as if they were stratified PSU’s to estimate sampling errors (similar to past NHANES). The MVUs on the NHANES demographic data files are not the “true” design PSUs. They are a collection of secondary sampling units that are aggregated into groups called Masked Variance Units for the purpose of variance estimation. The MVUs produce variance estimates that closely approximate the variances that would have been estimated using the “true” design structure. These MVUs have been created for each two-year cycle of NHANES and can be used for any combination of two-year data cycles without recoding by the user.


For NHANES 2001-2002 and 2003-2004, the two-year weights and MVUs are included in the Demographics data file. The NHANES1999-2000 Demographic file was updated to include MVU’s and four-year sample weights. Only the NHANES 1999-2002 data have special four year sample weights (as described in the NHANES Analytic Guidelines section on how and when to combine years of data). At this time, the preferred approach for calculating sampling errors is to use the MVUs and to ignore the JK-1 technique that served as an interim approach for variance estimation when the NHANES 1999-2000 data were released.


The stratum variable is SDMVSTRA and the PSU variable is SDMVPSU. Software specific for survey data, such as SUDAAN, or software that has specific survey procedures, such as STATA and SAS, can be used to estimate sampling errors by the Taylor series (linearization) method. Typically, the data set should first be sorted by SDMVSTRA and SDMVPSU. For NHANES 1999-2000, SDMVSTRA is numbered 1-13; for NHANES 2001-2002, SDMVSTRA is numbered 14-28; and for NHANES 2003-2004 SDMVSTRA is numbered 29-43. Therefore, these files can be combined without any recoding of this variable. This procedure will also hold for combining NHANES 2001-2002 and 2003-2004 data files, as well as future two-year NHANES files. There are no replicate weights provided for NHANES 2003-2004. Replication techniques can still be used to estimate sampling errors if the software, such as WESVAR, computes its own set of replicate weights based on the nested MVU/PSU within stratum design.


Variance estimates for NHANES I, NHANES II, HHANES, and NHANES III utilized the true design PSUs. Pseudo strata and pseudo PSU variables were included on each public use data file for those surveys and the same software can be used to estimate sampling errors for each of those surveys.


5. Combining two or more 2-year cycles of the continuous NHANES is encouraged and strongly recommended in order to produce estimates with greater statistical reliability for demographic sub-domains and rare events,.

For two-year cycles, the sample size may be too small to produce statistically reliable estimates for very detailed demographic sub-domains (e.g. sex-age-race/ethnicity groups) or for relatively rare events. The sample design for NHANES makes it possible to combine two or more “cycles” to increase the sample size and analytic options. Each two-year cycle and any combination of those two years cycles is a nationally representative sample.


When combining cycles of data, it is extremely important that (1) the user verify that data items collected in all combined years were comparable in wording and methods and (2) use a proper sampling weight. Beginning in 2003, the survey content for each two year period is held as constant as possible to be consistent with the data release cycle. In the first four years of the continuous survey, this was not always the case, and some special data release and data access procedures had to be developed and used for selected survey content collected in “other than two-year” intervals (http://www.cdc.gov/nchs/data/nhanes/nhanes_release_policy.pdf) .

6. The decision on how many years of NHANES data are required for a particular analysis can be summarized by the concept of minimum sample size required.

The minimum sample size is determined by the statistic to be estimated (e.g. mean, total, proportion…), the reliability criteria (e.g. 20 or 30 percent relative standard error), the Design Effect for the statistics (DEFF defined as the variance inflation factor), and the degrees of freedom for the standard error estimate. For example, consider the minimum sample size to estimate a10 percent prevalence with relative standard error 30 percent or less, a survey DEFF of 1.5, and greater than 16 degrees of freedom for the standard error. The required minimum sample size is 150. Now consider the following simplified example (not real data).


Table1. Sample Size by Data Cycle and Sub-domain

1999-2000

2001-2002

2003-2004

Combined 4 years

Combined 6 years

Total

210

210

210

420

630

Males

110

110

110

220

330

age < 40

60

60

60

120

180

age > 40

50

50

50

100

150

Females

100

100

100

200

300


In this example, one could estimate the proportion for the total population in each of the 2-year data cycles but none of the sub-domains meets the minimum sample size requirement. Combining the data from two cycles to produce a 4 year dataset (in this case, a 1999-2002 or a 2001-2004 dataset) allows the proportion to be reliably estimated for both males and females. For a more detailed domain however such as Males less than 40 years of age, 6 years of data are required.

Earlier NHANES surveys were conducted for four or more years and, thus, have larger samples than the two-year cycles of the continuous NHANES. However, in each of the NHANES conducted prior to 1999, many sub-domains did not meet minimum sample size requirements and in those cases, the above concerns were (and still are) relevant.

7. When combining two or more two-year cycles of continuous NHANES data, the user should use the following procedure for calculating the appropriate combined sample weights.

When two or more 2-year cycles of the continuous NHANES are combined, the user must calculate new sample weights before analyzing the data. NCHS does not calculate sample and release all possible combinations of multiple two-year cycles of the continuous survey because it would be impractical to produce them and include them on all public release files.

The sample weights for NHANES 1999-2000 were based on population estimates developed by the Bureau of the Census before the Year 2000 Decennial Census counts became available. The two-year sample weights for NHANES 2001-2002 were based on population estimates that incorporate the year 2000 Census counts. The two population estimates were not strictly comparable. Therefore, appropriate four-year sample weights (comparable to Census 2000 counts) were calculated and added to the demographic data files for both 1999-2000 and 2001-2002. The four-year sample weights have the same variable name in each file. For example, the four-year examination sample weight in both files is WTMEC4YR. Thus, users of the earlier release of the NHANES 1999-2000 demographic file must use the updated demographic file to appropriately analyze the combined four-year data 1999-2002. Because NHANES 2003-2004 uses the same year 2000 Census counts as were used for NHANES 2001-2002, there is no need to create special four-year weights for 2001-2004.

For a four year estimate for 2001-2004, one can create a new variable for a four year weight by assigning ½ of the 2 year weight for 2001-2002 if the person was sampled in 2001-2002 or assigning ½ of the 2 year weight for 2003-2004 if the person was sampled in 2003-2004. This is possible because the 2 year weights for 2003-2004 are comparable to the 2001-2002 weights (in terms of a population basis). For an estimate for the 6-years 1999-2004, a 6-year weight variable can be created by assigning 2/3 of the 4 year weight for 1999-2002 if the person was sampled in 1999-2002 or assigning 1/3 of the 2 year weight for 2003-2004 if the person was sampled in 2003-04. This is possible because the 2003-2004 weights are also comparable (on a population basis) to the combined four-year weights specifically created for 1999-2002.


Summary comments and future additions to the NHANES Analytic Guidelines.

This document summarizes the most recent analytic and reporting guidelines that should be used for most NHANES analyses and publications. It is important for users to understand the entire document and to become familiar with statistical issues in the analysis of complex survey data.


These suggested guidelines provide a framework to users for producing estimates that conform to the analytic design of the survey. Because statistical methods for analyzing complex survey data are continually evolving, these recommendations may differ slightly from those used by analysts for previous NHANES surveys.


It is important to remember that the statistical guidelines in this document are not absolute. When conducting analyses, the analyst needs to use his/her subject matter knowledge (including methodological issues), as well as information about the survey design. The more one deviates from the original analytic categories and original analytic objectives defined in the planning documents, the more important it is to evaluate the results carefully and to interpret the findings cautiously.

Future versions of this NHANES Analytic and Reporting Guidelines will include additional topics, such as sample sizes and response rates for each NHANES survey, hypothesis testing, multivariate analysis, and a discussion of the concept of statistical versus practical significance.

These are guidelines not standards. Depending upon the subject matter and statistical efficiency, specific analyses may depart from these guidelines. The burden of proof for statistical efficiency and for appropriate data interpretation is on the data analyst.

One final reminder for NHANES data users is that the NHANES data files, documentation, and Analytic Guidelines may be edited and/or updated to reflect new information and corrected or edited data. NHANES data users are encouraged to check the NHANES website periodically (available at: http://www.cdc.gov/nchs/about/major/nhanes/NHANES99_00.htm) to determine if new or revised data files and analytic guidelines have been released by NCHS for the data of interest. Data users are encouraged to subscribe to the NHANES listerv (available at: http://www.cdc.gov/nchs/about/major/nhanes/nhaneslist.htm) to receive information updates.

0


File Typeapplication/msword
File Modified0000-00-00
File Created0000-00-00

© 2024 OMB.report | Privacy Policy