This document is in reference to the non-response bias analysis requested
by OMB. OMB made a second request for the lead Statistician to expound
on their previous response, by providing more detail.

Graduate Medical Education

Studying the Effects of ACGME Duty Hours
Limits on Resident Satisfaction: Results From
VA Learners’ Perceptions Survey
T. Michael Kashner, PhD, JD, Steven S. Henley, MS, Richard M. Golden, PhD,
John M. Byrne, DO, Sheri A. Keitz, MD, PhD, Grant W. Cannon, MD,
Barbara K. Chang, MD, MA, Gloria J. Holland, PhD, David C. Aron, MD,
Elaine A. Muchmore, MD, Annie Wicker, and Halbert White, PhD

As the Accreditation Council on
Graduate Medical Education (ACGME)
deliberates over further limiting duty
hours of graduate medical education
(GME) trainees, few large-scale studies
have shown residents to be satisfied with
the effect the 2003 standards have had
on clinical care, education outcomes, or
working environments. This study
measures the effect of the 2003 duty
hours limits on resident-reported
satisfaction with GME training during
their rotations through the Department
of Veterans Affairs (VA) medical centers
from 2001 through 2007.

assessed by comparing responses to VA’s
annual Learners’ Perceptions Survey
administered before 2003 with
responses administered after 2003. To
measure duty hours effects on
satisfaction, before–after differences
were adjusted for covariate biases
modeled after an exhaustive covariate
search with 10-fold cross-validation.
Because nonteaching controls are not
available in satisfaction studies, we
used a robust differencing variable
technique to control before–after
differences for trend biases in the
simultaneous presence of missing data
and possible model misspecification.

Self-reported satisfaction with clinical care
and education environments were

There were 19,605 responders. Adjusting
for covariate and trend biases, after the


tandards governing duty hours limits
have generally been considered necessary
in graduate medical education (GME) to
protect the safety of both patients1 and
residents.2 Resident sleep deprivation as a
result of long duty hours has been linked
to higher rates of medical errors,3 poorer
clinical performance,4 adverse events,5
and attentional failures6 in observational,
pre–post, and experimental studies.
Longer duty hours have also been linked
to resident motor-vehicle-related
injuries,7 obstetric complications,8
depression,9 burnout,10 poorer quality of
life11 and neuropsychological
performance,12 including memory loss
and reduced response times.13

In response to these safety concerns, the
Accreditation Council for Graduate
Medical Education (ACGME)
implemented mandatory standards on
July 1, 2003, that limited duty hours for
medical residents in accredited U.S. GME
programs.14 Although benefits from
ACGME duty hours limits continue to be
debated,15 few studies have described
how duty hours limits may be affecting
clinical training environments, trainee
learning, resident access to preceptors
and faculty, and resident education.16 For
instance, residents have complained that
mandatory duty hours rules interfere
with continuity of care,17 increase crosscoverage errors,18 shift the education
focus away from professionalism,19 create
fear that new regulations will add
additional training years,20 and cause
frustration when residents are faced with
heavy workloads and must reconcile
actual hours against ACGME duty hours
rules.21 Underscoring these concerns are
the contrasting missions of the teaching
hospitals, who need staff to provide

2003 ACGME standards, 25% more
residents in medicine specialties
reported satisfaction with VA clinical
environment and 11% more with VA
preceptors and faculty. For surgery,
33% more residents reported
satisfaction with VA clinical
environment and 12% more with
VA preceptors and faculty. Satisfaction
with working environment was mixed.
The 2003 ACGME duty hours
standards were associated with
improved satisfaction for resident
clinical training and learning
Acad Med. 2010; 85:1130–1139.

professional care; faculty, who balance
attending, practice, service, and research
responsibilities; and residents, who need
access to faculty and supervised clinical
experiences to properly prepare them to
enter independent practice.
To assess the effects of ACGME duty
hours rules on training environments,
researchers have surveyed residents using
post and pre–post survey designs. Post
surveys were administered after ACGME
duty hours rules became mandatory.
These surveys asked residents and
fellows10,22–25 about their views of the
success or failure of the mandatory
standards. Although informative about

Academic Medicine, Vol. 85, No. 7 / July 2010

Graduate Medical Education

how residents perceived duty hours
limits, postsurvey results are often
colored by memory loss, cohort
confounds when all of the responders
who have prelimits experiences have
become upper-level residents by the time
the survey is administered, and reporting
biases when residents mimic faculty
attitudes and beliefs in their survey
Pre–post designs26 –28 compare responses
to surveys that were administered in 2003
and earlier with responses to the same
surveys readministered in 2004 and later.
Pre–post designs are subject to covariate
biases whenever responders who took the
survey prelimits (2003 and earlier) differ
significantly from responders who took
the survey postlimits (2004 and later).
Pre–post designs are also subject to trend
biases whenever naturally occurring time
trends in the data confound pre–post
differences. Covariate biases have been
addressed by computing outcomes that
are adjusted to reflect the influences of
variations in responder characteristics.
Trend biases are addressed by differencein-differences methods29 where effect
sizes are computed by subtracting the
pre–post difference in mean responses
among physician residents rotating
through “effect” settings minus the pre–
post difference in mean responses
computed for comparable residents who
rotated through “control” settings.
Control settings have been identified as
(1) nonteaching hospitals where duty
hours limits are irrelevant,30 –32 (2)
training programs in teaching hospitals
where duty hours limits were openly not
enforced,26 or (3) responders whose duty
schedules were not changed, for whatever
reason, by duty hours limits. Facility-level
controls are limited to outcomes that can
be observed in both teaching and
nonteaching settings, such as patient
outcomes and medical errors, and are
thus not practical for resident satisfaction
surveys. Program-level controls are often
difficult to implement because few
program directors openly defy ACGME
standards. Responder-level controls can
be identified by asking respondents if the
2003 duty hours limits had any impact on
their actual duty schedules. However,
such questions were not answerable
before 2003, when ACGME duty hours
limits were first implemented. We call
this the “missing-data problem.”

Academic Medicine, Vol. 85, No. 7 / July 2010

For this report, we introduce and apply a
methodology that uses responder-level
controls to assess the influence of the
2003 mandatory ACGME duty hours
limits on how physician residents
perceived their clinical training
environments in the Department of
Veterans Affairs (VA) medical centers
between July 1, 2000 and June 30, 2007.
The study addresses covariate confounds,
trend biases, and missing-data problems
in three important aspects. First, we used
the Learners’ Perceptions Survey (LPS), a
structured interview administered
annually by the VA Office of Academic
Affiliations (VA-OAA) to residents
rotating through VA medical centers.
Second, respondents were classified into
effects or control groups based on LPS
survey questions that asked respondents
whether duty hours limits actually
changed their hours worked during
scheduled VA rotations. Third, we
adjusted for covariate and trend biases
using a robust differencing variable
technique, an advanced statistical method
designed to handle the missing-data
problem caused by failing to identify
controls among pre-2003 responders.

Data collection
We obtained resident satisfaction data
from the VA LPS, which has been
described elsewhere.33,34 Elements of each
satisfaction domain and ACGME duty
hours limits questions are listed in List 1. The
analyses for this study were conducted for
administrative purposes by, and were
under the direct supervision of, the VAOAA, under review by OMB Information
Collection (#2900-0691) approved for
VA Form #10-0439, for all data collected
through January 2010.
Used as a performance metric since 2001,
VA-OAA has administered the LPS
annually to all trainees who rotate
through VA medical centers. To reflect
overall satisfaction, respondents rate
“clinical training … on a scale from 0 to
100, where 100 is a perfect score and 70 is
a passing score.” We dichotomized
overall satisfaction responses from all
available surveys (2001–2007) into
satisfied (ⱖ70) or otherwise (⬍70),
where 70 was defined by VA as a
“passing” score.
Respondents also rated each of five
domains on a five-point scale (List 1).

List 1
Elements Comprising Satisfaction in
the Veterans Affairs Annual Learners’
Perceptions Survey
1. Faculty/Preceptors Domain
• Clinical skills
• Teaching ability
• Interest in teaching
• Research mentoring
• Accessibility
• Approachability
• Feedback timeliness
• Evaluation fairness
• Role models
• Mentoring
• Patient-oriented
• Faculty quality
• Evidence-based practice
2. Learning Environment Domain
• Time with patients
• Supervision
• Autonomy
• Noneducational “scut” work
• Interdisciplinary approach
• Preparation for clinical practice
• Future training
• Business aspects
• Learning time
• Access to specialty expertise
• Teaching conferences
• Care quality
• Patient safety
• Spectrum of patient problems
• Patient diversity
3. Clinical Environment Domain
• Work hours
• Number inpatients admitted for your care
• Outpatients seen
• Timely availability of outpatient
• Timely performance of necessary
• Timely admission of patient
• Ability to use emerging
• How well physicians and nurses work
• How well physicians and ancillary staff
work together
• Tests done in timely manner on weekends
• Tests done in timely manner at nights
• Accessing patient records
• Backup system for electronic records
• Amount of paperwork
• Ability to work within system to get
best care for patients


Graduate Medical Education

List 1
4. Working Environment Domain
• Morale of faculty
• Support staff
• Peer group
• Services of laboratory
• Radiology
• Ancillary/support staff
• Library
• Call schedule
• Computerized patient record system
• Computer access
• Internet access
• Orientation program
• Workspace
5. Physical Environment Domain
• Convenience of facility location
• Parking
• Personal safety
• Availability of phones
• Needed equipment
• Food services
• Equipment maintenance
• Facility maintenance
• Lighting
• Heating and air conditioning
• Cleanliness/housekeeping
• Call rooms
6. Accreditation Council for Graduate
Medical Education Duty Hours Limits
• Colleague support
• Personal reward
• Relationship with patients
• Appreciation of work by faculty
• Appreciation of work by patient
• Balance of professional and personal
• Work enjoyment
• Job stress
• Work fatigue
• Continuity of care
• Patient-care responsibility
• Quality of care
• Enhancement of clinical skills

For these analyses, to compute the odds
that a resident reported being satisfied,
we dichotomized responses into
“satisfied” (very satisfied, somewhat
satisfied) and “otherwise” (neither
satisfied nor dissatisfied, somewhat
dissatisfied, very dissatisfied). Since 2001,
domains included satisfaction with
clinical faculty/preceptors, learning,


working, and physical environments. A
fifth domain, clinical environment, was
added with the 2003 survey.
The VA added an ACGME duty hours
limits question beginning with the 2004
survey (List 1). The question read, “In
July 2003, the Accreditation Council for
Graduate Medical Education instituted
changes in requirements in duty hours/
scheduling for resident education. In
your opinion, what effect have these
changes had on your educational
experience at the VA facility…?”
Respondents rated their answers on a
five-point scale. To adjust pre–post
differences for time trends, we constructed
a differencing variable by dichotomizing
responses to this question to classify
each responder as either a no-effect control
(response of “no effect”) or an effectresponder (response of “very positive,”
“somewhat positive,” “somewhat negative,”
or “very negative” effect).
Several scenarios could explain why
residents may have claimed that ACGME
duty hours limits had no effect on their
VA clinical training settings. For
example, a resident may have worked a
schedule of hours that was within
the duty hours limits whether or not the
training program was enforcing the
ACGME-mandated duty hours rules.
Alternatively, the training program may
have ignored duty hours rules, at least
during the respondent’s rotation.
To adjust for trend biases, we constructed
the differencing variable to combine
responders who reported either positive
or negative perceptions of duty hours
effects. We combined the two groups
because our purpose was to measure how
duty hours limits influenced residents’
ratings of their training environment, and
not to determine how residents actually
perceived duty standards. Although
important, the latter research lies outside
the scope of the current study.
The LPS also obtained demographic
information about each respondent,
including gender, training level
(postgraduate year), and specialty. For
our analyses, we grouped resident
specialties into medicine (internal
medicine, neurology, physical and
rehabilitation medicine), surgery
(surgery, anesthesiology), psychiatry
(including psychiatric subspecialties),
and ancillary care (diagnostic specialties,
radiology, pathology). Beginning in 2003,

the LPS began collecting information on
medical school graduation (date
graduated, U.S. versus foreign medical
school) and the mix of patients seen
during the respondent’s VA rotation. We
estimated patient mix from survey
responses as the percentage of patients
the respondent reported seeing during
“an average week, at the VA…” for each
of seven patient categories: 65 years of
age or older, chronic mental illness,
chronic medical illness, multiple medical
illnesses, alcohol/substance dependence
conditions, low-income socioeconomic
status, and no social/family support.
To be comparable with other studies, we
computed the effect of ACGME duty
hours limits on resident satisfaction for
each satisfaction domain as a ratio of
odds ratios (ROR) describing whether the
resident reported satisfaction or
otherwise. The ROR numerator is
calculated for effect-responders and
equals the odds that these respondents
would have reported satisfaction in the
postperiod divided by the odds that these
same respondents would have reported
satisfaction in the preperiod. The
denominator is calculated in the same
way, but only for no-effect controls.
ROR ⫽ 1 indicates that pre–post changes
in satisfaction rates among effectresponders were no different from the
pre–post changes among no-effect
controls, and suggests that no duty-limit
effect on satisfaction was observed. On
the other hand, ROR ⬎ 1 indicates that
pre–post changes in satisfaction rates
among effect-responders were greater than
their no-effect counterparts, and suggests
that duty hours limits were associated with
higher satisfaction rates. Similarly, ROR ⬍
1 suggests that duty hours limits led to
decreased satisfaction rates.
ROR is adjusted for both covariate and
trend biases using a robust differencing
variable technique that extends
difference-in-differences analyses29 to
logistic regression35 by (1) using residentlevel training environments as the unit
of analyses, (2) identifying control
respondents to adjust for trend biases,
(3) accounting for missing data without
imputation noise, (4) performing an
exhaustive model search to adjust for
covariate biases, and (5) computing
estimates of effect sizes and confidence
intervals (CIs) that are robust to model
misspecification (see Mathematical Appendix,

Academic Medicine, Vol. 85, No. 7 / July 2010

Graduate Medical Education

Supplemental Digital Content 1,

Table 1
Description of Veterans Affairs Learners’ Perceptions Survey Respondents by
Reporting Period, 2001–2007ⴱ
No. (%)
from all

No. (%)
reporting from
pre–duty hours
limits period

No. (%)
reporting from
post–duty hours
limits period






7,102 (39)

2,639 (39)

4,463 (39)


Medical school†




10,575 (75)

1,947 (76)

8,628 (74)


Entered residency




2,174 (16)

411 (16)

1,763 (16)





5,498 (28)

1,910 (27)

3,588 (28)


4,446 (23)

1,554 (22)

2,892 (23)


4,289 (22)

1,523 (22)

2,766 (22)


2,964 (15)

1,080 (16)

1,884 (15)


1,444 (7)

541 (8)

903 (7)


706 (4)

275 (4)

431 (3)


258 (1)

81 (1)

177 (1)



Training level


Medical specialty




11,990 (67)

4,731 (70)

7,259 (65)

3,804 (21)

1,433 (21)

2,371 (21)

1,615 (9)

402 (6)

1,213 (11)

424 (2)

126 (2)

298 (3)










Patients seen each week


Over 65 years of age§




406 (3)

31 (1)

375 (3)





700 (5)

83 (3)

617 (5)


1,518 (11)

175 (7)

1,343 (11)


4,833 (34)

792 (31)

4,041 (34)


5,310 (37)

1,125 (45)

4,185 (37)


1,416 (10)

315 (13)

1,101 (10)

Mental illness§





2,304 (16)

512 (21)

1,792 (15)


3,703 (26)

683 (28)

3,020 (26)


3,359 (24)

540 (22)

2,819 (24)


2,553 (18)

397 (16)

2,156 (19)


1,418 (10)

209 (8)

1,209 (10)


789 (6)

123 (5)

666 (6)

Medical illness§





292 (2)

28 (1)

264 (2)


301 (2)

48 (2)

253 (2)


771 (5)

104 (4)

667 (6)


2,299 (16)

325 (13)

1,974 (17)


5,017 (35)

888 (36)

4,129 (35)


5,485 (39)

1,110 (44)

4,375 (38)


To account for covariate biases between
pre- and postperiods, and effect- and
control responders, we adjusted
satisfaction outcomes based on responder
characteristics and clinic experience. To
account for nonlinear associations, we
used a maximum-likelihood recoding
strategy to transform all continuous and
ordinal variables into binary covariates.
Specifically, each continuous and ordinal
variable was independently dichotomized
using nonparametric, bootstrapped,
maximum-likelihood cut-point estimates
for each of the five domains and overall
satisfaction score.36 For each dependent
variable, we determined a model
containing the most predictive covariates
from an exhaustive model search37 using
the generalized Akaike information
criteria38 based on data from postlimits
periods. We then validated these
empirically motivated models using a
10-fold cross-validation approach.39
Next, we constructed a theoretically
motivated model to contain three specific
variables. First, a period indicator
variable assumed a value of zero if the
respondent answered the LPS survey in
prelimits years (2001–2003), or a value of
one if the respondent had answered the
survey during postlimits years (2004 –
2007). Second, a differencing variable was
constructed to assume a value of zero if
the respondent was a no-effect control,
and a value of one if the respondent
reported either a positive or negative
effect to the ACGME duty hours limits
question (effect-responder). Third, a
period ⫻ differencing variable
interaction term was computed by
multiplying the period indicator and
differencing variables for each


Multiple illnesses§





255 (2)

19 (1)

236 (2)


274 (2)

57 (2)

217 (2)


858 (6)

121 (5)

737 (6)


Academic Medicine, Vol. 85, No. 7 / July 2010

We constructed a final model by
combining the terms that made up the
theoretically and empirically motivated
models. All models included a constant
term. We then used a nonnested model
selection test40 –43 to compare the fit of all
three models. If the final model fit the
data better than either theoretically or
empirically motivated models, then duty
hours limits effects were estimated by
exponentiating the estimated coefficient
to the period ⫻ differencing variable
interaction term to the final model. To
semantically interpret the interaction


Graduate Medical Education

Table 1
No. (%)
from all

No. (%)
reporting from
pre–duty hours
limits period

No. (%)
reporting from
post–duty hours
limits period


2,505 (18)

357 (14)

2,148 (18)


5,338 (38)

967 (39)

4,371 (38)


4,924 (35)

972 (39)

3,952 (34)




889 (6)

152 (6)

737 (6)


3,029 (21)

566 (23)

2,463 (21)


4,310 (31)

680 (27)

3,630 (31)


3,776 (27)

712 (29)

3,064 (26)


1,738 (12)

302 (12)

1,436 (12)

403 (3)

71 (3)

332 (3)


Alcohol/substance abuse§






Low income





445 (3)

75 (3)

370 (3)


1,905 (14)

318 (13)

1,587 (14)


4,356 (31)

730 (30)

3,626 (31)


4,047 (29)

724 (29)

3,323 (29)


2,649 (19)

485 (20)

2,164 (19)


721 (5)

129 (5)

592 (5)

No social support






969 (7)

173 (7)

796 (7)


3,404 (24)

572 (23)

2,832 (24)


4,622 (33)

781 (32)

3,841 (33)


3,350 (24)

596 (24)

2,754 (24)


1,475 (11)

269 (11)

1,206 (10)

292 (2)

59 (2)

233 (2)


Satisfaction outcome


Summary score




16,349 (90)

5,998 (89)

10,351 (90)


70 or above


Clinical preceptor§




Very satisfied

8,593 (45)

2,664 (40)

5,929 (48)

Somewhat satisfied

7,909 (42)

2,901 (44)

5,008 (41)

1,175 (6)

498 (8)

677 (6)




Somewhat dissatisfied

861 (5)

371 (6)

490 (4)

Very dissatisfied

379 (2)

153 (2)

226 (2)




Very satisfied

5,449 (29)

1,475 (22)

3,974 (33)

Somewhat satisfied

9,457 (50)

3,599 (54)

5,858 (48)


2,123 (11)

912 (14)

1,211 (10)

1,416 (8)

545 (8)

871 (7)

478 (3)

178 (3)

300 (3)

term, we assumed that the adjusted
impact of the differencing variable on
satisfaction is invariant with time
(see Mathematical Appendix,
Supplemental Digital Content 1, http://
The LPS did not ask respondents about
duty hours limits in prelimits periods
when ACGME rules were not enforced
(missing-data problem). However, the
concept of a no-effect control in prelimits
periods is still relevant. Although one can
only speculate about the actions of VA
staff during 2001–2003, it is possible that
some residents were assigned to work
schedules that complied naturally with
the duty hours rules. Residents may also
have been supervised by attending
physicians who would have done business
as usual and ignored duty hours limits
had such rules been mandatory.
Assigning values to the differencing
variable for prelimits responders is
treated as a missing-at-random problem.
That is, by knowing the year of the
survey, one knows whether the value of
the differencing variable is missing.44
Concerning missing data, the period ⴛ
differencing variable interaction term will
always equal “zero” during prelimits
periods. Thus, only the differencing
variable as a main effects term will have
missing data. Rather than using
imputation, we computed maximumlikelihood estimates by taking into
account all possible patterns of values for
the missing data. Missing values among
covariates caused by later additions to the
LPS survey were also treated in this way.
Thus, model coefficients were computed
directly from the observable component
of the data without imputation noise.




Somewhat dissatisfied


Very dissatisfied

Final models were tested for fit, the
presence of model misspecification, and
multicollinearity. Because of the potential
for misspecification, robust estimation
methods that are valid in the presence of
model misspecification were used to
compute both parameters and CIs.29,45,46






Very satisfied

3,221 (22)

336 (13)

2,885 (24)

Somewhat satisfied

6,400 (45)

1,149 (46)

5,251 (44)


2,452 (17)

555 (22)

1,897 (16)

Somewhat dissatisfied

1,726 (12)

337 (13)

1,389 (12)

592 (4)

131 (5)

461 (4)


Very dissatisfied



Table 1 presents characteristics and
satisfaction scores for 19,605 LPS
physician resident responders classified
by reporting period. Variation in
responder characteristics underscores the
need to adjust for covariate biases.
Compared with all residents in ACGMEaccredited programs in 2008 –2009,47 the

Academic Medicine, Vol. 85, No. 7 / July 2010

Graduate Medical Education

Table 1
No. (%)
from all

No. (%)
reporting from
pre–duty hours
limits period

No. (%)
reporting from
post–duty hours
limits period




Very satisfied

4,516 (25)

1,289 (19)

3,227 (28)

Somewhat satisfied

8,403 (46)

3,065 (46)

5,338 (46)


3,112 (17)

1,308 (20)

1,804 (15)

Somewhat dissatisfied

1,861 (10)

795 (12)

1,066 (9)

427 (2)

178 (3)

249 (2)




Very satisfied

4,672 (25)

1,182 (18)

3,490 (29)

Somewhat satisfied

8,749 (47)

3,211 (48)

5,538 (46)


2,826 (15)

1,259 (19)

1,567 (13)

Somewhat dissatisfied


Very dissatisfied




1,890 (10)

773 (12)

1,117 (9)

Very dissatisfied

611 (3)

244 (4)

367 (3)

Duty hours limits¶




No effect



2,585 (24)




8,068 (76)

Very negative



96 (1)

Somewhat negative



465 (4)

Somewhat positive



3,916 (37)

Very positive



3,591 (34)

an expected 93% under mandatory
limits, adjusted to reflect differences in
the mix of respondents and other time
trends in the data. That is, we estimate
that 33 out of 100 respondents, who
otherwise would not have been satisfied,
would have reported satisfaction under
mandatory duty hours limits. For
medicine (ROR ⫽ 3.46, 95% CI [1.37,
8.70], P ⫽ .0084), the prelimits period
satisfaction rate of 58% increased to 83%,
for an adjusted net increase of 25% under
the mandatory duty hours rules.
Similarly, these data suggest that an
expected 12% more surgery residents and
11% more medicine residents would have
reported satisfaction with faculty or
preceptors under ACGME mandatory
duty hours rules than without such rules.


Total sample (n ⫽ 19,605) includes those reporting during pre– duty hours limits periods representing academic
years 2001 (n ⫽ 1,752), 2002 (n ⫽ 2,531), and 2003 (n ⫽ 2,681) and those reporting during post– duty hours
limits periods representing academic years 2004 (n ⫽ 2,793), 2005 (n ⫽ 3,101), 2006 (n ⫽ 3,792), and 2007
(n ⫽ 2,955).
First introduced beginning with the FY2003 survey.
Time between graduating from medical school and beginning residency program is greater than four years.
Indicates statistically significant at P ⬍ .05 based on two-sided Pearson chi-square test.
First introduced beginning with the FY2004 survey.

LPS sample had slightly fewer females at
7,102 of 18,323 (39%) versus 48,823 of
108,176 (45%) residents in ACGMEaccredited programs, had fewer
international medical school graduates at
3,602 of 14,177 (25%) versus 29,488 of
108,176 (27%) ACGME residents, and
fewer first-year residents at 5,498 of
19,605 (28%) versus 38,404 of 108,176
(36%) ACGME residents.

environment under duty hours limits
than without such standards. These
findings held across each of the five
domains, for all residents taken together,
and for medicine residents only. Surgery
residents tended to report higher levels of
satisfaction only for clinical faculty or
preceptors and clinical environment.
Estimates for ancillary and psychiatry
specialties were inconclusive.

Table 2 reports estimates of duty hours
limits effects measured as an ROR based
on the robust differencing variable
technique. The wide CIs reflect the
uncertainty associated with working with
incomplete datasets.

To understand its relevance to education,
we recalculated ROR estimates of duty
hours limits effect sizes (Table 2) to
reflect the adjusted estimate of the
percentage of respondents who would
change their response from “not
satisfied” to “satisfied” as duty hours
limits became mandatory (Table 3). The
largest change occurred in the clinical
environment domain. For surgery
residents (ROR ⫽ 9.10, 95% CI [2.62,
31.61], P ⫽ .0005), satisfaction rates for
clinical environments increased from a
prelimits period rate of 60% (Table 1) to

Overall, respondents tended to report
higher satisfaction with their VA clinical
training environment when duty hours
limits applied. For instance, respondents
overall were 2.46 times (95% CI [1.49,
4.05], P ⬍ .001) more likely to report
satisfaction with VA as a clinical training

Academic Medicine, Vol. 85, No. 7 / July 2010

To show the importance of adjusting for
covariate mix and time trends, the unadjusted
pre–post period change in satisfaction with
VA training environments is OR ⫽ 1.00 (95%
CI [0.91, 1.11], P ⫽ .96). There was also little
adjusted cross-sectional difference in overall
satisfaction between effect-responders and
no-effect controls (OR ⫽ 1.12, P ⫽ .66). This
finding is comparable with those of studies
showing few differences in patient outcomes
between teaching and nonteaching VA
hospitals.48 Females were generally more
likely to report overall satisfaction for VA
training (OR ⫽ 1.12, P ⫽ .038) as well as
clinical (OR ⫽ 1.09, P ⫽ .039) and working
(OR ⫽ 1.15, P ⬍ .001) environments. The
higher rates of satisfaction among females
are consistent with other surveys.49
Respondents who reported that 50% or
more of the patients they saw were without
family support, or were substance abusers,
were only 56% (OR ⫽ 0.56, P ⬍ .0001) and
73% (OR ⫽ 0.73, P ⬍ .0001), respectively,
as likely to report satisfaction with VA
clinical training environments as their
counterparts who saw fewer than 50% of
such patients.


Using advanced statistical techniques to
adjust for trend and covariate biases, we
found that the 2003 ACGME standards
significantly and materially enhanced
learning satisfaction rates for medicine
and surgery residents rotating through
VA medical centers. The statistical tools,
along with our large sample size and
robust survey, provided a comprehensive
estimate of the impact of duty hours


Graduate Medical Education

Table 2
Effect of Accreditation Council for Graduate Medical Education Duty Hours
Limits on Resident Satisfaction With Clinical Rotations Through Veterans Affairs
Medical Centers Between 2001 and 2007

95% CI

P† GAIC/2n‡



Overall clinical training


All specialties¶


1.49–4.05 .0004




1.80–4.95 .0000



1.26 0.14–11.46 .8349


1214 16,774


915 11,315










1.42 0.03–73.54 .8612




Clinical faculty/preceptors


All specialties¶


1.84–4.72 .0000


1060 16,394



2.09–5.79 .0000


979 11,047


4.76 1.68–13.49 .0034






0.48–1.42 .4918





1.63 0.06–46.56 .7762





Learning environment


All specialties¶


1.47–3.38 .0001


1650 17,236



1.59–3.90 .0001


1239 11,616



0.79–6.75 .1237





2.21 0.42–11.55 .3471










duty hours limits, and to understand how
residents’ satisfaction with their training
environments can improve as duty hours
limits rules are enforced.
These findings were consistent with
subanalyses conducted across domain
elements, and when satisfaction scales
were “cut” at different levels. However,
our results both compared to and
contrasted with those of previous studies.
Specifically, these findings are consistent
with reported associations between
reduced work hours and residents’
perceptions of more time to read and
learn independently,24,28,50 greater
attending supervision,28,51 and attending
physicians’ increased role in patient
care.52 In contrast, these findings differ
from postsurveys22–25,53 and pre–post
surveys26 –28 that reported clinical
experiences and patient-care quality
remained unchanged, or even worsened,
with fewer duty hours.


0.17–2.35 .4899

Clinical environment


All specialties¶


2.05–7.55 .0000


5037 12,935



1.37–8.70 .0084





9.10 2.62–31.61 .0005





4.90 0.99–24.33 .0520










0.00–4.07 .1789

Physical environment


All specialties¶


1.27–3.21 .0029


2406 16,710



1.34–3.67 .0020


1772 11,294



0.89–7.94 .0798





0.80 0.03–22.81 .8976










0.00–0.22 .0036

Working environment


All specialties¶


2.04–4.27 .0000


1547 17,081



2.13–4.90 .0000


1393 11,500



0.60–5.58 .2843





5.42 1.83–16.07 .0023





4.37 0.27–70.62 .2985





There are several possible reasons for the
disparity between these survey findings and
ours. First, the robust differencing variable
technique applied here was designed to
adjust for time trends using respondentlevel controls with pre–post survey data.
Such corrections, in fact, had an important
effect on our study findings. For example,
we found no ACGME duty hours limits
effect on satisfaction rates (OR ⫽ 1.00, 95%
CI [0.91, 1.11], P ⫽ .96) with LPS data
when effect sizes were based entirely on
unadjusted pre–post differences. Adjusting
for time trends alone, the estimated effect
size increased to an ROR of 2.13 (95% CI
[1.27, 3.58], P ⫽ .004), and to 2.46 (95% CI
[1.49, 4.05], P ⬍ .001) (Table 2) when
estimates were further adjusted to account
for differences in responder mix across
periods and duty hours limits effect


Exponentiation of period ⫻ differencing variable interaction parameter (i.e., ratio of odds ratios), with effect
size, confidence interval, standard error, and model fit computed using robust missing data methods46,66 (see
also Mathematical Appendix).
Computed from robust standard errors.45,46,65
Estimate of model fit.38
Measure of multicollinearity computed as the maximum condition number (CN) for the Hessian and outer
product gradient of the variance/covariance matrices where CN ⫽ ␭max(matrix)/␭min(matrix)| where ␭ ⫽
eigenvalues for the corresponding matrix.
P ⬍ .01.
Cannot be estimated.
Misspecified model.45,46,65

limits on residents’ satisfaction with their
educational environment. Understanding
these effects can provide useful


information to government agencies,
accrediting bodies, teaching hospitals, and
program directors in assessing the effects of

A second explanation for the
discrepancies may involve differences in
survey designs. For purposes of
identifying control respondents, the LPS
survey asked responders to rate
satisfaction about current clinical
rotations and whether duty hours limits
(including limits on schedules and shifts)
had an effect (good or bad) on the
respondent’s actual VA training
environment. Postsurvey designs often
focused on previous clinical training
experiences and actual hours worked,
which are subject to underreporting

Academic Medicine, Vol. 85, No. 7 / July 2010

Graduate Medical Education

Table 3
Adjustedⴱ Estimates in Satisfaction Rates for Medicine and Surgery Resident
Respondents to the Veterans Affairs Learners’ Perceptions Survey (LPS) After
Accreditation Council for Graduate Medical Education Duty Hours Limits,
Percentage of
Change in respondents
percentage of
who report satisfaction
before duty satisfied after duty
hours limits†
hours limits‡ Estimate
95% CI






4.8% to 9.0%





⫺34.4% to 9.1%








7.4% to 12.4%





5.8% to 14.4%








7.9% to 17.7%





⫺4.2% to 17.1%








7.4% to 34.4%





19.7% to 38.0%








6.4% to 22.6%





⫺2.6% to 25.9%











12.0% ⫺12.2% to 25.3%

14.9% to 25.3%


Adjusted for covariate and trend biases.
Computed as the percentage of respondents for academic years 2001, 2002, or 2003 who reported “very
satisfied” or “satisfied” on the LPS survey, computed from data in Table 1.
Adjusted percentage of respondents [p2] satisfied during post– duty hours periods computed from effect sizes
(Table 2) [R] and the percentage of respondents satisfied during pre– duty hours limits periods (column 1 of this
table) [p1], or g ⫽ [p1 / (1 ⫺ p1)] ⴱ [R], and p2 ⫽ g / (1 ⫹ g).

A third difference may be attributable to
the sample and the survey design. Onethird of the nation’s residents rotate
through VA medical centers under VA
affiliation agreements with 107 U.S.
medical schools,55 with VA second only
to Medicare and Medicaid as the largest
funder of residency training in the United
States.56 Although VA teaching medical
centers likely differ from non-VA
teaching hospitals, this is the largest
survey of physician resident satisfaction
to date and involves a variety of facility
sizes and medical school affiliations in
diverse geographic areas across the
United States. Furthermore, the
confidential LPS survey is administered
by a federal agency under strict rules of
confidentiality enforced under federal
oversight by the Office of Management
and Budget. Promoted as an
administrative tool designed to improve

Academic Medicine, Vol. 85, No. 7 / July 2010

VA as a clinical training environment,33,34
the LPS survey began with the 2001
academic year, three years before duty
hours limits were first implemented, and
one full year after full implementation of
VA’s quality improvement initiatives had
been completed.57,58
Fourth, by classifying respondents
individually into “effect” respondents and
“no-effect” controls, we avoided
aggregation errors created when
respondents were grouped by educational
program or facility. Overall, 36% of LPS
respondents claimed that duty hours limits
did not impact their VA clinical rotations
during postlimits academic years (2004 –
2007). Such reports occurred across
programs, specialties, and facilities,
indicating the diversity of experiences
residents encountered within the same
programs and teaching facilities.

Finally, it may not be unusual to find
“no-effect” environments after 2003
because some training programs had
failed on occasion to adhere to
mandatory duty hours rules. In one
study, respondents reported exceeding
the 80-hour rule at least once during six
months in surgical (89%) and
nonsurgical (74%) specialties while
underreporting their work hours to their
program directors (73% and 38%,
respectively).54 In a national survey of
interns after ACGME implementation,
67% reported working shifts beyond the
30-hour rule, 43% more than the 80hour rule, and 44% less than the one-inseven day rule.59 Despite having regulated
resident duty hours since 1989, New York
State found 54 of the state’s 82 teaching
hospitals were in noncompliance.60
The present study has certain limitations.
VA clinic rotations may not necessarily
represent experiences at non-VA
locations. Second, respondents may not
know when duty hours limits affected
their training environments, thus leading
to overreporting of “no effect” on the
ACGME duty hours limits question.
However, overreporting “no-effect”
would bias estimates of duty hours limits
effect sizes toward zero. Third, it is
unknown whether resident satisfaction
with clinical training is related to
objective measures of education
outcomes, such as in-service
competencies examinations, board
scores, and attending physician
evaluations. Fourth, covariates we used to
adjust for differences in respondent mix
may not have controlled for all relevant
factors that drive satisfaction rates. The
study did not address why satisfaction
may have changed, but this shift could be
explained by many factors in addition to
duty hours limits, including changes in
workload, work life,61 resident crosscoverage, night-float systems,
redistribution of workload, reassignment
of noneducational tasks to midlevel and
lower-level providers,62 clinical schedules
that minimize sleep interruption,63 or
reduced in-house on-call duties. Fifth,
the results are based on resident
perceptions and may not necessarily
reflect true differences in the quality of
patient care or the effectiveness of the
teaching environment. Finally, it is
unknown whether further restrictions on
duty schedules will continue to improve
resident satisfaction.


Graduate Medical Education


In summary, applying advanced
statistical methods to robust survey data,
we found the 2003 ACGME mandatory
duty hours limits were associated with
improved training satisfaction rates. With
the prospect that ACGME may adopt
new standards for resident duty hours,16
education researchers may wish to
consider using the LPS survey design with
robust differencing analyses to assess the
impact of new standards across U.S.
teaching hospitals.64
9 of 9

