Download:
pdf |
pdfEstimation Considerations for the Occupational
Requirements Survey
Bradley D. Rhein1, Chester H. Ponikowski1, and Erin McNulty1
1
U.S. Bureau of Labor Statistics, 2 Massachusetts Ave., NE, Room 3160,
Washington, DC 20212
Abstract
The Bureau of Labor Statistics (BLS) is working with the Social Security Administration
(SSA) to carry out a series of tests to determine the feasibility of using the National
Compensation Survey (NCS) platform to accurately and reliably capture data that are
relevant to the SSA's disability program. The proposed new Occupational Requirements
Survey (ORS) is envisioned to be an establishment survey that collects information on the
vocational preparation and the cognitive and physical requirements of occupations in the
U.S. economy, as well as the environmental conditions in which those occupations are
performed. Many of the data elements are collected on the basis of presence and then are
measured by duration (length of time). Some data elements collected are conditional on the
presence of another data element. This paper discusses the considerations for developing
the estimates to be produced for ORS and how the micro data may be adjusted to account
for unit and item non-response.
Key Words: estimation, binomial data, point estimates, establishment survey,
descriptive statistics
1. Introduction
In the summer of 2012, the Social Security Administration (SSA) and the Bureau of Labor
Statistics (BLS) signed an interagency agreement to begin the process of testing the
collection of data on occupations. As a result, the Occupational Requirements Survey
(ORS) began testing in late 2012. The goal of ORS is to collect and publish occupational
information that will replace the outdated data currently used by SSA. All ORS products
will be made public for use by non-profits, employment agencies, state or federal agencies,
the disability community, and other stakeholders. More information on the background of
ORS can be found in the next section.
An ORS interviewer attempts to collect close to 70 data elements related to the
occupational requirements of a job. The following four groups of information will be
collected:
Physical demand characteristics/factors of occupations (e.g. strength, hearing, or
stooping)
Educational requirements
Cognitive elements required to perform work
Environmental conditions in which the work is completed
During the last two years of survey testing, there have been a number of changes to the
details of collection. Some data elements were not considered necessary and were dropped.
New data elements were added to the list. Many data element definitions were refined,
while continuing to rely on definitions from the Revised Handbook for Analyzing Jobs
(RHAJ) [1]. In some cases, the method of collection was changed. For instance, initially
the duration of an occupational activity was captured categorically using ranges of time
defined by SSA. Currently duration is captured by hours or percentage of the day, with
fallbacks being a range specified by the respondent or use of a duration scale.
This paper explores the estimation and some of the non-response options for ORS data.
Section 2 provides background information on the Occupational Requirements Survey.
Section 3 explains the types of ORS data elements captured and available for use in
estimation. Section 4 explores the estimation possibilities for each data element, and
Section 5 provides a glimpse of some non-response adjustment possibilities. The paper
ends with a conclusion and description of further research to be completed.
2. Background Information on ORS
In addition to providing Social Security benefits to retirees and survivors, the Social
Security Administration (SSA) administers two large disability programs, which provide
benefit payments to millions of beneficiaries each year. Determinations for adult disability
applicants are based on a five-step process that evaluates the capabilities of workers, the
requirements of their past work, and their ability to perform other work in the U.S.
economy. In some cases, if an applicant is denied disability benefits, SSA policy requires
adjudicators to document the decision by citing examples of jobs the claimant can still
perform despite restrictions (such as limited ability to balance, stand, or carry objects) [2].
For over 50 years, the Social Security Administration has turned to the Department of
Labor's Dictionary of Occupational Titles (DOT) [3] as its primary source of occupational
information to process the disability claims [4]. SSA has incorporated many DOT
conventions into their disability regulations. However, the DOT was last updated in its
entirety in the late 1970’s, although a partial update was completed in 1991. Consequently,
the SSA adjudicators who make the disability decisions must continue to refer to an
increasingly outdated resource because it remains the most compatible with their statutory
mandate and is the best source of data at this time.
When an applicant is denied SSA benefits, SSA must sometimes document the decision by
citing examples of jobs that the claimant can still perform, despite their functional
limitations. However, since the DOT has not been updated for so long, there are some jobs
in the American economy that are not even represented in the DOT, and other jobs, in fact
many often-cited jobs, no longer exist in large numbers in the American economy. For
example, a job that is often cited is “envelope addressor,” because it is an example of a
low-skilled job from the DOT with very low physical demands. There are serious doubts
about whether or not this job still exists in the economy.
SSA has investigated numerous alternative data sources for the DOT such as adapting the
Employment and Training Administration’s Occupational Information Network (O*NET)
[5], using the BLS Occupational Employment Statistics program [6] (OES), and
developing their own survey. But SSA was not successful with any of these potential data
sources and turned to the National Compensation Survey program at the Bureau of Labor
Statistics.
3. Captured Data Elements
ORS is designed to capture occupational information on educational requirements,
cognitive and physical demands, and exposures to environmental conditions. Each of the
data elements falls into two data types: categorical and continuous. Data elements that are
categorical have a set of predetermined values, one of which will be selected as a response.
Continuous data may be limited by a minimum, such as zero hours, or a maximum, such
as 100 percent.
Educational requirements are part of a category called “Specific Vocational Preparation”
(SVP). SVP measures the amount of time it takes a typical worker to learn the techniques
and acquire the information needed for average job performance. There are four
components:
1. Minimum level of education, including literacy
2. Previous job experience required
3. Time spent in earning certifications and licenses (pre- or post-employment
training)
4. On-the-job training (expressed as the time to average performance).
Level of education is measured by the receipt of a degree, and an amount of time is assigned
based on the degree. The four components are used to identify the following three
occupational requirements, listed in Tables 1-3 below, desired by the SSA.
Table 1: Specific Vocational Preparation
SVP
Level
1
2
3
4
5
6
7
8
9
Amount of Specific Vocational Preparation Time
Short demonstration only
Anything beyond short demonstration up to and including 1 month
Over 1 month up to and including 3 months
Over 3 months up to and including 6 months
Over 6 months up to and including 1 year
Over 1 year up to and including 2 years
Over 2 years up to and including 4 years
Over 4 years up to and including 10 years
Over 10 years
Table 2: Job Zone
Job Zone Preparation Level
SVP Level(s)
1
Little or no preparation needed
1-3
2
Some preparation needed
4-5
3
Medium preparation needed
6
4
Considerable preparation needed
7
5
Extensive preparation needed
8-9
Table 3: Skill Level
Skill Level
SVP Levels
Unskilled
1-2
Semi-skilled
3-4
Skilled
5-9
Cognitive abilities are captured entirely in categories. Information is gathered about the
complexity of the job, the amount of oversight and controls associated with the job, the
occurrence of deviations in work tasks, work schedules, and work location as well as the
types and frequency of personal interactions required to perform typical job duties.
The job complexity data element measures the level of decision-making, comprehension,
memory, and application of information needed to perform the typical duties of an
occupation. Work controls refer to the level of supervision and requirement for workers to
adhere to established guidelines. Several data elements measure the work routine of an
occupation, including the deviation in tasks, work schedule, and work location. Finally, the
cognitive data elements explore the type and frequency of worker communications with
established working relationships, labeled regular contacts, and other contacts.
Table 4 summarizes the cognitive data elements and possible responses.
Table 4: Cognitive Data Element Responses
Cognitive Measure
How complicated are the tasks of the
occupation?
Possible Response
Very simple, simple, moderate,
complex, very complex
How closely controlled is the occupation’s
work?
How often are there deviations from the norm
in work tasks?
Very closely, closely, moderately,
loosely, very loosely
Hourly, daily, weekly, monthly, less
than monthly
How often are there deviations from the norm
in work schedule?
How often are there deviations from the norm
in work location?
How often does the occupation verbally interact
(work related) with regular contacts?
Hourly, daily, weekly, monthly, less
than monthly
Hourly, daily, weekly, monthly, less
than monthly
Hourly, daily, weekly, monthly, less
than monthly
What type of work-related interactions does the
occupation have with regular contacts?
Very structured, structured, semistructured, unstructured, very
unstructured
How often does the occupation verbally interact Hourly, daily, weekly, monthly, less
(work related) with people other than regular
than monthly
contacts?
What type of work-related interactions does
this occupation have with people other than
regular contacts?
Very structured, structured, semistructured, unstructured, very
unstructured
Physical demands are captured in two ways: presence and duration. A physical demand is
considered present if the worker is required to perform an activity as part of the job.
Duration measures the amount of time in a typical work day a worker spends performing a
physical demand. Some physical demands will only have information on presence, such as
the demands for hearing or vision. The full list of physical demands is as follows:
Sitting
Standing/Walking
Sitting vs. Standing at Will
Lifting/Carrying
Reaching (overhead and at or below the shoulder)
Pushing and pulling with arms, legs, or feet only
Climbing stairs and ramps (job related or structural)
Climbing ladders and scaffolds
Crouching, kneeling, crawling, stooping
Use of hands (gross manipulation)
Use of fingers (fine manipulation)
Use of one or both feet or legs to move controls on machinery or equipment
Keyboarding (10-key, traditional, touch screen, other)
Hearing
Vision (near/far acuity, peripheral)
Driving and Vehicle Type
Communicating verbally (one-on-one, group, telephone, or other sounds)
Some physical demands have sub-questions. One example is the reaching data element.
Once reaching is found to be present and a duration is captured, an additional piece of
information is collected: does the reaching require one arm or both arms? Other physical
demands that have sub-questions include pushing and pulling, use of hands, and use of
fingers. An example for reaching is found in Table 5.
Table 5: Reaching with one or both arms
Data Element
Reaching
One arm or both arms
Response
2 hours
One arm
Lifting and carrying weight is an exception. One data element measures the maximum
amount a worker would have to lift and/or carry. After determining the most that a worker
would have to lift, the amount of weight being lifted within a certain duration, such as
between one-third and two-thirds of the time, would be captured. SSA uses the categories
in Table 6, most of which are defined by the RHAJ [1].
Table 6: SSA Categories for Lifting and Carrying Weight
Ranges for Amount of Weight
How Often
Never
Seldom (not in the
RHAJ)
Up to 1/3 of the time
1/3 up to 2/3 of the
time
2/3 of the time or
more
None
None
None
0 to 10 lbs
11 to 20 lbs
21 to 50 lbs
0 to 10 lbs
11 to 20 lbs
21 to 50 lbs
None
51 to 100
lbs
51 to 100
lbs
Negligible
0 to 10 lbs
11 to 25 lbs
26 to 50 lbs
None
Negligible
0 to 10 lbs
11 to 20 lbs
None
More than
100 lbs
More than
100 lbs
More than
50 lbs
More than
20 lbs
One other physical demand data element measures the strength required for a job. This
data element incorporates the amount of weight a worker must lift, push, and/or pull.
Currently, the weight thresholds for each category of the strength data element are still
under development.
Environmental conditions are the specific surroundings and circumstances in which an
occupation is typically performed. Data are captured on eleven environmental conditions.
Ten such conditions are captured by presence and duration. Exposure should be coded as
experienced with the use of personal protective equipment. Environmental conditions data
are collected for the amount of time a worker is exposed to:
Outdoors
Humidity
Extreme heat
Extreme cold
Heavy vibration
High, exposed places
Wetness
Proximity to moving parts
Toxic, caustic chemicals
Fumes, noxious odors, dusts, gases
The eleventh environmental condition is noise intensity level. This condition is captured in
four categories - quiet, moderate, loud, and very loud - considering the job’s typical noise
intensity exposure and accounting for any protective equipment.
4. Estimation Possibilities
Estimation options will depend on the type of data element. For any categorical data
element, a percentage of workers that fall into a given category could be calculated. For
continuous data elements, descriptive statistics could be calculated, including the mean
amount of time and percentiles. If the continuous data were placed within pre-specified
ranges, they would become categorical values, and percentages could be calculated. Table
7 presents the many of the ORS data elements and respective estimation possibilities.
Table 7: Sample ORS Data Elements and Possible Estimates
Table 8 presents SSA’s requested categories for the duration of any activity or exposure.
Table 8: SSA Duration Categories
SSA Duration Category
Not Present
Seldom
Occasionally
Frequently
Constantly
Activity/Exposure Occurrence in a Work Day
Not present
Occurs less than 2% of the time
Occurs 2% to less than 1/3 of the time
Occurs 1/3 to less than 2/3 of the time
Occurs 2/3 of the time or more
For data elements that could be expressed as a category (see Table 7), a percentage of
workers for whom the element is present will be calculated for each category among the
total number of workers.
The percentage will be calculated over some specified domain, such as an occupation, an
occupation within an industry, or a group of occupations. The formula for the percentage
of workers in the domain is as follows.
I Gi
OccFWig X ig Z ig
i 1 g 1
100
Gi
I
OccFWig X ig
i 1 g 1
where:
i
= Establishment
I
= Total number of establishments
g
= Occupational quote within establishment i
Gi
= Total number of occupational quotes in establishment i
OccFWig
= Final occupational quote weight (a weight denoting the
number employees in the frame that are represented by
occupation g in establishment i) for occupation g in
establishment i
Xig
= 1 if occupational quote ig is in the domain
= 0 otherwise
Zig
= 1 if the data element is present for occupational quote ig
= 0 otherwise
To calculate the percentage of employees for whom a data element is present out of all
employees in the domain, add the final quote weights across only those occupational quotes
in the domain for whom the data element is present. Then divide that number by the sum
of the final quote weights across quotes in the domain. Multiply the final quotient by 100
to yield a percentage estimate.
An example of a percentage estimate follows. Each estimate will have an associated
measure of error. Calculation of standard errors will be left for a future paper.
Example 1: Education Level for Salespersons
Some data elements depend on the presence of a related data element. For instance, as
noted above, once an occupation is known to require reaching overhead, an additional
question captures whether the job requires the worker to reach overhead with one hand or
with both hands. All of these circumstances necessitate a calculation of a percentage. A
complete list of such data elements follows.
Reaching overhead
Reaching from the shoulder
Pushing or pulling
Use of hands
The following example illustrates a sub-category of the reaching data element. Once the
presence of reaching has been established for a particular occupation, a follow-up question
records whether the reaching can be done with one hand/arm only, or if reaching with both
hands/arms is necessary. In the example below, of the 75% of workers required to reach
overhead, 20% could perform the reaching with one hand or arm.
Example 2: Presence of Reaching Overhead for Salespersons
Finally, a number of data elements are captured as continuous data. Mean and percentile
values will be calculated for all continuous data elements (see Table 7). In addition, some
continuous data will be converted into a range for use by SSA. These ranges will act as
categories and a percentage of workers will be calculated, as described earlier for
categorical data.
A mean for a data element will be calculated over some specified domain, such as an
occupation, and the formula is as follows:
I Gi
OccFWig X ig Z ig Qig
i 1 g 1
Gi
I
OccFWig X ig Z ig
i 1 g 1
where:
i
= Establishment
I
= Total number of establishments in the survey
g
= Occupational quote within establishment i
Gi
= Total number of occupational quotes in establishment i
OccFWig
= Final quote weight for occupation g in establishment i
Xig
= 1 if occupational quote ig is in the domain
= 0 otherwise
Zig
= 1 if the data element is present for occupational quote ig
= 0 otherwise
Qig
= Value of the continuous data element for occupational quote ig
To calculate the mean value of a continuous data element, multiply the final quote weight
and the value of the element for those occupational quotes in the domain for whom the
element is present; add these values across all contributing quotes to create the numerator.
Divide this number by the sum of the final quote weights across only those quotes in the
domain for whom the element is present.
The following percentiles, designated as p, would be calculated: 10th, 25th, 50th (median),
75th and 90th. The p-th percentile is the value of the continuous data element Qig such that:
the sum of final occupational quote weights (OccFWig) across quotes with a value
less than Qigj is less than p percent of all final quote weights, and
the sum of final occupational quote weights (OccFWig) across quotes with a value
more than Qigj is less than (100 – p) percent of all final quote weights.
It is possible that there are no specific quotes ig for which both of these properties hold.
This occurs when there exists a quote for which the OccFWig of records whose value is less
than Qig equals p percent of all final quote weights. In this situation, the p-th percentile is
the average of Qig and the value on the record with the next lowest value. Include only
occupational quotes in the domain for whom the element is present – i.e., where:
X ig Zig 1
where:
Xig
= 1 if occupational quote ig is in the domain
= 0 otherwise
Zig
= 1 if the data element is present for occupational quote ig
= 0 otherwise
i
= Establishment
g
= Occupational quote within establishment i
5. Initial Non-response Adjustment Ideas
Both unit and item non-response are expected to occur in the collection of the Occupational
Requirements Survey. Unit non-response would entail a sampled establishment or
occupation refusing to provide ORS data, while item non-response would refer to
incomplete information within an occupation about the data elements described in the
earlier sections.
There is not much information on the expected amount of unit non-response for ORS at
this point. Each of the various survey tests to this point have been focused on collection
feasibility and protocols. ORS interviewers collected data from cooperating establishments
found on an over-sampled list of units that was not fully representative of the sample frame.
Notes detailing the reasons for refusal were taken for feasibility tests occurring in 2014,
but refusal turnaround was not attempted. Additionally, ORS interviewers were not limited
to a strict probability selection of occupations within an establishment but, instead, could
collect data on occupations at the convenience of the respondent. However, completing a
probability selection of establishments and occupations is vital for a statistically sound
survey and so, generally, the additional respondent burden will result in increased nonresponse.
Similar to the National Compensation Survey (NCS) [7], ORS plans to adjust for unit nonresponse by re-weighting establishments and occupations within a pre-specified nonresponse adjustment cell. Once a weighting adjustment has been made to the
establishments and occupations within a cell where a unit refused, the weights of all viable
units may be further adjusted by benchmarking the weights to the current national
employment to obtain the Occupational Final Quote Weight used in the formulas above.
Early testing of ORS has shown a low instance of item non-response. Once an
establishment cooperates in collection, nearly all information for all items and quotes was
collected in a single appointment. However, these circumstances are considered unusual as
ORS interviewers will be attempting collection from less cooperative establishments in a
production environment. As a result, some methods must be developed to alleviate the
impending item non-response as ORS moves into production.
One option currently under investigation is to group similar ORS data elements and impute
all items in a group where at least one item is missing. The data elements naturally split
into several imputation groups, though some groups are larger than others. For instance, all
of the data elements regarding the lifting and carrying of weight could appear in an
imputation group. Using groups may help keep imputed responses consistent with
collected responses. Also, having a small number of elements within imputation groups
would be ideal as less collected data would be over-written in the process of imputing for
a single missing value within a group. Imputation could employ the nearest neighbor
approach, where the nearest neighbor is determined by the establishment employment size.
Imputation cells may need to be collapsed for situations where a cell lacks a viable number
of donors. Further research on item non-response will occur in the next phase of survey
testing.
6. Conclusion and Next Steps
Estimation for ORS data will likely include the use of descriptive statistics, dependent on
whether the data element type is categorical or continuous. Some data element estimates
are dependent on the presence of another data element. Percentages of workers in a
category will be calculated for categorical data; the mean amount of time and percentiles
will be calculated for continuous data. Standard errors, while not discussed in this paper,
will accompany all estimates of percentages, means, and percentiles.
While non-response is expected to be present in ORS, there are currently no final
procedures for non-response adjustment. The behavior of ORS data is largely unknown
and must be studied further. As a starting point, unit-level non-response will be adjusted
for by adjusting the weights of both establishments and occupations. Possible methods for
item non-response adjustment are under review. Before the production of estimates from
the first full-scale production sample, an imputation method will be tested involving nearest
neighbor imputation by employment size and within cell definitions based on available
variables. Once a greater amount of ORS data has been amassed, more information will be
available for use in developing appropriate item non-response adjustments.
References/Footnotes
[1]
[2]
[3]
[4]
[5]
[6]
[7]
U.S. Department of Labor (1991), Revised Handbook for Analyzing Jobs (RHAJ),
Washington, DC: Government Printing Office.
Social Security Administration, Occupational Information System Project,
http://www.ssa.gov/disabilityresearch/occupational_info_systems.html.
U.S. Department of Labor, Employment and Training Administration (1991),
“Dictionary of Occupational Titles, Fourth Edition, Revised 1991”
Occupational
Information
Development
Advisory
Panel,
2010,
http://www.socialsecurity.gov/oidap/index.htm
U.S. Department of Labor, O*Net Online, http://www.onetonline.org/
Bureau of Labor Statistics, Occupational Employment Statistics Program,
http://www.bls.gov/oes/
McCarthy, Christi, Ferguson, Gwyn R., and Ponikowski, Chester, (2011) "The
Weighting Process Used in the Employer Costs for Employee Compensation
Series for the National Compensation Survey ", 2011 Proceedings of the Section
on Survey Research Methods, Alexandria, VA; American Statistical Association.
Any opinions expressed in this paper are those of the authors and do not constitute
policy of the Bureau of Labor Statistics or the Social Security Administration.
File Type | application/pdf |
File Modified | 2014-10-01 |
File Created | 2014-09-26 |