1220-0045 Supporting Statement Part B (1-24-2017)

1220-0045 Supporting Statement Part B (1-24-2017).docx

Survey of Occupational Injuries and Illnesses

OMB: 1220-0045

Document [docx]
Download: docx | pdf

Survey of Occupational Injuries and Illnesses

1220-0045

January 2017

SUPPORTING STATEMENT, Part B

Updated for the Household Survey of Occupational Injuries and Illnesses


B. Collection of information employing statistical methods.

The statistical methods used in the sample design of the survey are described in this section. The documents listed below are attached or available at the hyperlink provided. These documents are either referenced in this section or provide additional information.


Overview of the Survey of Occupational Injuries and Illnesses Sample Design and Estimation Methodology – Presented at the 2008 Joint Statistical Meetings (10/27/08)-- http://www.bls.gov/osmr/pdf/st080120.pdf

Deriving Inputs for the Allocation of State Samples (05/01/13)

The growth in cases with Restricted Activity or Job Transfer (08/2011)

Methods Used To Calculate the Variances of the OSHS Case and Demographic Estimates (2/22/02)

Variance Estimation Requirements for Summary Totals and Rates for the Annual Survey of Occupational Injuries and Illnesses (6/23/05)

BLS Handbook of Methods – Occupational Safety and Health Statistics (September 2008) -- http://www.bls.gov/opub/hom/pdf/homch9.pdf

Nonresponse Bias in the Survey of Occupational Injuries and Illnesses (August, 2013) -- http://www.bls.gov/osmr/pdf/st130170.pdf

Sample Allocation to Increase the Expected Number of Publishable Cells in the Survey of Occupational Injuries and Illnesses (August, 2015) -- http://www.bls.gov/osmr/pdf/st150070.pdf


Household Survey of Occupational Injuries and Illnesses (HSOII)


The proposed Pilot Household Survey of Injuries and Illnesses (HSOII) will employ statistical methods to analyze the information collected from respondents. The following sections describe the procedures for respondent sampling and data tabulation. The proposed HSOII will be administered using a dual-frame (landline and cell) random digit dial (RDD) sampling frame. The response mode will be Computer-Assisted Telephone Interviewing.

The survey will include a pretest of 50 completed interviews prior to the full administration of the survey. Details regarding those tests are provided in Section B.4.

Data collection for the pilot phase will be conducted by ICF International. ICF International has extensive experience with data collection, data analysis, and statistical methods, particularly with respect to national surveillance systems.

Respondents will be selected through Random Digit Dialing (RDD) of landlines and cell phones. According to the most recently available population estimates of the cell-only population as measured by the National Health Interview Survey (NHIS), 97% of the population lives in a household with a landline and/or cell phone.1


1. Description of universe and sample.


Universe


The main source for the SOII sampling frame is the BLS Quarterly Census of Employment and Wages (QCEW) (BLS Handbook of Methods, Chapter 5 from http://www.bls.gov/opub/hom/homch5_a.htm). The QCEW is a near quarterly census of employers collecting employment and wages by ownership, county, and six-digit North American Industry Classification System (NAICS) code. States have an option to either use the QCEW or supply public sector sampling frames for State and local government units. Some states provide their own frames to use for private sector (only for Guam where data is not available in QCEW), state or local government establishments. The numbers that do so are provided in Table 1:




Table 1: Number of states providing frames by ownership type

Year

State Frame

Local Frame

Private Frame

2014

6

5

1

2015

4

3

1

2016

4

3

1


The potential number of respondents (establishments) covered by the scope of the survey is approximately 8.4 million, although only about 800,000 employers keep records on a routine basis due to recordkeeping exemptions defined by OSHA for employers in low hazard industries and employers with less than 11 employees, or having no recordable cases. The occupational injury and illness data reported through the annual survey are based on records


that employers in the following North American Industry Classification System (NAICS) industries maintain under the Occupational Safety and Health Act:


Sector

Description

11

Agriculture, Forestry, Fishing, and Hunting

21

Mining, Quarrying, and Oil and Gas Extraction

22

Utilities

23

Construction

31, 32, 33

Manufacturing

42

Wholesale Trade

44,45

Retail Trade

48,49

Transportation and Warehousing

51

Information

52

Finance and Insurance

53

Real Estate and Rental and Leasing

54

Professional, Scientific, and Technical Services

55

Management of Companies and Enterprises

56

Administrative and Support and Waste Management and Remediation Services

61

Educational Services

62

Health Care and Social Assistance

71

Arts, Entertainment, and Recreation

72

Accommodation and Food Services

81

Other Services (except Public Administration)



Excluded from the national survey collection are:

  • Self-employed individuals;

  • Farms with fewer than 11 employees (Sector 11);

  • Employers regulated by other Federal safety and health laws;

  • United States Postal Service and;

  • Federal government agencies.


Mining and Railroad industries are not covered as part of the sampling process. The injury and illness data from these industries are furnished directly from the Mine Safety and Health Administration and the Federal Railroad Administration, respectively, and are used to produce State and national level estimates.


Data collected for reference year 2008 and published in calendar year 2009 marked the first time state and local government agency data were collected for all states and published for all states and the nation as a whole.

The SOII is a Federal-State cooperative program, in which the Federal government and participating states share the costs of participating state data collection activities. State participation in the survey may vary by year. Sample sizes are


determined by the participating states based on budget constraints and independent samples are selected for each state annually. Data are collected by BLS regional offices for non-participating states.


For the 2016 survey, 41 states plus the District of Columbia plan to participate in the survey. For the remaining nine states which are referred to as Non-State Grantees (NSG), a smaller sample is selected to provide data which contribute to national estimates only.


The nine NSG States for 2016 are:


Colorado

Florida

Idaho

Mississippi

New Hampshire

North Dakota

Oklahoma

Rhode Island

South Dakota



Additionally, estimates are tabulated for three U.S. territories-Guam, Puerto Rico, and the Virgin Islands-but data from these territories are not included in the tabulation of national estimates.


Sample

The SOII utilizes a stratified probability sample design with strata defined by state, ownership, industry, and size class. The first characteristic enables all the State grantees participating in the survey to produce estimates at the State level. Ownership is defined into three categories: state government, local government, and private industry. There are varying degrees of industry stratification levels within each State. This is desirable because some industries are more prevalent in some states compared to others. Also, some industries can be relatively small in employment but have high injury and illness rates which make them likely to be designated for estimation. Thus, states determine which industries are most important in terms of publication and the extent of industry stratification is set independently within each state. BLS sets some minimal levels of desired industry publication to ensure sufficient coverage for national estimates. So the state levels can only be set at an industry detail that is more specific than those set by BLS. These industry classifications are defined using the North American Industry Classification System (NAICS, http://www.census.gov/eos/www/naics/)and are referred to as Target Estimation Industries (TEI). The industry classifications set by the national office are referred to as NTEI, and are not used as sampling strata.



Finally, establishments are classified into five size classes based on average annual employment and defined as follows:


Size Class

Average Annual Employment

1

10 or less

2

11-49

3

50-249

4

250-999

5

1000 or greater



After each establishment is assigned to its respective stratum, a systematic selection with equal probability is used to select a sample from each sampling cell (stratum). As mentioned earlier, a sampling cell is defined as state/ownership/TEI/size class. Prior to sample selection, units within a sampling cell are sorted by employment and then by Reporting Unit number (a unique identifier assigned to each reporting unit on the QCEW) to ensure a consistent representation of all employments in each stratum. Full details of the survey design are provided in Section 2.


For survey year 2016, the sample size will be approximately 240,000 or 2.9 percent of the total 8.4 million establishments in state, local, and private ownerships.


Response rate. The survey is a mandatory survey, with the exception of State and local government units in the States listed below:


Alabama

Arkansas

Colorado

Delaware

District of Columbia

Florida

Georgia

Idaho

Illinois

Kansas

Louisiana

Mississippi

Missouri

Montana

Nebraska

New Hampshire

North Dakota

Ohio

Pennsylvania

Rhode Island

South Dakota

Texas






Each year, respondents in the SOII are notified of their requirement to participate via mail. All non-respondents are sent up to two non-response mailings as a follow-up to the initial mailing. Some states choose to send a third or fourth non-response mailing to non-respondents late in the collection period. For Survey Year 2014, approximately half of the states sent an optional third non-response mailing to a majority of the non-respondents at that point in time, and less than five percent of the states sent a fourth non-response mailing. In addition, states may contact respondents via telephone for additional non-response follow-up. No systematic establishment


level data on the number of telephone non-response follow-up contacts is captured.


As mentioned earlier, public sector establishments were included in the 2008 survey for all states, including those from which no public sector data had been collected in the past. In these states, public sector establishments have no mandate to provide data to the SOII; their participation is voluntary. For SY 2008, the rates for both state and local government decreased, primarily due to the addition of the voluntary state and local government establishments.


In 2010 an in-depth response rate analysis was undertaken. Aggregate response rates in the SOII were shown to be above 90% due to the mandatory nature of the survey and the excellent efforts to obtain survey data by BLS state and regional partners. However, it was also shown that response rates in states with voluntary reporting status for the state and local governments had low response rates for the government units. In subsequent years, this study was updated to continually monitor the item and establishment non-response. As of the most recent update, there have been no significant changes.


The table below illustrates the establishment level response rates from 2003-2014:






Although response rates for the SOII program have historically been high, the expansion of public sector collection in voluntary states resulted in a response rate of 75 percent in state government in 2008. Per OMB statistical guidelines, a nonresponse bias study was initiated and completed in 2013 (See “Nonresponse Bias in the Survey of Occupational Injuries and Illnesses” in the supporting documents). This work concluded that in states where participation is voluntary, there is statistically significant evidence to suggest that counts for establishments identified by a model as being ‘likely’ to respond are lower than establishments that were identified as ‘unlikely’ to respond. Similarly, the mean case rates for establishments identified by a model as being ‘likely’ to respond were higher than those identified as being ‘unlikely’ to respond. This apparent contradiction between the biases in the measures was explained by the changes in the estimates of the hours worked that are included in the rate estimate. Given these voluntary state/local units comprised 1.3% of the total survey, efforts to address these observed biases were deferred due to resource constraints.


Additional response efforts are being conducted to analyze response rates for several key data elements collected for each establishment in the survey. Data elements for NAICS industry,


SOC occupation, source, nature, part, and event for each case with days away from work are coded by BLS regional staff and/or state partners. As such, these fields are always available for collected data. Other data elements such as ethnicity, whether the event occurred before/during/after the work shift, the time of the event, and the time the employee began work may be missing from collected data. BLS has initiated a response analysis effort for these other data elements to identify specific response rates and the characteristics of respondents versus non-respondents for these variables.


Regional offices are also working with States on collection practices to improve response for voluntary units.


BLS will continue to monitor the response rates in the next three years for all segments of the survey scope. BLS will update the analysis each year and make recommendations for improvements in the data collection process based on the results of the analysis. If response rates at the establishment level remain below 80% for any group of establishments, BLS will conduct additional non-response bias studies. If response rates for any specific data element within establishments are below 70%, BLS will also implement additional non-response bias studies. Details for these studies will be documented as the studies begin.


HSOII


Universe

The respondent universe is the population of workers age 18 and older residing in residential households within the United States (all 50 States and the District of Columbia). An eligible respondent is defined as someone who worked either as an employee or self-employed contractor working for pay or profit in the prior 12 months.


Sample design

The sample size for the HSOII is 5,500 qualified workers, selected from a national dual-frame landline and cell phone RDD survey. The dual-frame sample is an overlap design, where dual-users (those with a landline and cell phone) will be interviewed if selected in the landline sample or the cell phone sample.

A sample size of 5,500 is sufficient to provide national occupational injury and illness estimates of +/-1.6% for a 95% confidence interval. This level of precision includes a design effect of 1.5, typical for a dual-frame RDD surveys with overlap. The design effect represents inefficiencies inherent in sample design and operations. Since the sample design is based on an RDD telephone sample of landline and cell phones, there will be dual-frame weighting adjustments, as well as weighting adjustments for non-response. Given the optimal allocation (described below), a small effect for combining the dual-frames is expected, 1.05.2 For the nonresponse weighting effect, data from the 2015 Behavioral Risk Factor Surveillance System was used. The increase due to the population weighting is 1.40. Combining these two effects results in a design effect of 1.5.


RDD Sample Allocation

The sample allocation is optimized to minimize the variance of the dual-frame composite estimator, as outlined in Lohr and Brick3 (2014). The allocation is based on:

  1. A cell to landline cost ratio of $1.75:14

  2. 54% of adults working at a job or business are cell-only5

  3. 60% of cell phone surveys will be cell-only and 15% of landline surveys will be landline –only.

The optimal sample allocation is 30% (n = 1,650) landline and 70% (n = 3,850) cell phone. With this allocation, it is expected 42% of respondents to be cell-only, representing the estimated 54% population (see Exhibit 2).

Exhibit 2. Expected Distribution by Phone Status


18+ Population

Cell sample

Landline sample

Cell-only households

54.0%

42.0%

N/A

Landline households

46.0%

28.0%

30.0%


Selecting the RDD Samples


The landline and cell phone RDD samples will be selected through Marketing Systems Group’s Genesys Sampling System. The RDD frame is constructed based on information from the North American Numbering Plan Administration, which governs the assignment of 1,000-blocks to service providers. A 1,000-block is the series of 1,000 telephone numbers defined by the last three digits of a 10-digit phone number (NPA-NXX-Z000 - NPA-NXX-Z999). The 1,000-blocks dedicated to cell service or landline service are identified by codes from the Telcordia® LERG (Local Exchange Routing Guide). Those dedicated to landline service comprise the landline frame, while those dedicated to cellular service comprise the cell phone frame.


Landline

The landline sample will be selected using RDD using the equal probabilities of selection method (EPSEM) from working banks. A “working” bank is a 100-block (NPA-NXX-ZZ00 - NPA-NXX-ZZ99) where at least one telephone number is assigned to residential service. Note that this frame definition is improved over traditional list-assisted frames, in which blocks with one or more “listed” telephone numbers were included in the frame. The traditional list-assisted frame excluded zero-blocks, which typically excludes about 5 percent of residential households.6 The assignment-based frame includes households that would have otherwise been excluded.



Respondent Screening and Selection

Once a landline telephone is answered, we will read the introductory text and confirm that we have contacted a private residence. After that, we will ask if the person we’re speaking to is 18 years of age or older; if not, we will ask to speak to an adult in the household. Once an adult is on the phone, we will:

  • Conduct a household roster in which the adult informant gives information on the total number of adults and the number of those adults who worked for pay in the past 12 months.

  • We will select up to three respondents per household. The order of the interviews will prioritize workers who are currently available.

    • First, we will determine whether the screener adult is an eligible worker. If yes, we will conduct the interview with this person

    • Second, if the screener adult is not eligible, or the screener adult has completed the survey, we will ask to speak with the oldest/youngest (rotated) worker currently at home

    • Finally, we will schedule a call back to reach any remaining workers in the household

Cell Phone

The cell phone sample will be selected using RDD with EPSEM. All telephone numbers from the cell phone frame will be manually dialed in accordance with laws that prohibit cell numbers from being called by an automated dialer.


Respondent Selection

Once a cell phone is answered, the introductory text will be read and it will be confirmed that it is safe for the respondent to talk on their phone. After that, it is determined if the person speaking s 18 years of age or older; if not, the interview will be terminated. If

speaking to an adult, it will be determined whether the respondent has worked in the past 12 months.


Estimation

The completed interviews will be weighted using dual-frame methods for combining landline and cell phones. First, the sampling weight will be computed, or inverse of the selection probability, for the landline and cell phone samples. The sampling weight is the total number of records on the frame (NRECSTR) divided by the total number of records selected (NRECSEL). For the landline sample, this weight is adjusted for multiple landline households by dividing by the number of telephone lines as recorded during the survey (PHONES).


For the adult landline survey, we will interview more than one worker per household. The household weights will be equal to the total number of workers enumerated in the household (WHH) divided by the number of workers responding to the survey (WR).

In summary, the design weights are calculated as follows:


Landline: DESIGN_WT = (NRECSTR/NRECSEL) x (1/PHONES) x(WHH/WR)


Cell: DESIGN_WT = (NRECSTR/NRECSEL)



To account for the overlapping landline and cell phone dual frame design, a composite weight will be used, averaging the dual users from the cell phone sample and the dual users from the landline sample. The composite weight is a ratio of the effective sample sizes, c = neff1 / (neff1 + neff2), where neff = n/deff is the effective sample size; deff = n x Σ(DESIGN_WT2) / (ΣDESIGN_WT)2 is a measure of variability of the design weights (DESIGN_WT) and n is the sample size for each group.


As the final weighting step, it will be post-stratified into demographic categories (race/ethnicity, gender, age groups, and education) and ratio adjust the weights so that the final weighted sample matches the population with respect to those demographic characteristics. A raking algorithm will be used for these adjustments. The raking will be integrated with weight trimming to control the variance impact resulting from large weight differential. The weight trimming will be conducted after each iteration of the raking based on the individual and global cap value (IGCV) algorithm as presented by Izrael.7 This


method decreases high weight values by not allowing an individual’s weight value to exceed thresholds based on the individual’s weight and the average of the sample weights.


Non-response bias analysis

Survey nonresponse bias occurs when respondents are substantively different from the non-respondents. Response rates are often used as a measure of data quality because they are thought to reflect the degree to which non-response bias exists in the data, but this connection is tenuous.8 9 Instead, response rates are a measure of the risk of nonresponse. High response rates reflect low risk of nonresponse bias while low response rates increase the risk of nonresponse. In the absence of high response rates, a nonresponse analysis helps to justify the accuracy of the survey data.

To mitigate the risk of non-response bias, weighting adjustments will be developed to increase the sample representativeness


relative to the population. The representativeness will be evaluated by comparing the results of the RDD sample to benchmarks such as the American Community Survey (ACS) and/or Current Population Survey (CPS). This comparison will focus on key demographic variables such as race/ethnicity, gender, age groups, and education.


The demographic variables found to be significant in this analysis will be candidates for defining weight adjustment classes to ensure that weight adjustments minimize the potential for non-response bias.


To the extent possible, questions will be compared from the HSOII survey that overlap with ACS, CPS, or other data sources. These questions could include demographics and work status.






2. Statistical methodology.


Survey design. The survey is based on probability survey design theory and methodology at both the national and state levels. This methodology provides a statistical foundation for drawing inference to the full universe being studied.

Research was done to determine what measure of size was most appropriate for the allocation module. Discussion with Occupational Health and Safety Statistics (OSHS) program management narrowed the choices to the rates for Total Recordable Cases (TRC); Cases with Days Away from Work (DAFW); and Cases with Days Away from Work, Job Transfer, or Restriction (DART).


Rates from the 2003 SOII were studied for all 1251 TEIs for each of the above case categories. The average case rate, standard deviation (SD), and coefficient of variation (CV) for each set of rates were calculated. The CV is the standard deviation divided by the estimate, which is commonly used to compare estimates in relative terms. The results are shown below:



Description Ave. Rate SD CV

DAFW 1.5540 1.078 0.69

DART 3.0479 2.000 0.66

TRC 5.5300 3.229 0.58


Based on this information it was recommended that the TRC rate be used as the measure of size for the sample allocation process for the survey. The lower CV indicates that it is the most stable indicator.


Additionally, to fulfill the needs of users of the survey statistics, the sample provides industry estimates. A list of the industries for which estimates are required is compiled by the BLS after consultation with the principal Federal users. The sample is currently designed to generate national data for all targeted NAICS levels that meet publication standards.


Allocation procedure. The principal feature of the survey’s probability sample design is its use of stratified random sampling with Neyman allocation. The characteristics used to stratify the units are state, ownership (whether private or state or local government), industry code, and employment size class. Since these characteristics are highly correlated with the characteristics that the survey measures, stratified sampling provides a gain in precision and thus results in a smaller sample size.


Using Neyman allocation, optimal sample sizes are determined for each stratum within each State. Historical case data are applied to compute sampling errors used in the allocation process. Details about this process can be found in Deriving Inputs for the Allocation of State Samples (05/01/13).


The first simplifying assumption for allocation is that for each TEI size class stratum h, the employment in each establishment is the same, which is denoted by . BLS also ignores weighting adjustments. In addition, BLS assumes that the sampling of establishments in each stratum is simple random sample with replacement. (It is actually without replacement of course, but this is a common assumption to simplify the formulas.)


One consequence of these assumptions is that the estimate of the overall employment is constant and as a result the estimated incidence rate of recordable cases in the universe is the estimated number of recordable cases divided by this constant. Therefore, the optimal allocation for the total number of recordable cases and the incidence rate of recordable cases are the same. BLS will only consider the optimal allocation for the total number of recordable cases.


BLS introduces the following notation. For sampling stratum h let:


denote the number of frame units

denote the number of sample units

denote the sample weight

denote the total employment in stratum h

denote the incident rate for total recordable cases

denote the unweighted sample number of recordable cases


Also let:


denote the estimated number of recordable cases in the entire universe.


Then


(1)


(2)


where V denotes variance.


Now BLS will obtain under two different assumptions. Assumption (a) is:


(a) All employees in stratum h have either 0 or 1 recordable cases and the probability that an employee has a recordable case is .


In this case can be considered to have a binomial distribution with trials and the probability of success in each trial and consequently


(3)


Assumption (b) is:


(b) The total recordable case rate for the sample establishments in stratum h has a binomial distribution with trials and the probability of success in each trial. In that case


(4)


Although BLS will derive the optimal allocations under both assumptions, BLS prefers assumption (b) since under assumption (a) the variance of the recordable case rate among establishments in stratum h BLS believes will be unrealistically small, particularly for strata with large .


To derive the optimal allocation under assumption (a) BLS substitutes (3) into (2) obtaining


(5)


Viewing (5) as a function of the variables and minimizing (5) with respect to these variables by means of the method of Lagrange multipliers from advanced calculus, BLS obtains that (5) is minimized when the are proportional to


(6)


As for the preferred assumption (b), to derive the optimal allocation, BLS similarly substitutes (4) into (2) obtaining


(7)


Minimizing (7) as BLS minimized (5), BLS obtains that (7) is minimized when the are proportional to


(8)


which is the preferred allocation.




Sample procedure. Once the sample is allocated, the process of selecting the specific units is done by applying a systematic selection with equal probability independently within each sampling cell. Because the frame is stratified by employment size within each TEI before sample selection, it was felt equal probability sampling was appropriate rather than a PPS selection. PPS selection is often applied to frames that aren’t stratified by size so in this case, it was felt that no additional value would be gained by selecting the sample by PPS.


The survey is conducted by mail questionnaire through the BLS-Washington and Regional Offices and participating state statistical grant agencies. Respondents are able to provide responses to the survey via the internet, an Adobe fillable form, or by submitting data via a paper questionnaire. In a limited number of cases, data is collected by participating State statistical grant agencies or BLS Regional Office employees through telephone conversations with respondents. Starting with survey year 2016, the survey will use email notification for notification of responsibility to participate in the survey as well as for data collection in accordance with BLS policy on the use of email for data collection.


Estimation procedure. The survey's estimates of the number of injuries and illnesses for the population are based on the Horvitz-Thompson estimator, which is an unbiased estimator. The estimates of the incidence of injuries or illnesses per 100



full-time workers are computed using a ratio estimator. The estimates of the incidence rates are calculated as


where:


C = number of injuries and illnesses

= total hours worked by all employees during a

calendar year

200,000 = base for 100 full-time equivalent workers

(working 40 hours per week, 50 weeks per

year).


The estimation system has several major components that are used to generate summary estimates. The first four components generate factors that are applied to each unit’s original weight in order to determine a final weight for the unit. These factors were developed to handle various data collection issues. The original weight that each unit is assigned at the time the sample is drawn is multiplied by each of the factors calculated by the estimation system to obtain the final weight for each establishment. The following is a synopsis of these four components.


When a unit cannot be collected as assigned, it is assigned a Reaggregation factor. For example, if XYZ Company exists on the sample with 1,000 employees but the respondent reports for only one of two locations with 500 employees each, it is treated as a reaggregation situation. The Reaggregation factor is equal to the target (or sampled) employment for the establishment divided by the reported employment for collected establishments. It is calculated for each individual establishment.


In cases where a sampled unit is within scope of the survey but does not provide data, it is treated as a nonrespondent. Units within scope are considered viable units. This would include collected units as well as nonrespondents. The Nonresponse adjustment factor is the sum of the weighted viable employment within the sampling stratum divided by the sum of the weighted usable employment for an entire sampling stratum. The nonresponse adjustment factor is applied to each unit in a stratum.


In some cases, collected data is so extreme that it stands apart from the rest of the observations. For example, suppose in a dental office (which is historically a low incidence industry for injuries and illnesses), poisonous gas gets in the ventilation system which causes several employees to miss work for several days. This is a highly unusual circumstance for that industry. This situation would be deemed an outlier for estimation purposes and handled with the outlier adjustment. If any outliers are identified and approved by the national office, the system calculates an Outlier adjustment factor so that the outlier represents only itself. In addition, the system calculates outlier adjustment factors for all other non-outlier units in the sampling stratum. This ensures that the re-assigned weight is distributed equally amongst all units in the strata.


Benchmarking is done in an effort to account for the time lapse between the sampling frame used for selecting the sample and the latest available frame information. Thus, a factor is computed by dividing the target employment (latest available employment) for the sampling frame by the weighted reported employment for collected units.


The system calculates a final weight for each unit. The final weight is a product of the original weight and all four of the factors. All estimates are the sum of the weighted (final weight) characteristic of all the units in a stratum.


In 2010 a pilot study to measure rates of Days of Job Transfer or Restriction (DJTR) for selected industries was begun using data from the 2011 survey reference year. The first public release of the case and circumstances data for DJTR cases from this pilot occurred on April 25, 2013. BLS is analyzing the results of this test to determine the value of the information and is looking at how best to implement the collection of these data as well as days away from work cases in future survey years. Updates to this DJTR pilot study are continuing by changing the industries of interest. See the testing section below for details.


HSOII



Data Collection Staff

Survey data will be collected by trained interviewers employed by ICF International. ICF has created a public health interviewing team for its BRFSS-protocol surveys. To be selected for the team, individuals must meet minimum standards with respect to tenure, response rate, non-response conversion capabilities, and interviewer performance based on monitoring sessions. To retain membership, interviewers are required to attend regular retraining sessions in refusal avoidance, non-response conversion, and general interview technique. ICF maintains a core group of at least 110 public health



interviewers at any time, and it is from this group that it will select interviewers to collect data for the HSOII.


During data collection, ICF project management staff will check the CATI system settings to ensure that the call attempt and call-back protocols are being met. Also during data collection, ICF will maintain a database of all CATI calls that took place over the prior 14 days in order to be able to conduct live monitoring as well as additional quality control (QC) tasks using recorded interviews. During data processing, the ICF project management team will review open-ended and “other, specify:” responses in the first few weeks of data collection, and then periodically throughout fielding, to identify potential coding or training issues. Prior to delivering the dataset, data will be cleaned and examined for missing data or errors in skip patterns, similar to the checks performed during questionnaire programming. ICF will also perform a variety of other checks using SAS programs designed specifically by programmers.


Monitoring Daily Activity

Strict processes maximize the number of completed interviews. In addition to monitoring interviewers’ performance, the sample is monitored throughout the 30-day calling window. Each day, the Contract Manager reviews protocol and production reports from the prior day. A set of automated routines ensures that records are dialed the correct number of times at the right times of day, that scheduled call-backs are honored, and that other protocol requirements are achieved. The Contract Manager will also monitor reports daily to identify specific challenges or unusual occurrences. If the sample is not performing as expected, the Contract Manager can increase attempts, assign records to be dialed, or employ any number of other custom tactics.


Dialing Protocol

Telephone calls will be rotated throughout different times of day and across days of the week, including evenings and weekends. This protocol is recognized as best practice for rigorous landline and cell phone surveys to maximize response rates and minimize non-response bias. As shown in Exhibit 6, extending landline and cell phone attempts to 15 and 8, respectively, maximizes the number of completed interviews obtained while balancing for efficiency. These calling procedures have been instituted on CDC BRFSS-protocol surveys and have experienced a limited number of complaints relative to



the number of dialing attempts made annually using these protocols.

Dialing protocols are flexible and can be adjusted should it be determined that the optimal time of day to reach this population differs from these best practices. For the landline sample, interviewers will make a minimum of 15 attempts over a 30-day period to reach an eligible household and interview 1 eligible adult for each telephone number in the sample frame. Call attempts will be spread over 3 calling periods: weekdays 9:00 a.m.–5:00 p.m., weekday evenings 6:00 p.m.–9:00 p.m., and weekends 9:00 a.m.–9:00 p.m. The first 9 attempts will be made to meet a minimum of 3 attempts (20%) within each calling period. For remaining active records, the final 6 attempts will be targeted either in the evening or on the weekend based on which calling times prove most productive. For the cell phone sample, there will be weekday, weeknight, and weekend calling occasions, with 2 attempts during weekdays (25%), 3 attempts on weeknights (37.5%), and 3 attempts on weekends (37.5%). All cell phone records will be dialed in compliance with Telephone Consumer Protection Act regulations.




3. Statistical reliability.


Survey sampling errors.


The survey utilizes a full probability survey design that makes it possible to determine the reliability of the survey estimates. Standard errors are produced for all injury and illness counts and case and demographic data as well for all data directly collected by the survey.


The variance estimation procedures are described in detail in the attached documents mentioned earlier:


Methods Used To Calculate the Variances of the OSHS Case and Demographic Estimates (2/22/02)

Variance Estimation Requirements for Summary Totals and Rates for the Annual Survey of Occupational Injuries and Illnesses (6/23/05)



HSOII


Spanish-language Barriers

The survey will include wording in Spanish for those who are entirely or predominantly Spanish speaking so that they are not excluded from the survey.

Callback Procedures

There are specific procedures for call-backs in order to increase the likelihood that respondents complete an interview. Interviewers are skilled at avoiding callbacks. For example, when a respondent expresses initial hesitation about doing the interview, interviewers make their best attempt to begin the interview anyway; respondents who start an interview are more likely to then complete it on a follow-up call. If the interview is not completed in a single call, the interviewer probes for a specific date and time to complete it. Setting an appointment is a very effective way to re-engage a respondent. Within the CATI system, scheduled appointments receive high priority. Honoring scheduled call-backs results in reaching willing respondents more reliably; therefore, the call center runs reports daily that list the times of all scheduled call-backs for the day to ensure that the project is always staffed to accommodate all call-backs.

In addition:

  • Eligible persons initially refusing to participate will be re-contacted one additional time for attempted conversion; anyone who communicates that they do not want to take the survey at that point will not be contacted again.

  • If an answering machine is reached, messages will be left on every third attempt, conveying the study’s importance and leaving a toll-free number for verifying the project’s legitimacy and/or to complete the survey.

  • Trained bilingual interviewers will be available on every shift to conduct interviews with selected respondents who speak Spanish.

  • Systematic, unobtrusive electronic monitoring (at least 15% of all interviews) will be a routine and integral part of survey procedures for all interviewers.

Refusal Conversion

Declining response rates is an industry wide trend affecting all modes of data collection.10 Methodology is based on best practices for maximizing response for RDD CATI research such as:

  • Using highly trained interviewers (including bilingual Spanish speakers) with effective interviewing techniques

  • Using a sample management approach that ensures a high number of contact attempts (15 for landline numbers and up to 8 for cell phone numbers)

  • Calls distributed across days and times (day, evening) with increased scheduling during peak times

  • Dedicated nonresponse conversion team

Exhibit 3 details ICF’s strategies for maximizing response rate.

Exhibit 3. Techniques for Maximizing Response Rates

Strategy

Description

Outcome

Focus on Minimizing Partially Completed Interviews

Separate the mid-terminate suspended records and put them into a special study, and create a report that shows how far each record is from completion. Records with selected respondents and non-terminal dispositions are attempted up to the maximum number of attempts.

A call center floor supervisor or Quality Assurance (QA) specialist calls these respondents and lets them know how much we appreciate the time they have already invested, and how close they are to allowing their responses to be counted. This strategy improves cooperation and overall response, and reduces the number of partial completes.

Collect Data With a Dedicated Public Health Interviewing Team

Maintain a group of highly skilled interviewers specifically trained to conduct BRFSS-protocol surveys.

A dedicated team of high performers understands the importance of obtaining high response rates, and their familiarity with the survey and respondent questions and concerns enables them to respond effectively, promoting cooperation.

Use Dedicated Non-Response Conversion Staff

Use a group of specially trained interviewers/floor supervisors/QA specialists to call back 100% of soft refusals and partial-completes.

Deft interviewers have proven their abilities to convert respondents, or have shown exceptional refusal aversion methods on non-conversion attempts, resulting in more completed interviews.

Prioritize Scheduled Appointments

Run daily reports that list the times of all scheduled call-backs for the day to ensure that the project is always staffed to accommodate all call-backs.

Honoring scheduled call-backs results in reaching willing respondents more reliably.

Create a CATI-Programmed Frequently Asked Question (FAQ) Screen

Enable interviewers to access project information with a few simple keystrokes so they can address respondent questions quickly, uniformly, and accurately.

Increasing respondent confidence results in increased cooperation.

Allow Appointments Outside Usual Calling Hours

Schedule appointments when a respondent requests an appointment outside of normal calling hours. We retrieve these records manually to ensure no other telephone numbers are attempted that did not request the call at that time.

Increasing respondent convenience results in more completed surveys.

Implement an Interactive Voice Response Respondent (IVR) Help Line

Develop an in-language IVR system that includes options for talking to a floor supervisor or the project manager (or a representative from the Department, if desired), learning about participant confidentiality, etc.

Promote informed survey response and provide 24-hour survey information.

Display Caller Identification

Display a caller ID number linked to the IVR system.

We reach respondents with call block and privacy manager devices; informing respondents of the importance of the research effort is critical to achieving a representative survey sample.

Focus on First Contacts

Develop a dedicated group of exceptional interviewers to make the first few critical call attempts.

Because the majority of completed interviews will occur on the first or second attempt, a small group of interviewers with proven success on first contacts will result in more completed interviews.


Expected Response Rates

As noted, it is expected that the survey will achieve a 48.7% landline and 40.5% cell phone response rate based on AAPOR’s response rate #4 (RR4).11 These response rates match the median response rates for all BRFSS states and territories in 2014. As stated above, it is acknowledged that these response rates do not meet the OMB standard of an 80% response rate. Methods to maximize response rates are outlined above. Below is described the plan to analyze the survey data for non-response and representativeness, and the plan to develop weighting adjustments to increase the representativeness of the sample.





Non-Response Bias

To mitigate the risk of non-response bias, weighting adjustments will be developed to increase the sample representativeness relative to the population.

4. Testing procedures.


The survey was first undertaken in 1972 with a sample size of approximately 650,000. Since then the BLS has made significant progress toward reducing respondent burden by employing various statistical survey design techniques; the present sample size is approximately 240,000. The BLS is continually researching methods that will reduce the respondent burden without jeopardizing the reliability of the estimates.


Responding to concerns of data users and recommendations of the National Academy of Sciences, in 1989, the BLS initiated its efforts to redesign the survey by conducting a series of pilot surveys to test alternative data collection forms and procedures. Successive phases of pilot testing continued through 1990 and 1991. Cognitive testing of that survey questionnaire with sample respondents was conducted at that time. The objective of these tests was to help develop forms and questions that respondents easily understand and can readily answer.


In survey year 2006, the SOII program conducted a one-year quality assurance (QA) study that had primarily a focus on addressing the magnitude of employer error in recording data from their OSHA forms to the different types of BLS collection forms and methods. The results showed no systematic under-reporting or over-reporting by employers. There was no strong dependence between error rates and collection methods.


Beginning in survey year 2007, the QA program introduced in 2006 was extended and modified to evaluate the quality of the data collected in terms of proper collection methods with the goal of minimizing curbstoning and collector adjustments without respondent contact. If improper collection methods or procedures were uncovered, they were corrected. A byproduct of this program was that each data collector would know that any form they have processed could be selected for the program.


In 2003, the BLS introduced the Internet Data Collection Facility (IDCF) as an alternative to paper collection of data. This system has edits built in which help minimize coding errors. The system is updated annually to incorporate improvements as a result of experience from previous years.


In 2008, extensive cognitive testing was completed on the IDCF collection system. In addition to being an overall review, this testing also provided detailed analysis of the site’s useability and eye-tracking. The summary (Summary of Expert Review of SOII IDCF Web Pages) provided extensive feedback, as well as a rating system that addressed “short-term” (wording changes), “Mid-term” (changes that affect the order of pages (flow), but seemed simple to execute), and “long-term” (changes with skip patterns, or associated buttons that appear to be more complex and would require more testing). The implementation of these changes went through a prioritization processes that took into account BLS staff resources to implement.


In 2009, extensive cognitive testing was completed on the IDCF Adobe Fillable Form. Recommendations were provided (OSMR Review of the Revised SOII Adobe Form), and were efforts were made to incorporate them in a timely manner.


In 2012, extensive follow-up cognitive testing was completed on the IDCF collection system. This testing showed (Results of the SOII Edits Usability Test) a vast improvement over previous studies, and noted limited issues in three main areas:

  1. Respondents showed difficulty in understanding what they are supposed to enter in the 'total hours worked by all employees' field, and in using the optional worksheet that accompanies this field.

  2. Respondents can be confused and/or frustrated by the way the information about the average hours worked per employee is derived and presented on the screen.

  3. Miss or have negative reactions to the error message that appears on the detailed “cases with days away from work” reporting page.

Currently these issues are being prioritized for future implementation based level of perceived need and available resource constraints.


In 2015, an option was added to the IDCF collection system that would allow users to ‘opt-in’ to receive future communications with BLS via email. Extensive cognitive testing was performed on this additional module to ensure understanding and ease of use.


Current plans will put in place by 2017 technical improvements to several systems that will allow contact via email for those respondents who have agreed to receive correspondence via email.


Since 2008, BLS has been conducting research concerning the completeness of estimates from the SOII. This multiyear research effort provided results in 2012 which were used to guide the selection of further research.


During an examination into the causes the high instance of ‘unpublishable’ estimates (i.e. estimates that for various reasons were deemed to be too volatile, or in violation of confidentiality agreements), it was discovered that some sampling strata exhibit a high degree of ‘sampling inefficiency’ (i.e. items sampled not being useable for estimation for any number of reasons). In 2013, a research project began to determine if it would be feasible to ‘oversample’ these strata in a way that would minimally impact the optimal sizes produced by the Neyman allocation. This research is currently ongoing and is showing promising results (see: “Sample Allocation to Increase the Expected Number of Publishable Cells in the Survey of Occupational Injuries and Illnesses").


The BLS also utilizes statistical quality control techniques to maintain the system's high level of reliability.


Undercount Research

The Bureau of Labor Statistics (BLS) is conducting ongoing research to investigate the completeness of the injury and illness counts from the Survey of Occupational Injuries and Illnesses (SOII). The purpose of this research is to better understand a potential undercount of occupational injuries and illnesses reported by the SOII and to investigate possible reasons behind it. Several articles and papers describing this research are available at http://www.bls.gov/iif/undercount.htm.


The BLS continues to evaluate the results of the undercount research completed. These efforts include evaluating reporting practices employed by establishments and testing the feasibility of collection of injury and illness data directly from workers.


The employer reporting practices is being investigated by conducting a follow-back study of a subsample of respondents to the 2013 SOII. The results of this study should be released to the public within the next year.


The feasibility of collecting injury and illness data directly from workers is being evaluated through an incumbent survey. BLS plans to conduct a pilot test of a worker/incumbent survey in 2016-2017. The test will be a large-scale, nationally representative household pilot survey that will allow BLS to test the collection information over one calendar year and also to produce broad industry and occupation estimates comparable to the SOII. These tests will continue BLS research into ways to improve completeness of injury and illness measures. A nonsubstantive change with further details will be submitted prior to the start of this test.


Computer Assisted Coding

BLS is constantly looking for ways to upgrade data collection that will minimize the impact of human error. Because much of the occupational data are provided in narrative form, BLS and its state partners must manually translate these narratives into codes. While BLS has incrementally developed rules for identifying coding errors, consistency remains a concern. In 2012, BLS began researching the concept of using computer learning algorithms to “autocode” free-form written case narratives from survey respondents. The initial results proved promising and indicated that computer-assisted coding would be feasible.


Currently, BLS is using the research output as part of the annual review of the codes state coders have assigned to


occupation and case circumstances for more than a quarter million nonfatal injuries and illnesses. BLS will continue to develop and evaluate computer-assisted coding with the twin goals of improving consistency and freeing personnel for more complex assignments where staff expertise is critically needed.


For the 2014 SOII, BLS began automatically assigning occupation codes. BLS found that it could successfully automatically assign codes to about one-quarter of 2014 SOII cases. With the 2015 SOII, autocoding was expanded to include nature of injury or illness and part of body affected. With this expansion, SOII anticipates autocoding about 500,000 codes. A small portion of the autocoded values will be withheld from the coders and will be manually coded. The manually assigned codes will be compared to the autocoder assigned values for quality assurance measurement purposes.


Days of Job Transfer or Restriction Testing

Beginning with the 2011 survey year, BLS began testing the collection of case and demographic data for injury and illness cases that require only days of job transfer or restriction. The purpose of this on-going pilot study is to evaluate collection of these cases and to learn more about occupational injuries and illnesses that resulted in days of job transfer or work restriction. The results of the first three years of collection were successful and demonstrated that this data could be collected and processed accurately for a limited set of industries. The most recent results from the DJTR study are available at http://www.bls.gov/iif/days-of-job-transfer-or-restriction.htm.


BLS is analyzing the results of this test to determine the value of the resulting information and is looking at how best to implement the collection of these data as well as days away from work cases in future survey years. BLS regards the collection of these cases with only job transfer or restriction as significant in its coverage of the American workforce.


To retain the level of case and demographic characteristics estimates published currently for cases with days away from work and publish similar estimates for cases with job transfer or restriction, a greater number of cases will need to be collected from employers. BLS has maintained the subsampling process for employers to limit to 15 the number of cases each employer needs to submit. BLS is continuing to examine this issue to determine an optimal number of cases to collect for each type of case while limiting the burden on the employer and the burden on the participating State agencies.


OSHA Electronic Recordkeeping


The Occupational Safety and Health Administration (OSHA) requires large establishments in manufacturing and from selected high-risk industries outside of manufacturing to record and retain data similar to those collected by the BLS injury and illness survey. OSHA requires establishment specific data to target interventions such as inspections, consultations, and technical assistance.


OSHA recently amended its recordkeeping regulations to add requirements for the electronic submission of certain injury and illness information employers are already required to keep under OSHA’s regulations. The proposed rule does not add to or change any employer’s obligation to complete and retain injury and illness records under OSHA’s regulations for recording and reporting occupational injuries and illnesses. The proposed rule modifies employers’ obligations to transmit information from these records to OSHA or OSHA’s designee. The proposed rule does not change any employer’s obligation to complete the SOII. BLS will form a working group with OSHA to assess data quality, including timeliness, accuracy, and public use of the collected data, as well as align the collection with the SOII.


HSOII


Prior to the regular interviewing phase, 50 CATI pretest interviews will be conducted. Pretest participants will be recruited using a nonprobability convenience sample through requests for participation on social media, online advertisement, or hiring pre-testers through Amazon Mechanical Turk (MTurk).  This will achieve a sufficient number of test cases who have experienced an occupational injury, whereas a random sample would be unlikely to achieve an adequate number of participants who experienced occupational injury or illness.  At least 20 participants who have experienced occupational injury or illness will be recruited in the pretest group of 50.  Initially, free recruitment methods will be used, such as requests for participants on social media through groups which are likely to contain people who have experienced occupational injury, next paid methods such as advertisement or MTurk will be used if needed. Participants will be given a small noncash incentive or payment of $10 to participate as the level of effort for pretesting is a marginally higher than the participation in the interview alone, since follow-up questions will be asked after the interview. 


Both the paradata and respondent data collected during the pretest will be reviewed and questionnaire revisions will be made as needed. As part of that para- and respondent data review, item non-response will be assessed to identify any potentially problematic questions or survey sections that may be contributing to respondent break-off, and carefully review datasets for aberrant or inconsistent responses that might signal problems related to comprehension, recall, or reporting in either the questions or the response categories.


During the pretest, live calling and interviewing will be listened to, as well as incorporate respondent debriefing questions, to identify respondent comprehension problems or challenging passages in the script. Any programming changes will trigger another iteration of the programming QA process.




5. Statistical responsibility.


The Statistical Methods Group, Chief, Gwyn Ferguson is responsible for the sample design which includes selection and estimation. The sample design of the survey conforms to professional statistical standards and to OMB Circular No. A46.

HSOII


The following individuals at ICF have reviewed technical and statistical aspects of procedures that will be used to conduct the Pilot Household Survey of Injuries and Illnesses:


John Boyle, Ph.D., Senior Vice President, Survey Research


Robert Tortora, Ph.D., Senior Survey Methodologist,


Ronaldo Iachan, Ph.D., Senior Survey Statistician


Randy ZuWallack, M.S., Survey Statistician


Bradford Booth, Ph.D., Principal


1 Blumberg SJ, Luke JV. Wireless substitution: Early release of estimates from the National Health Interview Survey, January -- June 2015. National Center for Health Statistics; 2015.

2 The design effect due to the dual-frame adjustment is based on the weighting required to combine the landline and cell phone samples. Since people with a cell phone and a landline (“dual-users”) have a chance of selection in both the landline sample and the cell phone sample, they have an increased chance of being selected for the survey. The increased probability of selection for the dual-users causes an unequal weighting effect that increases the variability of survey estimates.

3 Lohr, Sharon L, and J M Brick. 2014. "Allocation For Dual Frame Telephone Surveys with Nonresponse." Journal of Survey Statistics and Methodology 388-409.

4 Guterbock, T M, Lavrakas P J, Thompson T, and ZuWallack, R. 2013 Cost and productivity ratios in dual-frame RDD telephone surveys. Survey Practice. 4(2).

5 Blumberg, S. J., & Luke, J. V. (2016). Wireless substitution: Early release of estimates from the National Health Interview Survey, July–December 2015 [National Health Interview Survey Early Release Program report]. Retrieved from http://www.cdc.gov/nchs/data/nhis/earlyrelease/wireless201605.pdf

6 Boyle J, Bucuvalas M, Piekarski L, Weiss A. Zero banks: Coverage error and bias in RDD samples based on hundred banks with listed numbers. Public Opinion Quarterly. 2009;73(4):729-50.

7 Izrael, D, Battaglia, MP, Frankel, MR. 2009. Extreme Survey Weight Adjustment as a Component of Sample Balancing (a.k.a. Raking). Proceedings from the 2009 SAS Global Forum, Washington, DC.

8 Curtin, R., S. Presser, and E. Singer, The effects of response rate changes on the index of consumer sentiment. Public opinion quarterly, 2000. 64(4): p. 413-428.

9 Groves, R.M., Nonresponse rates and nonresponse bias in household surveys. Public Opinion Quarterly, 2006. 70(5): p. 646-675.

10 Czaika JL, Beyler A. Declining response rates in federal surveys: Trends and implications. Report submitted to Office of the Assistant Secretary for Planning and Evaluation: US Dept of Health and Human Services; 2016.

30


File Typeapplication/vnd.openxmlformats-officedocument.wordprocessingml.document
Authorpetrie_a
File Modified0000-00-00
File Created2021-01-20

© 2024 OMB.report | Privacy Policy