ERS REIS Supporting Statement - PART B (Revised)

ERS REIS Supporting Statement - PART B (Revised).docx

Rural Establishment Innovation Survey (REIS) (Also Known as National Survey of Business Competitiveness)

OMB: 0536-0071

Document [docx]
Download: docx | pdf

SUPPORTING STATEMENT

Revised 5/6/2014


U.S. Department of Agriculture

Economic Research Service

Rural Establishment Innovation Survey (REIS)

OMB Control No. 0536-XXXX


Part B. Collection of Information Employing Statistical Methods


  1. Universe and Respondent Selection

For the Rural Establishment Innovation Survey (REIS), the sample will be selected from the business establishment list maintained by the Bureau of Labor Statistics as part of its Quarterly Census or Employment and Wages (QCEW) program for those state employment security departments granting approval, and from a proprietary business list frame (Dunn and Bradstreet) for states not granting approval. Forty-six states and the District of Columbia have agreed to participate, 5 states have declined.


The sample will exclude business establishments with fewer than 5 employees, establishments that are not privately owned and establishments not included in ‘tradable industries’ defined as mining, manufacturing, wholesale trade, transportation and warehousing, information, finance and insurance, professional/scientific/technical services, arts, and management of businesses.


Sampling stratification will be based on North American Industry Classification System (NAICS) code, metropolitan/nonmetropolitan location, employment size class and whether or not the state has agreed to release their QCEW list frame through BLS for production. Establishments from the same strata in participating and nonparticipating states with have identical target sampling rates. The strata table below provides cell sizes for the study population and drawn sample for the combined BLS Quarterly Census of Employment and Wages and proprietary frames.


Establishment populations by strata are provided in the table below. The full study sample will have an initial sample size of 60,000; roughly 4,000 from a proprietary sample frame will receive a telephone screening survey and the roughly 56,000 from the BLS sample will not be pre-screen due to a very low share of ineligible establishments identified in the pilot study. This is the number of businesses that could be contacted and re-contacted multiple times and by multiple ways in a mixed mode survey protocol and stay within the survey budget. The target sampling rates were initially computed by compiling population establishment total across the 9 target industries for nonmetropolitan counties and metropolitan counties:


Nonmetropolitan Sample Rate = 0.66667 x 60,000

Nonmetropolitan Establishment Total


Metropolitan Sample Rate = 0.33333 x 60000

Metropolitan Establishment Total



Examination of the establishment population data made it clear that the sample sizes for Management of Businesses (Headquarters) and Performing Arts Companies, Museums, Historical Sites, and Similar Institutions (Arts & Museums) would be insufficient to provide reliable statistics. In addition, the Finance and Insurance (Finance) establishment population is very large, particularly with respect to potentially tradable services in rural areas. Oversampling of Headquarters and Arts & Museums by a factor of 3.3 ensures reliable statistics and is offset by an undersampling of Finance establishments by a factor of 0.33.




Table 1. Population Universe by Strata for Rural Establishment Innovation Survey

Stratum:

Industry

Stratum:

Geography

Stratum:

Estab. Size

Estab. Population1

Sampling Rate

Sample

Mining

Nonmetro

5-19

4200

0.2845

1195

Mining

Nonmetro

20-99

2508

0.2887

724

Mining

Nonmetro

100 +

588

0.5884

346

Mining

Metro

5-19

5096

0.0232

118

Mining

Metro

20-99

2979

0.0235

70

Mining

Metro

100 +

841

0.0488

41

Manufacturing

Nonmetro

5-19

15573

0.3178

4949

Manufacturing

Nonmetro

20-99

10625

0.3163

3361

Manufacturing

Nonmetro

100 +

4953

0.6239

3090

Manufacturing

Metro

5-19

75618

0.0245

1851

Manufacturing

Metro

20-99

52144

0.0260

1358

Manufacturing

Metro

100 +

17778

0.0514

913

Wholesale Trade

Nonmetro

5-19

18629

0.2891

5386

Wholesale Trade

Nonmetro

20-99

5723

0.2939

1682

Wholesale Trade

Nonmetro

100 +

389

0.5707

222

Wholesale Trade

Metro

5-19

122693

0.0227

2781

Wholesale Trade

Metro

20-99

45369

0.0230

1043

Wholesale Trade

Metro

100 +

6429

0.0464

298

Transportation

Nonmetro

5-19

10366

0.2933

3040

Transportation

Nonmetro

20-99

3895

0.2924

1139

Transportation

Nonmetro

100 +

448

0.5915

265

Transportation

Metro

5-19

37847

0.0230

869

Transportation

Metro

20-99

20003

0.0230

461

Transportation

Metro

100 +

4632

0.0466

216

Information

Nonmetro

5-19

6964

0.2885

2009

Information

Nonmetro

20-99

2134

0.2854

609

Information

Nonmetro

100 +

144

0.5417

78

Information

Metro

5-19

29635

0.0222

657

Information

Metro

20-99

17247

0.0223

384

Information

Metro

100 +

4722

0.0449

212

Finance

Nonmetro

5-19

20395

0.0916

1868

Finance

Nonmetro

20-99

3334

0.0918

306

Finance

Nonmetro

100 +

212

0.1792

38

Finance

Metro

5-19

121239

0.0073

880

Finance

Metro

20-99

27559

0.0072

199

Table 1. Population Universe by Strata (Cont.)


Stratum:

Industry

Stratum:

Geography

Stratum:

Estab. Size

Estab. Population

Target Sampling Rate

Anticipated Sample

Finance

Metro

100 +

6437


0.0146


94


Prof/Sci/Tech Serv.

Nonmetro

5-19

16742

0.2839

4753

Prof/Sci/Tech Serv.

Nonmetro

20-99

2373

0.2840

674

Prof/Sci/Tech Serv.

Nonmetro

100 +

214

0.5748

123

Prof/Sci/Tech Serv.

Metro

5-19

181087

0.0225

4068

Prof/Sci/Tech Serv.

Metro

20-99

56302

0.0227

1279

Prof/Sci/Tech Serv.

Metro

100 +

9838

0.0453

446

Headquarters

Nonmetro

5-19

1332

0.9437

1257

Headquarters

Nonmetro

20-99

728

0.9451

688

Headquarters

Nonmetro

100 +

149

1.0000

149

Headquarters

Metro

5-19

10530

0.0756

796

Headquarters

Metro

20-99

7637

0.0757

578

Headquarters

Metro

100 +

3349

0.1514

507

Arts & Museums

Nonmetro

5-19

921

0.9197

847

Arts & Museums

Nonmetro

20-99

444

0.9302

413

Arts & Museums

Nonmetro

100 +

47

1.0000

47

Arts & Museums

Metro

5-19

4085

0.0729

298

Arts & Museums

Metro

20-99

2045

0.0738

151

Arts & Museums

Metro

100 +

608

0.1612

98

Totals

 

 

1007779


0.0595

59924



The relatively small cell sizes for some of the “Large” establishment strata raises the possibility of oversampling the Large establishment strata. However, the main interest in including all establishment size classes is to ensure the ability to make inferences on the tradable sector nationally. The main focus of the study is on innovation in small and medium sized establishments. Data collection during the pilot study revealed that large establishments were responding at half the rate of small and medium-sized establishments. Thus, for the main study, the large establishment strata are oversampled by a factor of 2.


The sample for the pilot study was comprised of roughly 2,600 respondents from the previous 1996 ERS Rural Manufacturing Survey and 2,874 respondents drawn from the BLS sample frame.


For the states that do not approve BLS providing sample, the Dun and Bradstreet sample (DB) is expected to be less current and of lesser quality compared to the BLS sample. For the pre-screening effort, it is anticipated that the DB sample will be updated less frequently and with less authority compared to the BLS provided sample. The screening survey is very short and since anyone answering the phone can provide this information there will be lesser limitations to responding with contact information (Attachment J). The screening survey design and full study questions for the REIS are very similar to the 1996 Rural Manufacturing Survey that was administered and validated by SESRC.


  1. Procedures for Collecting Information


For participating establishments, REIS will be a one-time survey collection and will occur mainly in 2014. This is a voluntary government sponsored survey and will be conducted by an academic survey organization based at a Land Grant University. Establishment drawn from the proprietary sampling frame will be contacted through an initial telephone screening effort to determine if businesses are eligible (currently “in-business” and having 5 or more employees) for the study. During this contact, information will be obtained to identify a knowledgeable and appropriate respondent for the business and to collect all of this individual’s contact information (Attachment J). The results from the pilot survey demonstrated that the number of ineligibles in the BLS sample was very small and identifying a specific contact within the establishment did not significantly improve response rates. For these reasons prescreening will not be done for the BLS sample.


A letter of introduction signed by ERS Administrator Mary Bohman will be sent to the BLS sample and eligible establishments that complete the telephone prescreening from the proprietary sample (Attachment D). The purpose of this advance letter is to notify businesses about the study and why we need their participation. The second page of this letter contains a brief list of frequently asked questions regarding confidentiality, how the respondent was identified, and estimated burden for completing the survey. In addition, an advance letter from Danna Moore, the study director at SESRC is also included in the mailing that provides a web link to the survey and provides the justification for the token incentive as a gesture of reciprocity.


For the REIS, respondents will be asked to complete questionnaires in at least one of three possible survey modes (telephone, web, or mail, Attachments A, B and C). All survey instruments across modes will be carefully aligned to provide the same information and explanations of the survey. The web version of the survey is to be located on the SESRC WSU website with a specific URL. Each question screen will carry a banner with the survey title “National Survey of Business Competitiveness” and USDA ERS sponsor. The telephone survey introduction will be used by interviewers to explain the purpose and the sponsorship of the study. The mail surveys will use a cover letter to provide this information. All modes of contacting respondents will provide information on how to contact SESRC or ERS if they have questions or need clarifications about the study.


The survey methodology literature over the last decade has addressed the use of incentives as a means to improve response rates in household and person based surveys. However, there remain gaps in this literature with respect to detailed description of the establishment survey response process, the effectiveness of survey mode sequencing and how incentives interact within these processes to impact establishment survey respondents. The most important aspects of survey implementation shown to increase response rates in business surveys respectively are: 1) “Response Required By Law” message; 2) multiple contacts; and 3) cash incentives. A pilot study will use an experimental design to test various interventions on survey response that can be used to improve response. The experimental testing framework used in this study (see Table 2 and Table 3) is important because it will offer insights into how non-mandatory (voluntary) survey response is impacted by process components and strategies. There are a number of objectives to be tested: 1) alternative survey mode sequencing (telephone sequence first versus mail sequence first); 2) the effectiveness of each mode; 3) the combination of postal class and packaging (first class postage versus two day priority mail class and mail envelope packaging cardboard mailer versus brown paper envelope); and 4) early stage, later stage, and repetitive application of a small token $2 cash incentive with mail questionnaire; and 5) the timing of offering the web mode as an alternative response option for survey completion. Depending on the experimental group assignment and intervention, the business respondent will be contacted by telephone and/or by mail and will be offered one of three ways (telephone, mail, or web) to complete the survey.


These results collected within a voluntary survey environment reflect a more generalizable survey structure than those realized under mandatory government collections. We hope to capitalize on respondents’ awareness of web surveys and the offering of a choice as a means to accommodate completing the survey in a mode of their preference to determine if this is an important element of survey strategy. There is research in the household respondent survey literature that suggests offering more than one survey mode at a time can decrease survey response rather than enhance response (Millar and Dillman, 2011). This is an aspect that has not been tested in the establishment survey arena. This study specifically incorporates the idea of offering a web link at specific junctures in the contact process and then following this with email augmentation to those respondents with an email address that was gained during the telephone prescreening contact of the business.


The mode sequence selected for the full study will be contingent on findings from the pilot and the factors surrounding this decision will be fully elaborated in the pilot study assessment report submitted to OMB. The three general outcomes anticipated are statistically significant higher response rate of one mode sequence over all others, statistically significant higher response rates for two or more mode sequences over remaining mode sequences without identification of a clear dominant mode sequence, or failure to discern statistically significant differences in response rates across all mode sequences. The mode sequence selected in the first and last case would be the one with highest response rate or lowest cost, respectively. In the middle case mode sequences with statistically significant lower response rates would be abandoned and the survey would be administered by allocating an equal share of potential respondents to the remaining mode sequences. The likelihood that the pilot study will identify substantive differences between mode sequences if they exist is high: the power of detecting a difference in response rates of 0.05 between two mode sequences with a sample size of 1600 is 0.953 at the 0.01 level of significance.


For the web survey version, the website for the survey will be secure and respondents can only access the website by entering their specific project assigned identification code. It is anticipated most respondents will be able to complete the questionnaire in one session. However, business respondents will be allowed multiple reentries to the survey website if needed to complete the questionnaire in multiple sessions.


Upon receipt of completed questionnaires, SESRC will download, enter, compile, and aggregate survey responses from each survey (mode version and interventions) and analyze all survey responses. Respondents will all be addressed with the same survey questions about their business environment, activities and revenues thus providing uniform data across survey venues.


All contact materials and survey questionnaires have benefited from expert consultation (internal and external) and peer review by stakeholder groups. Cognitive interviews to test the survey questionnaire were conducted in September 2013 (Attachment F). The letters and reminders were developed in collaboration with internal and external survey methodologists.


DATA EDITING PROCEDURES


Telephone screening and telephone interviewing


Survey data for all REIS samples – landline, listed, and cell– will be collected using the same computer-assisted telephone interview (CATI) system for both screening telephone survey phase and extended full interviews collected over the telephone. While the screening interview may vary somewhat by sample, the same editing procedures will be followed for all REIS cases. In a CATI environment, the data collection and interview process is controlled using a series of computer programs to ensure consistency and quality. At SESRC WSU, the commercial CATI software used is Voxco and this software has been used more than 15 years. SESRC has more than 25 years experience with CATI software. For the telephone survey administration, the CATI system programming determines which questions are asked based on business characteristics, composition, respondent characteristics, or preceding answers, and the order in which the questions are presented to interviewers. The system also presents the response options that are available for recording answers. CATI range and logic edits do much to help ensure the integrity of the data during the collection process by telephone. This editing at the time of the interview greatly reduces the need for post-interview editing and allows most questionable entries to be reviewed in real time with the respondent as part of the collection process. Although the CATI system virtually eliminates out-of-range responses and many other anomalies, some consistency and edit issues may arise. For example, interviewers may note concerns or problems that must be handled by data analysts or preparation staff after the interview is complete. Updating activities require that both manual and machine editing procedures be developed to correct interviewer, respondent, and CATI program errors and to check that updates made by data management staff were input correctly. Because data editing may result in changes to the survey data, specific quality control procedures will be implemented. REIS survey data will be carefully examined and edited before delivering final data files to ERS USDA.


Additional data quality assurance occurs through survey supervision of interviewer performance. Quality checking is implemented by survey monitors and survey supervisors that listen and visually screen check coding of live interview answering between interviewers and respondents while they are being conducted. Any problems in question delivery, interview performance, or entry will be noted and the interviewer will be notified of performance problems. SESRC has a performance management scoring system for interviewers. This process includes meeting with each interviewer to discuss performance, review outcomes, and plan for improvement. Interviewers are routinely monitored with a goal of once a week during calling, within the first few days of calling on any given project, and to meet contractual agreements. Routinely, as part of contractual agreements, SESRC monitors between 5 % and 10% of all interviews for quality. If needed an interviewer will be retrained and systematically monitored for improvement. If an interviewer is not capable of meeting performance objectives they are terminated from calling. If an error in the data recorded by an interviewer is detected a data correction will be made to the case. If the errors detected are severe then all cases by a given problematic interviewer will be reviewed for completeness and accuracy. If any cases are suspect, then cases will be recalled and/or particular answers verified with the business.


One critical step during the data collection process for telephone interviews includes a process whereby at the completion of an interview, each interviewer answers a set of questions about the interview. If the interviewer detects concerns with quality such as compromised respondent ability, extreme distractions, or other issues these are noted at this time. Survey supervisors routinely review these results to detect poor or suspect interviews. Quality control procedures associated with data corrections may also involve limiting the number of staff who make updates, using the CATI specifications to resolve issues in complex questionnaire sections, carefully checking updates, and performing computer runs to identify inconsistencies or illogical patterns in the data associated with the current questionnaire.


The data editing procedures for REIS will consist of four main tasks: (1) managing and resolving problem cases (error checking), (2) reading and using interviewer comments to make data updates, (3) coding questions with open ended text strings (i.e., “other, specify” responses), (4) verifying data editing updates, (5) survey supervisor review of interviewer response outcomes on interviews. The final step will be to convert the edited data from the CATI system to the SAS data delivery files.


Mail returns, review, hand coding, and hand data entry


For completed mail questionnaires, the data entry process consists of three main stages: 1) initial data entry by one clerical staff, 2) verification (second pass data entry) performed by a different clerical staff, and 3) the final validation step is to account for all questionnaires by ID number and ensure all observations have been entered, verified and to correct any errors that may have occurred during this process. The data entry program consists of a computerized online system that prompts clerical personnel for valid responses to every question in the survey. The data entry program has the same features and operational features as the CATI questionnaire software for range checks and question branching/skipping logic.


Prior to the initial data entry, data editing and data cleaning will occur once a large number of returned completed questionnaire are received and a coding manual has been developed. During this initial phase several hundred paper questionnaires and question answers will be reviewed for: 1) respondents’ adherence to following question branching and skip instruction patterns, 2) marks and comments written in the margins or on questions; 3) completeness and open-ended numeric answers with anomalies; 4) straight lining on question banks; 5) selective checking in question banks; and any other types of errors indicating the need for data cleaning and data edits. Once a large number of paper questionnaires have been reviewed a coding manual will be drafted and reviewed with principal investigators and researchers at ERS prior to hand coding and data entry. Cleaning decisions will be documented in the coding manual and instructions for specific questions and problems developed for coders. Coding will be performed by a limited numbers of coder staff to ensure accuracy and consistency of coding. A data manager/analyst will review coding. Once questionnaires are coded data entry will be performed by data entry staff.


Web surveys


In the web survey environment, all questions allow voluntary responses and there is no insistence built into the web questionnaire program that requires an answer to maintain progression through the survey by the respondent. This also meets the best practices for human subject’s research. Allowing the respondent to “not answer a question” also prevents abandonment of the interview as it reduces respondent’s frustration if they are unwilling to answer a given question. In order to reduce instances of questions being skipped over without answering special screen prompts will be programmed and shown that will prompt for an answer. The goal of this functionality is to persuade the respondent to answer the question by describing the importance of the response or the purpose of the question. The types of questions that most often experience item non-response are open ended numeric questions. These questions will be carefully reviewed and pretested to determine if they require specific instructions for inclusions or exclusions. If it is found during the early stages of the study that respondents are skipping particular questions, these particular questions will be reviewed for sensitivity, wording and or comprehension issues. If needed the question will be changed or information added such as an instruction, definition, or a screen prompt.


Estimation procedures


The analytical approach for addressing the study’s central research questions are discussed below:


  1. What percentage of rural establishments in tradable industries introduced product, process or practice innovations in the previous 3 years?

  2. What percentage of self-reported innovative establishments also demonstrates behaviors consistent with substantive innovation?

  3. How do self-reported and ostensibly substantive innovation rates differ by urban/rural location, industry and establishment age?

  4. What establishment and community characteristics are associated with self-reported and ostensibly substantive innovation?

  5. Do ostensibly substantive innovators demonstrate faster rates of employment growth or higher survival rates than claimed innovators and non-innovators?


Questions 1-3 will be addressed using descriptive analysis. Questions 4-5 will be addressed using multivariate regression techniques. In addition, questions 2-5 will require a method for classifying innovative establishments as either claimed innovators or substantive innovators.


To address the first question, the percentage of rural respondents that report product, process or practice innovations will incorporate information from the complex sample design to the entire sample to produce valid estimates of mean and variance and pseudo-maximum likelihood methods for generating population weighted frequency tables. Within the rural stratum, comparison of innovation rates across settlement types ranging from micropolitan counties to entirely rural counties will use domain analysis to take into account the randomness of the sample size across settlement types. As the first quantitative assessment of rural innovation in the U.S., valid variance estimation will be critical in describing the phenomenon across the rural continuum.


However, past efforts examining measures of self-reported innovation in the European Union have identified a problem of over-reporting (North and Smallbone 2000). Lacking the resources to qualitatively assess the innovativeness of each respondent, the analysis will utilize auxiliary information on various establishment characteristics believed to be strongly associated with substantive innovation. For example, a question designed to correct for social desirability bias will ask about failed innovations at the establishment. Comparing the percentage of claimed innovators that acknowledge failed innovations to the percentage of claimed innovators that do not acknowledge failed innovations will provide one measure of possible over-reporting. Other characteristics, such as safeguards for protecting intellectual property or practices that facilitate data-driven decision-making, may also differentiate substantive innovators from claimed innovators. Variation in these observed variables may reflect variation in an unobserved factor related to substantive innovation.


Mixture models such as latent class models are well-suited to the problem of describing and analyzing observations hypothesized to come from different unobserved subgroups in the population. The two conceptual classes of most interest are substantive innovators and nominal innovators with non-innovators identified as respondents opting out of the innovation questions. However, the data could support four subgroups in the population with a subgroup of advanced non-innovators being identified; i.e., respondents that did not introduce new or significantly improved products but did utilize data-driven decision-making tools or possessed intellectual property worth protecting. Recent research examining the use of latent class models with complex survey design data (Patterson, et al. 2002; Vermunt 2007; Wedel, et al. 1998) has made it possible to apply these tools when the assumption of simple random sampling is violated.


The validity of the latent class structure will be assessed in the short-run by comparing the industry distribution of substantive innovators with known innovation intensive industries. If ostensible substantive innovators are much more likely to be in innovation intensive industries, then this would provide prima facie evidence of the validity of the class structure. In the long-run, linking REIS to the Business Employment Dynamics data at BLS (see below) will provide longitudinal performance data to compare substantive with nominal innovators that would provide outcome based evidence of the validity of the class structure.


Questions 2 and 3 will apply the relevant innovator classification to all respondents and then estimate mean and variance of percentages as was done for the self-reported innovation variable in Question 1. Domain analysis will be used when estimating parameters across groups such as settlement type, industry or establishment age.


Question 4 will be addressed using a binary response model to investigate the relationship between innovative activity and establishment and community characteristics. Nonlinear logit or probit models able to incorporate complex survey design information are available in statistical software packages allowing unbiased estimation of parameter variance. Domain analysis will allow investigating similarities or differences with respect to innovative activity across settlement types or industry groups providing critical information for designing rural innovation policy.


The analysis will also provide an assessment of the value of the ostensibly substantive innovation classification. It is anticipated that the explanatory power of the substantive innovation model will be significantly higher than the self-reported innovation model since the latter is thought to include establishments over-reporting innovative activity due to social desirability bias. Alternatively, if the substantive innovation model does not demonstrate better explanatory power then it is less likely that the observed characteristics thought to be related to substantive innovation are correlated with the hypothesized unobserved factor.


Questions 1-4 will be addressed as soon as cleaned data from the REIS becomes available. Addressing Question 5 will not be possible until several years later when a sufficient amount of quarterly employment data is available to support survival analysis. It is anticipated that the REIS will be linked with the Business Employment Dynamics data at the Bureau of Labor Statistics that will allow examining the medium and long-term effects of innovative activity on establishment survival and employment growth.


To examine employment growth we will use a two-stage model that incorporates information from an establishment exit model to correct for the nonrandom selection of surviving establishments. This model has been widely adopted in manufacturing studies (Doms et al., 1995; Jarmin, 1999, Acs 2002). The two stages are specified as:


(1)

(2) ,


where is the parameter vector from the exit equation, is the parameter vector from the growth equation, is the covariance between the disturbance terms of the two equations and is the inverse Mills ratio—derived from the first stage regression and used as an instrument to control for selection bias in the second stage. We estimate equation (1) using standard limited dependent variable techniques. We identify equation (2) via the nonlinearity of the Mills ratio as do Evans, 1987, and Doms et al., 1995.


Establishment survival will be assessed using a proportional hazard specification that is widely-used and designed to account for the censored nature of the data. Our dependent variable, whether an establishment is continuing or has exited, is reported quarterly for each establishment, is modeled as:


(3) ,


where i= 1, …, N establishments, t=1, …, T quarters during the specified period and is 0, 1.


The quarterly dependent variables are regarded as a panel of binary variables; each quarter, for each establishment, there is an indicator variable for whether or not the establishment has any employees. Each establishment is viewed as contributing several observations to a larger logit likelihood function, the product of each of the (3) logit models:

(4)


Treating the data as a panel data set facilitates estimating flexible hazard functions because the complicated likelihood maximization problem is replaced with a familiar logit estimation problem (equation 4), which can be estimated with standard software.

Integrating complex survey design information into the analysis required to address Question 5 is now possible using the svyset functionality in Stata 11. Both 2-stage selection models and proportional hazard models can now be estimated using the svy command that incorporates survey design information and allows performing domain analysis on selected subpopulations to produce valid variance estimates.


Degree of Accuracy Needed


Comparing innovation rates between urban and rural establishments is a primary focus of the study. The most challenging aspect of this question with respect to sample size is comparing conventional measures of innovative or inventive activity such as patent application rates as these tend to be rare in both urban and rural environments. Unfortunately we have not been able to locate previous studies that have examined patent application rates at the establishment level. However, it is possible to combine information from different sources to arrive at a reasonable estimate of differences in patent application rates we would expect to observe. We would want a sample large enough to detect a significance difference between these expected application rates.

We combine survey results from the 1996 Rural Manufacturing Survey with the 2008 BRDIS results to arrive at an expected patent application rate for manufacturing establishments. We then use European data on differences in patent application rates between manufacturing and services to estimate patent application rates for our entire sample. We incorporate information from both differences in rural and urban patent application rates and differences in the mix of manufacturing and services to arrive at expected patent application rates for urban and rural areas among all tradable sectors.

The results from the 2008 BRDIS suggest that 1 in 5 firms with R&D units applied for at least one patent. Findings from the Rural Manufacturing Survey demonstrate that 30% of urban establishments and 22% of rural establishments had an R&D unit. For manufacturing we would expect that 6% of urban establishments applied for a patent compared with 4.4% of rural establishments. Given a likely rural manufacturing sample of 4,907 and urban manufacturing sample of 1,738, and assuming a 60% response rate, the power of the test for two proportions fails to make the threshold of 0.8 of a powerful test at 0.655. This example is instructive because it is the one industry for which we have the best information and also where the events are anticipated to be less rare. However, the low power is not a problem for the study objectives of comparing rural and urban innovation rates for the tradable sector.


The POWER Procedure

Pearson Chi-square Test for Two Proportions

Fixed Scenario Elements

Distribution

Asymptotic normal

Method

Normal approximation

Number of Sides

1

Group 1 Proportion

0.06

Group 2 Proportion

0.044

Group 1 Sample Size

1043

Group 2 Sample Size

2944

Null Proportion Difference

0

Alpha

0.05


Computed Power

Power

0.655

To apply the power analysis to the entire sample we use patent application data from Europe to arrive at a reasonable ratio of services to manufacturing patent application rates. We then apply this ratio to our estimates of rural and urban US manufacturing patent application rates to derive the services patent application rates. The assumption is that the ratio between manufacturing and services application rates is the same in both entities without requiring the more restrictive assumption that patent application rates in Europe and the US are equal.

The services patent application rate in Europe is 41.5% the manufacturing patent application rate. Thus, in the US we estimate that the rural services patent application rate would be 0.01826 (or 41.5% of 4.4%) and the urban services patent application rate would be 0.025 (or 41.5% of 6%). The fact that manufacturing makes up a larger share of the tradable sector in rural areas reduces the expected difference between rural and urban patent application rates overall. For the urban tradable sector overall the patent application rate is expected to be 0.03183 and 0.024575 for the rural tradable sector. Assuming a 60% response rate an initial sample size of 30,000 will produce a test of adequate power of 0.872.


The POWER Procedure

Pearson Chi-square Test for Two Proportions

Fixed Scenario Elements

Distribution

Asymptotic normal

Method

Normal approximation

Number of Sides

1

Group 1 Proportion

0.03183

Group 2 Proportion

0.024575

Group 1 Sample Size

6000

Group 2 Sample Size

12000

Null Proportion Difference

0

Alpha

0.05


Computed Power

Power

0.872


By positing the magnitude of innovation events we expect to be rare in the sample we are able to demonstrate that an initial sample size of 30,000 will be sufficient for detecting expected difference between rural and urban establishments.

  1. Methods to Maximize Response

Efforts to maximize response and still remain within the survey budget will use token cash incentives ($2), higher class postage and distinctive mailers in the mail modes of contact. For all modes and mode sequences this study will utilize multiple contacts as a best practice to reach the respondent and achieve response. The use of mixed mode design, with a telephone sequence with 20 call attempts and the use of a mail sequence are also know strategies to increase survey response. In addition, in the mailing portion of the study an additional special contact will be mailed to sampled businesses that refused during telephone contact or by mail. This letter will be specially designed to appeal and persuade based on known psychological messaging to emphasize the importance of the survey request.



  1. Tests of Procedures or Measures


After the initial design phase, the telephone version of the questionnaire was tested by internal SESRC and ERS expert review, mock interviews over the telephone between SESRC and ERS USDA staff. The CATI telephone instrument was tested with one ineligible known innovative business from the local WA State population to assess: questionnaire length, usability, workability, question understanding, and to behavior code respondent clarifications.


After the initial testing, mail and telephone versions of the survey were tested using cognitive interviewing protocols with 6 establishments (see Attachment F for the detailed report). A special focus of the cognitive interviewing was auxiliary questions that will be used to differentiate substantive from nominal innovators. All of the auxiliary questions were easily understood and answered by the six respondents. The cognitive interviewing was also invaluable for assessing how industries outside of manufacturing would respond to questions and resulted in significant modifications to the survey instruments. Finally, the cognitive interviewing helped identify opportunities for decreasing respondent burden (e.g., allowing firms with no debt to avoid questions on borrowing).

The questionnaires will undergo comprehensive testing and usability testing by internal SESRC experts, supervisors, and interviewers during pretesting with actual respondents in a pilot phase of this study after OMB clearance. Usability pretesting during the pilot will include monitoring interviews to observing participants’ probes and clarification behaviors, noting difficulties and comments, and conducting post-testing interviews with interviewers to gain qualitative feedback about potential confusions. In addition, quantitative measures will also be gathered, including time to complete the survey, evaluating paradata and navigation patterns from the web questionnaire.


The pilot study will also be used to assess item nonresponse along with problems of very limited response variation. A focus of this analysis will be to identify systematic nonresponse within particular industry or establishment size strata. With the proposed sample size of 4000 only two of the 54 strata have empty cells and four strata have an initial sample of two. This coverage should be sufficient to identify significant nonresponse problems prior to the full study.

This study includes a pilot study that has experimental components that are designed to evaluate impacts on less cooperative respondents that require more contacting to gain cooperation. The study tests the impact of survey mode sequencing (mail, telephone, and web) and interactions with other interventions as shown in Tables 2 and Table 3. The pilot sample frame is randomly assigned to experimental groups 1 to 5. Each group varies on sequence and timing of treatments. Two-fifths of the sample is assigned to first receive the telephone sequence of survey contacts which is then followed by questionnaire mailings for main data collection. Three-fifths of the sample frame will be contacted first by mailings with questionnaires followed by telephone follow-ups for survey completion. Next the groups vary on when (which specific day and mailing) a web link is offered to do the survey over the internet. For those respondents with an email address an email contact will follow that is designed to augment the postal letter contact as it offers a web link that can be clicked on to go directly to the survey. Also the interventions of $2 cash incentives and the use of higher class two day priority mail compared to first class postage and the number of applications will be used at varying phases in the multi-contact sequence. The overall goal is to evaluate whether any of these interventions comparatively improve response propensity and/or bring in more of the “hard to reach” establishment respondents.


Table 2 shows the overall tests for each group and inclusion of specific treatments. Table 3 shows each group and the specific details of implementation by days across data collection. Early responders from the screening portion of the survey will not be allowed in the pilot so that all respondents experience the experimental treatments. Early responders will be encouraged in the full study.



Figure 1: Overview of the Rural Establishment Innovation Survey Pilot Study Implementation Process


Shape1

2011-2012

Establishment Listed Sample Frame


1996 Mnf. D&B


Shape4 Shape3 Shape2

3/5 Sample will have a Mail start

  • Pre-notification letter

  • 1st mail questionnaire w/ cover letter

  • Postcard reminder/thank you all respondents

  • 2nd mail questionnaire w/ cover letter

    • (exp. Random assignment to variations on postage, packaging, cash incentive)

8 weeks -SWITCH MODE to Telephone


1-10 Telephone contacts to non-responders

  • Special Refusal mailing (Attachment K)


2/5 Sample will have a Telephone Start

  • Pre-notification letter

  • 1-10 telephone survey contacts



8 weeks SWITCH MODE to Mail

  • 1st Mail qstn to non-respondents (This would be a Special refusal mailing to telephone refusals)

  • Postcard reminder /thank you to all respondents

  • 2nd Mail qstn to non-respondents

    • (exp. Random assignment variations on postage, packaging, cash incentive)

  • Special refusal mailing to telephone refusers (Attachment K).


Experiment Split Assignment of sample

Tel start vs. Mail start

rvey




Shape5 Shape6








Shape7 Shape8


Shape9 Shape10

Screening telephone contact (1-5 attempts)


Screening telephone contact (1-5 attempts)



Shape11 Shape12









Table 2. REIS Pilot Study Experimental Design and Stimuli


 

Tel Pre-screen

Pren. letter

Pren. letter has web link

Mode Sequence test

Web link timing

Web link day

When web link is offered

Web link times

Email Augm

$2 Incentive yes/no

Incentive

times

Incentive day(s)

Incentive timing

Priority mail

Priority mail timing

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

G1

Yes

Yes

No

Mail first

Early

day 7

1st mail qstn and after

3

day 14

Yes

2x

day 7 & day 35

early

1x

late day 35 2nd qstn

G2

Yes

Yes

No

Tele first

Late

day 42

1st mail qstn and after

3

day 49

Yes

2x

day 42 & day 56

late

1x

late day 56

G3

Yes

Yes

Yes

Mail first

Very early

Day 1

Advance letter and all mailings after

4

day 7

Yes

2x

day 1 & day 28

very early

1x

day 28 1st qstn

G4

Yes

Yes

No

Mail first

Early

day 7

1stn mail qstn and after

4

day 7

Yes

2x

day 7 & day 35

early

2x

day 7 1st qstn and day 35 2nd qst

G5

Yes

Yes

No

Tele first

Late

Day 42

1st qst mail qstn and after

4

day 49

No

None

None

None

None

None

Table 3. REIS Pilot Study Experimental design specific interventions and details of implementation procedures across data collection.


Group

Exp. Design

Sample size

Tel Prescreen

Phase 1

Phase 2

Phase 3

Phase 4

Phase 5

Phase 6

Phase 7

Phase 8

Phase 9

Phase 10

Phase 11

 

 

 

4 weeks

Day 1

Day 7

Day 14

Day 21

Day 28

Day 35

Day 42

Day 49

Day 56

Day 63

Day 70-77

1

Mail First

800

tel. prescrn

Advance1 letter NO Web link

1st Qstn Web link $2 First class

Email Augm

 

Postcard thank you reminder

2nd Qstn We blink $2 Priority Mail

Tel 1-2

Tel 3-4

Tel 5-6

Tel 7-8

Tel 9-10

2

Telephone First

800

tel prescrn

Advance letter NO Web link

Tel 1-2

Tel 3-4

Tel 5-6

Tel 7-8

Tel 9-10

1st Qstn Web link $2 First class

Email Augm

Postcard thank you reminder

2nd Qstn We blink $2 Priority Mail

Refusal mailing

3

Early Web Push Mail first

800

tel. prescrn

Advance letter Web link $2

Email Augm

 

paper follow-up reminder letter

1st Qstn We blink $2 Priority mail

Postcard reminder

Tel 1-2

Tel 3-4

Tel 5-6

Tel 7-8

Tel 9-10

4

All stimulus Mail 1st Qstn

800

tel. prescrn

Advance letter NO Web link

1st Qstn We blink $2 Priority mail

Email Augm


Postcard reminder

2nd Qstn We blink $2 Priority Mail

Tel 1-2

Tel 3-4

Tel 5-6

Tel 7-8

Tel 9-10

5

Control

Tel 1st

No cash

First class only

800

tel prescrn

Advance letter NO Web link

Tel 1-2

Tel 3-4

Tel 5-6

Tel 7-8

Tel 9-10

1st Qstn NO Web No Cash First class

Email Augm

Postcard Thank you reminder

2nd Qstn NO We blink NoCash First Class

Refusal mailing


1 All advance contacts will have an enclosure from the ERS Administrator Mary Bohman.


  1. Contact(s) for Statistical Aspects and Data Collection

For questions on statistical methods described above, please contact

Timothy R. Wojan

Regional Economist

Farm and Rural Business Branch

Economic Research Service, USDA

355 E Street SW

Washington, DC 20024

Tel. 202-694-5419

[email protected]


For questions on the data collection described above, please contact:

Danna L. Moore

Social and Economic Sciences Research Center

Washington State University

Pullman WA 99164-4014

Tel. 509-335-1117

[email protected]


Attachments


Attachment A Draft Rural Establishment Innovation Survey (sent out as National

Survey of Business Competitiveness)

Attachment B Final CATI Script

Attachment C Screen shots of the Rural Establishment Innovation Survey Internet Application

Attachment D Draft Rural Establishment Innovation Survey Letters

Attachment F Cognitive interview Report 12-051: National Survey of Business Competitiveness

Attachment J Pre-screening Telephone Script

Attachment K Mail Short Form for Telephone Refusals

Attachment Not Referenced in Supporting Statement


Attachment G ERS Response to NASS Comments











References


Acs, Z. 2002. Innovations and the growth of cities. Northampton, MA: Edward Elgar.


Doms, M., Dunne, T. and Roberts, M.J.. 1995. “The role of technology use in the survival and growth of manufacturing plants,” International Journal of Industrial Organization. 13: 523-542.


Evans, D.S. 1987. “The relationship between firm growth, size, and age: estimates for 100 manufacturing industries,” The Journal of Industrial Economics. 35(4):567-581.


Hsieh, F.Y. 1989. “Sample size tables for logistic regression,” Statistics in Medicine 8:795-802.


Jarmin, R.S. 1999. “Government technical assistance programs and plant survival: the role of plant ownership type,” CES Discussion Paper 99-2 February.


Millar, M.M. and Dillman, D.A. 2011. “Improving response rates to web and mixed-mode surveys,” Public Opinion Quarterly 75(2):249-269.


North, D. and Smallbone, D. 2000. “The innovativeness and growth of rural SMEs during the 1990s,” Regional Studies 34(2):145-157.


Patterson, B., Dayton, C.M., and Graubard, B.I. 2002. “Latent Class Analysis of Complex Survey Data: Application to Dietary Data,” Journal of the American Statistical Association 97(459): 721-741.


Vermunt, J.K. 2007. “Latent Class Analysis with Sampling Weights: A Maximum Likelihood Approach,” Sociological Methods and Research 36(1):87-111.


Wedel, M., ter Hofstede, F. and Steenkamp, J.-B.E.M. 1998. “Mixture Model Analysis of Complex Samples,” Journal of Classification 15(5):225-244.


1 Combination of Quarterly Census of Employment and Wages (2013Q2) and proprietary business registry from SSI for states not available through QCEW. .

23


File Typeapplication/vnd.openxmlformats-officedocument.wordprocessingml.document
File TitleSUPPORTING STATEMENT
Authorlove0313
File Modified0000-00-00
File Created2021-01-27

© 2024 OMB.report | Privacy Policy