National Agricultural Workers Survey
Summary of Third Party Methodological Reviews
Supporting Documentation for Part B of the Information Collection Request
Office of Management and Budget Control No. 1205-0453
October 25, 2019
This document summarizes two reviews of the survey methodology for the National Agricultural Workers Survey (NAWS) that were conducted between fiscal years 2009 and 2014. The Employment and Training Administration (ETA) contracted with Mathematica Policy Research (the reviewer) for this effort. As there was some content overlap in the reviews, this memo focuses on each aspect of the methodology that was reviewed and identifies the methods under review, how they were assessed, the results of the assessment, the recommendations made by the reviewer, and the actions that ETA took in response to the reviews.
The Office of Management and Budget (OMB) initiated the reviews to address questions and concerns arising from its assessment of Part B of the 2008 Information Collection Request (ICR) for the NAWS. The focus of the reviews grew to include additional issues raised by the reviewer during the first and second reviews. The 2009 review assessed methods for computing point estimates, including the use of sampling weights, nonresponse adjustments, and post-sampling weights. The 2011-2014 review examined the implementation of recommendations from the 2009 review, again reviewed the calculation of point estimates and their reporting in draft publications, reviewed possible sources of design effects, revisited the issue of nonresponse adjustments, and conducted site visits to observe field procedures, particularly worker sampling. The second review also included an assessment of the 2015 Part B supporting statement.
Over the course of the reviews, the reviewer identified areas where the NAWS methodology could be strengthened. ETA reviewed the findings and recommendations and, in consultation with the NAWS contractor, implemented several changes to the NAWS methodology.
Specifically, in response to the reviews the NAWS contractor:
Improved the clarity of the Part B supporting statement for sampling probabilities, non-response adjustments, and weights;
Improved the specification of standard errors in programs calculating point estimates;
Implemented suppression standards in draft documents and used relative standard errors (RSE) to identify estimates with high RSE’s that should be reported with caution or suppressed;
Strengthened field sampling documentation;
Provided additional training for field interviewers and field monitors on field sampling and documentation procedures;
Verified that quota sampling of workers was no longer used by the NAWS interviewers;
Simplified the procedure for sampling crews at the worksite;
Changed Farm Labor Area (FLA) sampling to select the FLA roster using probability proportional to size (PPS) and to select FLAs from the roster for each cycle using an equal probability selection process;1
Changed the way ZIP code regions are formed to reduce weights for this level of sampling;
Incorporated nonresponse adjustments at various stages in the design and revised the employer nonresponse adjustment; and
Implemented corrections in calculating the weights for FLAs, ZIP code regions and the post-stratification day weight.
The reviews were flexible and scope was expanded at certain points as new areas of concern were identified and prioritized. However, the reviews raised but did not address some hypotheses about potential biases and the statistical efficiency of the NAWS methodology. ETA addressed this limitation by authorizing subsequent studies. The results of these additional studies can be found in the NAWS Nonresponse and Design Studies.
1. Accuracy of point estimates
The 2009 review assessed the accuracy of point estimates for selected variables from data that was collected in fiscal years 2001-2008. The reviewer checked the statistical programs that were used to calculate the survey’s weights. The assessment found that the weight for the first stage of FLA sampling was omitted. The reviewer then compared point estimates that were calculated using the original weight and the revised weight that included the first stage FLA selection probability; the reviewer found no significant or substantive differences. ETA’s NAWS contractor added the first stage FLA weight to its analyses beginning in fiscal year (FY) 2010.
During the 2011-2014 review, the reviewer recreated point estimates from the first drafts of two prospective publications, one reporting 2003-2004 data and the other 2005-2006 data. The reviewer again noted that bias may have been introduced through omitting the first stage FLA weight. The reviewer also noted that the analysis program failed to correctly specify the stratification variables. When comparing the original and revised estimates, the results again showed that NAWS point estimates were not sensitive to minor changes in the weights or analysis specifications. ETA agreed with these findings. As noted above, the first stage weight was added to the analysis beginning in FY 2010. Analyses of NAWS data correctly specified the stratification variables beginning in FY 2012.
The reviewer further recommended that estimates be suppressed in first drafts, rather than in final drafts, and that the suppression criteria exclude estimates with relative standard errors (RSE) greater than 50 percent. ETA concurred with, and the contractor implemented the recommended practices.
2. Sampling
The 2009 review included an assessment of the NAWS sampling design. The 2011-2014 review followed up on the 2009 review’s recommendation to observe NAWS field procedures and to assess worker sampling procedures at farms and ranches where interviews were conducted.
Prior to the 2009 review, OMB had raised concerns that NAWS interviewers were implementing quota sampling by stopping short of completing the full employer allocation at the final employer in the county once the county allocation had been met. The 2009 review verified that quota sampling was no longer occurring.
The 2009 review found that FLAs were sampled twice using PPS. The reviewer recommended that only the FLA roster be selected using PPS and that the sample selected from the roster for each cycle be an equal probability sample. This recommendation was implemented, and the new FLA weight was confirmed during the 2011-2014 review.
Finally, the 2009 review examined the appropriateness of the NAWS survey design and concluded that the design was sound and resulted in a nationally representative equal probability sample of migrant and seasonal crop workers.
For the 2011-2014 review, a bilingual member of the reviewer’s staff accompanied teams of interviewers while conducting worker sampling in four locations. The findings showed that three of the four teams correctly conducted a worker selection raffle. The fourth group of interviewers struggled when attempting the raffle. Additionally, at least one team failed to correctly count worker refusals. As a result, the interviewers received additional training and continue to receive ongoing training on field sampling from NAWS statisticians. ETA’s contractor also revised its field monitoring protocols to better assess interviewers’ abilities to conduct worker sampling independently.
Based on the reviewer’s report, it was clear that interviewers and the observer had trouble understanding the interview allocation instructions when multiple crews were selected. Given that selecting multiple crews happens only a handful of times per year, ETA and its contractor simplified the crew selection process by allowing only one crew to be sampled at each worksite. Interviewers received training on the new procedures and additional training on how to document crew selection.
In the 2009 review, the reviewer noted that the field sampling documentation should be strengthened. ETA concurred with this recommendation. In response, the NAWS contractor developed and implemented new forms for worker sampling documentation that provided much greater detail on the worksite, crew selection, and worker selection. To further encourage accurate documentation, the NAWS interviewers began using a software application that guides interviewers through the selection process and checks the completeness and accuracy of the documentation.
Weights
The 2011-2014 review examined the NAWS weights calculations to assess their potential for causing design effects. In doing this, the reviewer examined both the statistical programs used to calculate the weights and the sections of the Part B supporting statement describing the weight calculations. The reviewer then recreated the NAWS sampling, nonresponse, and post-stratification weights for 2010 and 2011. The results focused on three areas. First, the reviewer noted that it was difficult to follow the Part B supporting statement and provided clarifying algebra and text. Second, the review identified some needed corrections to calculating the probabilities for the FLA and ZIP code region weights as well as the weight that adjusts for part-time and full-time workers. The third finding was that the NAWS weights varied widely in size, and some observations had large weights. Additionally, the reviewer hypothesized that the large size of some of the NAWS weights could be a possible cause of the design effects. ETA concurred with the changes to the weights and the NAWS contractor implemented these recommended changes.
Nonresponse adjustments
The reviewer then focused on the NAWS nonresponse adjustments as a possible source of the large weights. To initiate the nonresponse analysis, the reviewer independently calculated the NAWS nonresponse rate and found this rate to be similar to the reported NAWS nonresponse rate, differing by less than a percentage point.
The reviewer then turned to identifying appropriate nonresponse adjustments. In both the 2009 and 2011-2014 reviews, the reviewer assessed the appropriateness of incorporating nonresponse adjustments at each level of sampling versus using the global nonresponse adjustments at the cycle-region level. The reviewer concurred that, due to the diversity of worker populations across counties and FLAs within a cycle-region stratum, it was best to use the nonresponse adjustment at the cycle -region level rather than the FLA and county level.
In terms of adjustments at the worker and employer level, the 2009 review advocated for limiting other nonresponse adjustments, particularly at the employer-within-county level. However, after further consideration of the design effect implications during the 2011-2014 review, the reviewer decided that nonresponse adjustments were appropriate at the worker and employer level. Further, due to the large weights at the employer level, the reviewer proposed a nonresponse adjustment designed to reduce the large employer weights. ETA concurred, and the new non-response adjustment was included in the 2015 ICR submission. The NAWS contractor implemented the change when the OMB approved the ICR.
Design effects
For the 2011-2014 review, the reviewer calculated the design effects on several variables used in calculating the National Farmworker Jobs Program (NFJP) populations estimate and found those effects to be large. The reviewer’s working hypothesis was that the locus of the design effects was either the weight computations, the sampling design, or both.
The reviewer calculated design effects using: a) original weights, b) weights with the recommended corrections and the new nonresponse adjustment, and c) weights using an alternative post-stratification procedure. While the nonresponse adjustment did reduce the design effects slightly, none of the changes to the weights made a sizeable reduction in the design effects.
Since the design effects were relatively impervious to revising the weights, the reviewer examined the sampling design for sources of large weights. The nonresponse adjustment discussed above was meant to reduce the grower weight within-ZIP-code-region weight. The reviewer then shifted the review focus to other geographic weights. As discussed above, due to large variations in crop worker demographics at the local level, further geographic nonresponse adjustments could not be implemented.
Additionally, the reviewer advocated for tighter control in releasing the employer sample at the county and ZIP code region level to reduce large weights and potentially reduce design effects. That is the reviewer thought that interviewers should finish all the growers in a Zip Code region to avoid high weights when only a small fraction of employers were contacted. The reviewer did some preliminary analysis on whether this reduced employer and ZIP code region weights. However, the reviewer did not assess the hypothesized impact on design effects. At the time, ETA deferred action on this recommendation as it was not practical given existing NAWS field conditions. Before further consideration, ETA requested that the NAWS contractor undertake a study of the locus of design effects in the NAWS sampling design. The results of this study can be found in the document Summary of NAWS Nonresponse and Design Studies.
Regional versus National Estimates
The scope of the 2011-2014 review included addressing the issue of sample sizes needed for both regional and national estimates, particularly in the context of estimating the size of the population eligible for participation in the NFJP. The reviewer noted the tradeoffs between having an efficient national estimate and having sufficient sample size in small regions to calculate reliable regional estimates. The reviewer recommended that ETA identify levels of precision for both the regional and national estimates and calculate the resulting sample sizes given the design effects that still existed after the weight modifications. ETA asked the NAWS contractor to undertake that analysis and the study is currently in progress.
Clarifying the Part B Supporting Statement
A focus of both phases of the review was clarifying the Part B supporting statement. OMB had raised concerns about the equations for the probability of selecting workers as listed in the 2008 statement. That document collapsed the equations for the worker selection probability and for the nonresponse adjustment into a single equation. The 2009 review verified that both equations were now included in the subsequent Part B documentation. Further, the revised document correctly included the weights for the revised FLA selection, reflecting a PPS roster and an equal probability selection from the roster for each cycle.
During the 2011-2014 review, the reviewer found that the weights documentation in the Part B supporting statement were not sufficiently detailed to allow for recreating the weights and proposed more detailed algebra. Subsequent Part B supporting statements included more detail on the nonresponse adjustments, including the new employer nonresponse adjustment and the reviewer’s recommended algebra. Subsequent to the review process, the reviewer was asked to read and provide comment on the 2015 Part B supporting statement. The reviewer verified that the requested changes had been implemented.
Conclusion
Both the 2009 and the 2011-2013 reviews identified ways to strengthen the NAWS sampling, weighting, and point estimate calculations. Where feasible, ETA and the NAWS contractor implemented the reviewer’s recommendations, as reflected in Part B of the 2015 and 2019 supporting statements for this information collection.
At the same time, the reviews left open hypotheses about potential nonresponse bias and the locus of the NAWS design effects. To investigate these hypotheses, ETA tasked the NAWS contractor with four employer nonresponse studies, which are described in Part B of the 2015 supporting statement, and two additional studies, one calculating optimal allocation at the region level and the other examining the locus of the survey’s design effects. The status of these six studies and their results can be found in the NAWS Summary of Nonresponse and Design Studies. For FY 2020, the optimal allocations study is being expanded to explore tradeoffs between regional sample size and the precision of the population estimates for the NFJP.
1 The NAWS has a complex sampling design that includes stratification across 12 regions and 3 cycles each year. Each region is divided into FLAs, the primary sampling units, which are composed of a single large county or a cluster of smaller counties with similar crop labor patterns. Within each region, clustering occurs at the level of the FLA, county, Zip code region, and employer.
File Type | application/vnd.openxmlformats-officedocument.wordprocessingml.document |
Author | Carroll, Daniel J - ETA |
File Modified | 0000-00-00 |
File Created | 2022-01-11 |