PROGRAMME
FOR THE INTERNATIONAL
ASSESSMENT OF ADULT COMPETENCIES
(PIAAC)
2010 FIELD TEST AND 2011/2012
MAIN STUDY DATA
COLLECTION
REQUEST FOR OMB CLEARANCE
Supporting Statement Part B
Prepared by:
National Center for Education Statistics
U.S. Department of Education
Washington, DC
February 16, 2010
DRAFT
Collection of Information
|
B |
The PIAAC target population consists of non-institutionalized adults who at the time of the survey reside in the U.S. (whose usual place of residency is in the country) and who at the time of interview are between the ages of 16 and 65 years, inclusive. Adults are to be included regardless of citizenship, nationality, or language. The target population excludes persons not living in households or non-institutional group quarters (such as military personnel who live in barracks or bases, or persons who live in institutionalized group quarters, such as jails, prisons, hospitals, or nursing homes). The target population includes full-time and part-time members of the military who do not reside in military barracks or military bases, adults in other non-institutional collective dwelling units, such as workers’ quarters or halfway homes, adults living at school in student group quarters, such as a dormitory, fraternity, or sorority.
The main study will be comprised of a probability-based nationally representative sample of 5,000 persons. The standard PIAAC design requires random selection methods with calculable probabilities of selection at each stage of sampling for the main study. Thus each person in the target population will have a known non-zero probability of selection. A four-stage sample design will be employed in which the primary sampling units (PSUs) will be counties or groups of contiguous counties. The second stage will be segments (census blocks or combinations of blocks), the third stage will be dwelling units (DUs), and the fourth stage will involve selecting one or two eligible adults per household. Once dwelling units are selected, a screener interview will be conducted to identify the eligible persons within selected households. A sampling algorithm will be implemented within the CAPI system to select one or two sample persons among those identified to be eligible. Once selected, the background questionnaire (BQ) interview is to be completed as well as the assessment. The assessments are based on a matrix design, whereby blocks of items have been distributed across multiple test booklets and the booklets are distributed across sample persons. Each person completes only one test booklet and, therefore, no participant is requested to complete all cognitive items developed for PIAAC. The booklets for the assessment will be randomly pre-assigned for up to two selected persons in each sampled dwelling unit.
The field test sample will be a non-probability sample of 1,530 persons with four stages of selection (more discussion in section B.2.2). For the field test, upon completion of the BQ, the selected person will answer the Information and Computer Technology (ICT) module items. If the respondent passes the ICT Core, the respondent will randomly be provided either a paper-and-pencil or computer-based assessment booklet. Those who do not pass the ICT Core will be given a paper-and-pencil assessment booklet. If a respondent refuses the ICT Core, a paper-and-pencil booklet type will be given. This process is required for the psychometric analysis, which relies on a random assignment among those who pass the ICT Core between paper and automated assessment.
For the main study, an initial sample size of close to 10,000 dwelling units is derived to ensure that the target number of completed assessments (5,000) can be achieved. The initial sample size needs to account for ineligibility (dwelling units without a person 16 to 65 years old, and vacant dwelling units) and screener nonresponse, as well as nonresponse to the BQ module and the assessment. We expect the response rates to meet the NCES standards for response rate goals. Reacting to response rate declines in the past decade, we expect the overall response rate to be 65%, which is slightly lower than the weighted response rate of 68% experienced in the 2003 Adult Literacy and Lifeskills (ALL) survey.
The occupancy rate is expected to be about 85.8%. This 14.2% vacancy rate is slightly higher than the actual 2003 ALL vacancy rate (13.6%) and the Census Bureau’s American Community Survey (about 12% for 2005-2007), and at the same level as observed in the 2003 National Assessment of Adult Literacy and in recent years for the National Health and Nutrition Examination Survey conducted at Westat. The screener eligibility rate of 85 percent is assumed based on the proportion of dwelling units that have at least one individual between 16 and 65 years old, inclusive, and it is expected to be the same as the 2003 ALL eligibility rate. Table 4 provides a summary of the sample sizes and the response rate assumptions at each sampling stage.
Survey and sampling stages |
Eligibility and
|
Projected rates |
Sample yield |
Number of selected PSUs |
|
|
80 |
Number of selected segments |
|
|
900 |
Number of selected dwelling units |
|
|
9,947 |
|
Occupied dwelling unit rate |
85.8% |
|
|
Screener response rate |
90.0% |
|
|
Eligibility rate |
85.0% |
|
|
Percentage of Dwelling Units with Two Sampled Persons |
6.0% |
|
Number of attempted BQs |
|
|
6,921 |
|
BQ response rate |
85.0% |
|
Number of persons with completed BQs |
|
|
5,883 |
|
Assessment completion rate |
85.0% |
|
Number of completed or partially completed assessments |
|
|
5,000 |
1 The screener, BQ and assessment response rates are consistent with those specified in the NCES standards; the occupied dwelling unit rate is consistent with the rate in the 2003 ALL sample. The eligibility rate and average number of sample persons per dwelling unit were computed from the 2008 Current Population Survey (a joint effort between the Bureau of Labor Statistics and the Census Bureau).
For the field test, an initial sample size needs to ensure that the target number of completed assessments (1,500) can be achieved, as well as the required number that passes the ICT Core (1,300). To do so, the initial sample size needs to account for sample attrition due to ineligibility and nonresponse.
We expect the response rates for the field test to be similar, but slightly lower, to those experienced in the household component of the 2003 National Assessment of Adult Literacy (NAAL) field test. The data collection period for the PIAAC field test is only three months, which is slightly shorter than the four month NAAL field test data collection period. As mentioned in section A.12, in just three months time, we will be able to incorporate most of the main study approaches to achieve high response rate (The purpose of this field test is not to pretest cooperation/response rates as this has been successfully done in previous adult literacy surveys). The expected response rates and eligibility rates are conservatively estimated in order to calculate the field test sample size, which we feel is operationally essential to have more than enough cases available to work in the field, since there will be no time to release new sample cases if an unexpected shortfall occurs. The response rates are not related to the quality of the non-probability based field test sample. The conservative expected overall response rate of 44% is computed as the product of the expected screener response rate, BQ/JRA response rate, and the assessment response rate, that is, using the rates in Table 5, .44 = .65* .80* .85. The occupancy rate is expected to be about 85 percent.
We also expect that approximately 85 percent will pass the ICT Core among the 1,530 completed assessments. This rate was derived using the PSU level computer usage data collected during the 2003 NAAL survey. Table 5 provides a summary of the sample sizes needed at each sampling stage in order to arrive at the target sample size of completed assessments.
Sample Size Component |
Rate Component |
Sample size |
Rate |
Number of Primary Sampling Units |
|
25 |
|
Number of Secondary Sampling Units |
|
300 |
|
Initial Dwelling Unit Sample Size |
|
4,478 |
|
|
Dwelling Unit Occupancy Rate |
|
85.8% |
|
Screener Response Rate |
|
65.0% |
|
Screener Eligibility Rate (age 16-65) |
|
85.0% |
|
Percentage of Dwelling Units with Two Sampled Persons |
|
6.0% |
|
|
|
|
Initial Person Sample Size |
|
2,250 |
|
|
BQ/JRA Response Rate |
|
80.0% |
|
Assessment Response Rate |
|
85.0% |
Expected Number of Completed Assessments |
|
1,5301 |
|
|
ICT Core Passing Rate |
|
85.0% |
Expected Number Passing the ICT Core |
|
1,300 |
|
1 The Consortium assessment target sample size of 1,500 was inflated by 2 percent--from 1,500 to 1,530–so that the minimum sample size of 1,300 that pass the ICT Core is achieved under the assumption that 85 percent will pass the ICT Core.
The following sections describe the sample designs for the PIAAC main study and field test. A multi-stage design will be employed for both the main study and the field test, and the sample selection approach is described for each sampling stage.
As mentioned in section B.1, the PIAAC target population consists of non-institutionalized adults 16 to 65 years old who reside in the United States at the time of interview. To arrive at a minimum of 5,000 completed cases, a four-stage, stratified area probability sample is planned that involves the selection of (1) primary sampling units (PSUs) consisting of counties or groups of contiguous counties, (2) secondary sampling units (referred to as segments) consisting of area blocks, (3) dwelling units (DUs), and (4) eligible persons (ultimate sampling unit) within DUs. Random selection methods will be used, with calculable probabilities of selection at each stage of sampling.
For the initial stage of sampling, a total of 80 PSUs will be selected. The PSUs will be formed by combining adjacent counties to reach a minimum population size, respecting state and metropolitan statistical area boundaries, and taking into consideration the travel distance for data collectors. A stratified probability-proportionate-to-size (PPS) sample will be selected, where the measure of size (MOS) is the estimated non-institutionalized population—adjusted from the residency population estimates from the most recent (2008 or 2009) Census Bureau population estimates1 available for each county. The PSUs with the largest MOS will be selected with certainty (with probability equal to one) before stratification using a certainty cutoff determined from PPS sampling. One PSU will be selected per stratum, where strata will be formed from variables relating to census region, metropolitan statistical area status, race/ethnicity, poverty, English speaking ability, and education attainment. Westat conducted an extensive search for county variables for a Small Area Estimation (SAE) task using NAAL data (Mohadjer, et al., 2009), and the key predictors of literacy proficiency were related to race/ethnicity, poverty, English speaking ability, education attainment, and census division. Strata will be close-to-equal in size in order to reduce the variation in workload and also to control the variances of the estimates. County data is available from the Census Bureau’s Population Estimates Program, and from other sources, including the American Community Survey and the Census Bureau’s Small Area Income and Poverty Estimates program.
For the second stage of sampling, we propose to select a PPS sample of 900 segments from within the 80 sample PSUs. The segments will consist of at least 60 dwelling units (DUs) in area blocks2 (as defined by the 2000 census) or combinations of two or more nearby blocks. The frame of segments will be created within the selected PSUs using the Census 2000 Summary File 1 (SF1) block data. The timing of the data collection for PIAAC is such that the corresponding Census 2010 block data or the first release of the American Community Survey 5-year block group estimates will not yet be available for the segment formation process. If any delay in the PIAAC timeline is experienced, then it may become possible to use more current data. Within each PSU, the block data from the SF1 files will be sorted by tract, block group, and block number, before creating the segments. Blocks with no DUs and no population will be included so that all areas, some of which may contain DUs constructed after the 2000 Census, will be involved in the formation process.
The third stage of sampling for PIAAC will involve an initial sample of about 9,950 DUs from the frame of addresses in each selected segment in order to arrive at 5,000 completed assessments. All DUs within each selected segment will be listed by trained Westat listers. Given the actual number of listed DUs and derived sampling rates for each segment, dwelling units will be selected from the listing sheets at the home office. The listers will contact Westat whenever the number of listed DUs falls outside the range and provide any apparent reasons for the discrepancy. From the listings, the address and ID number of each selected DU will be keyed and verified.
The fourth stage of selection involves listing the age-eligible household members (aged 16 to 65) for each selected dwelling unit during the screener interview. Subsequently, one person will be selected at random within dwelling units with three or fewer eligible persons, and two persons will be selected if the dwelling unit has four or more eligible persons. The enumeration and selection of persons will be performed using the CAPI system, which will collect information via the screener instrument, including age and gender of persons in the dwelling unit, and randomly select eligible respondents. The design involves the selection of two persons in dwelling units with a large number of eligible persons to prevent a substantial increase to the variation in the resulting sampling weights.
Household members who are away in college (staying at college dormitories) will be considered to be part of their family’s household. If it is not possible to reach the students at the family homes during the data collection period, an interview will be arranged with them at college, if they reside within or adjacent to one of the 80 field test PSUs. Westat successfully applied the same procedure for the 2003 ALL survey.
For the PIAAC field test, we plan to use the electronic address listing files from the 2003 NAAL, which will significantly reduce the time and resources needed to select dwelling units. Westat created electronic files of all address listing sheets during the NAAL survey. Therefore, in order to make use of the listings for the multi-stage field test sample, the 2003 NAAL sample of 100 PSUs will be used as the sampling frame for the selection of PIAAC field test PSUs. It should be noted that coverage inadequacies are not a concern since the field test is not a nationally representative sample. Therefore, the NAAL 2003 listings will be used without coverage updating. The following paragraphs discuss the multi-stage sample in more detail.
For the first stage of selection, a non-probability sample of 25 PSUs will be selected from the 2003 NAAL sample of 100 PSUs. The 25 PSUs for the field test will be chosen with the goal of satisfying the demographic requirements of the psychometric testing, which is to arrive at a sample that is fairly evenly distributed across categories of key variables, such as age, gender, education, income and race/ethnicity. Therefore, to achieve this objective, the 100-PSU 2003 NAAL sample will be stratified by four dichotomous variables: education attainment (percentage of population with high school education or less), income (percentage of population who are below 150% poverty), race (percentage of population who are Black Non-Hispanic), and Ethnicity (percentage of population who are Hispanic). The two levels for each stratifying variable will be determined by the median across the PSUs. At least one PSU will be chosen from each of the 16 groups that are formed from the four variables. In addition, the sample will be spread across each of the four Census Regions and four PSUs classified as non-Metropolitan Statistical Areas (MSA) will be selected, keeping the MSA distribution of PSUs proportionate to the population (4 out of 25 PSUs, or 16 percent).
In the second stage of selection, segments selected for 2003 NAAL will constitute the frame for the segment sample. We will select 12 segments from the NAAL sample within each of the 25 chosen PSUs for a total of 300 segments. This will result in an equal workload design with an average of five completed cases per segment.
The third stage of sample selection will involve a sample of dwelling units from the 2003 NAAL listing of addresses in each sample segment. There will be about 15 selected dwelling units within each of the 300 selected segments on average. The listing of dwelling units will be stratified by segment and sorted within each segment by their line number (order listed). To avoid overlap, any dwelling units included in the 2003 NAAL sample will be removed from the frame. A systematic random sample will be selected within each segment.
In the fourth stage, interviewers will screen dwelling units to determine whether they include any eligible respondents as defined by the PIAAC target population, and to select one or two persons for the interview and assessment. All household members in selected dwelling units will be enumerated as part of the screener interview, as conducted through the CAPI system. Household members who are away in college (staying at college dormitories) will be considered to be part of their family’s household. If it is not possible to reach the students at the family homes during the data collection period, an interview will be arranged with them at college, if they reside within or adjacent to one of the 25 field test PSUs. Westat successfully applied the same procedure for the 2003 ALL survey.
A random selection algorithm will be programmed into CAPI, and a sample of those that are age eligible will be selected. As in the main study, the algorithm will select one person at random within dwelling units with three or fewer eligible persons, and two persons will be selected if the dwelling unit has four or more eligible persons.
For the main study, sampling weights will be produced to facilitate the estimation of the target population parameters. Replicate weights will be computed to facilitate variance estimation, and will capture the variation due to the sample design and selection, as well as weighting adjustments.
The estimation procedures for the PIAAC data are prescribed by and are the responsibility of the international sponsoring agency. The United States will comply with these procedures and policies by delivering masked data (note that a disclosure analysis will be conducted prior to submitting the data to the international contractor so as to comply with current federal law), and documentation of sampling and weighting variables. All data delivered to the PIAAC Consortium will be devoid of any data that could lead to the identification of individuals.
There are no anticipated problems that would require specialized sampling procedures, nor will there be any use of periodic data collection cycles to reduce burden.
For the field test, the minimum expected sample size mentioned above was required for the Differential Item Functioning (DIF) analysis. The DIF analysis will be conducted to identify and correct items that perform poorly, including the translation and scoring procedures, and to examine item characteristics for establishing comparability (i.e., to evaluate the equivalence of item parameters in two aspects: the linking of items from ALL to PIAAC and linking between the paper-and-pencil and computer formats). This analysis will be conducted to identify items that should be examined more closely by subject-matter specialists for possible bias and omission from the testing instrument. The items with potential bias are ones that are differentially difficult for members across subgroups with comparable scores.
In order to meet the PIAAC response rate goals, NCES will rely on procedures and approaches that have been used successfully over many years of conducting household studies. Building good response rates begins with hiring field staff with the experience and skills that will make them successful in convincing people to cooperate, and training them how to not only administer the instrument and follow the study procedures, but also how to convince respondents to participate.
NCES views gaining respondent cooperation as an integral part of a successful data collection effort and will invest the resources necessary to ensure that the procedures are well developed and implemented. We will use an advance contact strategy that has been successfully employed on many large in-person household studies. Advance materials, including a letter and an informative brochure (provided in Appendix D), will be mailed to all selected households in advance of the data collector’s initial visit. These advance materials will inform potential respondents of NCES enabling legislation; the purposes for which the PIAAC data are needed; uses that may be made of the data; and the methods of reporting the data to ensure confidentiality. All project materials will include the study’s web site address and a toll-free telephone number for respondents to obtain additional information about the study. The materials will also mention the respondent incentive and will include the study logo for legitimacy purposes. It is very important for the data collector to establish legitimacy at the door, which can be accomplished by the use of a strong introductory statement during which the data collector shows their ID badge and a copy of the advance materials.
Effective contact patterns are another important component of achieving response rates. Completion rates improve when data collectors attempt contact on different days of the week and at varying times of the day. We propose that data collectors make four well-timed attempts to contact a household before reviewing the case with the supervisor to identify another pattern of contact. These other contact strategies may include telephone, FedEx letters, or leaving messages with neighbors. We plan to staff each PSU with two data collectors. It is advantageous to have multiple data collectors in a PSU as it allows better matching between data collectors and allows for coverage in case of data collector illness or unavailability. In carrying out efforts to achieve high response and participation rates, we propose to organize our data collection efforts using a phased approach that allows for refusal conversion.
Each data collector will receive a laptop computer loaded with the Interviewer Management System (IMS). This system allows data collectors to launch all CAPI instruments and permits tracking of their work and time. Data collectors will use the electronic record of call (EROC) feature of the IMS to collect information about each visit to a household that did not result in a completed interview. EROC information will include: contact date and time, contact result or disposition code, appointment information, and general data collector comments. The EROC data are very helpful in documenting the results of contact attempts for nonresponding households, and in helping to design a more directed and effective campaign to convert the nonresponding households. All nonresponse followup and refusal conversion efforts also will be tracked and documented in the IMS.
Whenever a refusal or breakoff is encountered, the data collector will complete an automated noninterview report (NIR) that captures information about the reason for refusal. Automated EROC and NIR information is available to the supervisors via data transmission to the home office by the data collectors and subsequent transmissions to the supervisors. Contact and decline information will be collected, coded, and included in the biweekly data collection progress report. NCES believes that frequent, open communication between all levels of field staff is required for a successful data collection effort. Supervisors will primarily use email for day-to-day communication with their staff. Scheduled weekly conference calls will also be used at all levels. All supervisory staff will be available for questions or other issues that come up every day via telephone and email.
The U.S. is participating in a full field test for PIAAC. The PIAAC field test will provide an opportunity for testing several facets of sampling. The main objectives of the sampling activities are to:
Provide a sample of adults that will be used to validate the test items to be included in the psychometric assessment;
Test the within-household sample selection process;
Train field staff in the sampling activities;
Test the Quality Control (QC) sampling-related procedures; and
Test the flow of materials and the sample data from sample selection to the delivery of the Sample Design International File (SDIF) at the end of the data collection.
The following are responsible for the statistical design of PIAAC:
Leyla Mohadjer, PIAAC Consortium/Westat; and
Kentaro Yamamoto, PIAAC Consortium/Educational Testing Service.
Westat will be the contractor responsible for sampling activities:
Leyla Mohadjer, Vice President; and
Tom Krenzke, Senior Statistician.
Analysis and reporting will be performed by:
Kentaro Yamamoto, Educational Testing Service.
1U.S. Census Bureau Population Estimates Program produces estimates of the resident population at the county level. An adjustment will be done to estimate the non-institutionalized population for each county.
2Blocks are very fine partitions of the United States, formed using visible semi-permanent features such as roads, railroad tracks, mountain ridges, bodies of water, and power lines. The only invisible boundaries used are county, state, and national boundaries. Minor civil division boundaries and property lines are ignored. A block group is a small group of contiguous blocks. A tract is a collection of contiguous block groups all within the same county.
File Type | application/vnd.openxmlformats-officedocument.wordprocessingml.document |
File Title | PIAAC OMB Clearance Part B 12-15-09 |
Author | Michelle Amsbary |
File Modified | 0000-00-00 |
File Created | 2021-02-03 |