Supporting Statement B for
EVALUATION OF NIAID’S HIV VACCINE RESEARCH EDUCATION INITIATIVE
HIGHLY IMPACTED POPULATION SURVEY
(NIAID)
November 18, 2009
Project Officer:
Katharine Kripke, Ph.D.
Assistant Director, Vaccine Research Program
Division of AIDS, NIAID, NIH, DHHS
6700 B Rockledge Drive, Room 5144
Bethesda, MD 20892
Telephone: 301-594-2512
Fax: 301-402-3684
E-mail: [email protected]
Supporting Statement Section B
Table of Contents
B.1. Respondent Universe and Sampling Methods 20
B.2. Procedures for the Collection of Information 25
B.3. Methods to Maximize Response Rates and Deal with Non-Response 39
B.4. Test of Procedures or Methods to Be Undertaken 42
B.5. Individuals Consulted on Statistical Aspects and Individuals Collecting and/or Analyzing Data 43
Collections of Information Employing Statistical Methods
The proposed survey will be conducted with adults living in the United States, with particular focus on specific populations that are highly impacted by HIV/AIDS—African Americans, Hispanic/Latinos, and MSM. A randomized stratified sample of residential addresses will form the basis of a General Population sample; sample augments from African Americans, Hispanic/Latinos, and MSM will be created to improve the precision of the estimates associated with these groups. Data will primarily be collected by means of a telephone or an online survey.
1.1 Respondent Universe and Survey Objectives
The populations of interest for this study are all English or Spanish-speaking adults 18 or older residing in the 50 United States, as well as specific subpopulations known to be highly impacted by HIV/AIDS:
(1) African Americans (AA)
(2) Hispanic/Latinos (H/L)
(3) Men who have sex with men (MSM).
Table B.1-1 provides information from U.S. census estimates on the expected population distribution for African American and Hispanic/Latinos within the U.S. population.1 Estimates of MSM are more difficult to obtain, but the most oft-cited reference is a 1994 book from Michael et al. They place the number of men who have had sex with men since the age of 18 at about 5 percent of the U.S. adult male population.2
Table B.1-1 2008 Census Estimates of the Adult Population |
||
|
Estimate |
Proportion of General U.S. Population |
U.S. Population |
221,419,638 |
100% |
African American |
27,836,291 |
12.6% |
Hispanic/Latino |
30,851,076 |
13.9% |
The primary objective for this survey is to provide point estimates with respect to knowledge, attitudes, and beliefs for the general population and for each of the three highly impacted populations. For the highly impacted populations, the survey will provide point estimates with a sampling error of +/- 3 percentage points at the 90 percent confidence level.
The study will involve four samples. The first, the General Population Sample, will be a random stratified sample of U.S. households. In order to provide estimates at the level specified, additional cases will be provided through three augment samples: (1) the African American Augment; (2) the Hispanic/Latino Augment; and (3) the MSM Augment. The samples are drawn from two different survey frames, as described below.
1.2 Survey Frames
The survey relies on an Address Based Sampling (ABS) frame for the General Population, African American augment sample, and the Hispanic/Latino augment sample. An augment sample for MSM will need to be generated using online panels. The general population sample and augment samples for African Americans and Hispanic/Latinos will be drawn from an ABS frame based on the U.S. Postal Services Delivery Sequence File, a list of all residential addresses in the United States. Because the proportion of MSM in the country is relatively small and screening large numbers of persons for sexual orientation is impractical and introduces bias, samples from online panels from the Knowledge Network and Sample Surveying International will serve as the basis for the MSM sample. The justification for these two frames follows.
Address-Based Sample. ABS frames are developed from a comprehensive database of all residential household addresses in the United States. Their use has grown considerably in recent years as an alternative to more traditional random digit dial (RDD) approaches that have been in use for the past 30 years. The growing popularity of ABS frames is related to the increasing problems associated with the exclusion of cell phone listings from RDD samples and the avoidance of such problems when employing ABS frames.
The proportion of Americans who rely solely or mostly on a cell phone has been growing steadily over the past decade, while the proportion of persons who regularly use a household landline phone to accept their incoming calls has been declining. Since RDD samples are developed exclusively from the areas codes and telephone exchanges applicable to household-based landline phone numbers, an increasing proportion of households are, therefore, not being reached through this sampling approach.
The National Center for Health Statistics (NCHS) reported that in 2008 “cell phone-only” households missed by RDD landline samples comprise about 18.4 percent of all U.S. households, more than double the level of just 3 years earlier.3 In addition, over the past 2 years, NCHS has also begun tracking another related threat derived from the expanded use of cell phones. This relates to the growing proportion of U.S. households that have both landline telephones and cell phones but that report that they receive all or nearly all of their incoming calls on their cell phones. NCHS estimates that such households, which are referred to as “cell phone-mostly” households, now constitute about 14.4 percent of all U.S. households.
The demographics of the people at risk of being excluded from RDD sample frames shows that they have distinctive characteristics that reduce the representation of certain population subgroups, especially younger adults. Thus, the exclusion of cell phone-only and cell phone-mostly households from RDD landline samples introduces the potential for non-coverage bias into surveys developed exclusively from this frame.
This coverage bias is minimized when developing household samples from an ABS frame. With over 99 percent coverage, one of the most compelling attributes of ABS is that it provides samples with virtually zero selection bias. This is because ABS samples are drawn through a random selection of residential addresses from the U.S. Postal Service’s Delivery Sequence File. ABS samples include nearly all types of households, including those with regular city style address listings, those with P.O. Box listings, as well as those with drop-point listings, which include multiple residential units with a single street address. Deliberately excluded are non-permanent addresses (such as seasonal or vacation homes), group quarters (i.e., prisons, barracks, and dormitories), and throw back units, which are households that have both a P.O. Box and city style address but only want mail sent to their P.O. Box.
Samples of households developed from an ABS frame can be used effectively for surveys like this one, which attempts to employ telephone interviews as one mode of data collection, because they capture each of the various types of telephone households, including landline-only telephone households, cell phone-only households, and cell phone-mostly households. Implementing telephone surveys from an ABS sample frame allows for the use of multiple modes of approach to the household for most households. Persons at addresses with matching landline telephone numbers (estimated at about 60 percent of the sample) can be recruited through both telephone calls and mailers, while persons at addresses without matching telephone numbers must be recruited through mailings alone. Additional details about data collection procedures may be found in Section B2.
Knowledge Network (KN). To augment MSM cases found in the course of collecting data from the ABS sample, we propose to approach adult gay/bisexual males who are previously identified after enrollment in KN’s KnowledgePanel®. KN provides access to a national online panel of 50,000 U.S. residents, age 18 and older, that is based on a nationally representative sample. The panel size fluctuates because of the addition of panelists from the ongoing recruitment and because of voluntary withdrawals and retirements of panelists reaching the end of their panel tenure. Individuals stay on the panel for an average of 2 years, and the panel is replenished by ongoing recruitment efforts each quarter. Up until about one year ago, the sample was drawn from an RDD frame. The sample is now shifting to an ABS sample frame to resolve issues related to cell phone use, and currently about one-third of the panel is derived from an ABS sample.
By providing computers and Internet access to sampled individuals who would not otherwise be able to respond online, KN addresses the issue of the so-called “digital divide,” which prevents some 30 percent of U.S. households from taking part in online panels. By recruiting individuals from a random, representative sample, where each sampled persons has a known statistical probability of selection, KN attempts to minimize the effects of “professional respondents” who self-select into “opt-in” online panels. The concern is that a sample comprised entirely of persons who are interested in participating in opt-in online surveys may be biased when compared to a representative, random sample of persons, some of whom are not interested and would need to be heavily recruited and some of whom would not participate at all. In an opt-in sample, the number and characteristics of reluctant responders is unknown, and responses are impossible to estimate. In the KN panel, the number and characteristics of nonresponders are known and can be investigated and estimated.
The demographic composition of the KN panel compares very well to that of the general population. Year after year, KnowledgePanel® closely tracks benchmarks from the Current Population Survey (CPS), published jointly by the U.S. Census Bureau and the Bureau of Labor Statistics. Attachment E, comparing KnowledgePanel® adult members to the December 2008 Current Population Survey (CPS), shows this close approximation with CPS benchmarks.
The use of the KN sample for studies of public and health policy is not new. Founded in 1998, KN conducts a wide range of research spanning the fields of public policy, health policy and services, epidemiology, environmental protection, political science, sociology, and social psychology. Researchers in these and other fields have conducted text-based and multimedia surveys using the web-enabled Knowledge Networks Panel because it is based on a probability sample of U.S. households designed to be representative of the U.S. population. KN customers include academic researchers from Stanford University, Duke University, Harvard University, and New York University, as well as government researchers at the FDA, NOAA, EPA, USDA, CDC, and other Federal agencies.
Initial cooperation rates for the KN panel are in the 15–20 percent range overall. Specific survey cooperation rates for recent studies with minimal field periods and followup range from 55 to 80 percent. Thus, overall response rates have been in the 8–16 percent range. We expect to increase the survey response rate through extending the field period, through sending a second invitation e-mail, and through two reminder telephone calls. (See Section B.3. Methods to Maximize Response Rates and Deal with Non-Response.) In addition, nonresponse analysis will be conducted at the end of the data collection period. Substantial information about nonresponders is available from KN at both the initial cooperation level and the survey specific level, so no additional data collection will be required from nonresponders.
The remaining one-third of the MSM augment sample will be obtained from another online panel. The panel developed by Survey Sampling International (SSI) is recruited through multiple online methods and includes both individuals with cell phones as well as landline telephone service. Though a convenience sample, this panel is very large, about 3,000,000 Americans. A random sample of 263,000 males has been previously screened for their sexual orientation; approximately 9,000 have identified themselves as gay/bisexual.4 Data from the SSI sample will be benchmarked against the KN data, which is based on a random sample and for which extensive response and nonresponse data are available.
While the use of the KN/SSI panel sample is not a perfect solution, we recommend the panel approach because of the significant drawbacks to the alternative methods of gathering the MSM population data (i.e., RDD or ABS). Given that the incidence of MSM in the general population is only about 5 percent,5 there is no method for selecting a national sample of MSM that is bias-free and that is cost-effective. Screening the general population for sexual orientation by mail or telephone is problematic because information provided on such short data collection instruments is likely to be biased. Furthermore, screening the entire general population is cost-prohibitive. Of course, the most expensive and inefficient option is collecting data from all randomly sampled U.S. households and then discarding more than 90 percent of all the responses who are not from MSM. We do not recommend that strategy for obvious reasons.
An alternative used in the past is to sample MSM from enclaves that have a high concentration of gays and bisexuals. Most large cities have such enclaves where the incidence of the MSM population rises into the 20–30 percent range. However, such enclaves are very small islands in the large ocean of America; hence, only a small portion of MSM live within them. Thus, the men in these are areas are likely to be different from MSM residing outside of the enclaves. We think this is particularly true with regard to HIV/AIDS, given that the enclaves were devastated by the epidemic in the 1980s and 1990s. In contrast, note that the MSM in the KN panel are geographically diverse, as shown in Attachment F.
A disproportionate sampling alternative, where all parts of the country are sampled but the enclaves are over-sampled, is also problematic since the cost of screening outside the enclaves is very high and the cases from the enclaves have to be weighted down considerably in order to keep design effects within reasonable bounds.
This section discusses sampling procedures for each of the four samples, as well as data collection procedures.
2.1 Sampling Procedures
As indicated earlier, the full target population for this study consists of adults ages 18 and older living in the 50 United States. The U.S. Postal Services Delivery Sequence File will be used as the Address-Based Sampling frame. The key advantage to this frame is that it allows sampling of almost all U.S. households—an estimated 99 percent of U.S. households are covered.
Data will be collected in two waves. In the first wave, we will release approximately two-thirds of the sample we anticipate needing for the entire study. After we have been in the field about 10 weeks, we will evaluate response rates and target sample sizes. The remaining sample, just enough to complete the survey, will be released soon thereafter for each stratum and each sample.
The three highly impacted population estimates will require an effective sample size of 750 cases to provide a point estimate of +/- 3 percent with a confidence interval of 90 percent. However, weighting often increases variances of survey estimates. The inflation due to weighting, which is commonly referred to as design effect, can be approximated by:
where Wi represents the final weight of the ith respondent. Consequently, the effective sample size will decrease as the variability in applied weights increases. While the resulting design effect for a particular survey estimate will not be known until the final survey weights have been computed, effective stratification and mindful oversampling will help maintain a control on this effect. It is our expectation that, for this study, the overall design effect for most survey estimates will be less than 2, and the target sample sizes are based on this assumption.
Table B.2-1 Summary of Sampling Strategies for the Highly Impacted Population Survey
|
||
Sample |
Sample Frame |
Target Number of Completed Surveys |
U.S. Population |
Stratified random sample from ABS |
1,000 |
African American Augment |
ABS sample from neighborhoods with high AA density (yields cases for both the AA and Hispanic/Latino estimate) |
801 |
Hispanic/Latino Augment |
ABS sample from neighborhoods with high Hispanic density (yields cases for both the AA and Hispanic/Latino estimate) |
801 |
Gay/Bisexual Men Augment |
Knowledge Networks |
450 |
|
Survey Sampling International |
250 |
In the following sections we describe procedures for selecting respondents. Different strategies are employed to create the four different samples: (1) General Population; (2) African American Augment (3) Hispanic/Latino Augment, and (4) MSM Augment. The strategies are summarized in Table B.2-1.
2.1.1 General U.S. Adult Population.
We propose surveying 1,000 adults age 18 or older living in the United States in English and Spanish as part of the general adult cross-section sample. As a first step, a random stratified sample of U.S. household addresses will be drawn from an ABS sample frame. The sample will be stratified by Census Region. This sample will then be matched against available telephone directory and commercial telephone-matching services to identify households for which a telephone number can be found. We estimate that about 60 percent of all address listings nationwide will likely yield a telephone match, while the remainder will not. The procedures for selection and for collecting data will differ according to whether the telephone number has been matched.
Matched sample. Each household with a telephone number will be mailed a pre-notification card and then will be called by Field Research telephone interviewers. Persons who prefer to complete the survey in Spanish will be offered that option or be directed to the Spanish-language version of the online questionnaire.
Consistent with established RDD procedures for randomly selecting persons within a household, if a person who answers the telephone reports that more than one adult resides in the household, the person who is the target of the survey will be randomly selected. We will employ a method that minimizes the use of intrusive questions that can negatively affect survey participation. The procedure begins by asking the individual how many adults reside within the household and takes advantage of the fact that approximately three-quarters of all households have only one or two adults in residence. In households where only one adult resides, no respondent selection procedure is required, and interviews are attempted with that adult. In households where two adults were found to reside, CATI randomly selects either the initial adult contacted or the household’s other adult. In households where three or more adults reside, CATI determines whether the adult being screened should be the randomly selected respondent after first giving each adult an equal chance of being selected. If that adult is not selected, then the “most recent birthday” method is used to identify which of the other household adults should be the selected respondent. Once an adult is selected, repeated attempts are made to reach and complete an interview with that individual.
The a priori random selection procedure within households is implemented only for the outgoing telephone interviews, which are expected to comprise the majority of completes. For other data collection procedures, we propose to implement a post hoc procedure to investigate bias related to lack of randomization.
Once a household is selected by random sample, persons within both matched and unmatched samples who complete surveys online will not be randomly selected because of feasibility concerns and a decrease in response rate that is likely to occur. For example, it would be difficult to set up an online system to ensure that someone different from the initial online respondent signs on and completes the survey if the initial respondent were not selected. We intend to collect information from everyone willing to respond online, including information on household composition. Using the same randomization algorithm implemented for the matched telephone survey, each online respondent will be assigned a flag indicating whether they would or would not have been randomly selected. Post hoc analyses will be conducted to determine whether cases that are not in the randomized sample would introduce bias. If bias were shown to exist, study estimates would be limited to respondents flagged as randomly selected. Because we expect that bias will be negligible and that instituting an online randomization strategy could compromise the validity of the data and result in the loss of data from all households where the initial respondent is not selected, we prefer the post hoc analysis strategy.
Unmatched Sample. Households for which a matching telephone number cannot be found will be sent multiple recruitment mailings printed in both English and Spanish. Persons from the unmatched sample will respond to the survey online or will set up an appointment for an interview. Both English and Spanish versions will be available. We do not propose to make use of random household selection among the unmatched sample. We do expect to implement the post hoc analysis scheme for identifying bias discussed in the previous paragraph.
2.1.2 African American Augment Sample
In order to obtain precision of three percentage points and a confidence interval of 90 percent, an effective sample size6 of about 750 African Americans after statistical weighting is needed, assuming no design effect. African American respondents will come from three sources: (1) African American adults identified and surveyed as part of the overall adult sample, (2) an augment sample of African Americans identified from an augmented sample of U.S. households and (3) by interviewing African American adults identified from the augmented sample of Hispanic/Latinos living in high-density Hispanic areas. Since African American adults now comprise about 12 percent of the U.S. adult population, we expect that our random sample of 1,000 U.S. adults will retrieve interviews with approximately this proportion, yielding slightly more than 100 African Americans nationwide.7
In addition, we propose to complete an additional 801 interviews with African Americans nationwide by gathering data from an ABS sample of U.S. households that is limited to census blocks units in which 40 percent or more of the population is black or African American. It is estimated that over 55 percent of the African American population live in communities that are more than 40 percent African American.
We will also include African Americans from a Hispanic/Latino augment sample. Based on population data, we expect that out of 900 interviews completed from the high-density Hispanic/Latino sample, 99 interviews will be completed with African American adults. Persons other than those with an Hispanic/Latinos or African American background will be screened out.
This will bring the total number of interviews completed among African Americans to at least 1,000.
However, because this part of the sample will exclude African American adults living in those parts of the country where less than 40 percent of the population is African American, some post-survey statistical weighting will be required when combining these results to the other African American sample interviews. This statistical adjustment will have the effect of reducing slightly this sample’s effective sample size, but even after such weighting we estimate that the 900 African American respondents from both augment samples will yield an effective sample size of 650 African Americans nationwide. Thus, when combined with the other 100 interviews completed from the random sample of U.S. adults, this will yield survey results with an effective sample size large enough to meet the desired +/- 3 percentage point sampling error threshold at the 90 percent confidence level with a design effect less than 2.0.
The methods used to develop random samples of African Americans from the sample augment will be similar to those developed for the U.S. adult sample. ABS sample listings will first be matched against available telephone directory listings and called by telephone whenever a match is found. Listings for which a matching telephone number could not be found will be mailed a letter printed in English and Spanish instructing recipients that the survey can be completed either online or by telephone. The same protocol of pre-notification and reminder mailings used for the general population will be used for this sample.
Because the goal of the augmented sample is to find only eligible African American adults, this will be done by asking household spokespersons several eligibility questions before completing the survey. Those being called by telephone or who attempt to complete the survey online will be asked their racial background and ethnicity, and the survey will continue with those identifying themselves as black or African American. Those mailing back a response card (i.e., from the unmatched households) will also be asked to indicate their racial and ethnic background on the response card (along with several other demographic questions), with callbacks only attempted with those who indicated that they were black or African American. Only individuals completing the full survey would be provided with the $20 incentive.
2.1.3 Hispanic/Latino Augment Sample
To obtain a precision of three percentage points and a confidence interval of 90 percent, we estimate that an effective sample size of about 750 Latinos is needed, assuming no design effect. We propose to develop the Hispanic/Latino sample from three sources: (1) Hispanic/Latino adults identified and surveyed as part of the overall U.S. adult sample, (2) by interviewing Hispanic/Latino adults identified from the augmented sample of African Americans living in high-density African American areas, and (3) through another augmented sample of U.S. households whose occupants live in high-density Hispanic/Latino neighborhoods. The expected number of completed interviews that would be captured from each of these sample sources is outlined below.
Since Hispanic/Latinos comprise about 14 percent of the U.S. adult population, we expect that the adult public sample will retrieve approximately this same proportion, thereby yielding a random sample of about 140 Hispanic/Latino adults nationwide.
Additional Hispanic/Latino cases will be obtained from the African American augment sample. According to population data, the proportion of Hispanic/Latino adults living in high-density African American areas (i.e., 40 percent or more African American) is slightly lower than the national average. Again, we will only interview African American, Black, and Hispanic/Latinos from the augmented samples. We estimate that out of 900 interviews completed from the African American augment, 99 Latino adults will be reached and interviewed in addition to the 801 African Americans.
In addition, we propose completing an additional 801 interviews with Hispanic/Latino adults nationwide by implementing a third ABS sampling of U.S. households, but one limited to households in census blocks in which at least 40% of the population is Hispanic/Latino. Thus, the total number of interviews conducted with Hispanic/Latino adults from all three sample sources will be at least 1,000.
Because a large number of the Hispanic/Latino interviews will be completed with Hispanic/Latinos living in high-density African American or Hispanic/Latino areas, some post-survey statistical weighting will be required to combine results across the three samples, reducing its effective sample size. After such weighting, we estimate that the sample of 1,000 interviews completed will produce survey results with an effective sample size large enough to meet the desired +/- 3 percentage point sampling error threshold at the 90 percent confidence level with a design effect less than 2.0.8
The methods used to develop random samples of Hispanic/Latino adults from the two Hispanic/Latino sample augments will be similar to those described for the U.S. adult sample and the African American sample augment. ABS sample listings will first be matched against available telephone directory listings and called by telephone whenever a match is found. The same protocol of pre-notification and follow-up/reminder mailings used in the other samples will be used for this sample in order to boost response rates.
Because the goal of the two augmented samples is to find only eligible Hispanic/Latinos, surveys will only be completed among households in which a Hispanic/Latino resides. Those being called by telephone or who attempt to complete the survey online will be asked their racial background and ethnicity, and the survey will only continue with those identifying themselves as Latino or of Hispanic origin. Those mailing back a response card will also be asked to indicate their racial and ethnic background on the card (along with several other demographic questions), with callbacks only attempted with those who indicated that they are Latino or of Hispanic origin. Only individuals completing the full survey would be provided the $20 incentive.
2.1.4 MSM Adult Augment Sample
Developing a nationwide sample of adult MSM presents significant challenges. The following is a description of how we propose to implement this portion of the survey.
First, we estimate that about one-half of interviews completed with the adult public sample survey of 1,000 will be conducted with men, and that about 5 percent of these males will report (in the survey) having had sex with other men. This would yield a sample of about 25 adult MSM. Using the same logic, we expect approximately 33 additional MSM will come from the African American over-sample and 13 from the Latino over-sample.
To augment these cases, we propose to use adult gay/bisexual males enrolled in KnowledgePanel®, a nationally representative, probability-based panel developed by KN through a dual recruitment strategy of RDD and ABS. MSM will be drawn from those panelists who selected the “gay” or “bisexual” response option to “Do you consider yourself to be …”9 Like most general population surveys/panels, KN does not ask about the gender of sex partners and, hence, cannot pre-identify MSM who do not self-identify as gay or bisexual. This issue is discussed in some detail below.
KP contains 658 gay/bisexual men; from this total, we expect to get 450 completed interviews for the study. We propose to augment it with approximately 250 additional men from the Survey Sampling International (SSI) online panel, enough to bring us to the total number needed for the study.10 Out of the 9,000 gay/bisexual men previously identified from a random sample of 263,000 men in the panel, 3,000 potential respondents will be randomly selected as the sample for the study. A series of sub-samples will be developed and released in groups in order to obtain the target number of completed interviews. Not all 3,000 individuals are likely to be invited to participate.
Because the SSI panel is an opt-in panel, we will use statistical weighting procedures to ensure that the combined data set is appropriately constituted. To calculate these weights we will use demographic data from the KN panel sample as benchmarks. This approach has been used successfully in a number of previous projects including the Joint Advertising, Market Research & Studies Advertising Tracking Study (JAMRS) conducted for the U.S. Department of Defense. The methodology has also been the subject of multiple papers and presentations at the meetings of the American Association for Public Opinion Research and the Joint Statistical Meetings.
2.1.5 Summary of Sampling Plan
Table B.2-2 displays a summary of the sampling plan for the HIP survey. The first two columns display the name of the sample and the methods used to derive the sample. The next four columns display the number of cases from each sample that will be used to compute the general population, African American, Hispanic/Latino, and MSM estimates. The total number of completed surveys is expected to be 3,500. Adding column totals across the bottom row leads to a higher number, since the same cases contribute to more than one estimate. For example, individuals in the General Population Sample contribute to all three highly impacted population estimates, and the African American and Hispanic/Latino augment samples contribute to the MSM estimate.
Table B.2-2 Source of Cases for Each of the HIP Estimates
|
|||||
Sample |
Method |
Number of Cases Contributing to Estimate |
|||
General U.S. Pop |
African American |
Hispanic/ Latino |
MSM |
||
General Population |
All households in ABS frame |
1,000 |
100* |
140* |
25* |
African American Augment |
High Density AA Augment from ABS frame |
0 |
801 |
99 |
23* |
Hispanic/ Latino Augment |
Hispanic Surname from ABS frame |
0 |
99 |
801 |
23* |
Augment MSM |
KN/SSI panel of Gay/Bisexual Men |
0 |
0 |
0 |
700 |
Total |
|
1,000 |
1,000 |
1,040 |
771 |
* These cases provide information for more than one estimate.
2.2 Sample Size
Sample size and sample yield estimates are shown in Table B.2-3. The estimates are based on various data sources and contractor experience with similar studies and the fact that we are planning substantial follow up with nonresponders.
Table B.2-3 Sample Yields |
||||
|
# in Sample |
# Screened |
# Eligible a |
# Complete |
Matched b |
|
|
|
|
General Population |
2,000 |
790 |
750 |
600 |
African American Augment |
2,250 |
900 |
675 |
540 (481 AA/59 Hispanic/Latino) |
Hispanic Augment |
2,250 |
900 |
675 |
540 |
Unmatched c |
|
|
|
|
General Population |
3,509 |
702 |
667 |
400 |
African American Augment |
4,000 |
800 |
600 |
360 (320 AA/ 40 Hispanic/Latino) |
Hispanic Augment |
4,000 |
800 |
600 |
360
|
MSM Panels d |
|
|
|
|
Knowledge Panel |
643 |
643 |
643 |
450 |
SSI Panel |
2,500 e |
2,500 |
2,500 |
250 |
Total |
20,875 |
8,219 |
7,164 |
3,500 |
a. Eligibility assumptions are based on U.S. Census data and information from Marketing Systems Group, our sample vendor.
b. Calculations for matched households assume (1) a 40 percent response rate using AAPOR Response Rate 3; 92) approximately 20 percent of the initial sample will include out-of-scope telephone numbers; (3) an 80 percent cooperation rate among identified eligibles.
c. Calculations for unmatched households assume (1) 20 percent of households are screened, either by returning their response card or going online; (2) 60 percent of eligible respondents that return a response card will complete the survey; (3) about 40 percent will be unreachable in seven attempts.
d. For the online panels, respondents are pre-screened in order to be selected for the panel. Cooperation rates have been provided by Knowledge Networks and SSI based on previous experience with the panels.
2.3 Weighting
Virtually all survey data are weighted before they can be used to produce reliable estimates of population parameters. While reflecting the selection probabilities of sampled units, weighting also attempts to compensate for practical limitations of survey sampling, such as differential non-response and under-coverage. Furthermore, by taking advantage of auxiliary information about the target population, weighting can render the sample more representative of the target universe. The weighting process for this survey includes the following major steps:
Calculation of Design Weights will be carried out in the first step to reflect the design-imposed disproportional allocation of the sample. Here, base weights will be calculated as reciprocal of the selection probabilities. This will be necessary because the needed sample will be selected from different strata, with varying selection probabilities in each stratum to increase the efficiency (hit rates) for minority subgroups of interest. Specifically, at this step adjustments will be made to compensate for over-sampling of Hispanic/Latino and African American respondents.
Adjustment for Frame Multiplicity will be required to the extent there is any overlap between the various frames that will be used for sample selection. While all address-based samples selected from the Delivery Sequence File (DSF) of the Postal Service will be generated from mutually exclusive frames, selection of males who have sex with other men will entail use of an overlapping frame. In order to adjust for the increased chance of selection for respondents from such households, their design weights will be adjusted accordingly.
Adjustments for Non-response and Under-coverage is an essential part of survey weight calculations. For this purpose, the above weights will be further adjusted so that aggregated final weights would match reported counts for the eligible population with respect to the available demographics. In this step, an iterative proportional fitting (raking) procedure will be used to simultaneously adjust the multiplicity-adjusted design weights to the counts of eligible adults, which will be obtained from the Current Population Survey (CPS).
Given the absence of Census and other official data on the gay/bisexual male population, we will consider the possibility of weighting the MSM sample data to bring the distribution of demographic characteristics into line with consensus views about the actual distribution of these characteristics in the U.S. population, based on the existing body of LGBT population socio-demographic research. A final decision will be made after a review of the sample and its most important characteristics.
It should be noted that several miscellaneous steps will be taken while calculating the survey weights. These steps include imputation of missing data and trimming of extreme weights. Missing data for demographic variables needed during the weighting process will be imputed in such a way as to maintain the observed distribution of such items. In order to avoid undue inflation of variances of survey estimates, extreme base weights will be trimmed prior to the final rating adjustment. These standard steps will be taken to construct final survey weights that can produce unbiased estimates for the population parameters of interest.
Variance Estimation for Weighted Data from Complex Surveys. Survey estimates can only be interpreted properly in light of their associated sampling errors. Since weighting often increases variances of estimates, use of standard variance calculation formulae with weighted data can result in misleading statistical inferences. With weighted data, two general approaches for variance estimation can be distinguished. One is Taylor Series linearization, in which a nonlinear estimator is approximated by a linear one, and then the variance of this linear proxy is estimated using standard variance estimation methods. The second method of variance estimation is replication, in which several estimates of the population parameters under the study are generated from different, yet comparable parts of the original sample. The variability of the resulting estimates is then used to estimate the variance of the parameters of interest using one of several replication techniques, such as Balanced Repeated Replication (BRR) and Jackknife. There are several statistical software packages that can be used to produce design-proper estimates of variances using linearization or replication methodologies, including:
SAS: http://www.sas.com
SUDAAN: http://www.rti.org/sudaan
WesVar: http://www.westat.com/westat/statistical_software/wesVar
Stata: http://www.stata.com
2.4 Survey Procedures
The Highly Impacted Population (HIP) survey will be conducted using both telephone and online data collection methods. Data collection procedures for the HIP survey involve the management of three different data collection processes for three different groups: (1) ABS Matched Sample, (2) ABS Unmatched Sample, and (3) KN/SSI Panel. For the ABS Matched Sample, a telephone number is located that is matched to the address. This allows us to make telephone calls directly to the household. For the ABS Unmatched Sample, there is no telephone number available. Data collection for this group will occur either online or by telephone if the respondent calls in or sends in a card requesting an interview. For the KN/SSI panel, contact with participants is handled through a Panel Vendor, and e-mails or reminder calls are sent to respondents who have self-identified as gay or bisexual. Regardless of group, all of the recruitment materials will convey the information in both English and Spanish, and respondents can choose to complete the survey by telephone or online in English or Spanish as well.
ABS Matched Sample. Households with matched telephone numbers will receive a pre-notification postcard as well as two subsequent reminders, one in a postcard format and one in letter format. These recruitment materials may be found in Attachment D.
The pre-notification postcard will alert households that the contractor will be contacting them about an important study being conducted for NIH. The postcard will also contain information about the $20 incentive and the website at which the survey can be completed online. The postcard will include a toll-free call-in number at which respondents can leave a message with a preferred time to be called back for an interview. If at all possible, respondents calling in will be interviewed immediately, without an appointment.
In order to gain attention, the card will be oversized and printed in a bright color. A postcard is considered to be the best initial contact because it does not require the opening of an envelope to get its message across to the potential respondent.
After the postcard, we will also make up to five attempts to reach and screen these households by phone. An additional seven phone calls will be made to complete interviews with respondents we have identified as eligible.
If there is no response after five call attempts, a nonresponse letter will be sent. This letter will be printed on NIH letterhead and will carry an NIH insignia on an outer envelope developed in accordance with standards specified by the NIAID communications office. The letter will include a request for participation from the NIAID Project Officer. The letter will include, as before, the website address, information about the $20 incentive, and a telephone number to create an appointment for an interview. The letter will also provide additional information about the survey as well as a telephone number to call for more information.
After a waiting period of a few days, five more calls will be made to the respondent. If these calls are ineffective, a final postcard will be sent requesting participation. The final postcard will be similar to the initial postcard but will include more urgently worded language as well as the online survey website.
Most cases in the Matched Sample will have landline numbers, although some individuals calling in may request a cell phone appointment. For landline numbers, initial telephone contact attempts will be made during the afternoon and early evening hours on weekdays and throughout the day on weekends to maximize the chances of including both working and non-working adults. Callbacks will be made at different times and on different days to increase the probability of finding qualified adults available for the interview. All calls to cell phones will be initiated at the preferred times provided by potential respondents on their response cards; all of these respondents will come from the households without matched phone numbers. As contact efforts unfold, appointments for callbacks will be made for the convenience of all potential study respondents regardless of which part of the sample they come from.
For those telephone numbers where we repeatedly encounter answering machines, we will leave a message on the answering machine after the fourth attempt, explaining the purpose of the call and our desire to include that household or respondent in the survey. A second message will be left after the seventh attempt to the listing. Subsequent messages will be left as appropriate given the call history for the listing. Each message will reference the availability of the toll-free 800 number that potential respondents can call to contact the contractor directly to complete the survey, as well as the website at which the survey can be completed online. The contractor has developed a number of techniques to maximize telephone survey response, and these techniques are described in detail in Section B.3.
ABS Unmatched Sample. Households without matched phone numbers will receive a pre-notification postcard like the one that is being sent to the Matched Sample. However, the card will not include language about incoming telephone calls, since we will have no valid number for unmatched addresses. Respondents will be asked to call in to set an appointment for a telephone interview or to respond online.
If there is no response, we will send out a letter invitation that is similar to the NIH letter being sent to the Matched Sample. The letter invitation will consist of a business-size envelope containing a letter with a tear-off response card at the bottom and a pre-paid business reply envelope. The response card will ask for the respondent’s name, telephone number, whether the phone number is a cell or landline number, and whether the number is a residential/personal number or a business number. It will ask for the preferred days and times for us to contact them. In addition, it will gather data on gender and age (if sent to the general population sample) and also race/ethnic identification (if sent to one of the race/ethnic over-sample populations.) Once a response card has been received by the contractor, eligible respondents will receive up to seven calls in order to get a completed interview.
If there is no response, a second, similar letter with more urgent wording will be sent. Like the Unmatched Sample, if there is no response after 8 weeks, a final postcard requesting assistance will be sent.
Nonresponse Data Collection. At close of data collection, we will send a mailing to a sample of 3,000 nonresponders (Matched and Unmatched). The mailing will include a $2 non-contingent incentive, along with a prepaid postcard asking the respondent to provide information about gender, age, race, ethnicity, and whether they have been close to anyone with HIV/AIDS. These data will be used to calculate nonresponse adjustments.
KN/SSI Panelists. Panelists will first receive an initial e-mail invitation that is similar to the invitation letter from NIH. The e-mail will include information on how the data are being used, where to go for further information, and the $5 incentive.
If there is no response, two reminder e-mails will be sent. Finally, 6 weeks after the start of the field period, up to two reminder calls will be made to panel nonresponders.
For those who start the contractor’s online survey but fail to complete the interview, three e-mail reminder followups will be generated to those who provide their e-mail address when they sign into the site. Those going to the site will be encouraged to provide their e-mail address as a way of helping them should they be “unable to complete the survey.”
Attachment D contains all supporting documentation for the procedures described in this section, including the pre-notification postcard, response card, invitation letter, reminder postcard and letter, initial phone call scripts, cold call refusal conversion scripts, and call with concern scripts.
Table B.2-4 summarizes the data collection process for the three groups and describes the use of the different letters, postcards, and scripts.
Table B.2-4 Data Collection Process with Respondent Contacts |
||
ABS Matched Sample (Addresses AND Phone Numbers) |
ABS Unmatched Sample (Address with NO Phone Numbers) |
Online Panel (Vendor manages e-mail contact) |
Send Pre-Notification Cards |
Send Pre-Notification Cards |
Send Introductory E-mail |
Online data collection throughout field period. (see Survey Instrument) |
Online data collection throughout field period (see Survey Instrument) |
Online data collection throughout field period (see Survey Instrument) . |
Start calling sample with known phone number using Screener and Survey Instrument. Voicemail Message left. |
Mail Invitation Letter with Response Card. Incoming cards will be screened and entered into CATI. Calls made according to Screener and Survey Instrument. Voicemail Messages left. |
|
Mail Reminder Letter to non-respondents. |
Mail Reminder Card to non-respondents. |
Send Follow-up E-mail |
Calls made; Voicemail Messages left, online data collection. |
Mail Second Invitation Letter and Response Card to non-respondents. |
|
Mail Reminder Card to non-respondents |
Calls made; Voicemail Messages left, online data collection. |
Send Follow-up E-mail |
Final calls made |
|
Two Reminder Calls |
END DATA COLLECTION |
||
NON-RESPONSE ADJUSTMENT MAILING |
|
Though the target response rate for surveys is 80 percent, previous contractor experience with telephone studies indicate that a response rate of about 30 percent is more likely. High response rates minimize selection bias in survey findings, so several procedures will be implemented to maximize the response rate. Survey response rates are more robust when the research topic is salient to the respondent, when the questionnaire has been designed for maximum ease of administration, when multiple data collection modes are implemented, when the field period is extended, and when the data collection protocol is tailored through a variety of incentives and accommodations to acknowledge respondents’ cooperation and contribution. The presentation of the survey is also important, so that respondents can differentiate it from other mail and research requests.
The introductory pre-notification card or e-mail with the link to the survey will indicate that it is sponsored by NIAID, a prestigious NIH institute known to be at the forefront of HIV/AIDS research. The card will be colorful and oversized, so that it stands out from all other correspondence. The first contact with the potential ABS respondent will utilize a card rather than a letter since a card removes the necessity of opening an envelope, which is a potential barrier to receiving the information.
The prenotification card/e-mail will also discuss the conferment of a monetary incentive that sufficiently acknowledges the respondent’s time and cooperation. Also on the card will be a telephone number through which the respondent can set an appointment for an interview.
Invitation letters will be sent on NIH letterhead and will be signed by the NIAID Project officer. Telephone scripts include the NIH name, mention of the incentive, and a place to call. All potential respondents will receive multiple reminders delivered over several months. Two follow-up mailings will be sent to both groups of households—those with a matching telephone number and those without. Mailings will include a website address where the respondent can complete the survey online. Providing nonresponders with a choice of methods for completing the survey is expected to improve response rates.11 Online administration of the survey is expected to greatly increase the ease of data collection for persons who are computer literate.
Similarly, two reminder e-mails will be sent to the KN/SSI panelists after the initial invitations are sent. For those who start Field Research’s online survey but fail to complete the interview, three e-mail reminder followups will be generated to those who provide their e-mail address when they sign into the site. Those going to the site will be encouraged to provide their e-mail address as a way of helping them “should they be unable to complete the survey.”
Multiple calls will be made to households with telephone numbers, and these calls will be synchronized with reminder mailings to maximize response rates. Calls will be made at different times and on different days to increase the probability of finding qualified adults to complete the survey, and where possible, appointments will be made at dates and times specified by the respondent to maximize respondent convenience and cooperation. Telephone messages that include the website of the online survey will be left for nonresponders.
Refusal conversion for telephone survey has become a critical component of successful survey efforts. There has been a significant decline in response rates when conducting virtually all telephone surveys among the general public in the United States over the past decade. The following are procedures that we will employ to increase the rate of response on both the telephone and online surveys.
Most telephone refusals occur at the onset of the interview attempt. Although some are not preventable, our experience is that a certain proportion can be persuaded to cooperate. Training will be provided on how to adjust a refusal script according to the mood of the person making the refusal. For example, those who appear to be initially uninterested will be approached differently from those who are diffident and lacking confidence. This procedure, coupled with continued training on how to make initial contact and the emphasis on the importance of minimizing refusals, has proven effective in holding down the rate of initial refusals.
When refusals are encountered, procedures will be established for interviewers to record their impressions of the respondent’s reason for refusing and any other information that may be relevant in helping to gain a completed interview in a subsequent attempt, including the name of the interviewer who obtained the refusal. Initial refusals other than those adamant about not being called again (“hard refusals”) will be called again. Under this approach, even though the household resulted in an initial refusal, a second “cold call” attempt will be made to complete the interview as if the initial refusal had never occurred. Such calls are typically made at different times of day and on different days of the week than when the initial refusal occurred. It has been our experience that this form of “cold call” refusal conversion is successful in converting about 10 percent of all initial refusers.
For those households that continue to refuse, or for those where a respondent began the survey but broke off (but again excluding “hard refusals” adamant about not being called again), a specially trained team of refusal conversion interviewers would approach these households a third time, using a “call with concern” procedure. In these calls, interviewers consult previous details about the prior refusals to provide them with information that might be useful in helping them to convert refusals.
A number of techniques will be utilized to maximize survey response among non-English speakers in the telephone survey. One procedure is to make available the Spanish language version to interviewers at the onset of data collection. Our CATI systems are designed to enable an interviewer to seamlessly switch between the English and Spanish language versions of the questionnaire during the call, enabling all bilingual interviewers to make the initial household approach in either language. Thus, by employing a large number of the bilingual interviewers fluent in both English and Spanish, many of these initial contacts can be handled without callbacks and can be converted immediately into completed interviews during the initial call.
Another procedure that has proven to have a positive impact on improving response rates when calling non-English language households involves the management of the sample. The sample management protocols assign in-language callbacks to interviewers fluent in each language, enabling the prompt scheduling of callbacks to households that require a non-English language interviewer. And since all interviews in all languages are conducted in-house from the survey data collection subcontractor’s (Field Research Corporation) own central location interviewing call centers, we are able to maintain maximum control over the management of the non-English samples and their assignment to appropriate interviewers.
Consistent with the response rate calculations approved by the American Association for Public Opinion Research (AAPOR), response rates for this study will be calculated as follows:
Number of Completed Surveys
Number of Completed Surveys + Number of Nonrespondents
When constructing the survey instrument, items used previously in other surveys by other NIH Institutes and Centers or organizations were carefully evaluated for inclusion. The survey instrument was tested with cognitive interviews with nine respondents who are similar to the ones that will provide respondents for the survey. In response to their comments, questions were revised, dropped, or combined.; response categories were added to several items; and several small wording changes were made.
A pre-test of the online program and procedures was conducted with nine individuals to verify procedures and the fidelity of the instrument.
The contractor analyzing information for the NHVREI will be NOVA Research Company (NOVA). NOVA will subcontract data collection to Field Research. Responsibility for collecting and analyzing information obtained through the methodologies described above will rest with NOVA. All data collection and analysis will be performed in compliance with OMB, Privacy Act, and Protection of Human Subjects requirements.
1 AA, H/L, and Gen Pop data accessed on August 3, 2009, at the U.S. Census http://www.census.gov/popest/national/asrh/.
2 Michael, R., Gagnon, J., Laumann, E., & Kolata, G. (1994). Sex in America: A definitive survey. Boston: Little, Brown.
3 http://www.cdc.gov/nchs/data/nhis/earlyrelease/wireless200905.htm. Accessed August 2, 2009
4 Our main concern about the use of online panels has to do with our access to only self-identified gay/bisexual males as opposed to all MSM. MSM who do not self-identify as gay or bisexual are not identified on the panel sample. We are keenly aware of the difference between the two populations. It is important to note, however, that the vast majority of MSM self-identify as gay or bisexual; recent data from the California LGBT Tobacco Survey suggests that this non-gay/bisexual identifying segment is quite a small proportion of all MSM, in the range of 10–15 percent, depending on whether partner choice is assessed in adolescence, adulthood, or in the last 5 years. Notably, gay enclaves are not the best place to find non-gay/bisexual identifiers; the vast majority of these men reside elsewhere.
5General population surveys with questions on sexual orientation often find a larger proportion, especially if they make use of T-ACASI interviewing methods, but the contractor, NOVA Research, has been involved with all the other stand-alone LGBT surveys that involve screening for sexual orientation, and, in this survey setting, the proportion is a smaller one.
6The effective sample size is the sample size after weighting.
7We are assuming that African Americans will respond at a lower rate than other respondents.
8Our various sampling frames will be constructed to be mutually exclusive of one another. This way, proper selection probabilities will be available for each record, and all selection biases will be completely eliminated. In other words, surname households falling into the high-density African American areas will be eliminated from the surname sample frame.
9The full set of response options to that question consist of the following: heterosexual or straight, gay, lesbian, bisexual, or other (please specify).
10 Survey Sampling International (SSI) was founded in 1977. In addition to global online samples, SSI is a leading supplier worldwide of RDD and targeted telephone samples.
11 There are some indications that in order to maximize response rate in multi-mode studies, potential respondents should be offered a choice of modes (e.g., telephone or online) only after nonresponse to initial recruitment efforts for a single mode. This is an area of rapidly evolving knowledge. Should there be expert consensus before the survey field date that delaying a choice is optimal, information about the online survey option will deleted until the second follow up mailing or the second set of answering machine messages for the ABS sample.
File Type | application/msword |
File Title | Supporting Statement for |
Author | CMcLeod |
Last Modified By | CMcLeod |
File Modified | 2009-11-18 |
File Created | 2009-11-18 |