Download:
pdf |
pdfGovernment & Academic Research
Best Practices for ICR Supporting Documentation
for KnowledgePanel® Surveys
Prepared by J. M. Dennis
December 1, 2010
Federal agencies develop Information Collection Requests (ICRs) that are submitted to
the Office of Management and Budget (OMB) for new surveys that will impose a
reporting burden on the U.S. public as defined under the Paperwork Reduction Act of
1974 and as further required under the so-called Data Quality Act.
This document is an attempt to provide textual support to Federal agencies proposing to
use KnowledgePanel sample for studies about which an ICR will be submitted for OMB
review.
Because OMB’s stated objection to KnowledgePanel is the cumulative response rate
(~10%), the information below emphasizes nonresponse bias measurement. KN has led
the survey research industry in providing standards for response reporting for web-based
surveys and has disclosed its response rates in Public Opinion Quarterly.1
Best Practices for ICRs involve descriptions of these tasks:
Sampling
Data Collection Procedures
Nonresponse Bias Measurement
Statistical weighting is not covered in this document as KN’s standard procedures have
been considered sufficient by OMB, in past ICR reviews.
Summary
Key Justification Points for using KnowledgePanel sample in Federal information
collections:
Uses probability-based sampling consistent with traditional sampling theory;
Provides single mode of data collection (web based), obviating potential for data
collection mode effects;
1
See Callegaro, Mario & Disogra, Charles (2008). Computing Response Metrics for Online Panels. Public
Opinion Quarterly. 72(5) pp. 1008-1031.
Knowledge Networks, Inc. Government & Academic Research 1350 Willow Road, Ste 102 Menlo Park, CA
94025 phone: (650) 289-2000 facsimile (650) 289-2001 www.knowledgenetworks.com/ganp/
Includes sample coverage of non-Internet households (via computer device and
ISP provision) and Spanish-language households;
Supports multimedia surveys/stated-preference methodology through the web
mode of data collection;
Supports targeted sampling for studies of subpopulations;
Supports longitudinal information collections with high follow-up cooperation
rates;
Reduces reporting burden on the U.S. public by re-using previously consented
sample and by eliminating the re-asking of previously asked survey questions;
Benefits from informed consent having already been acquired from research
subjects;
Supports cost-effective measurement of nonresponse bias;
Achieves within-panel survey cooperation rates of 70% and higher, minimizing
the potential of nonresponse bias from self-selection into a specific study.
Best Practices for ICRs, to reduce it to the essentials, are as follows:
Survey-Specific Sampling
o Draw the KnowledgePanel sample exclusively from the 2008 and later
cohorts sourced from Address-Based Sampling (ABS);
o Include KnowledgePanel Latino;
o Use modified stratified sampling with completion propensity adjustment, a
sampling selection procedure that takes into account between-group
differences in survey completion rates to KnowledgePanel online surveys.
Data Collection Procedures to Maximize Within-Panel Survey Cooperation Rate
o Send pre-notification email to sampled respondents 2-3 days before
sending the actual survey invitation;
o Field Survey for two to three weeks;
o Include cash-equivalent incentives of $5 to $10 for longer surveys (25
minutes or longer);
o Use cash-equivalent incentives selectively to target nonresponders late in
the field period;
o Use email reminders and telephone-based reminder calls with
nonresponders.
Nonresponse Bias Measurement
o To identify possible self-selection effects at the panel recruitment and
retention stages, statistically compare demographic and household
characteristics of (i) the sample invited to the KN panel and (ii) the subset
of actual survey participants (i.e., the estimating sample);
o To identify possible within-panel self-selection effects, statistically
compare demographic and household characteristics of (i) the sample
invited to a specific online panel survey and (ii) the panelists participating
in the survey on which the estimates are based.
Knowledge Networks, Inc. Government & Academic Research 1350 Willow Road, Ste 102 Menlo Park, CA
94025 phone: (650) 289-2000 facsimile (650) 289-2001 www.knowledgenetworks.com/ganp/
o Benchmark KN panel survey estimates by placing benchmarking survey
questions on the KN panel survey instrument, and comparing the surveys
estimates to benchmarks from gold-standard surveys (e.g., NHIS, GSS,
SIPP, etc); the selection of the survey questions should be informed by a
theory that the survey measures are related to the study topics of interest
(e.g., political ideology measure in a study on attitudes towards an
government regulation of an environmental good);
o (Optional, but not essential) Measure nonresponse bias directly through a
technique sometimes called “double sampling” or a “nonresponse followup survey.” The procedure is to randomly subsample households that
initially were selected to join KnowledgePanel but refused to do so, as
well as subsample households that agreed to join the panel but
subsequently refused to participate. In this approach, a subset of items
from the main survey questionnaire is administered to the selected samples
by either a mail survey or web survey.
In the next two sections, more information is provided on the Sampling and Nonresponse
Bias Measurement Tasks.
Sampling
1. Restrict Panel Samples to ABS-Sourced Respondents
As of December, 2010, approximately 40% of the active KnowledgePanel households are
sourced from a sample frame called “Address-Based Sampling,” while the remainder is
sourced from random-digit dialing (RDD). For the information collections requiring
OMB review, we recommend that the KnowledgePanel sample be restricted to the ABSsourced sample in order to provide the most representative sample possible. ABSsourced sample is advantaged by providing improved representation of certain segments,
particularly young adults, cell-phone-only households, and nonwhites. In addition, by
restricting the sample to ABS, valuable ancillary person-level and household-level
characteristic data are available for the ABS sample units, making possible a descriptive
comparison of the characteristics of the entire invited sample and the subset of survey
participants.
Between 1999 and April 2009, KnowledgePanel’s probability-based recruitment had
been based exclusively on a national RDD frame. In April 2009, Knowledge Networks
added the ABS frame (to supplement the RDD frame) in response to the growing number
of cell-phone-only households (CPOHHs) that are outside of the RDD frame. In January
2010, Knowledge Networks transitioned completely to ABS-sourced panel recruitment
and ceased recruitment using RDD and telephone methods, with the exception of some
Spanish-language telephone-based recruitment to support KnowledgePanel Latino.
ABS involves probability-based sampling of addresses from the U.S. Postal Service’s
Delivery Sequence File (DSF). Post office boxes and rural route addresses are included.
Knowledge Networks, Inc. Government & Academic Research 1350 Willow Road, Ste 102 Menlo Park, CA
94025 phone: (650) 289-2000 facsimile (650) 289-2001 www.knowledgenetworks.com/ganp/
Business and institutional addresses (i.e., dormitories, nursing homes, group homes, jails,
etc.) are removed from the frame, as is military housing. Also removed are those multidwelling residential structures that have only a single address (called a drop point
address) and for which there is no unit-level identifying information (mail is internally
distributed).
Randomly sampled addresses are invited to join KnowledgePanel through a series of
mailings. Telephone follow-up calls are made to nonresponders when a telephone
number can be matched to the sampled address. Invited households can join the panel by
one of several means:
Completing and mailing back an acceptance form in a postage-paid envelope;
Calling a toll-free hotline staffed by bilingual recruitment agents; or
Going to a dedicated KN recruitment website and completing the recruitment
information on line by using a unique PIN provided in the advance letter.
After initially accepting the invitation to join the panel, respondents are then “profiled”
online, answering key demographic questions about themselves. This profile is
maintained using the same procedures established for the RDD-recruited research
subjects. Respondents not having an Internet connection are provided a laptop computer
and free Internet service. Respondents sampled from the ABS frame, like those from the
RDD frame, are provided the same privacy terms and confidentiality protections that we
have developed over the years and that have been reviewed by dozens of Institutional
Review Boards.
The key advantage of the ABS sample frame is that it allows sampling of almost all U.S.
households. An estimated 97% of households are “covered” in sampling nomenclature.
Regardless of household telephone status, they can be reached and contacted via the mail.
Second, ABS pilot project has other advantages beyond the expected improvement in
recruiting young adults from CPOHHs, such as improved sample representativeness for
minority racial and ethnic groups and improved inclusion of lower educated and lowincome households.
2.
Include KnowledgePanel Latino Sample in the Panel Sample Draws
To achieve improved sample coverage, inclusion of the Spanish-language dominant
households can be important for certain studies. Approximately 4% of the U.S. adult
population is not covered for a general population study when the sample rule excludes
Spanish-speaking adults who are insufficiently literate in English for self-administered
English-language surveys.
3.
Use Modified Stratified Sampling with Completion Propensity Adjustment
Certain demographic segments have survey cooperation rates that are predictably lower
or higher than average. If these groups are sampled for a panel survey in proportion to
their share of the U.S. population, then the unweighted share of the interviews from the
Knowledge Networks, Inc. Government & Academic Research 1350 Willow Road, Ste 102 Menlo Park, CA
94025 phone: (650) 289-2000 facsimile (650) 289-2001 www.knowledgenetworks.com/ganp/
low-cooperation-rate groups will be less than their share of the U.S population, while the
unweighted share of the high-cooperation rate groups will be higher than their share of
the U.S. population.
For studies requiring an ICR, Knowledge Networks employs a refinement to the KN
standard protocol for drawing samples from the panel (see U.S. Patent No. 7,269,570).
The modified approach is designed to improve further the demographic similarities
between the completed panel interviews and the U.S. Census population benchmarks by
factoring estimated survey completion rates for key demographic groups into the sample
draw selection probabilities. Knowledge Networks has ample experiential data upon
which to calculate reliable completion propensities for specific demographic groups.
Essentially, by oversampling groups that have consistently lower completion rates and
undersampling groups that tend to have higher rates, the valid completed interviews can
mirror the Census demographic benchmarks more closely. This approach can be
employed when it is essential to minimize the range of a study’s post-stratification
weights and the resultant design effect. This modified sampling approach is executed by
first constructing 576 cells using the following six variables and then adjusting the
fielded sample size for each cell by the response propensity for each cell: Age (18-24,
25-34, 35-44, 45-54, 55-64, 65+); Education (Less than high school, High school, Some
college, College degree +); Hispanic (Hispanic, Non-Hispanic); Race (White, Black,
Other); Gender (Male, Female); Household income (Less than $75K, $75K+).
Nonresponse Bias Measurement
This section will describe basic statistical tests of nonresponse bias measurement, and
one direct measurement technique. A summary of past research on nonresponse bias
measurement in the context of Knowledge Networks surveys is available.2
1.
Statistical Comparison of Demographic and Household Characteristics of
the Sample Frame versus the Subset of Actual Survey Participants
This approach attempts to measure self-selection bias among the estimating sample
making up the completed interviews. The approach is possible only for general
population studies of U.S. adults where the interview sample size requirement is 5,000
interviews or less (subject to increase as the ABS-sourced sample increases over time).
The approach works best when limiting the KN panel sample draw to ABS-sourced
panelists.3 ABS sourcing is important because a specific benefit of address-based
sampling: the ability to append to the sample frame many person-level and householdlevel ancillary data associated with an address. Commercial databases (e.g., Experian,
infoUSA, and Acxiom) are used to append to the sample frame observed and modeled
information at various levels of aggregation. These same ancillary data are also used to
2
See Dennis, J. Michael. 2010. KnowledgePanel®: Processes & Procedures Contributing to Sample
Representativeness & Tests for Self-Selection Bias. The paper may be downloaded from
http://www.knowledgenetworks.com/ganp/reviewer-info.html.
3
The approach is technically possible when using the RDD-sourced portion of KnowledgePanel; however,
the ancillary data attached to the sample frame will have more unit-level missing data.
Knowledge Networks, Inc. Government & Academic Research 1350 Willow Road, Ste 102 Menlo Park, CA
94025 phone: (650) 289-2000 facsimile (650) 289-2001 www.knowledgenetworks.com/ganp/
analyze nonresponse bias by comparing the ancillary data available for the entire sample
invited to join the KnowledgePanel and the small subset of recruited study participants
that participate in any given study. If the study requires a general population adult
sample, the expectation is that the estimating sample of completed interviews will have
marginal distributions on person-level and household-level characteristics that are
statistically similar to the distributions of the entire invited sample.
Consider the following example from an analysis of ancillary data. The figure is a
comparison of the distribution of household income for the ABS sample units making up
the entire invited sample and the distribution from the sample agreeing to join
KnowledgePanel. The data source is the ancillary data from all KN invited sample from
2008 through early 2010. All sample units fielded for panel recruitment during this time
frame are included in the analysis. The results show statistical comparability between the
total invited sample versus the subset of actually recruited adults. For instance, 22.4% of
the invited sample was relatively low income (less than $25K household income)
compared to 20.4% of the recruited sample.
Distributions of Household Income Comparing the Total Sample Invited to KN
Panel Recruitment and the Subset of Actual Recruited Persons
100%
90%
19.8
20.6
12.0
12.5
80%
70%
100k+
60%
19.2
19.7
50%
40%
30%
50k-74k
15.5
16.1
11.1
10.7
22.4
20.4
Invited sample
Recruited sample
35k-49k
25k-34k
20%
10%
75k-99k
< 25k
0%
Statistical comparisons for specific studies can be made between the total invited sample
for the panel recruitment and the estimating sample for these variables:
Household level
Number of adults in the household
Presence of children (yes, no)
Home ownership (own, rent)
Household income (12 levels recoded to <$25K. $25-$49K, $50-$74K, $75K+)
Knowledge Networks, Inc. Government & Academic Research 1350 Willow Road, Ste 102 Menlo Park, CA
94025 phone: (650) 289-2000 facsimile (650) 289-2001 www.knowledgenetworks.com/ganp/
Person level
Marital status (married, single)
Education of head of household (less than high school, high school, some college,
BA, higher)
Age of householders
Race/Ethnicity (White, African American, Hispanic, Other)
The approach is explained in more detail in an article by DiSogra, Dennis, and Fahimi in
the 2010 Proceedings of the Joint Statistical Meetings.4 An aggregate error rate can be
calculated as the sum of the differences in the distributions between the expected values
from the total invited sample compared to the actual values (from the estimating sample
of completed interviews).
2.
Statistical Comparison of Demographic and Household Characteristics of
the Survey Participants versus the Non-Responders
For studies requiring ICRs, the within-panel survey cooperation rate will be less than
100%. There is the potential for self-selection bias at the stage of inviting KN panelists
to participate in an actual survey. This technique attempts to identify nonresponse
resulting from a survey cooperation or return rate of 70% to 80%. It involves a simple
comparison using the approximately 20 person-level and household-level characteristics
(available on all KN panelists). The sample invited to participate is compared to the
sample that does participate. If the survey topic is preponderantly more attractive to
some groups rather than others, this technique will identify such patterns.
3.
Benchmarking KN Panel Survey Estimates
Benchmarking KN panel survey estimates by placing benchmarking survey questions on
the KN panel survey instrument, and comparing the surveys estimates to benchmarks
from gold-standard surveys (NHIS, GSS, SIPP, etc); the selection of the survey questions
should be informed by a theory that the survey measures are related to the study topics of
interest (e.g., political ideology measure in a study on attitudes towards an government
regulation of an environmental good). For more examples of benchmarking studies, see
Dennis, J. Michael. 2010, “KnowledgePanel®: Processes & Procedures Contributing to
Sample Representativeness & Tests for Self-Selection Bias,”
http://www.knowledgenetworks.com/ganp/reviewer-info.html.
A limitation of this approach is that usually the benchmarking data were not collected by
the online mode of data collections but instead by in-person interviewing. As a result,
data differences observed in the KnowledgePanel estimates and those from the
benchmarking survey could the result of the mode differences (presence versus absence
of an interviewer). Differences in mode is a hypothesized to be factor accounting for
4
The article may be downloaded from http://www.knowledgenetworks.com/ganp/reviewer-info.html. The
citation is DiSogra, Charles, J. Michael Dennis, and Mansour Fahimi. 2010. On the Quality of Ancillary
Data Available for Address-Based Sampling. Conference Proceedings of the 2010 Joint Statistical
Meetings.
Knowledge Networks, Inc. Government & Academic Research 1350 Willow Road, Ste 102 Menlo Park, CA
94025 phone: (650) 289-2000 facsimile (650) 289-2001 www.knowledgenetworks.com/ganp/
KnowledgePanel estimates between different on some items from the General Social
Survey.5
4.
Direct Measurement of Nonresponse Bias
Direct measurement of nonresponse bias is infrequently undertaken because of its cost
and also because of a concern that it will introduce an additional source of error.
The technique is sometimes called “double sampling” or a “nonresponse follow-up
survey.” The procedure is to randomly subsample households that initially were selected
to join KnowledgePanel but refused to do so, as well as subsample households that
agreed to join the panel but subsequently refused to participate. In this approach, a subset
of items from the main survey questionnaire is administered to the selected samples by
either a mail survey or web survey.
Additional error can be introduced by this approach as a result of the need to use more
than one mode of data collection in order to achieve a satisfactory refusal conversion rate.
Because the double-sampling approach is premised on the need to interview those who
already refused to participate or else constitute non-contacted households, it is common
to supplement the web mode of data collection with telephone-based and mail-based
interviews. As a consequence, the supplemental interviews obtained in the nonresponse
follow-up interviews are from different modes, introducing measurement differences that
may be entirely attributable to the mode of data collection. The result is an inability to
isolate the cause of estimation differences resulting from sample source (KN panel
recruits, KN panel recruitment nonresponders) versus mode of data collection (online,
mail, telephone, and in-person). For more discussion and examples, see Dennis, J.
Michael. 2010, “KnowledgePanel®: Processes & Procedures Contributing to Sample
Representativeness & Tests for Self-Selection Bias,”
http://www.knowledgenetworks.com/ganp/reviewer-info.html.
5
See Smith, Tom W., and J. M. Dennis. 2005. Online Versus In-Person: Experiments with Mode, Format,
and Question Wordings. Public Opinion Pros. December issue. Available under "Past Issues" at
http://www.publicopinionpros.norc.org/index.asp.
Knowledge Networks, Inc. Government & Academic Research 1350 Willow Road, Ste 102 Menlo Park, CA
94025 phone: (650) 289-2000 facsimile (650) 289-2001 www.knowledgenetworks.com/ganp/
File Type | application/pdf |
File Title | KN support letter |
Author | mdennis |
File Modified | 2010-12-03 |
File Created | 2010-12-03 |