The Supporting Statement for OMB 15XX-XXXX
Pilot Test of Consumer Tipping Survey
B. Collections of Information Employing Statistical Methods
Universe and Respondent Selection.
The potential respondent universe for this study includes all U.S. resident persons who conduct transactions where the social norm of tipping prevails. A precise estimate of the number of individuals in this population is unknown, but likely includes a majority of the U.S. adult population. Examples of settings where tipping is expected include: full-service restaurants, taxis, barber shops, beauty salons, hotels, and casinos.
The private nature of transactions involving tipping makes it extremely difficult to collect reliable data that can be used to estimate total tip income. This difficulty is further compounded by the motivation of some individuals to evade tax on tips received. For these reasons, the IRS has concluded that surveying consumers about their tipping experiences is the most reliable way to collect quantitative data on tip income.
Prior IRS research on consumer tipping behavior1 found tipping rates that varied considerably by industry and by region. A 1982 study conducted by the University of Illinois for the IRS (Pearl and Sudman, 1983) found tipping rates, defined as the tip amount as a percent of expenditures on tipping occasions, to be 14 percent for restaurants, 12 percent for barber and beauty shops, 19 percent for bars, and 20 percent for taxis. On a regional basis, mean restaurant tipping rates ranged from a low of 12.5 percent in the West North Central to a high of 15 percent in the Northeast.
The observed variation in tipping rates implies larger sample sizes are required in order to produce accurate estimates of tipping rates. Other things being equal, a larger sample size means greater cost. This constraint may be met in two ways: (1) lowering the scope of the study to focus on fewer industries/regions or (2) finding a lower cost mode of data collection. For obvious reasons, the IRS believes it would be inappropriate to limit the geographic scope of the study. Limiting the study to restaurants and drinking places would provide coverage of the industry with the largest share of reported tips (about 63 percent)2 but would omit several industries with significant tipping activity, including the casino and gambling industry, which experienced significant growth in recent decades.
A second option for lowering the cost of data collection is to use a non-probability sample. The costs of sampling from a preexisting opt-in internet panel may be substantially lower than the costs associated with sampling from a telephone or mail-based frame due to lower labor costs associated with phone contacts or material/transportation costs associated with mail-based sampling. In addition, there might be additional costs or non-response associated with pushing individuals sampled from the telephone or mail frame to the internet survey instrument. The chief drawback of using a non-probability sample from an internet opt-in panel is that internet panelists may be less representative of the target population than the phone or mail frames. However, given the high rates of non-response associated with sampling from these frames, it is not clear to what degree respondents from probability samples are more representative with respect to tipping behavior than respondents contacted through an internet panel, particularly after post-stratifying on observed demographic characteristics. While non-response can be mitigated through follow-up contacts, this exacerbates the differences between the probability and non-probability sampling strategies with respect to the cost of obtaining a sample of a given size. Consequently, given a fixed budget, it is unclear whether the reductions in bias in the estimates of mean tipping and stiffing rates that result from using a probability versus a non-probability sample is worth the increase in the variability in these estimates that results from a smaller sample size, especially for relatively infrequent tipping transactions.
Given the uncertainty in the degree to which there is a tradeoff between the variance and bias in estimated tipping rates associated with a choice between a probability and non-probability sample, this study will follow OMB guidelines3 by using a pilot survey to resolve this uncertainty. Specifically, this pilot study will determine if the results generated by two different internet-based data streams -- one probability based and one nonprobability based -- are equivalent, and thus the degree to which there is a difference in bias that results from the use of a non-probability versus a probability sample. If the two data streams support identical conclusions about the tipping behavior across industries and geographic areas, then future IRS data collections efforts with respect to consumer tipping behavior can choose to employ just one of these methods, perhaps the one that generates the most cases at the least cost per case.
Procedures for Collecting Information.
The pilot study will be conducted using internet panels maintained by subcontractors Ipsos and GfK, both of which have been designed to be representative of the adult population. A brief description of each of these internet panels is provided here:
GfK Knowledge-Panel®: The KnowledgePanel is an internet-based panel that uses a probability-based sampling strategy where the survey frame is derived from the USPS Delivery Sequence File. Individuals are invited to join the GfK KnowledgePanel by mail, followed by telephone calls for those who do not respond to the initial invitation. Once they have joined the panel they are invited to surveys and other projects via email. Households are sampled without replacement, avoiding potential bias that may result from respondents participating in the panel twice. For those individuals selected for participation without computers or an Internet connection, a netbook is provided. The primary benefit of the KnowledgePanel relative to opt-in panels (like the Blended Online Sample described below) is that knowing the probability of selection allows researchers to estimate error. However, these estimates will always be deficient capturing all aspects of non-response unaddressed by demographic post-stratification. Further, the procedures used to setup and maintain panel membership and participation serve as an additional component of error difficult to fully model and correct for.
Blended Online Sample (Ipsos Ampario): Ipsos’ blended sample approach combines the use of its Ampario online sampling method in addition to its ISAY online panel—an online panel of 800,000 members and their households. Ampario is a new nonprobability sampling procedure Ipsos has developed that invites respondents by invitations, banner ads, and other means on 100 to 400 websites that have partnered with Ipsos. These two methods are combined into a single sample using Ipsos’ proprietary Cortex routing system, which allocates and reallocates a sample given respondent eligibility. Simply put, when respondents are not eligible for one survey, they are immediately redirected to other surveys in progress. In traditional one-off opt-in surveys, noneligible respondents are lost, representing a considerable cost. Finally, Bayesian methodology, which requires previous information regarding the overall sample of interest in order to mix with current information for the final distribution of results, is used to form the final distribution. As is the case with a traditional online sample, Ipsos’ blended sampling could work with several different data collection modes, but it is best implemented with an online-based questionnaire, which could include a cross-sectional administration or a longitudinal diary approach. However, because of the opt-in nature of the Blended Sample, it is not possible to model the probability of response, and thus to account for that source of potential bias in survey estimates.
IRS will obtain 20,000 complete surveys over the course of a month, 10,000 from Ipsos’ non-probability system, and another 10,000 from GfK’s probability-based panel. The IRS estimates that 154,000 participants will need to be contacted in order to get the required sample size. This estimate is based on the completion rate of 13 percent for a multiple-wave survey conducted by Ipsos using data from their report on the 2012 United States Presidential election. In that study, the Ipsos ISAY panel was used in conjunction with the Ampario blended sampling method to send out an invitation and reminder in the same day. A similar methodology will be used for this study. As tipping expenditures likely fluctuate significantly throughout an average week, it will be important to gather a representative sample from all days of the week. A sample of 10,000 observations is the minimum size necessary to obtain estimates of the frequency of certain infrequent transactions, such as casino gambling. These frequency estimates will be used to determine the target sample size for the final survey.
Ipsos has an internal method for tracking which panel members complete a survey for their internal reward system for survey completion. Therefore, only those panel members who have not completed the survey will receive a reminder email. Once the 10,000 respondents from Ipsos’ sample complete the survey, any reminder emails scheduled after that would not occur.
Following the survey’s administration, the survey research sub-contractors will provide FMG (contractor) and the IRS responses to the survey questions with a generic respondent identifier (e.g. 1, 2, 3, 4, 5 …). The relative degree of accuracy of tipping rate estimates from the GfK probability sample and the Ipsos non-probability sample will be benchmarked to a contemporaneous nationwide sample of electronic point of service (POS) data purchased from a vendor of POS equipment. The key questions of interest that will be explored in this pilot study are the following:
Do the two data streams produce similar estimates of tipping rates by industry and region?
Do the two data streams produce similar estimates of tipping rates in comparison to electronic POS data?
Do the responses produced by the two data streams have similar distributional characteristics?
A research study similar in intent to the one proposed here was performed for the U.S. Census Bureau.4 A non-probability internet survey performed for the U.S. Environmental Protection Agency is OMB Control No: 2060-0643 (“Internet Survey Research for Improving Fuel Economy Label Design and Content”, ICR Reference NO: 201005-2060-012).
Methods to Maximize Response.
We are utilizing an established online panel for survey administration. Survey administration will include an invitation email and up to one reminder email (as needed) in an effort to maximize response rate. The expected response rate is 13% based on the response rate for a 2012 survey concerning the last Presidential election conducted by Ipsos using their ISAY panel and Ampario system.
Testing of Procedures.
Prior to finalizing the survey instrument, FMG (contractor) will conduct a usability study with no more than 35 adults to test the survey language by taking the survey to ensure survey respondents understand the industry/service, as well as tipping (monetary/in-kind) attribute language and can accurately recall their tipping activity. IRS will edit the survey as needed from those results. IRS expects the changes to be minimal and related only to wording of the specific items listed above. IRS does not expect that the changes will include any of the following: an increase in the kind or amount of information sought; an increase in coverage; an increase in the timing or frequency of reporting; a change in the sample design or collection method; or a change in the purpose for which the information is collected or required to be maintained.
The survey will be administered electronically; however there are no cookies involved. Survey participants will be provided a link/web address via a secure website. Transmission to/from the secure website for the survey will be encrypted.
Survey respondents will be selected from the subcontractor’s panel members and non-panel Internet users. Potential respondents will be sent an email invitation to participate in a survey to understand their preferences for how to get help for tax-related service needs they may encounter. Participants will be provided a link/web address to a secure website with their unique survey URL that corresponds to their survey questions. The subcontractor hosting the panel and survey will maintain a secure survey control system that will document the correspondence and track the status of all sample members by giving each sample member a unique sample ID. The sample ID is used in place of name, address, or other personally identifiable information.
For questions regarding the study or questionnaire design or statistical methodology, contact:
Brian K. Griepentrog, Ph.D.
Director of Research Studies
Fors Marsh Group LLC
Other individuals involved in the study design include:
John P. Vidmar, Ph.D.
Head of Ipsos Public Affairs, USA
Clifford Young, Ph.D.
Managing Director, Public Sector, Ipsos
1 Pearl, Robert B. and Seymour Sudman, A Survey Approach to Estimating the Tipping Practices of Consumers, Final Report to the Internal Revenue Service under Contract TIR 81-52 (June 1983); Pearl, Robert B., Tipping Practices of American Households: 1984, Final Report to the Internal Revenue Service under Contract 82-21 (July 1985).
2Compiled from Form 941 population data for tax year (TY) 2010.
3 See OMB (2006). “Questions and Answers when Designing Surveys for Information Collections.” Pg. 16, Section 22: “An agency may also use a pilot study to examine potential methodological issues and decide upon a strategy for the main study.”
4See Josh Pasek and Jon A. Krosnick (2010), “Measuring Intent to Participate and Participation in the 2010 Census and Their Correlates and Trends: Comparisons of RDD Telephone and Non-probability Sample Internet Survey Data”. Statistical Research Division, U.S. Census Bureau, Survey Methodology Report # 2010-15. Online at http://www.census.gov/srd/papers/pdf/ssm2010-15.pdf
File Type | application/msword |
File Title | DRAFT |
Author | PCxx |
Last Modified By | Reference |
File Modified | 2015-02-10 |
File Created | 2015-02-10 |