Download:
pdf |
pdfBehavioral Risk Factor Surveillance System
Comparability of Data BRFSS 2021
(August 2022)
1
Introduction
The Behavioral Risk Factor Surveillance System (BRFSS) is an ongoing, state-based, randomdigit–dialed telephone survey of noninstitutionalized adults 18 years of age or older, residing
in the United States.1, 2 For detailed descriptions of the BRFSS questionnaires, data, and
reports, please see the BRFSS website. In 2021, all 50 states, the District of Columbia (DC),
Guam, the Commonwealth of Puerto Rico, and the US Virgin Islands conducted both
household landline telephone and cellular telephone interviews for the BRFSS data collection.
Florida was unable to collect BRFSS data over enough months to meet the minimum
requirements for inclusion in the 2021 annual aggregate data set.
The BRFSS data collection, structure, and weighting methodology changed in 2011 to allow the
addition of data collection by cellular telephones. The BRFSS survey uses disproportionate
stratified sample (DSS) design for landline telephone samples and random sample design for the
cellular telephone survey. The BRFSS used iterative proportional fitting (IPF)—also known as
raking—for weighting the 2021 BRFSS data. Because of sample design and the multiple
reporting areas, the BRFSS data showed some variation between states and territories for the
2021 data year. The following sections identify important similarities and variations for the
2021 data year from previous years.
A. 2021 Data Anomalies and Deviations from the Sampling Frame
The BRFSS state-based annual sample designs are fixed for the data collection year beginning
in January in all the states or territories. The samples are drawn quarterly and screened monthly
to provide a representative sample for monthly data collection. The intent of the monthly sample
is to use it for 1 month, but in most states, it took more than 1 month to complete data collection
using the monthly sample. In several instances, states used their monthly sample during a period
of several months. This deviation will disproportionately affect analyses based on monthly
(rather than annual) data. California continued to receive its landline sample quarterly rather
than monthly, allowing staff to keep their sample active across three or more months.
Several states conducted fewer than 12 monthly telephone samples for data collection during
the year. The following states did not collect 12 monthly landline samples: California, Florida,
Illinois, Kentucky, Mississippi, Nevada, North Carolina, North Dakota, Oklahoma, Oregon,
Pennsylvania, Utah, Virginia, Puerto Rico, and Virgin Islands.
The following states did not collect 12 monthly cellphone samples: California, Florida, Illinois,
Iowa, Kansas, Kentucky, Louisiana, Mississippi, Nevada, North Carolina, North Dakota,
Oklahoma, Pennsylvania, Utah, Virginia, Puerto Rico, and Virgin Islands.
Thirty-three states, Guam, Puerto Rico, and US Virgin Islands were unable to close-out their
2021 sample by December 31, 2021 and continued data collection into early 2021.
California, Connecticut, District of Columbia, Georgia, Idaho, Illinois, Indiana, Michigan,
Mississippi, New Jersey, North Dakota, Pennsylvania, and Tennessee began data collection in
February. Kentucky, Louisiana, Maine, and Virginia began data collection in March. US Virgin
Islands began data collection in May. Puerto Rico began data collection in June.
2
The months of data collection missed in each situation will likely affect seasonal estimates, i.e.
influenza vaccination. Although 49 states, the District of Columbia, Guam, Puerto Rico, and US
Virgin Islands met the minimum requirements to be included in the public-use data set for 2021,
please consider the differences in collection when comparing estimates across years.
As mentioned above Florida was unable to collect BRFSS data over enough months to meet
the minimum requirements for inclusion in the 2021 annual aggregate data set. Please note
Florida did collect data in 2021 which may be considered a point-in-time sampling of the state
with data collected from November 2021 into January 2022. The data collected in 2021 by
Florida is available by request from the Florida Department of Health BRFSS webpage.
https://www.floridahealth.gov/statistics-and-data/survey-data/behavioral-risk-factorsurveillance-system/index.html Requestors will need to scroll down to the “What can the
DOH provide?” box and fill out the linked data request form and send it their BRFSS feedback
box at: [email protected].
B. Protocol Changes from 2021 Data Collection
1.
Cellular Telephone Data
Telephone coverage varies by state and also by subpopulation. According to the 2019 American
Community Survey (ACS), 99% of all occupied housing units in the United States had telephone
service available and telephone non-coverage ranged from less than 1% in several states to 1.6%
in South Dakota. It was estimated that 3.5% of occupied households in Puerto Rico did not have
telephone service.3 The percentage of households using only cellular telephones has been steadily
increasing—69.0% of all adults lived in households with only cellular telephones in 2021.4 The
increased use of cellular telephones required the BRFSS to begin to include the population of
cellular telephone users in 2011. At that time, all adult cellular telephone respondents who had a
landline telephone were not eligible for the survey. In 2012, the BRFSS changed the screening
process. Cellular telephone respondents were eligible—even if they had landline phones—as long
as they received at least 90% of all calls on their cell phones. Beginning in 2014, all adults
contacted through their personal (nonbusiness) phone numbers were eligible regardless of their
landline phone use (i.e., complete overlap).
2. Weighting Methodologies
Since 2011, the BRFSS has used the weighting methodology called iterative proportional fitting
(IPF)--or raking--to weight data. Raking allows incorporation of cellular telephone survey data,
and it permits the introduction of additional demographic characteristics that more accurately
match sample distributions to known demographic characteristics of populations at the state
level. (Refer to the BRFSS website for more information on methodologic changes). Raking
adjusts the estimates within each state using the margins (raking control variables). The raking
method applies a proportional adjustment to the weights of the cases that belong to the same
category of the margin. The iteration (up to 100 times) continues until a convergence to within a
target percentage difference is achieved. Since 2013, up to 16 raking margins have been used in
3
the following order—county by gender, county by age, county by race or ethnicity, county,
region by race or ethnicity, region by gender, region by age, region, telephone service (landline,
cellular telephone or dual user), age by race or ethnicity, gender by race or ethnicity, tenure (rent
or own), marital status, education, race or ethnicity, and gender by age.
Since 2014, the inclusion of all adult cellular telephone respondents in the survey required an
adjustment to the design weights to account for the overlapping sample frames. A compositing
factor was calculated from each of the two samples (landline and cellular sample) for dual
users—individuals who had both cellular telephone and landline phone. The BRFSS multiplied
the design weight by the compositing factor to generate a composite weight for the records in the
overlapping sample frame. Later the design weight was truncated based on quartiles within
geographic region (or state). In 2021, the truncated weight was adjusted to regional (or state)
population and the state phone source proportions prior to raking. This adjusted weight was used
as the input weight for the first raking margin. At the last step of the raking process, weight
trimming was used to increase the value of extremely low weights and decrease the value of
extremely high weights. Weight trimming is based on two alternative methods, IGCV (Individual
and Global Cap Value) and MCV (Margin Cap Value).
3. Other Issues
As in previous years, the data from an optional module were included if interviewers asked
module questions to all eligible respondents within a state for the entire data collection year. A
state may have indicated the use of an optional module. If the module was not used for the
entire data collection year, the data were moved into the state-added questions section. Several
states collected data with optional modules by landline telephone and cellular telephone
surveys.
CDC has also provided limited technical support for the survey data collection of multiple (up
to three in 2021) questionnaire versions. A state may ask a subset of its survey sample a
different set of questions following the core, as long as the survey meets the minimum effective
sample size (2,500 participants) for a given questionnaire version. States must use the core
instrument without making any changes to it in any of their versions of the overall
questionnaire. States can include an optional module on all versions or exclusively on a single
version but, once a state chooses to use an optional module, the state must ask the module
questions throughout the data collection year. The objective of the multiple-version
questionnaire is to ask more questions, on additional topics, within a statewide sample. In 2021,
14 states conducted multiple-questionnaire-version surveys on both their landline telephone and
cellular telephone surveys. Data users can find version-specific data sets and additional
documentation regarding module data analysis in the 2021 BRFSS Survey Data and
Documentation.
A 2012 change to the final disposition code assignment rules modified the requirements for a
partially complete interview. If a participant terminated an interview during or after the
demographics section, the BRFSS coded it as a partial-complete. The coding of questions was
discontinued at the point of interview termination. When determining which records to include
in any analysis, data users should account for participants’ missing and refused values.
Beginning in 2015, questions in the demographic section were reordered and the definition of a
4
partial-complete changed. A partially complete disposition code in 2021 was assigned if the
interview terminated before completion of the survey and the selected respondent completed
the demographics section through question 12 for a cell phone interview and for a landline
interview.
More information about survey item nonresponse can be found in the 2021 BRFSS
Summary Data Quality Report and in the respective states’ Data Quality Reports.
5
C. Statistical and Analytic Issues
1. Analysis Procedures
To use the BRFSS data, the researcher needs to formulate a research question, review the
existing data tabulations, develop an analytic plan, conduct the analyses, and use data for
decision making.5 Unweighted BRFSS data represent the actual responses of each respondent
before any adjustment is made for variation in the respondents’ probability of selection,
disproportionate selection of population subgroups relative to the state’s population
distribution, or nonresponse. Weighted BRFSS data represent results that have been adjusted
to compensate for these issues. Regardless of state sample design, use of the weight in
analysis is necessary if generalizations are to be made from the sample to the population.
Please note the statistical and analytic issues described in this section are the same as those of
previous years.
2. Statistical Issues
The procedures for estimating variances described in most statistical texts and used in most
statistical software packages are based on the assumption of simple random sampling (SRS).
The data collected in the BRFSS, however, are obtained through a complex sample design;
therefore, the direct application of standard statistical analysis methods for variance estimation
and hypothesis testing may yield misleading results. There are computer programs available
that take such complex sample designs into account: SAS Version 9.4 SURVEYMEANS and
SURVEYREG procedures, SUDAAN, and Epi Info’s C-Sample are among those suitable for
analyzing BRFSS data.6,7,8 SAS and SUDAAN can be used for tabular and regression
analyses.6,7 Epi Info’s C-sample can be used to calculate simple frequencies and two-way
cross-tabulations.8 When using these software products, users must know the stratum, the
primary sampling units, and the record weight—all of which are on the public use data file. For
more information on calculating variance estimations using SAS, see the SAS/STAT® 13.1
User’s Guide.6 For information about SUDAAN, see the SUDAAN Language Manual, Release
117, and to find more about Epi Info, see Epi Info, Version 7.0.8
Although the overall number of respondents in the BRFSS is more than sufficiently large for
statistical inference purposes, subgroup analyses can lead to estimates that are unreliable.
Consequently, users need to pay particular attention to the subgroup sample when analyzing
subgroup data, especially within a single data year or geographic area.
Small sample sizes may produce unstable estimates. Reliability of an estimate depends on the
actual unweighted number of respondents in a category, not on the weighted number.
Interpreting and reporting weighted numbers that are based on a small, unweighted number of
respondents can mislead the reader into believing that a given finding is much more precise
than it actually is. The BRFSS previously followed a rule of not reporting or interpreting
percentages based upon a denominator of fewer than 50 respondents (unweighted sample) or
the half-width of a 95% confidence interval greater than 10.
From 2011, the BRFSS replaced the confidence interval limitation with the relative standard
6
error (RSE)—the standard error divided by the mean. The survey with the lower RSE has a
more-precise measurement. Because there is less variance around the mean, BRFSS did not
report percentage estimates where the RSE was greater than 30% or the denominator
represented fewer than 50 respondents from an unweighted sample. Details of changes
beginning with the 2011 BRFSS are available in the Morbidity and Mortality Weekly Report
(MMWR), which highlights weighting and coverage effects on trend lines.9 Because of the
changes in the methodology, researchers are advised to avoid comparing data collected before
the changes (up to 2010) with data collected from 2011 and onward.
3.
Analytic Issues
a. Advantages and Disadvantages of Telephone Surveys
Compared with face-to-face interviewing techniques, telephone interviews are easy to conduct
and monitor and are cost efficient; however, telephone interviews have limitations. Telephone
surveys may have higher levels of no coverage than face-to-face interviews because
interviewers may not be able to reach some US households by telephone. As mentioned earlier,
approximately 99% of households in the United States have telephones.3 A number of studies
have shown that the telephone and non-telephone populations are different with respect to
demographic, economic, and health characteristics.10,11,12 Although the estimates of
characteristics for the total population are unlikely to be substantially affected by the omission
of the households without telephones, some of the subpopulation estimates could be biased.
Telephone coverage is lower for population subgroups such as people with low incomes,
people in rural areas, people with less than 12 years of education, people in poor health, and
heads of households younger than 25 years of age.13 Raking adjustments for age, race, and sex,
and more demographic variables, however, minimize the impact of differences to a greater
extent in no coverage, under-coverage, and nonresponse at the state level.
Surveys based on self-reported information may be less accurate than those based on physical
measurements. For example, respondents are known to underreport body weight and risky
health behaviors, such as alcohol intake and smoking. This type of potential bias arises when
conducting both telephone and face-to-face interviews and when interpreting self-reported
data, data users should take into consideration the potential for underreporting.
Despite the above limitations, the BRFSS data are reliable and valid.14 The prevalence
estimates from the BRFSS correspond well with findings from surveys based on face-to-face
interviews, including the National Health Interview Survey (NHIS), and the National Health
and Nutrition Examination Survey (NHANES).15 Please visit the BRFSS website for more
information about methodological studies.
b. New Calculated Variables and Risk Factors
Not all of the variables that appear on the public use data set are taken directly from the state
files. CDC prepares a set of SAS programs that are used for end-of-year processing. These
programs prepare the data for analysis and add weighting, sample design, calculated variables,
7
and risk factors to the data set. The following calculated variables and risk factors, which the
BRFSS has created for the user’s convenience, are examples of results from this procedure for
2021 data:
_TOTINDA, _PNEUMO3, _RFBING5, _RFSMOK3, _RFHLTH, _CASTHM1, _RFMAM22
The procedures for calculating the variables vary in complexity. Some only combine codes,
while others require sorting and combining selected codes from multiple variables. This may
result in the calculation of an intermediate variable. For more information regarding the
calculated variables and risk factors, refer to the document entitled Calculated Variables in the
2021 Data File of the Behavioral Risk Factor Surveillance System, found in the 2021 BRFSS
Survey Data and Documentation section of the BRFSS website.
Two calculated variables (_METSTAT, _URBSTAT) have been included based on the 2013
NCHS urban–rural classification scheme for counties.16 The two variables identify
metropolitan status versus nonmetropolitan or urban versus rural within a given state. Three
states had a single county in a nonmetropolitan or rural category, thus requiring a recode of the
value to an adjacent category as a disclosure-avoidance measure. The definitions below show
the categorization of the two variables based on the sub-setting of the original six categories.
_METSTAT :
1 = _URBNRRL IN (1,2,3,4) = Metropolitan counties
2 = _URBNRRL IN (5,6) = Nonmetropolitan counties
_URBSTAT :
1 = _URBNRRL IN (1,2,3,4,5) = Urban counties
2 = _URBNRRL IN (6) = Rural counties
c. Specific data use concerns
The data is reviewed by CDC programs sponsoring content prior to public release and a concern
raised by a program in reporting the 2021 data are included below.
The 2021 e-cigarette question was included in the core for 2021. In previous years the e-cigarette
questions were collected as part of an optional module and two questions were asked to establish
use of e-cigarettes. The 2021 opportunity for inclusion in the core allowed space for one
question. The e-cigarette question used in 2021 does not provide a clear method to distinguish
“ever used” from the data collected with the 2021 e-cigarette question. In order for data users to
avoid making incorrect assumptions from estimates of “ever using” e-cigarettes, the program
recommends collapsing the response categories 3 (Not at all) and 4 (Never used e-cig) together
into one category “Not at all/Never used e-cig” for reporting in 2021. There is a calculated
variable (_CURECI1) for the e-cigarette question which includes the collapsed categories for
reporting in 2021.
8
References
1. Mokdad AH, Stroup DF, Giles WH. Public health surveillance for behavioral risk factors
in a changing environment: recommendations from the Behavioral Risk Factor
Surveillance team. MMWR Recomm Rep. 2003;52(RR-9):1-12.
2. Holtzman D. The Behavioral Risk Factor Surveillance System. In: Blumenthal DS,
DiClemente RJ, eds. Community-Based Health Research: Issues and Methods. New
York, NY: Springer Publishing Company Inc; 2004:115-131.
3. Federal Communications Commission USA. Universal Service Monitoring Report. 2021;
https://www.fcc.gov/document/oea-releases-2021-universal-service-monitoring-report Accessed
August 2022.
4. Blumberg SJ, Luke JV. Wireless substitution: Early release of estimates from the National
Health Interview Survey, July–December 2021. National Center for Health Statistics. June
2019. Available from: https://www.cdc.gov/nchs/data/nhis/earlyrelease/wireless202205.pdf
Accessed August 2022.
5. Frazier EL, Franks AL, Sanderson LM, Centers for Disease Control and Prevention.
Behavioral risk factor data. In: Using Chronic Disease Data: A Handbook for Public
Health Practitioners. Atlanta, GA: Centers for Disease Control and Prevention, US
Dept. of Health and Human Resources; 1992.
6. SAS Institute Inc. 2013, SAS/STAT® 13.1 User’s Guide. Cary, NC: SAS Institute, Inc.
7. Research Triangle Institute (2012). SUDAAN Language Manual, Volumes 1 and 2, Release
11.
8. Dean AG, Arner TG, Sunki GG, et al. Epi Info™, a database and statistics program for
public health professionals. Atlanta, GA: Centers for Disease Control, US Dept of
Health and Human Resources; 2011.
9. Pierannunzi C, Town M, Garvin W, et al. Methodologic changes in the Behavioral Risk
Factor Surveillance System in 2011 and potential effects on prevalence estimates.
MMWR Morb Mortal Wkly Rep. 2012;61(22):410-413.
www.cdc.gov/mmwr/preview/mmwrhtml/mm6122a3.htm Accessed 11 August 2022.
10. Groves RM, Kahn RL. Surveys by Telephone: A National Comparison with Personal
Interviews, New York, NY: Academic Press; 1979.
11. Banks MJ. Comparing health and medical care estimates of the phone and nonphone
populations. In: Proceedings of the Section on Survey Research Methods. American
Statistical Association. 1983:569-574.
12. Thornberry OT, Massey JT. Trends in United States telephone coverage across time
and subgroups. In: Groves RM, et al, eds. Telephone Survey Methodology. New York,
NY: John Wiley & Sons; 1988:25-49.
9
13. Massey JT, Botman SL. Weighting adjustments for random digit dialed surveys. In:
Groves RM, et al, eds. Telephone Survey Methodology. New York, NY: John Wiley &
Sons; 1988:143-160.
14. Pierannunzi C, Hu SS, Balluz L. A systematic review of publications assessing
reliability and validity of the Behavioral Risk Factor Surveillance System (BRFSS),
2004–2011. BMC Med Res Methodol. 2013;13:49.
15. Li C, Balluz L, Ford ES, et al. A comparison of prevalence estimates for selected health
indicators and chronic diseases or conditions from the Behavioral Risk Factor
Surveillance System, the National Health Interview Survey, and the National Health
and Nutrition Examination Survey, 2007-2008. Prev Med. 2012;54(6):381-387.
16. Ingram DD, Franco SJ. 2013 NCHS Urban-Rural Classification Scheme for Counties.
National Center for Health Statistics. Vital Health Stat. 2014;2(166):1-73.
10
File Type | application/pdf |
File Modified | 2023-06-01 |
File Created | 2023-06-01 |