SUPPORTING STATEMENT
FOR
P.L. 89-663, Title 1, Section 106, 108, 112. - COLLECTION OF CRASH DATA
OMB Control Number: None
Describe the potential respondent universe and any sampling or other respondent selection methods to be used.
The purpose of Crash Report Sampling System (CRSS) is to provide annual, nationally representative estimates of the number, types and characteristics of police-reported motor vehicle crashes. The police crash report (PAR) is the sole source of data for CRSS. The CRSS universe, or sample frame, is the set of police-reported motor vehicle crashes on a traffic way (strata 2 – 10 of Table 1).
Table 1 - CRSS Analysis Strata, Target Sample Allocation, and Population Sizes
CRSS Analysis Stratum |
Analysis Stratum Description |
Target Percent of Sample Allocation |
Estimated Population (GES 2011) |
Population Percent |
1 |
An in-scope Not-in-Traffic Surveillance (NiTS) crash (take all)*. |
|
|
|
2 |
Crashes not in Stratum 1 in which: Involves a killed or injured (includes injury severity unknown) non-motorist |
9% |
119,579 |
2.2% |
3 |
Crashes not in Stratum 1 or 2 in which: Involves a killed or injured (includes injury severity unknown) motorcycle or moped rider |
6% |
76,513 |
1.4% |
4 |
Crashes not in Stratum 1-3 in which: At least one occupant of a late model year** passenger vehicle is killed or incapacitated |
4% |
22,272 |
.42% |
5 |
Crashes not in Stratum 1-4 in which: At least one occupant of an older** passenger vehicle is killed or incapacitated |
7% |
84,659 |
1.6% |
6 |
Crashes not in Stratum 1-5 in which: at least one occupant of a late model year passenger vehicle is injured (including injury severity unknown) |
14% |
330,619 |
6.2% |
7 |
Crashes not in Stratum 1-6 in which: involved at least one medium or heavy truck or bus (includes school bus, transit bus, and motor coach) with GVWR 10,000 lbs. or more |
6% |
302,781 |
5.7% |
8 |
Crashes not in Stratum 1-7 in which: at least one occupant of an older passenger vehicle is injured (including injury severity unknown) |
12% |
800,390 |
15.0% |
9 |
Crashes not in Stratum 1-8 in which: involved at least one late model year passenger vehicle, AND No person in the crash is killed or injured |
22% |
1,511,371 |
28.4% |
10 |
Crashes not in Stratum 1-9: * This includes mostly PDO crashes involving a non-motorist, MC , moped, and passenger vehicles that are not late model year and any crashes not classified in strata 1-9. |
20% |
2,078,263 |
39.0% |
*: NiTS cases are not in the scope of CRSS. They are set aside for NiTS analysis.
**: Note: Late model year passenger vehicle: passenger vehicle that are ≤4 years old
Older passenger vehicle: passenger vehicle that are 5 years old and older
The estimated CRSS population size (strata 2 – 10 of Table 1) is about 5.3 million a year. CRSS samples this population through a stratified multi-stage cluster scheme as follows:
PSU Sample Selection
Divide the country into geographic units called Primary Sampling Units (PSUs). A PSU is a county or group of counties. PSUs were formed as groups of adjacent counties subject to a minimum measure of size (MOS) condition to ensure enough cases to be sampled from each PSU and approximately equal weights. The CRSS PSU MOS was defined as:
Here
= the PAR strata defined in Table 1.
= the desired total sample size of PARs
= the desired sample size of PARs in the PAR stratum
= the estimated population counts in the PAR stratum
= the estimated population counts in the PAR stratum and PSU .
In the formula, is the desired PAR strata sample allocation (the “Target Percent of Sample Allocation” column in Table 1), and is the relative estimated population counts of PSU for PAR stratum . In this way, a PSU with larger desirable combination of estimated population counts of all PAR strata has larger MOS.
PSU formation respects Census region and urbanicity boundary. Some outlying areas of Alaska and small islands of Hawaii were excluded. There are total of 707 CRSS PSUs in the PSU frame.
The PSU frame was then stratified into 8 primary PSU strata by two variables – region (Northeast, West, South, and Midwest) and urbanicity (urban and rural). Within each primary stratum, PSUs were further stratified by other secondary stratification variables such as vehicle miles traveled, crash rate, truck miles traveled, and crash rate by road type. PSUs with similar characteristics were grouped into secondary strata with approximately equal MOS sizes and minimum within stratum variances. As the result, total 50 PSU strata are formed.
Table 2: CRSS PSU Strata, PSU Population Counts, and Sample Size
PRIMARY STRATA |
STRATID |
VMT_RATE_IMP |
TOT_CRASH _RATE |
TRK_MI_RATE |
ROAD_TYPE _RATE |
Number of PSUs |
PSU Sample Size |
|||||||
Upper |
Lower |
Upper |
Lower |
Upper |
Lower |
Upper |
Lower |
|||||||
1 |
101 |
1800.660 |
0.000 |
|
|
|
|
358.504 |
0.000 |
5 |
2 |
|||
1 |
102 |
4064.065 |
1800.660 |
|
|
|
|
358.504 |
0.000 |
5 |
2 |
|||
1 |
103 |
7159.044 |
4064.065 |
|
|
|
|
358.504 |
0.000 |
8 |
2 |
|||
1 |
104 |
5791.034 |
0.000 |
0.028 |
0.000 |
153756.114 |
0.000 |
2175.024 |
358.504 |
6 |
2 |
|||
1 |
105 |
8040.031 |
5791.034 |
0.028 |
0.000 |
153756.114 |
0.000 |
2175.024 |
358.504 |
7 |
2 |
|||
1 |
106 |
|
|
0.028 |
0.000 |
249917.616 |
153756.114 |
2175.024 |
358.504 |
7 |
2 |
|||
1 |
107 |
|
|
0.028 |
0.000 |
591240.550 |
249917.616 |
2175.024 |
358.504 |
7 |
2 |
|||
1 |
108 |
|
|
0.039 |
0.028 |
|
|
2175.024 |
358.504 |
11 |
2 |
|||
2 |
201 |
|
|
|
|
236700.660 |
0.000 |
|
|
22 |
2 |
|||
2 |
202 |
|
|
|
|
1027525.695 |
236700.660 |
|
|
22 |
2 |
|||
3 |
301 |
4134.622 |
0.000 |
|
|
45708.732 |
0.000 |
|
|
3 |
2 |
|||
3 |
302 |
7465.060 |
4134.622 |
|
|
45708.732 |
0.000 |
|
|
8 |
2 |
|||
3 |
303 |
9897.834 |
7465.060 |
|
|
45708.732 |
0.000 |
|
|
10 |
2 |
|||
3 |
304 |
|
|
|
|
102553.858 |
45708.732 |
|
|
11 |
2 |
|||
3 |
305 |
4443.529 |
0.000 |
|
|
339758.109 |
102553.858 |
|
|
13 |
2 |
|||
3 |
306 |
6002.758 |
4443.529 |
|
|
339758.109 |
102553.858 |
|
|
11 |
2 |
|||
3 |
307 |
11617.975 |
6002.758 |
|
|
339758.109 |
102553.858 |
|
|
10 |
2 |
|||
4 |
401 |
|
|
|
|
66170.891 |
0.000 |
4344.584 |
0.000 |
28 |
2 |
|||
4 |
402 |
6045.032 |
0.000 |
|
|
565024.725 |
66170.891 |
4344.584 |
0.000 |
27 |
2 |
|||
4 |
403 |
11623.151 |
6045.032 |
|
|
565024.725 |
66170.891 |
4344.584 |
0.000 |
25 |
2 |
|||
4 |
404 |
|
|
|
|
|
|
17641.397 |
4344.584 |
30 |
2 |
|||
5 |
501 |
3619.866 |
0.000 |
0.048 |
0.000 |
125590.275 |
0.000 |
|
|
5 |
2 |
|||
5 |
502 |
4529.728 |
3619.866 |
0.048 |
0.000 |
125590.275 |
0.000 |
|
|
8 |
2 |
|||
5 |
503 |
4951.021 |
4529.728 |
0.048 |
0.000 |
125590.275 |
0.000 |
|
|
6 |
2 |
|||
5 |
504 |
5016.203 |
4951.021 |
0.048 |
0.000 |
125590.275 |
0.000 |
|
|
3 |
2 |
|||
5 |
505 |
5277.180 |
5016.203 |
0.048 |
0.000 |
125590.275 |
0.000 |
|
|
5 |
2 |
|||
5 |
506 |
5745.576 |
5277.180 |
0.048 |
0.000 |
125590.275 |
0.000 |
|
|
6 |
2 |
|||
5 |
507 |
6399.201 |
5745.576 |
0.048 |
0.000 |
125590.275 |
0.000 |
|
|
5 |
2 |
|||
5 |
508 |
12826.284 |
6399.201 |
0.048 |
0.000 |
125590.275 |
0.000 |
|
|
8 |
2 |
|||
5 |
509 |
5641.161 |
0.000 |
0.048 |
0.000 |
210430.008 |
125590.275 |
|
|
6 |
2 |
|||
5 |
510 |
8347.640 |
5641.161 |
0.048 |
0.000 |
210430.008 |
125590.275 |
|
|
7 |
2 |
|||
5 |
511 |
13892.020 |
8347.640 |
0.048 |
0.000 |
210430.008 |
125590.275 |
|
|
10 |
2 |
|
||
5 |
512 |
|
|
0.048 |
0.000 |
358684.226 |
210430.008 |
|
|
8 |
2 |
|
||
5 |
513 |
|
|
0.048 |
0.000 |
877545.577 |
358684.226 |
|
|
13 |
2 |
|
||
5 |
514 |
|
|
0.085 |
0.048 |
|
|
|
|
17 |
2 |
|
||
6 |
601 |
|
|
|
|
49853.675 |
0.000 |
|
|
35 |
2 |
|
||
6 |
602 |
6353.419 |
0.000 |
|
|
162415.067 |
49853.675 |
|
|
34 |
2 |
|
||
6 |
603 |
14414.922 |
6353.419 |
|
|
162415.067 |
49853.675 |
|
|
35 |
2 |
|
||
6 |
604 |
|
|
|
|
250189.692 |
162415.067 |
|
|
33 |
2 |
|
||
6 |
605 |
5693.121 |
0.000 |
|
|
1156242.173 |
250189.692 |
|
|
35 |
2 |
|
||
6 |
606 |
16270.533 |
5693.121 |
|
|
1156242.173 |
250189.692 |
|
|
35 |
2 |
|
||
7 |
700 |
|
|
|
|
|
|
|
|
1 |
1 |
|
||
7 |
701 |
6477.249 |
0.000 |
0.027 |
0.000 |
104521.554 |
0.000 |
|
|
7 |
2 |
|
||
7 |
702 |
6920.874 |
6477.249 |
0.027 |
0.000 |
104521.554 |
0.000 |
|
|
4 |
2 |
|
||
7 |
703 |
7860.805 |
6920.874 |
0.027 |
0.000 |
104521.554 |
0.000 |
|
|
5 |
2 |
|
||
7 |
704 |
5137.293 |
0.000 |
0.027 |
0.000 |
249358.333 |
104521.554 |
|
|
3 |
2 |
|
||
7 |
705 |
8069.657 |
5137.293 |
0.027 |
0.000 |
249358.333 |
104521.554 |
|
|
10 |
2 |
|
||
7 |
706 |
|
|
0.048 |
0.027 |
92715.811 |
0.000 |
|
|
9 |
2 |
|
||
7 |
707 |
|
|
0.048 |
0.027 |
186409.133 |
92715.811 |
|
|
7 |
2 |
|
||
8 |
801 |
|
|
|
|
|
|
3938.460 |
0.000 |
30 |
2 |
|
||
8 |
802 |
|
|
|
|
|
|
18292.498 |
3938.460 |
41 |
2 |
|
A major challenge of the CRSS sample design is the uncertainty over the future operational budget. Unknown future funding levels and the need for a stable PSU sample require NHTSA to select a scalable PSU sample in which the PSU sample size can be decreased or increased with minimum impact to the existing PSU sample and the selection probabilities can be tracked. To this end, a multi-phase sampling method was used to select the CRSS PSU sample by selecting a sequence of nested PSU samples. In this method, a PSU sample larger than actually needed is first selected as the first phase PSU sample. From the first phase PSU sample, a smaller subset of PSU sample is selected as the second phase PSU sample. From the second phase PSU sample, another smaller third phase PSU sample is selected. This process is continued until the PSU sample size reaches unacceptable levels. In this way, a sequence of nested PSU samples is obtained. Each of these PSU samples is a probability sample and can be used for data collection (see Figure 1). According to the prevailing budget level, a sample with the appropriate sample size is picked from the nested sequence. This allows us to easily track the selection probabilities and minimizes changes to the PSU sample.
Figure 1: Nested PSU Samples
For CRSS, 5 PSU samples were selected under the 5 scenarios of number of PSU strata and PSU sample sizes. Table 3 summarizes the CRSS PSU sample scenarios.
Table 3 – CRSS PSU Sample Scenarios: Number of Strata and Sample Size
Scenario |
# of PSU Strata |
# of Sampled Non-certainty PSU |
# of Sampled Certainty PSU |
Total # of Sampled PSU |
1 |
50 |
97 |
4 |
101 |
2 |
37 |
74 |
1 |
75 |
3 |
25 |
50 |
1 |
51 |
4 |
12 |
24 |
0 |
24 |
5 |
8 |
16 |
0 |
16 |
With sample size of 100 and without stratification, one PSU was identified as certainty PSU by condition:
Here N is the total number of PSUs in in the PSU frame. This certainty PSU was set aside and selected with certainty. Then 2 PSUs were selected using probability proportional to size (PPS) sampling from each of the 50 scenario-1 strata. With sample size 2 for each PSU stratum, total 3 PSUs were identified as certainty PSUs from 3 of the 50 scenario-1 strata by condition:
Here is the total number of PSUs in stratum . The certainty PSUs were aside and selected with certainty. The corresponding stratum PSU sample size was reduced by 1. Then a PPS sample of non-certainty PSUs was selected using the revised PSU stratum sample size.
Scenario-1 sample has total 101 PSUs. For a non-certainty PSU, the selection probability is:
Here is the non-certainty PSU sample size for PSU stratum .
For scenario-2, with sample size of 74 and without stratification, one PSU was identified as certainty PSU and was set aside. Then 13 of the scenario-1 strata were collapsed with other strata to form total 37 scenario-2 strata. The collapsing of strata follows the following rule:
Only the secondary strata in the same primary stratum can be collapsed;
Only the contiguous secondary strata can be collapsed;
The resulting strata have similar stratum total MOS within each primary stratum.
In each of the scenario-2 stratum, the sampled scenario-1 PSUs were treated as the sampling frame. Each PSU was assigned a new MOS equal to its scenario-1 stratum total MOS. Then 2 PSUs were selected from each scenario-2 stratum with PPS sampling using the new MOS. In this way, the resulting selection probability of the scenario-2 PSU is still PPS selection probability.
Other scenario samples were selected in similar way.
SSU Sample Selection
The secondary sampling units (SSU) of CRSS are police jurisdictions. Within each PSU, PARs are stratified by the police jurisdictions (PJ) where PARs are available and PJs become the second stage sampling units. A composite MOS is assigned to each PJ in the selected PSUs. Similar to PSU MOS definition, it is sensible to assign larger selection probability to PJs with desirable crash composition. To this end, crash counts of the 9 PAR strata in Table 1 for each PJ in the selected PSUs were estimated from the information collected from the PJs in the selected PSUs. For the PJ in the PJ frame within the sampled PSU , the composite SSU MOS is defined as the following:
where
= the desired total sample size of crashes
= the desired sample size of crashes in the PAR stratum
= the estimated population number of crashes in PAR stratum
= the estimated population number of crashes in PAR stratum , PJ and PSU
PJs are then stratified into two PJ strata by their MOS (largest 50% vs the rest) in addition to certainty PJs. A PJ sample is then selected from each PJ stratum using sequential Poisson sampling.
Sequential Poisson sampling method (see Ohlsson, Esbjörn (1998): Sequential Poisson Sampling, Journal of Official Statistics, Vol.14, No.2, pp. 149–162) produces an approximate PPS sample, handle the frame changes and minimize the changes to the existing sample at the same time.
Sequential Poisson sampling method was applied to the PJ sample selection for each of non-certainty PJ strata (large MOS or small MOS stratum) within the sampled PSU , as following:
Generate a permanent uniform random number for each PJ in the PJ frame.
Identify certainty PJs by the condition:
Here is the PJ sample size and is the PJ frame size for a PJ stratum within PSU i. is the PJ MOS. The identified certainty PJs are set aside. And this process is repeated to the remaining PJs based on the reduced PJ sample size until there is no more certainty PJs. Let the total number of certainty PJs be .
For the remaining non-certainty PJs in the frame, divide their permanent random number by the MOS to obtain the transformed random number: . Then, sort the transformed random number from the smallest to the largest as following:
Thus, the certainty PJs plus the first non-certainty PJs on the above list are the PJ sample for a PJ stratum within PSU .
Sequential Poisson sampling is approximately PPS. The PJ selection probability is:
Here j is for PJ, is the PJ sample size for PSU i, is the PJ MOS. The summation is over all non-certainty PJs in the selected PSU.
TSU Sample Selection
The tertiary sampling units (TSU) of CRSS are police crash reports (PAR). The CRSS PAR sample is selected by a stratified systematic sampling. For each selected SSU (PJ), PARs are periodically obtained by either technician’s visit to the PJ or electronic transmission. All the PARs are listed in the order they become available, and stratified by the PAR strata identified in Table 1. Through this listing process, PAR sampling frame in each selected PJ are prepared for PAR sample selection.
For a large PJ with too many PARs to be listed, PARs are sub-listed by systematic sampling. For example, only PARs with even PAR number may be listed if a sub-listing factor is 2. Or 1 of every 5 PARs is listed if a sub-listing factor is 5. If one of every PARs is sub-listed in PJ , PSU , the sub-listing probability for all sub-listed PARs are:
After PARs are listed, a PAR sample is selected by systematic sampling from the listed (or sub-listed) PARs by PAR stratum within each selected PJ. PAR selection probability is:
Here is the number of PARs selected from PAR stratum , is the number of PARs listed in PSU , PJ , PAR stratum .
The overall selection probability is:
The design weight is the inverse of .
Sample Allocation
CRSS PSU, PJ and PAR sample sizes are estimated using optimization by minimizing variance subject to cost assuming three stage simple random sampling without replacement.
The optimization model consists of the objective function, cost constraint, and variance constrains as following.
: Subscript of the identified key estimate, . Here .
: Identified key proportion estimate.
: Optimal sample sizes of PSUs, PJs, and cases (PARs) to be determined.
: Population size of PSUs
: Average population size of PJs.
: Average population size of PARs
: Variance of the identified key estimate .
: Variance component at PSU-, PJ-, and case-level.
: Total, fixed, PSU-, PJ-, and crash-level cost coefficients.
: Variance of the identified key estimate in the current system (NASS GES).
: known case load.
Standard errors for seven key estimates under current General Estimates System (GES) were used as constraints in the above optimization model to ensure the corresponding degree of accuracy under CRSS will be at least as good as GES.
Under the current GES budget and the current GES cost components, NHTSA determined the sample allocation is about 60 PSUs, 6 PJs per PSU, and 140 PARs per PJ.
Weighting Adjustments, Imputation and variance estimation
After design weights are calculated, the weights need to be adjusted for the following reasons:
Refusal/non-respondent adjustments;
Frame coverage bias correction;
Matching marginal totals to other data sources – for example, total fatality to FARS;
Large weight trimming;
Calibration technique will be used as the adjustment method. The potential auxiliary information to be used for calibration includes FARS, Census population counts, and PSU level total crash counts.
The calibration adjustment method that handle all the above have been implemented in SUDAAN 11 WTADJX procedure. SUDAAN WTADJX procedure will be used to create the final analysis weights.
Some key item missing values will be imputed. Several imputation methods will be considered and used for imputation, depending on the missing variable and available information. The imputation methods include but not restricted to: logical imputation, regression imputation, and hot deck imputation.
The resulting CRSS PSU sampling rate is quite low. We expect the PSU sample selection can be approximated treated as with-replacement sample selection. The standard specialized software such as SAS SURVEY procedures and SUDAAN procedures can be used for CRSS data analysis.
Describe collection of information procedures.
Once a PAR has been selected for data collection, data coders record information in the PAR into the computer system as appropriate. CRSS data collection is solely based on PARs.
Describe methods to maximize response rates and to deal with issues of non-response.
CRSS has a three stage sample design. The first stage sampling units are counties or groups of counties. A PSU becomes a non-responding PSU only if all selected police jurisdictions (PJs) within the PSU are non-responding PJs. In CRSS, PJ samples are selected using sequential Poisson sampling method. The whole PJ frame can be used as replacement sample. Therefore, a PSU becomes non-responding PSU only if all PJs in the frame are non-responding PJs. In the current GES, the PJ cooperation rate is about 100%. Therefore, by design, it is unlikely there will be any non-responding PSUs in CRSS.
The second stage sampling units of CRSS are PJs. A sampled PJ becomes non-responding PJ if it refuses to cooperate. To improve PJ cooperation rate, NHTSA plan to visit each selected PJ and meet with local law enforcement officers to gain cooperation. In the current GES, NHTSA obtained almost 100% cooperation from sampled PJs. Therefore, we expect only a few non-responding PJs in the CRSS PSUs.
The third stage sampling units of CRSS are PARs. First all police accident reports (PARs) in the selected PJs are listed. Then a systematic sample of PARs is selected and coded. Because the sampling units are police reports and a PAR must be available before it is listed, therefore by design, there is no unit non-response at the third stage of sampling for CRSS.
The CRSS quality control system will be designed to produce the most accurate, reliable, and complete database possible within the limits of available resources. A sample of all PARs will be given a thorough review by experienced data quality control personnel. Quality control personnel will also visit each PSU regularly to observe the data collection activities and to discuss systematic problems revealed in edit and reviews of the data collector’s cases.
Describe any tests of procedures or methods to be undertaken.
NHTSA will test new data collection procedures for six (6) months. The test will include gathering police crash reports and identifying qualified crashes, analyzing data, and monitoring for quality control.
The attached PAR example (Attachment 4) shows the data elements to be collected from the selected PARs. The electronic forms and protocols are being developed to collect this information on tablet computers.
Provide the name and telephone number of individuals consulted on statistical aspects of the design and the name of the agency unit, contractor(s), grantee(s), or other person(s) who will actually collect and/or analyze the information for the agency.
Ms. Chou-Lin Chen, National Center for Statistics and Analysis, NHTSA, 202-366-1048 is responsible for CRSS survey design.
NHTSA has decided to undertake a basic redesign of the National Automotive Sampling System that will attempt to meet new and diverse requirements through expanding its scope and making it more responsive to changing needs. Accordingly, NHTSA has contracted with Westat (contract DTNH22-12-F-00389) to help the CRSS survey design effort. The contract award date for CRSS data collection and coding is estimated to be March 16, 2015.
Page
File Type | application/vnd.openxmlformats-officedocument.wordprocessingml.document |
File Title | SUPPORTING STATEMENT |
Author | Ruth Isenberg |
File Modified | 0000-00-00 |
File Created | 2021-01-24 |