Bobbit - Bootstrap Method

Bobbit - Bootstrap Method.doc.pdf

International Price Program U.S. Export and Import Price Indexes

Bobbit - Bootstrap Method

OMB: 1220-0025

Document [pdf]
Download: pdf | pdf
Application of the Bootstrap Method in the International Price Program
Patrick A. Bobbitt, Steven P. Paben, MoonJung Cho, Te-Ching Chen,
James A. Himelein Jr., and Lawrence R. Ernst
U.S. Bureau of Labor Statistics
2 Massachusetts Avenue NE, Washington DC 20212 U.S.A.
bobbitt [email protected]
Abstract

change for internationally traded goods using three primary classification systems - Harmonized System (HS),
The International Price Program (IPP) collects data on Bureau of Economic Analysis End Use (BEA), and North
United States trade with foreign nations and publishes American Industry Classification System (NAICS). IPP
monthly indexes on the import and export prices of U.S. also publishes selected services indexes and goods indexes
merchandise and services. The IPP employs a three stage based upon the country or region of origin. This paper
PPS design in which establishments, then broad product will only focus on the Import goods indexes that IPP
categories traded within establishments, and finally items publishes monthly.
within a category, are selected. Certainty selections can
The target universe of the import indexes consists of all
occur in the first two stages.
goods purchased from abroad by U.S. residents. Ideally,
We present three variations of the bootstrap rescaling the total breadth of U.S. trade in goods in the private secmethod adapted to the IPP sample design: 1) sampling tor would be represented in the universe. Items for which
at the first stage, treating certainty units as probability it is difficult to obtain consistent time span for comparaunits, 2) sampling that allows for certainties, and 3) a ble products, however, such as works of art, are excluded.
procedure that extends the previous method, by collaps- Products that may be purchased on the open market for
ing single item strata.
military use are included, but goods exclusively for miliFinally, we compare the stability, bias and coverage tary use are excluded.
rates of the three approaches by simulating 1000 samples
of a simulated universe using the IPP sampling method2. Sampling in the International Price Program
ology.
The import merchandise sampling frame used by the IPP
is obtained from the U.S. Customs and Border Protection
(USCBP). This frame contains information about all import transactions that were filed with the USCBP during
the reference year. The frame information available for
1. Introduction
each transaction includes a company identifier (usually
The International Price Program collects data on the the Employer Identification Number), the detailed prodUnited States’ trade with foreign nations and publishes uct category (Harmonized Tariff number) of the goods
monthly indexes on the changes in import and export that are being shipped and the corresponding dollar value
prices for both goods and services. Recently a research of the shipped goods.
IPP divides the import universe into two halves regroup was chartered to assess the quality of the existing
variance measures[14], and if possible, propose alternate ferred to as panels. One import panel is sampled each
methods. This group studied the use various estimation year and sent to the field offices for collection, so the unimethods: the Bootstrap, the Jackknife, BRR, and two verse is fully re-sampled every two years. The sampled
products are priced for approximately five years until the
methods derived from applying the Taylor’s series[2].
In this paper, we will present an overview of the items are replaced by a newly drawn sample from the
IPP sample design, weight structure and index estima- same panel. As a result, each published index is based
tion. We will then present the three bootstrap methods upon the price changes of items from up to three different
adapted to the IPP sample design. Finally, we will apply samples .
Each panel is sampled using a three stage sample dethe methods to seven representative strata published by
sign. The first stage selects establishments independently
the IPP.
The International Price Program of the Bureau of La- proportional to size (dollar value) within each broad
bor Statistics (BLS) produces two of the major price product category (stratum) identified within the harmostatistics for the United States: the Import Price Indexes nized classification system (HS).
and the Export Price Indexes. The IPP, as the primary
The second stage selects detailed product categories
source of data on price change in the foreign trade sector (classification groups) within each establishment - straof the U.S. economy, publishes index estimates of price tum using a systematic probability proportional to size

KEY WORDS: Variance estimation; Bias estimation;
Certainty sampling units; Collapsing

(PPS) design. The measure of size is the relative dollar value adjusted to ensure adequate coverage for all
three published strata across all classification systems,
and known non-response factors (total company burden
and frequency of trade within each classification group).
Each establishment - classification group (or sampling
group) can be sampled multiple times and the number
of times each sampling group is selected is then referred
to as the number of quotes requested.
In the third and final stage, the field economist, with
the cooperation of the company respondent, performs the
selection of the actual items for use in the IPP indexes.
Using the entry level classification groups selected in the
second stage, a list of items is provided by the respondent
to the field economist. Using a process called disaggregation, items are selected from this list with replacement
to satisfy the number of item quotes requested for each
entry level classification group.
3. Index Estimation
IPP uses the items that are initiated and re-priced every
month to compute its price indexes. These indexes are
calculated using a modified Laspeyres index formula.
The modification used by the IPP differs from the conventional Laspeyres index by using a chained index instead of a fixed-base index. Chaining involves multiplying
an index (or long term relative) by a short term relative
(STR). This is useful since the product mix available for
calculating price indexes can differ over time.
These two methods produce identical results as long as
the market basket of items does not change over time and
each item provides a usable price in every period. In fact,
due to non-response, the mix of items used in the index
from one period to the next is often different. The benefits of chaining over a fixed base index include a better
reflection of changing economic conditions, technological
progress, and spending patterns, and a suitable means
for handling items that are not traded every calculation
month.
Below is the derivation of the modified fixed quantity
Laspeyres formula used in the IPP.

LT Rt

=


P
P pi,t qi,0 (100)
p q
 P i,0 i,0  p  
pi,0 qi,0

=

P

i,t
pi,0

pi,0 qi,0

 (100)

P

wi,0 ri,t
P
(100)
=
w
 P i,0
 P

wi,0 ri,t
wi,0 ri,t−1
P
P
=
(100)
w r
wi,0
 P i,0 i,t−1 
wi,0 ri,t
= Pw r
(t−1 )
i,0 i,t−1

= (ST Rt ) (LT Rt−1 )
where,

pi,t
qi,0
wi,0
ri,t
LT Rt
ST Rt

= price of item i at time t
= quantity of item i in base period 0
= (pi,0 ) (qi,0 ), the total revenue in base
period 0
pi,t
, or the long term relative of item
= pi,0
i at time t
= long-term relative of a collection of
items
 P at timei
=

wi,0 ri,t

P

wi,0 ri,t−1

For each classification system, IPP calculates its
estimates of price change using an index aggregation
structure (i.e. aggregation tree) with the following form:
Upper Level Strata
Lower Level Strata
Classification Groups
Weight Groups (i.e.
Company-Index Classification
Group)
Items
As mentioned previously, at any given time, the IPP
has up to three samples of items being used to calculate each stratum’s index estimate . Currently the IPP
combines the data from these samples by ‘pooling’ the
individual estimates.
Pooling refers to combining items from multiple samples at the lowest level of the index aggregation tree.
These combined sample groups are referred to as a weight
group. Different sampling groups can be selected for the
same weight group across different samples, so it is possible that multiple items from different sampling groups
can be used to calculate a single weight group index. This
weight group level aggregation is done primarily so the
Industry Analysts within IPP can perform analyses on
the index information across samples.
4. Variance Project
The variance project was chartered in the IPP to find a
variance estimation algorithm that would be useful given
the unique aspects of both the IPP sample design and the
modified Laspeyres index employed in the program. This
project would analyze several different variance methods
for both their precision and their bias through simulation. To achieve this, we drew one thousand complete
samples from a the sampling frame that included all import transactions from July 2002, to June 2003. We then
simulated the response data at the item level for these
1000 draws for a period of three years[2].
In this paper, we will inspect only the variance estimation of the short-term relative, specifically, the short
term relative for the first month in a ‘chain’. The reason
for this is at the first month the LTR is the same as the
STR, and due to the ‘memory’ effect of the LTR over
time errors in estimating variance get compounded over
time, complicating the analysis.

5. Bootstrap Methods
5.1

b
wheci

Literature Review

The IPP evaluated different variance estimation methods
such as Taylor Series Linearization, bootstrap, jackknife,
balanced repeated replication (BRR) for their applicability to the IPP. In this paper, we present three variations
of the bootstrap rescaling method which were adapted to
the IPP sample design in detail.
The bootstrap method for the iid case has been extensively studied since Efron proposed his bootstrap method
in 1979, and considered as the most flexible method
among well known resampling methods. The original
bootstrap method was then modified to handle complex
issues in survey sampling, and results were extended to
cases such as stratified multistage designs [10].
Rao and Wu [11] provided an extension to stratified
multistage designs but covered only smooth statistics.
The main technique which was used to apply the bootstrap method to complex survey data was scaling. The
estimate of each resampled cluster was properly scaled
so that the resulting variance estimator reduced to the
standard unbiased variance estimator in the linear case.
Sitter [13] also explored the extensions of the bootstrap
to complex survey data and proposed a mirror-matched
bootstrap method for a variety of complex survey designs.
Sitter mentioned in his study that it was difficult to compare the performances of his proposed method with those
in Rao and Wu [11]’s rescaling method either theoretically or via simulation.
Later, Rao et al [12] extended the result of Rao and
Wu [11] to non-smooth statistics such as the median
by making the scale adjustment on the survey weights
rather than on the sampled values directly. Although
this method is known to overestimate the true variance,
it has an advantage that it does bootstrap-sampling only
at primary sampling unit level.
5.2

This method resamples only items selected in the first
stage of sampling. Despite its simplicity, it is still unbiased, and consistent in the linear case. This method also
works well when estimating ’non-smooth’ functions such
as percentiles.
To capture the variability induced by items originally
sampled in the later stages, it adjusts the final stage’s
weight to take into account the ’other’ variation that
would have been captured when selected in a particular bootstrap sample. In IPP’s case, this means that for
each stratum, we will select the establishment via simple random sampling with replacement (SRSWR), then
adjust item weight using the following formula:
) r
#
"(
r
 

Where:

1−

nbn
nh − 1

+

nbn
nh − 1

nh
nbh

nbh
nh
mbhe

In our case, we are going to select nbh = nh − 1 items in a
bootstrap sample, reducing the bootstrap weight defined
above to:
 
b
wheci
= wheci nnhb mbhe
 h 
h
= wheci nhn−1
mbhe
5.3

BRM Algorithm

1. Draw nbh = nh − 1 items with replacement from the
nh establishments sampled in stratum h. Let mbhe
be the number of times establishment e was selected
in stratum, h for bootstrap sample, b.
2. Define the bootstrap weights, thus:


nh
b
wheci = wheci
mbhe
nh − 1
Where the terms are defined as in the above definition of the bootstrap item weight
3. Compute, θˆhb , the price STR index for stratum h
using the bootstrap item weights found in bootstrap
sample b.
4. Iterate steps 1 through 3 for b = 1, ..., 150

Bootstrap Rescaling Method (BRM)

b
wheci
= wheci

wheci

= The new bth bootstrap weight for
item i within Classif-Group c, traded
by establishment e, in stratum h
= The original item weight for i, e, c, h
defined above.
= The number of items selected in the
bth bootstrap sample in stratum h.
= The number of establishments
sampled within stratum h
= The number of times establishment e
was selected in stratum h in
bootstrap sample b.

mbhe

5. Compute the bootstrap estimator:
150

vh =

2
1 X ˆ
θhb − θˆh
150
b=1

where θˆh is the STR index for stratum h computed
using the original sample
5.4

Bootstrap variance estimator using Rescaling Method with Certainties (BRMC)

This method is an extension of the BRM method seeking
to address the issue of certainty selections used the IPP
sampling methodology by obtaining bootstrap samples
from the first stage of variability. We achieved the adjustment for certainties by obtaining the bootstrap sample from the first stage in which there was no certainty
selection.

Prior to applying the (BRMC) algorithm, we must first
partition Sh , the set of all items sampled from sampling
stratum, h, into three groups:
(
)
h1

h2

=

=






(

h3

=

The set of items in Sh the establishment was selected with a probability
less than 1.

The set of items in Sh in which the


associated establishment was a
certainty selection, but the CG was


selected with a probability less than 1.
)
The set of items in Sh in which both
of the associated establishment and
CG were certainty selections.

Using this definition, we compute a bootstrap item
weight, whb j i from the existing item weight when nhj > 1
:
 s
 s





nb
nb
n
hj
hj
hj
b
b 
w
= wh i 
1−
+
m
hj
hj i
j
nh − 1
nh − 1
nb


j
j
hj
Where:
whb j i

whj i
nbhj

nhj

mbhj

= The new bth bootstrap weight for
item i, within sampling stratum
partition hj
= The existing item weight for item i
within sampling stratum partition hj
= The number of entities selected for
bootstrap sample, b in sampling
stratum partition hj . An entity is
defined by the partition, j and will
be detailed below.
= The number of entities selected
within sampling stratum partition hj
The entities depend upon the
partition, j and will be defined below.
= The number of times the entity was
selected in sampling stratum
partition,
hj in bootstrap sample
b.


 establishments j = 1 
CG
j=2
=


items
j=3

5.5

BRMC Algorithm

1. Draw nbhj entities.
case 1: nhj > 1 Select the entities using a simple random sample with replacement from the nhj
entities in the sampling stratum partition hj . Let
mbhj be the number of times entity was selected in
sampling stratum partition, hj for bootstrap sample,
b.
case 2: nhj = 1 Select the single item.
2. Define the bootstrap weights, thus:
 n 
)
(
hj
mbhj nhj > 1
whj i nh −1
b
j
whj i =
whj i
nhj = 1
Where the terms are defined above.
3. Compute, θˆhb , the price STR index for published
stratum h using the items from ALL sampling stratum partitions and the bootstrap item weights found
in bootstrap sample b.
4. Iterate steps 1 through 3 for b = 1, .., 150
5. Compute the bootstrap estimator:
150

vh =

2
1 X ˆ
θhb − θˆh
150
b=1

where θˆh is the STR index for published stratum s
computed using the original sample.
5.6

Bootstrap variance estimator using Rescaling Method with Certainties and Collapsing
(BRMCC)

This method extends the BRMC outlined above by including a simple collapsing algorithm for those variance
strata that have only one entity available for selection. In
the previous section, we chose to presume that in most
cases single entity variance strata were for all practical
entity
purposes, certainties. This section treats these single
cases as potential areas of sampling variability and will
In our case, we are going to select:
collapse to form a pool on which to sample.


nh
n
−
1
n
>
1
This collapsing will be achieved by creating 2 j new
h
h
j
j
nbhj =
1
nhj = 1
variance strata, where nhj is the number of single entity variance strata in the sampling strata partition hj .
entities in bootstrap sample, b, reducing the bootstrap
Each of these new variance strata will contain at least
weight defined above to:
two entities, with possibly one containing three. Given
 n 
)
(
hj
b
the nature of the sample design we are likely to encounter
m
n
>
1
w
hj
hj i nh −1
hj
j
whb j i =
this in the cases when j = 2 or j = 3. Example scenarios
whj i
nhj = 1
of this follow:
In the case where nhj = 1, we are assuming that such
cases are likely absolute certainties that would have never
1. j=2: We find that in sampling stratum 052 we find
produced any sampling variability. This is because if
eleven certainty establishments that have only samthere is (for example) a company in which there is only
pled Classif Group. We will form five new variance
a single CG beneath a certainty Establishment, it’s most
strata four with two certainty establishments, one
likely the only CG would have ever acquired.
with three certainty establishments.

2. j=3: We find that in sampling stratum 052 we
find 6 certainty establishments that also have certainty CGs with only one item. We will form
three new variance strata each with two certainty
estab||certainty CGs.

We are limiting our analysis to only the first month
of the STR chain. This study provides a more detailed
analysis for individual strata than was provided in Chen’s
study.
Strata

Description

For each of these newly formed variance strata we will
form them by pairing the X[i] , X[i+1] where X[i] is the ith
order statistic of the dollar value for the entity.

P07

5.7

P87

Edible vegetables, roots, and
tubers
Motor vehicles
and their parts
Toys, games and
sports
equipment; parts and
accessories
Articles
of
leather; travel
goods, bags, etc.
Articles of apparel and clothing accessories,
knitted or crocheted
Optical, photographic, measuring and medical
instruments
Furniture,
stuffed furnishings;
lamps,
lighting fittings

BRMCC Algorithm

1. Identify the single entity variance strata

P95

This will be the j for which nhj = 1.
2. Sort the group identified in 1. above by the entity
dollar value

P42

3. identify the new variance strata as
P61

hkjki
where h is the sampling stratum, j designates the
partitioned group identified in section 2.1, and i is
the greater rank of the entity dollar value for the
pair. Example: Suppose that for sampling strata 52
we have six single entity variance strata in the second group (i.e. certainty establishments, but noncertainty CGs). both the top two establishment dollar value amounts would be coded 5221, the third
and fourth largest would be coded 5222, the fifth
and sixth largest would be coded 5223. In the case
of an odd number of single entity variance strata for
a given sampling stratum, the smallest group would
contain three.

P90

P94

Reason
included
Historically
Volatile
Historically
Volatile
Coverage
close
to
BRR
Randomly
Selected
Randomly
Selected

Randomly
Selected

Randomly
Selected

Table 1: Studied strata descriptions

4. select bootstrap sample
this is done as in the previous bootstrap example, only using the newly created variance strata
5. define the bootstrap weight
again, as above
6. Compute θˆhb , the price STR index again as above.
7. iterate
8. compute bootstrap estimator as above

6.2

Comparison

For each of the methods we ran 150 iterations. For each
method we computed the average Bias, Stability, and
coverage the following way:
For each two-digit HS stratum k, let us define yi be the
full vector of entire sample i where i = 1, . . . , 1000, θˆki =
θˆk (yi ) and define
¯ˆ
θk. =

6. Analysis
6.1

Overview

Table 1 presents three two-digit strata that were used in
our analysis. The first two, P07 and P87 were included
because they were historically volatile, the third, P90 was
randomly selected. This analysis will include a brief comparison of the performance of the three methods outlined
to estimate the first month of a STR chain for the above
strata using bias, stability, and coverage rate as criteria.
For a more detailed study see (Chen, 2007).

V˜k. =

1000
1 Xˆ
θki
1000 i

1000

X
1
¯ 2
θˆki − θˆk.
1000 − 1 i

(1)

and σ
˜k. =

q
V˜k.

As we do not have a true variance, for each of the
two-digit HS stratum k, we use V˜k. and σ
˜k. as our
sampling variance and standard deviation.
Let σ
ˆmki be the standard error estimator of a two-digit
HS stratum k of sample i for the variance estimation

method m. The relative bias of an interested variance
estimation method is calculated as


P1000
1
σ
ˆmki − σ
˜k.
i
1000
Relative Bias =
× 100%
σ
˜k.
and the stability is
v
u
1000
2
u
X
1
¯
ˆ
σ(Vmki ) = t
Vˆmki − Vˆ mk. × 100%
1000 − 1 i
¯
where Vˆ mk. is the average of the 1000 variance estimations for the method.
We formed 95% confidence limits using:
θˆkiL
θˆkiU

= θˆki − t0.975,µmki /2 σ
ˆmki
ˆ
= θki + t0.975,µ /2 σ
ˆmki

Strata
P07
P42
P61
P87
P90
P94
P95

µmki =

6.2.2

mki

where,

σ
˜h·
0.0502
0.0015
0.0016
0.0056
0.0018
0.0019
0.0019

Stability

Strata
P07
P42
P61
P87
P90
P94
P95

BRM
0.1478%
0.0002%
0.0002%
0.0048%
0.0003%
0.0003%
0.0016%

BRMC
0.1464%
0.0002%
0.0002%
0.0047%
0.0002%
0.0003%
0.0005%

BRMCC
0.1509%
0.0002%
0.0003%
0.0056%
0.0003%
0.0004%
0.0005%

Table 3: Stability Results for 1000 samples

From this, we form the coverage rate thus:
1000

o
1 X n¯ˆ
I θk. ∈ θˆkiL , θˆkiU
.
1000 i

BRMCC
4.68%
-13.68%
-5.22%
-0.99%
0.08%
2.58%
18.75%

Table 3 displays the stability (or variance of the variance)
of each variance method. From this table, it is clear that
the stability for all methods is roughly the same.

the price relative for stratum k in sample, i
The degrees of freedom for the two digit
price relative

cˆ =

BRMC
4.37%
-14.87%
-15.78%
-6.98%
-6.44%
-0.46%
13.78%

Table 2: Rel. Bias Results for 1000 samples

where,
θˆki =

BRM
3.94%
-14.91%
-11.78%
-6.39%
-4.02%
-0.31%
16.98%

6.2.3

Coverage Rate

Table 4 displays the coverage rates for the 1000 samples
for each of the three methods presented. From this we can
I =
see that with the exception of P61 and P90 the BRMCC
method provided coverage rates closer to the expected
¯ˆ
θk. =
95% for all strata, and the BRMC method offered no
improvement to the BRM method in all but two cases,
We are using the relative biases for accuracy and P07 and P95, and these were so small as to be negligible.
percentage of the stability for the variation because
the biases and stability are all very small. Note that
Strata BRM BRMC BRMCC
by the way we have defined stability, a lower number
P07
0.944
0.948
0.95
means greater stability, while a higher number means
P42
0.91
0.92
0.922
less stability.
P61
0.921
0.886
0.919
P87
0.899
0.892
0.906
P90
0.955
0.945
0.965
6.2.1 Relative Bias
P94
0.908
0.904
0.92
Table 2 displays the relative bias of each method as comP95
0.922
0.926
0.931
pared to the variance observed in the 1000 samples. In
this table, there seems to be little difference in the relTable 4: Coverage results for 1000 samples
ative bias between the BRM and BRMC methods. In
both cases both methods have a negative bias. With the
exception of strata P07, P95, P94 the BRMCC method
moved the relative bias closer to zero.
6.3 Discussion
For each method, the sign and magnitude of the bias
seems to be independent of the size of the estimated pop- Since IPP has a rather complete frame provided from
ulation variance, σ
˜k· .
Customs, we could compute the sampling fraction for the

 )
¯
1 , if θˆk. ∈ θˆkiL , θˆkiU
0
otherwise
is the average of 1000 index estimates
used as population (“true”) index.
(

BRM and the BRMC methods. For the BRMCC method,
we did not have actual item data available from the frame,
making such a calculation impossible. Table 5 shows a
summary of all sampling strata for the BRMC method.
The h1 column presents the number of items sampled,
PSUs, and averages in the sample and frame when the
variance PSU is defined to be the establishment. The
h2 column presents the number of items sampled, PSUs
and averages in the sample and frame when the variance
PSU is defined to be a classif group within a certainty
establishment. A rough estimate of the sampling fraction
can be seen by dividing the average number of elements
to select per variance strata for the frame and sample we
find that for probability establishments we have a ratio
of 12.4% and for certainty establishments we have about
14.4%.

further substantiated by observing that P61 has a proportion of its variance PSUs somewhere between P07s and
P87/P90 and its relative bias/coverage rates are similarly
affected.
stratum
P07
P42
P61
P87
P90
P94
P95
other
nh

h1
20
45
165
227
171
164
114
1393
2299

h2
0
3
57
104
26
17
18
209
434

h3
0
14
75
815
200
105
86
989
2284

total
20
62
297
1146
397
286
218
2591
5017

Table 6: Number of PSUs per strata by type
Type
Total number
Sampled
Num.
Variance PSU
Ave Elements
in VPSU
Total number
in Frame
Num.
Variance PSU
Ave Elements
in VPSU

h1
2299

h2
1091

Total
3390

72

435

507

31.93

2.51

6.69

18,512

7,519

26,031

72

433

505

257.11

17.36

51.55

Table 5: Average elements in variance strata

6.4

Conclusions

There is some evidence to suggest that sampling from
the first level of variability in the sampling methodology
as opposed to the first stage of sampling has a positive
effect on estimating variance in the IPP. The use of the
BRMC method, however seems to be less effective than
the more general BRMCC method. This is likely due to
the high proportion of ’single item’ certainty CGs that
were set to certainties in the BRMC method as opposed
to the collapsing scheme used in the BRMCC method.
6.5

Future Work

Try adapting the BRM derived by Rao et. al.[12] for
the single stage without replacement sampling design as
opposed to the with replacement version we adapted here.
6.3.1 Impact Sampling Methodology on Variance
In our study we did not take into account the impact of
Table 6 shows the number of variance PSUs available imputation, while we did simulate non-response to samto be selected at each variance stratum, within the two- pling based upon models developed in the IPP, we predigit stratum. This table further breaks out the number sumed that respondents who agreed to participate would
selected in each of the three possible selection classes for do so every month, thus negating the need to impute.
the BRMCC method.
Future work should include studying the impact of impuFrom this table we see that P07 has no certainty vari- tation.
ance PSUs. This provides evidence as to why there was
In the IPP, our design was developed to support mulnegligible impact to both the relative bias and the cov- tiple aggregation structures. Since we only sampled one
erage rates. Since this strata also had the least number panel, it was ‘difficult’ to address the affect of using an alof variance PSUs available to sample, which may explain ternate aggregation structure would have on the variance
why this strata had the largest stability results.
of indexes produced for these structures. Future work will
Note too, that P87 and P90 have the largest propor- address the impact of alternate aggregation structures.
tions of their variance PSUs residing in the h3 category.
Similarly to the ‘alternate’ classification system quesThat is, most of their VPSUs are items that were se- tion, our simulation was applied only to a single sample.
lected in the third stage after certainty selections in the In production we actually ‘pool’ data from multiple
first two stages. This provides some evidence as to why samples. Future work will address the impact of pooling
there was a clear impact on both the relative bias and multiple samples.
coverage rates when using the BRMCC method.
The observation that the proportion of items having Any opinions expressed in this paper are those of
certainties at later stages in the sampling methodology the authors and do not constitute policy of the Bureau of
are better suited using the BRMC, BRMCC methods is Labor Statistics.

References
[1] Bobbitt, P., Cho, M.J. and Eddy, R. M. “Weighting Scheme Comparison in the International Price
Program.” 2005 Proceedings of the American Statistical Association. Government Statistics Section [CD-ROM], 1006-1014 (2005).
[2] Chen, Te-Ching; Bobbitt, Patrick; Himelein,
James A. Jr.; Paben, Steven P.; Cho, Moon Jung;
Ernst, Lawrence “Variance Estimations for International Price Program Indexes.” 2007 Proceedings of the American Statistical Association
[3] Cho, M. J., Chen, T-C, Bobbitt, P.A., Himelein,
J.A., Paben, S.P., Ernst, L.R., and Eltinge, J.
L.(2007), “Comparison of Simulation Methods
Using Historical Data in the U.S. International
Price Program”, 2007 Proceedings of the American Statistical Association, Third International
Conference on Establishment Surveys [CD-ROM],
Alexandria, VA: American Statistical Association:
to appear
[4] Efron, B. (1979). “Bootstrap methods: Another
Look at the Jackknife.” The Annals of Statistics.
Vol 7, No. 1, pp. 1-26
[5] Efron, B. (2000). “The Bootstrap and Modern
Statistics”, Journal of American Statistical Association, Vol 95, No. 452, pp. 1293-1296
[6] Efron, B. and Gong, G. (1983). “A Leisurely
Look at the Bootstrap, the Jackknife, and CrossValidation.” The American Statistician. Vol. 37,
No. 1, 36-48.
[7] Efron, B., and Tibshirani, R. (1986). “Bootstrap
Methods for Standard Errors, Confidence Intervals, and Other Measures of Statistical Accuracy”,
Statistical Science, Vol 1, No. 1, pp. 54-75
[8] Efron, B. and Tibshirani, R.J. (1993). “An Introduction to the Bootstrap.” Chapman & Hall.
[9] Korn, E. and Graubard, B. (1990). “Simultaneous testing of regression coefficients with complex
survey data: Use of Bonferroni t statistics.” The
American Statistician, 44, 270-276.
[10] Lahiri, P. (2003). “On the impact of bootstrap in
survey sampling and small-area estimation.” Statistical Science, Vol.18, No. 2, 199-210.
[11] Rao, J.N.K. and Wu, C.F.J. (1988). “Resampling
inference with complex survey.” Journal of American Statistical Association, Vol.83, No. 401, 231241.
[12] Rao, J.N.K., Wu, C.F.J. and Yue, K. (1992).
“Some recent work on Resampling methods for
complex surveys.” Survey Methodology, Vol.18,
No. 2, 209-217.

[13] Sitter, R. (1992). “A resampling Procedure for
Complex Survey Data.” Journal of American Statistical Association, Vol 87, No. 419, pp. 755-765
[14] Toftness, Richard, “Comparison of Variance Estimation Techniques for a Price Index with
Non-Independent Weights” 2002 Proceedings of
the American Statistical Association. Government
Statistics Section [CD-ROM], (2002).
[15] Wolter, K.M., Introduction to Variance Estimation. Springer-Verlag, New York 1985.


File Typeapplication/pdf
File Modified2009-08-19
File Created2007-10-09

© 2024 OMB.report | Privacy Policy