Part B Appendix 1: Wage Items

Part B App 1, Wage Items 021011.doc

Unemployment Insurance Data Validation Program

Part B Appendix 1: Wage Items

OMB: 1205-0431

Document [doc]
Download: doc | pdf

Appendix B-1


Sampling for Wage Item Validation (Module 5)


Wage Item Validation


Module 5 of the UI DV Tax handbook describes the process for Wage Item validation. Wage items are records that identify the wages paid in a quarter to the employees of each UI covered employer. When a state receives a record on an employee as part of the employer’s quarterly wage and contribution submission, it is called a “wage record;“ once entered into the database, it becomes a “wage item.” Each wage item must contain certain information needed to identify the worker and the employer (i.e., SSN, name, employer name, amount of wages). The UI system uses this information to compute a separated employee’s monetary eligibility for UI benefits when the employee claims benefits; these records are also a means for determining whether the employer retains its status as an active employer, as well as for various research and evaluation purposes.


The count of wage items is reported on the ETA 581 report and is used as part of the workload formula for allocating UI administrative resources. The validation process involves recounting samples of wage records that have already been entered into the database to ensure that each one contains all essential identifying information. State validators draw separate samples of wage items that the state has received by different modes (e.g., on paper records; on magnetic media; by fax; by Internet or other electronic transmission, etc.) They are asked to draw random samples of 150 wage items for each mode. These are investigated in two stages, the first stage is 50 items viewed as an acceptance sample; the second stage is an estimation sample of the full 150 items. The approach is the same as described in item B-2 for benefits random samples, except that the pass-fail criterion is 2% instead of the 5% used for benefits samples. Every mode must pass with an error rate of ≤2% for Wage Item validation to pass. If even one mode fails, the entire validation process involving all modes must be repeated within the next year. If it passes, the validation is good for three years.


Wage Item Sampling


This paper describes the two-stage sample based on a sample of 150 items, with an initial examination of 50 cases. The first stage of the process is an acceptance sample of size 50 to determine whether a judgment can be made at that level or whether review of the remaining cases in the sample is called for. If the result is inconclusive (or the State wishes to estimate the probable underlying error in a population that has clearly failed in the first stage) the additional 100 sampled transactions are evaluated and a judgment is made from the full 150-case sample.


The first stage procedure uses the following decision rule:


Pass Fail Inconclusive

50 Cases 0 errors 4 1 - 3 errors (evaluate remaining 100 cases)


This decision rule (as well as the decision rule for the full sample) assumes that the samples of transactions are selected without replacement from a large population, and that each transaction in a sampled population of transactions has an equal chance of being selected into the sample of 150 and into the subsample of 50 that is used for the first stage. Based on these assumptions, the probabilities of any process passing or failing are computed using the binomial formula.1


The sampling procedure must balance the risks of (a) taking an unwarranted action to correct a process with a true error rate < 2% (Type I error); and (b) allowing reporting errors to continue by failing to detect populations with underlying error rates that exceed 2% (Type II error). For this design, the probability of a Type I error is .05 and the probability of a Type II error is .10.


The decision rule for the first stage attempts by balance Type I and Type II errors. In the first stage, a process passes only with zero errors, and fails if it has 4 or more errors. To find these cut-off points (pass, fail) for the first stage, we calculate the Type I and Type II errors with the actual error rate = 0.02.2 Table 1 summarizes the cumulative probabilities that the number of errors observed in the sample of 50 items is < c.


Table 1. Cumulative Probabilities by Number of Errors in Sample

and Population Error Rate


Sample →

50

50

50

50

50

50

Errors (c) →

0

1

2

3

4

5

0%

1

1

1

1

1

1

1%

0.6050

0.9106

0.9862

0.9984

0.9999

1.0000

2%

0.3642

0.7358

0.9216

0.9822

0.9968

0.9995

3%

0.2181

0.5553

0.8108

0.9372

0.9832

0.9963

4%

0.1299

0.4005

0.6767

0.8609

0.9510

0.9856

5%

0.0769

0.2794

0.5405

0.7604

0.8964

0.9622


To find the optimal cutoff (C1) to reject the assumption that the population error rate is < 2% in the first stage sample, we compare Type I errors for different levels of . The larger is, the smaller the type I error is. We want to choose such that the Type I error ( ) is below the 0.05 threshold.


Table 1 gives the type I errors contributed from stage one upon different ’s. From the table we can see that for a sample size of 50 and an underlying error rate of 2%, the cumulative probability of observing 3 or fewer errors is .9822. Therefore, setting the cutoff > 3 will result in a Type I error < .0178. To minimize the Type II error contributed from the first stage we require that there be no error at all to pass the test at the first stage. By requiring zero errors to accept the assumption that the population error rate is < 2% in the first stage sample, the Type II errors for population error rates > 2%, are shown in the column for zero errors. At 3%, the Type II error is .2181; at 4%, .1299, and at 5%, .0769.


______________________

where d is the number of errors


If the result is inconclusive, the State must evaluate the additional 100 sampled items and make a judgment from the full 150-case sample. (The State may also wish do this to estimate the probable underlying error in a population which has clearly failed in the first stage.)


In the second stage, the cut-offs are set to ensure that if the underlying error rate is less than or equal to 2%, the probability that a sample will fail is < .05. If the underlying error rate is greater than 2%, probability that a sample will fail is > .05 and increases as the underlying rate increases. The Type I and Type II errors are summarized in Table 2. The second stage decision rule is to conclude that the underlying error rate is < 2% if there are six or fewer errors.


Table 2. Cumulative Probabilities by Number of Errors in Sample

and Population Error Rate


Sample →

150

150

150

150

150

150

Errors (c) →

4

5

6

7

8

9

Error Rate







1%

0.9799

0.9980

0.9999

1.0000

1.0000

1.0000

2%

0.7201

0.8783

0.9599

0.9902

0.9982

0.9998

3%

0.4054

0.5946

0.7636

0.8843

0.9531

0.9844

4%

0.2023

0.3385

0.5000

0.6615

0.7977

0.8944

5%

0.0949

0.1745

0.2871

0.4257

0.5743

0.7129

6%

0.0428

0.0845

0.1512

0.2458

0.3655

0.5000


To compute the overall probability of passing, one must take into account the ways in which the sample can pass. We denote the number of errors in the first stage as d1 and the number from the second stage as d2, and the rejection value for the first sample as c1i and for the second as c2i. For the (50 / 100) sample, where c1 = 4 and c2 = 7, can pass in any of four ways:


d1 = 0,

d1 = 1 and d2 < 5

d1 = 2 and d2 < 4

d1 = 3 and d2 < 3


Table 3 displays the joint results of the two-stage process.


Table 3. Joint Probabilities of Two-Stage Sample


Errors →

0

1, 5

2, 4

3, 3

Cumulative

Probability

Error Rate





To Pass

To Fail

1%

0.6050

0.3054

0.0754

0.0120

0.9978

0.0022

2%

0.3642

0.3658

0.1764

0.0521

0.9585

0.0415

3%

0.2181

0.3100

0.2090

0.0818

0.8188

0.1812

4%

0.1299

0.2133

0.1737

0.0791

0.5960

0.4040

5%

0.0769

0.1247

0.1138

0.0567

0.3722

0.6278

6%

0.0453

0.0638

0.0626

0.0330

0.2048

0.7952

7%

0.0266

0.0291

0.0301

0.0165

0.1023

0.8977

8%

0.0155

0.0121

0.0129

0.0073

0.0478

0.9522

9%

0.0090

0.0046

0.0051

0.0029

0.0216

0.9784

10%

0.0052

0.0016

0.0018

0.0011

0.0097

0.9903


For an error rate of 2% or less, Type I error is < .0415. Type II errors decrease to approximately .10 for an underlying error rate of 7%.



1The probability of exactly d events (in this case errors) occurring with n trials where the population error rate is p is expressed as:



The probability that no more than c events occurring is:


- 4 -

File Typeapplication/msword
File TitleBurman,
Authorspisak.andrew
Last Modified Byskrable.burman
File Modified2011-02-14
File Created2011-02-10

© 2024 OMB.report | Privacy Policy