Supporting Statement Part B 1220-0157

Supporting Statement Part B 1220-0157.doc

National Longitudinal Survey of Youth 1997

OMB: 1220-0157

Document [doc]
Download: doc | pdf

B. Collections of Information Employing Statistical Methods


1. Respondent Universe and Respondent Selection Method

This section summarizes the primary features of the sampling and statistical methods used to collect data and produce estimates for the NLSY97. Additional technical details are provided in the NLSY97 Technical Sampling Report, available online at http://www.nlsinfo.org/preview.php?filename=nlsy97techsamprpt.pdf.


Additional information about statistical methods and survey procedures is available in the NLSY97 User’s Guide at:

http://www.nlsinfo.org/nlsy97/docs/97HTML00/97guide/toc.htm


The initial sample was selected to represent (after appropriate weighting) the total U.S. population (including military personnel) 12 to 16 years of age on December 31, 1996. The sample selection procedure included an overrepresentation of blacks and Hispanics to facilitate statistically reliable analyses of these racial and ethnic groups. Appropriate weights are developed after each round so that the sample components can be combined to aggregate to the overall U.S. population of the same ages. Weights are needed to adjust for differences in selection probabilities, subgroup differences in participation rates, random fluctuations from known population totals, and survey undercoverage. Computation of the weights begins with the base weight and then adjusts for household screener nonresponse, sub-sampling, individual nonresponse, and post-stratification of the nonresponse-adjusted weights. The number of sample cases in 1997, the first round, was 8,984. Retention rate information for subsequent rounds is shown in the table below. BLS anticipates approximately the same retention rate in Round 13 that was attained in Round 12. In Round 12, the retention rate was close to that of Round 9. We saw an increase in retention rate in round 10 and modest declines in Rounds 11 and 12. Only sample members who completed an interview in Round 1 are considered in-scope for subsequent rounds. Even if NORC is unable to complete an interview for an in-scope sample member in one round, they attempt to complete an interview with that sample member in each subsequent round. The interview schedule is designed to pick up crucial information that was not collected in the missed interviews.


The schedule and sample retention rates of past survey rounds are shown in Table 3.


Table 3. NLSY97 Fielding Periods and Sample Retention Rates

Round

Months conducted

Total respondents

Retention rate

Number of deceased sample members

Retention rate excluding the deceased

1

February–October 1997
and March–May 1998

8,984

2

October 1998–April 1999

8,386

93.3

7

93.4

3

October 1999–April 2000

8,209

91.4

16

91.5

4

November 2000–May 2001

8,081

89.9

15

90.1

5

November 2001–May 2002

7,883

87.7

25

88.0

6

November 2002–May 2003

7,898

87.9

30

88.2

7

November 2003–July 2004

7,755

86.3

37

86.7

8

November 2004–July 2005

7,503

83.5

45

84.0

9

October 2005–July 2006

7,338

81.7

60

82.2

10

October 2006–May 2007

7,555

84.1

77

84.8

11

October 2007-June 2008

7,418

82.6

90

83.4

12

October 2008 – June 2009

7,4001

82.41

1041

83.31


Note: The retention rate is defined as the percentage of base year respondents who were interviewed in a given survey year.

1 Projected.


2. Design and Procedures for the Information Collection

The NLSY97 includes personal interviews with all living Round 1 respondents, regardless of whether they subsequently become institutionalized, join the military, or move out of the United States. We employ a thorough and comprehensive strategy to contact and interview sample members. At each interview, detailed information is gathered about relatives and friends who could assist NORC field staff in locating respondents if they cannot readily be found in a subsequent survey round. Every effort is made to locate respondents. Interviewers are encouraged to attempt to contact respondents until they reach them. There is no arbitrary limit on the number of call-backs.


Preceding the data collection, the NORC interviewers are carefully trained, with particular emphasis placed on resolving sensitive issues that may have appeared in the pretest and in prior rounds. Most of the NORC interviewers have lengthy experience in the field from having participated in earlier NLSY97 rounds as well as from involvement with the NLSY79 and other NORC surveys. All new recruits are given one day of personal training on general interviewing techniques, followed by three days of personal training on the questionnaire and field procedures. Experienced interviewers receive self-study training consisting of over 8 hours on specially designed materials requiring study of the questionnaire and procedural specifications, with exercises on new or difficult sections and procedures.


Field interviewers are supervised by NORC Field Managers and their associates. NORC has divided the U.S. into 10 regions, each supervised by a Field Manager who is responsible for staffing and for the quality of field work in that region. A ratio of 1 supervisor to 15 interviewers is the standard arrangement. Field Managers are, in turn, supervised by one of the two Field Project Managers.


The interview content is prepared by professional staff at BLS, CHRR, and NORC. When new materials are incorporated into the questionnaire, assistance is generally sought from appropriate experts in the specific substantive area.


Because sample selection took place in 1997 in preparation for the baseline interview, sample composition will remain unchanged.


Some new activities are planned in Round 13 to supplement the main interview.


Release of Postsecondary Education Records

With BLS encouragement, a research team headed by Chandra Muller of the University of Texas at Austin has submitted a grant proposal to the National Institute of Child Health and Human Development to collect college transcripts and other postsecondary enrollment information for NLSY97 sample members. If this proposal is funded, BLS would seek to collect signed permission forms from respondents to grant BLS permission to obtain transcripts. A draft permission form is provided in Attachment 7.


Permission will be sought from all respondents who have reported that they received a high school diploma or GED or completed coursework in a postsecondary degree program. We will seek permission from this broad group, rather than just the respondents who reported some college coursework, to help validate the educational attainment information that respondents have provided during the NLSY97 interviews. Some respondents who reported college coursework might not actually have completed such coursework. Similarly, some respondents who did not report any college experience may actually have attended college. For NLSY97 respondents who sign the permission form, we will obtain their transcripts and other information about college attendance through the National Student Clearinghouse (http://www.studentclearinghouse.org/).


Releases would be sought from respondents first by field interviewers at the time of the Round 13 in-person interview. A follow-up mail effort would take place after the close of the Round 13 data-collection period and would request return of signed releases from sample members completing the Round 13 interview by phone or not completing the Round 13 interview at all. We estimate by Round 13 that 7,425 respondents will have completed high school, earned a GED, or completed a term in a postsecondary degree program. Including respondents completing and not completing the Round 13 interview, we estimate that 6,311 will provide signed releases.


We estimate the respondent burden to read and sign the transcript release is 1.5 minutes per respondent.


We emphasize that collection of releases will take place only if NICHD awards a grant to fund the collection of the released and the subsequent collection of college transcripts.


Web-based entry of locating data

During the Round 13 pretest, we plan a trial of web data collection in which the respondent could provide selected locating data through web entry rather than during the locating section of the interview. This trial has several attractive features.


First, it is plausible that locating information is of higher quality when self-administered than when collected by an interviewer. Although interviewers can certainly better ensure that complete data are provided and conscientiously entered, respondents are likely to be more familiar with persons’ names, street names, and various numbers that are associated with contact information. In addition, the locating section consumes several minutes of the interview interaction. Removing those minutes from the interview setting would likely reduce the burden associated with the interview itself. Web collection offers the possibility of higher data quality, reduced overall respondent burden (because most adults read and write faster than they can read aloud and dictate), and reduced length of the interview session.


A second advantage is more forward looking. A strength of NLSY97 data collection strategies over time has been our emphasis on accommodating respondent preferences and convenience by being flexible and offering a range of options for interview completion. For example, interviewers will meet respondents at the location of their choosing, interviews are conducted at the time of respondents’ choosing, the interview can be broken up into several sessions, and we have made considerable use of phone interviewing as an option for respondent convenience. More recently, we have found that the use of e-mail, a respondent contact web site, and other online tools have given us yet another avenue for interacting with respondents at the times and in the modes they prefer. Although the complexity of the NLSY97 questionnaire makes it unlikely that pure self-administration could be attractive soon, it seems appropriate for the NLS program to continue exploring ways that the Internet can provide additional mode and convenience options to respondents.


Although attractive technical solutions have been devised for many of the technological problems of fielding confidential, Federal surveys online, the challenges for online collection of any NLSY97 data likely would be more on the human interface. Which respondents have Internet access, where do these respondents access the Internet, how well could they keep track of passwords and login information necessary for a secure site, how do we validate users, how will respondents perceive the privacy and scientific integrity of online data collection, and so forth?


Specifically, we plan to offer respondents in the Round 13 pretest the option of providing selected portions of the NLSY97 locating data online in advance of the interview. To encourage respondents, we would offer a modest incentive of $5, to be added to their pretest interview respondent fee, and we would point out that the overall length of the interview session would be reduced because fewer questions would need to be administered during that session.


We would have a secure site for which respondents would need to provide login and password information. We also likely would have some preloaded information, such as whether the respondent had a driver’s license previously. We would not display any previously provided confidential information. For example, we would not say, “Aunt Sally used to live at 123 Mulberry Street. Is that still correct?”


To minimize the risk of compromised locating data, respondents’ web entries would be reviewed prior to the pretest interview. If clarification or retrieval were necessary, the interviewer would complete these tasks during the interview. For example, if we don’t pre-load confidential information, then we may need to confirm that Sara Hansen is in fact the Aunt Sally Hansen mentioned in a previous interview. Also, if a respondent entered the web form but did not provide zip codes or phone numbers, the interviewer would ask the full set of locating questions during the interview.


NIR Questionnaire

Although we continue to have excellent rates of return among sample members who missed some previous rounds, the NLSY97 now has a few hundred respondents who are extremely unlikely to complete an interview in any given round. At the start of Round 13, we expect there to be about 600 respondents who were last interviewed prior to Round 8. In Round 11, respondents who had missed at least 5 rounds completed interviews at a rate of 8.8%. Conceiving of these individuals as likely nonrespondents rather than as potential respondents may help us improve our understanding of our sample and our ability to convert long-term nonrespondents. We plan to field an experimental nonresponse questionnaire among sample members who have extremely low probabilities of completion.


The purpose of a noninterview respondent (NIR) questionnaire would be to capture key status information about the sample member’s life. This information could permit nonresponse analysis to understand whom we are missing and what difference it might make in our analytic results. In addition, a brief, minimally intrusive “interview” might slightly increase these sample members’ willingness to participate in a full interview in subsequent rounds.


Operations. Respondents who had not completed the prior 5 interviews would be eligible for the NIR questionnaire. They would be asked to complete an NIR questionnaire if they refuse the first three requests for a full interview or for some other reason a field manager believes that the case cannot result in a completed interview. We would offer a $13 incentive because it’s Round 13. NIR questionnaire respondents would receive a modified thank you letter. These cases would still be recorded as refusals, but would also have an NIR questionnaire completed. We would still secure consent from respondents, so that we could provide the NIR questionnaire responses in the public-use data file. Respondents completing the NIR questionnaire would be fielded for the main interview in the next round as usual. Respondents would still get the full NIR premium in a future round if they completed the interview. In future rounds, if the NIR questionnaire were continued, respondents would not be permitted to complete NIR questionnaires in two consecutive rounds. Given the nature of the sample, the presumption is that almost all of these NIR questionnaires would be completed by telephone.


Given an 8.8% completion rate of the full interview among this group in Round 11, we might expect 120 completed NIR questionnaires based on a 20% completion rate of the NIR questionnaire among the estimated 600 respondents at the start of Round 13 who were last interviewed prior to Round 8.


Questionnaire Design. These variables would simply be additional variables that would not be related to any others in the data file. If researchers wanted to combine these variables with other, similar measures, that would be the researcher’s choice and effort. To the extent possible, we would design the NIR questionnaire items to be easily merged with existing measures.


One approach to such a questionnaire would be to collect “critical” items that we would be most interested in and that could be incorporated into a future, full interview or considered a partial interview. The main argument in favor of this approach is that we should collect these pieces of data when sample members are more likely to recall them accurately. For example, names of jobs, names of spouses, and birth dates of children could be collected now, with the intention of retrieving additional information about those entities in a future, full interview. In addition, such an approach could be used to fill out key event history variables. This approach makes sense if we believe that we will eventually secure complete interviews with these respondents.


An alternative is simply to treat this questionnaire as a nonresponse indicator and an opportunity to have a pleasant, brief interview-like interaction with sample members who are unlikely to complete the full interview again. In this scenario, questions should capture key current status measures, with no eye toward providing anchors for a full event history enumeration. In subsequent interviews, if we ever gained cooperation for a full interview again, no attempts would be made to incorporate the previously reported information into the full interview. This latter approach is recommended here. If the experiment were successful, the nature of the questionnaire could change in subsequent rounds. Note that a very large fraction of papers using the NLSY actually use only these summary-type measures, so that the limitation to a handful of items may not be too restrictive.


The evaluation of the NIR questionnaire would include the following elements:

- What portion of eligible respondents completed the NIR questionnaire?

- What was the completion rate of the main interview among the treatment and control groups in this category (of sample members not interviewed since prior to Round 6).

- What data quality do we see in the NIR questionnaire data. For example, what are item nonresponse rates? Do data appear internally consistent?

- Are we able to complete any nonresponse analyses based on these data? Given the small sample sizes, our analyses would likely be more qualitative than quantitative in nature.

- What impressionistic feedback do we get from field staff, project management staff, or sample members regarding this effort?


A draft of the NIR questionnaire is in attachment 8.



3. Maximizing Response Rates

A number of the procedures that are used to maximize response rate already have been discussed in items 1 and 2 above. The other component of missing data is item nonresponse. Nonresponse includes respondents refusing to answer or not knowing the answer to a question. Almost all items in the NLSY97 have low levels of nonresponse. For example, in prior rounds there was virtually no item nonresponse for basic questions like the type of residence respondents lived in (YHHI-4400) or the highest grade of school respondents had ever attended (YSCH-2857).


Cognitively harder questions like, “How many hours did you work per week?” (YEMP-23901) have low levels of nonresponse. In the hours per week example, 6 individuals out of 2,810 (0.2%) did not answer the question in Round 8.


Sensitive questions have the highest nonresponse. Table 4 presents examples of round 10 questionnaire items that are most sensitive or cognitively difficult. Even very personal questions about sex have low rates of nonresponse. The top row of the table show that the vast majority of respondents (over 95%) were willing and able to answer the question, “Did you have sexual intercourse since the last interview?” The third row shows that only 1.2% of respondents did not respond to the question on marijuana usage since the last interview. The fourth row shows that very few respondents (0.5%) did not answer whether they had carried a handgun since the last interview. Lastly, almost all respondents (0.6% nonresponse rate) were willing to reveal whether they had earned money from a job in the past year, but many did not know or refused to disclose exactly how much they had earned (20.4% nonresponse rate). Because high nonresponse rates were expected for the income amount question, individuals who did not provide an exact answer were asked to estimate their income from a set of predetermined ranges. This considerably reduces nonresponse on the income question. Only 6.4% of those who were asked to provide a range of income did not respond.



Table 4. Examples of Nonresponse Rates for Some Round 10 Sensitive Questions


Q Name

Question

Number Asked

Number Refused

Number Don’t Know

% Nonresponse

YSAQ2-299B

Have Sex Since Date of Last Interview?1

7,460

283

25

4.1%

YSAQ-370C

Use Marijuana Since Date of Last Interview?

7,460

73

20

1.2%

YSAQ-380

Carry a Handgun Since Date of Last Interview?

7,460

32

9

0.5%

YINC-1400

Receive Work Income in 2003?

7,559

14

28

0.6%

YINC-1700

How Much Income from All Jobs in 2003?

6,386

48

1,252

20.4%

YINC-1800

Estimated Income from All Jobs in 2003?2

1,300

37

46

6.4%


1 Asked of respondents who have previously reported having sexual intercourse who do not report a spouse or partner in the household.

2 Asked of respondents who were unable or unwilling to answer the previous question (YINC-1700).


4. Testing of Questionnaire Items

BLS is cautious about adding items to the NLSY97 questionnaire. Because the survey is longitudinal, poorly designed questions can result in flawed data and lost opportunities to capture contemporaneous information about important events in respondents’ lives. Poorly designed questions also can cause respondents to react negatively, making their future cooperation less likely. Thus, the NLSY97 design process employs a multi-tiered approach to the testing and review of questionnaire items.


When new items are proposed for the NLSY97 questionnaire, we often adopt questions that have been used previously in probability sample surveys with respondents resembling the NLSY97 sample. We have favored questions from the other surveys in the BLS National Longitudinal Surveys program to facilitate intergenerational comparisons. We also have used items from the Current Population Survey, the Federal Reserve Board’s Survey of Consumer Finances, the National Science Foundation-funded General Social Survey, and other Federally funded surveys.


All new questions are reviewed in their proposed NLSY97 context by survey methodologists who consider the appropriateness of questions (reference period, terms and definitions used, sensitivity, and so forth). Questions that are not well-tested with NLSY97-type respondents undergo cognitive testing with convenience samples of respondents similar to the NLSY97 sample members. During Round 12 questionnaire development, for example, cognitive testing of a proposed module pertaining to fertility expectations revealed significant comprehension and interpretation difficulties among a convenience sample of 8 individuals in the NLSY97 age group. As a result of this testing, the module was excluded from the Round 12 interview. These questions have been redesigned and cognitively tested for inclusion in Round 13. We have also completed cognitive testing on the new questions for military veterans, questions on the respondent’s favorite person, health questions for all respondents and health questions specifically for respondents at age 29.


Existing questions are also reviewed each year. Respondents’ age and their life circumstances change, as does the societal environment in which the survey is conducted. Reviews of the data help us to identify questions that may cause respondent confusion, require revised response categories, or generate questionable data. Sources of information for these reviews include the questionnaire response data themselves, comments made by interviewers or respondents during the course of the interview, interviewer remarks after the interview, interviewer inquiries or comments throughout the course of data collection, other-specify coding, and comparison of NLSY97 response data to other sources for external validation. We also watch carefully the “leading edge” respondents, who answer some questions before the bulk of the sample – for example, the first respondents to attend graduate school or to get a divorce. These respondents are often atypical, but their interviews can reveal problems in question functionality or comprehensibility.


A comprehensive pretest is planned as part of this information collection request and would occur approximately four months preceding each round of the main NLSY97 to test survey procedures and questions. This pretest includes a heterogeneous sample of 201 respondents of various racial, ethnic, geographic, and socio-economic backgrounds. On the basis of this pretest, the various questionnaire items, particularly those being asked for the first time, are evaluated with respect to question sensitivity and validity. When serious problems are revealed during the pretest, the problematic questions are deleted from the main NLSY97 instrument.


Although further edits to questionnaire wording are extremely rare, we monitor the first several hundred interviews each round with particular care. Based on this monitoring, field interviewers receive supplemental training on how best to administer questions that seem to be causing difficulty in the field or generating unexpected discrepancies in the data.


A. Round 13 questions that have not appeared in previous rounds of the NLSY97 include:


Workplace injury questions in the Employment section. These questions about injury or illness on the job are taken from the NLSY79, which included the questions in several rounds of the survey. The questions ask about the name of the employer, date of the injury or illness, work activity engaged in when injured, and whether the injury or illness resulted in missed workdays, loss of wages or other changes at work. Respondents who report an injury or illness are also asked about worker’s compensation claims they may have filed.


Veterans questions in the Employment section. Respondents who report serving on active military duty will be asked a series of questions on their military service. Military veterans also will be asked about their experience with programs designed to help service members make the transition from military to civilian life. These questions are adapted from the Current Population Survey supplement on veterans.


Additional questions in Health section. A few questions have been added to the Health section. These are designed to improve the breadth and detail of our coverage of health behaviors, which change frequently and can generally not be elicited retrospectively after long time periods have elapsed. The new questions primarily come from the NLSY79 Round 23 health section, including greater detail on exercise activity, diet and nutrition, and oral health. We have also introduced two additional questions on days of work missed due to emotional or mental health problems.


Age-29 Health module. In Round 13, we are introducing a new module for 29-year-olds, to be administered this round to the 1980 birth cohort. These questions are intended to be administered close to the respondent’s 29th birthday, and are the first of what we intend to be a series of decennially administered sections. These sections will capture health outcome and preventive health practice information from respondents. The module includes questions about family history of selected common and heritable conditions, date and cause of death of the respondent’s parents, preventive measures taken, and the SF-12, a widely-used and tested health assessment that captures health functioning and limitation independent of access to the health-care system. With the exception of a few questions asking more detail about diabetes exposure in the respondent’s family, these items come from the NLSY79 “40-plus” and “50-plus” health modules, which also adopt a decennial-administration design to capturing selected health outcome and prevention information.


Fertility expectations questions. A new module is included this year with the design assistance of researchers Stacy Dickert-Conlin and Steven Haider of Michigan State University. These questions capture the respondent’s expectations of future marital status and fertility, as well as the respondent’s experience of fertility problems. These questions should enable researchers to study the ways in which men and women use perceptions of their own fecundity in planning sequences of education, employment, marriage, and other life stages. The expectations questions previously have been asked in the NLSY97, mostly in the Round 5 experiment trying to better understand respondents’ reports of expectations. The questions on the fertility expectations module appear in the marriage, fertility and second self-administered questionnaire sections.


Ladder of life questions. Questions on how the respondent would place himself/herself on the ladder of life are taken from the Pew Research Center Surveys and Pew Global Attitudes surveys in the last two decades, most recently in 2008. These questions will be added to the Tell Us What You Think section in Round 13.


The ladder of life series of questions employ what researchers call a “self-anchoring scale.” Respondents are first asked to give a numerical rating to their present quality of life. Then, having anchored themselves in the present, they are asked to rate the past and future the same way. They are not asked if they think the future (or past) is better or worse. They are simply asked, in succession, to rate three points in time on the same numerical scale. This battery of questions was developed by Hadley Cantril and colleagues and has been asked by a number of different organizations over the years. (See Cantril, Hadley. 1965. The Pattern of Human Concerns. New Brunswick, NJ: Rutgers University Press.)


Speech patterns. Questions about the respondent’s favorite person will be added to the Tell Us What You Think section, and a question about a visit to the doctor’s office will be added to the Health section. These questions are part of our effort to collect information on respondents’ speech patterns. This work is contingent on funding requested from the National Institutes of Health (NIH grant proposal 1R01HD061585-01) by Professor Jeffrey Grogger of the University of Chicago. Professor Grogger also is chair of the NLS Technical Review Committee.


Whites and many African Americans speak different varieties of English. Linguists have documented such differences extensively, and social psychologists have shown that speech patterns influence listeners’ attitudes toward the speaker. There has been little research into the socioeconomic consequences of racial differences in speech, however. Some recent research by Professor Grogger shows that speech differences explain a sizeable share of the racial wage gap among young workers (Grogger 2008). Specifically, black workers with distinctly African American speech patterns earned roughly 12 percent less than comparably skilled white workers. Black workers without distinctly African American speech patterns faced only a 3 percent wage penalty. The goal of the proposed research is to expand the understanding of the role speech patterns play in explaining labor market differences between blacks and whites.


For Hispanics, the link between language and labor market success has been widely studied. Numerous studies show that English proficiency is strongly related to wages and earnings, and Trejo (1997) estimates that most of the wage gap between Mexican immigrants and U.S. natives is linked to language. In contract, the link between language and the black-white wage gap has never been analyzed, despite an extensive linguistics literature which documents differences between SAE and AAE and strong evidence which links AAE to early educational disadvantage. This study seeks to remedy this omission in the literature.


To collect speech data, respondents will be asked to answer two sets of stimulus questions. One set is designed to capture formal speech, the other casual speech. By priming respondents for formal and informal speech, we plan to construct a measure of code-switching among bi-dialectical respondents. Many African Americans speak both African American English (AAE) and Standard American English (SAE), using SAE in more formal settings and AAE in more casual contexts (Baugh 1983; Labov 1972).


To prompt relatively formal speech, we will pose the following question:


Suppose you had the flu bad enough that you went to a clinic or doctor’s office. How would you explain to the doctor or nurse how you felt?


We expect the doctor’s office setting to prime bi-dialectical respondents to speak SAE. The reason is that the doctor’s office represents a relatively formal setting, even though doctor visits are a common occurrence (95 percent of NLSY97 respondents reported having seen the doctor for a routine checkup during the past five years). Furthermore, Baugh (1983) and Rickford and McNair-Knox (1994) report that code-switching is influenced by the topic of discussion, and Linnes (1998) reports this pattern to be particularly pronounced for speakers under age 30, such as the NLSY97 respondents.


The questions to capture casual speech take the following form:


Lead-in: I want you to think of somebody who is one of your favorite people, maybe a friend or relative. (If the respondent asks: You don’t have to tell me who it is; he/she doesn’t have to be your absolute favorite, just one of your favorites.)


(1) Describe his (her) personality for me.


(2) What does he (she) do that makes him (her) one of your favorites?


(3) If you needed help with something, what would he (she) do?


The lead-in is designed to put the respondent at ease. The subject matter is chosen to put the respondent in a relaxed frame of mind and induce him/her to speak informally. Linnes (1998) and Rickford and McNair-Knox (1994) report that topics related to family and friends are particularly effective in evoking AAE features. Furthermore, the questions are designed to elicit specific high-frequency features. The first question, asking for a description, is designed to elicit copula deletion (for example, “she nice” versus “she’s nice”). The second question is designed to elicit the production of third person singular present tense forms (for example, “she say” versus “she says”). The third question is designed to elicit auxiliary deletion (for example, “he help” versus “he would help”). Of the roughly 30 features that distinguish AAE and SAE, these three features occur most frequently in adults (Linnes 1998; Washington and Craig 2002).


To analyze the effects of speech patterns, we need to transform the audio responses to our stimulus questions into measures that can be incorporated into econometric models. We propose to do this through two approaches. First, we plan to collect listener perception data of the type employed in Grogger (2008). The questionnaire that listeners will use to code their perceptions of the race, ethnicity, and other characteristics of speakers is shown in attachment 9.


Second, we plan to construct quantitative dialect density measures (DDM’s) based on linguistic analyses of the respondent’s speech. The DDM is a ratio of the number of AAE features produced by the speaker to the number of words he or she speaks. To compute the DDM, AAE features produced in response to the stimulus questions will be coded for a potential set of 24 morphological and 9 phonological types based on an established coding system (Craig & Washington, 2006). Coding will be supported by the Child Language Data Exchange System (CHILDES, MacWhinney, 1994). The frequency command (FREQ) tallies frequencies of coded behaviors, and the mean length of turn command (MLT) counts units within a turn such as words. The Computerized Language Analysis Program (CLAN) from CHILDES automatically tallies production frequencies and rates. From the computerized tallies, the DDM will be calculated, in which AAE frequencies (tokens) regardless of type are divided by the total number of words produced in response to the questions (Craig, Washington, and Porter-Thompson, 1998; Oetting & McDonald, 2002). For example, 5 instances of AAE out of 100 words would yield a DDM of 0.05, corresponding to the production of one feature every 20 words.


To construct the DDM, there is no questionnaire per se. Rather, technicians transcribe the respondent’s speech according to the standards of the CHILDES speech transcription system. Morphosyntactic and phonological features unique to AAE are incorporated through modifications to this system specified in Craig, Washington, and Porter-Thompson (1998). The transcripts are then processed by the CLAN component of the CHILDES system. CLAN provides counts of AAE features, which are then normalized by the total number of words spoken to produce the DDM. This is a numeric measure that lies between 0 and 1 and indicates the rate at which the speaker produces distinctive AAE features.


NORC’s staff of telephone interviewers will listen to the recorded speech of NLSY97 respondents and code information about their speech patterns. Using NORC staff members provides for much greater security than could be achieved through other means. Although the audio data themselves are confidential due to their potential to identify respondents, the measures we propose to construct and make available on public-use files need not be, since they pose no significant confidentiality risk.


References


Craig, H. K., Washington, J. A., & Thompson-Porter, C. (1998), “Average c-unit lengths in the discourse of African American children from low income, urban homes,” Journal of Speech, Language, and Hearing Research, 41, pp. 433–444.


Trejo, Stephen J. “Why Do Mexican Americans Earn Low Wages?,” Journal of Political Economy, 105 (6), 1997, pp. 1235-1268.




A list of all changes to the NLSY97 questionnaire from rounds 12 to 13 is shown in attachment 10.



5. Statistical Consultant

Kirk M. Wolter

Senior Fellow and Director

Center for Excellence in Survey Research

NORC

55 East Monroe Street

Chicago, IL 60603

(312) 759-4206


The sample design was conducted by NORC, which continues the interviewing fieldwork.


File Typeapplication/msword
AuthorNora Kincaid
Last Modified Byrowan_c
File Modified2009-04-28
File Created2009-03-17

© 2024 OMB.report | Privacy Policy