Final Supt. Stmt. Part B - NASA OSTEM Performance Measurement and Evaluation Testing (7-23-21) REVISED

Final Supt. Stmt. Part B - NASA OSTEM Performance Measurement and Evaluation Testing (7-23-21) REVISED.docx

Generic Clearance for the NASA Office of Education Performance Measurement and Evaluation (Testing)

OMB: 2700-0159

⚠️ Notice: This form may be outdated. More recent filings and information on OMB 2700-0159 can be found here:

Document [docx]

Download: docx | pdf

Section Page

COLLECTION OF INFORMATION EMPLOYING STATISTICAL METHODS 2

References 8

APPENDIX A: Descriptions of Methodological Testing Techniques 10

APPENDIX B: Privacy Policies and Procedures 13

List of Tables 15

GENERIC CLEARANCE FOR THE NASA

OFFICE OF STEM ENGAGEMENT/PERFORMANCE MEASUREMENT AND EVALUATION (TESTING) SUPPORTING STATEMENT

B. COLLECTION OF INFORMATION EMPLOYING STATISTICAL METHODS

RESPONDENT UNIVERSE AND SAMPLING METHODS

Respondent Universe

The respondent universe for NASA Office of STEM Engagement (OSTEM) methodological testing consists of individuals who either participate in NASA STEM Engagement activities or are staff managing STEM Engagement activities (both at NASA and funded through NASA grants, cooperative agreements, and contracts). It is difficult to anticipate and define all the types of potential respondents under this generic clearance beyond the most immediate needs for this generic clearance, but below are descriptions of the individuals who could represent the respondent universe in this generic submission:

Undergraduate and graduate students participating in NASA-funded internships and fellowships;
P-12 and informal educators and higher education faculty participating in NASA-funded educator professional development;
Precollege students participating in NASA-funded STEM Engagement project and activities;
NASA civil servants who manage projects and activities; and
Primary investigators and managers of NASA-funded grants, cooperative agreements, and contracts.

Respondent categories with a corresponding estimate for each Potential Respondent Universe

(N) anticipated for this generic clearance can be found below (See Table 1). Expected Response Rate is defined as the past rate of response observed in OSTEM for that particular Respondent Category. For instance, precollege, undergraduate, graduate, and post-graduate students are especially responsive to OSTEM requests for information because at the onset of establishing a relationship with NASA, they are not eligible to apply for NASA internships and fellowships with incomplete information. For that reason, these respondents willingly partner with OSTEM to maintain current contact information in order to access current information pertaining to relevant opportunities. OSTEM Educational Platform and Tools Team has IT applications which can provide participants the opportunity to update contact information in a way that is less burdensome, through automated delivery of links to the NASA STEM Gateway system wherein opening the link will take the participant directly to a log in screen appropriate to her or his project activity or program.

Further, for these categories of respondents who are affectively characterized as highly motivated individuals, they understand the value of submitting feedback to optimize future opportunities they may be awarded. Therefore, they tend to be highly motivated to cooperate with NASA OSTEM requests for information at a rate of 60%. The same can be said for educator participants who must complete information in our systems in order to partake of professional development opportunities or provide retrospective feedback on the NASA STEM Engagement activity they facilitate/instruct.

External program managers are required to submit information to our online data collection systems and therefore it is not difficult to leverage Center points of contact to obtain data submitted in a timely fashion. Therefore, 100% compliance with a request for information, even in the form of participation in data collection instrumentation, is a reasonable expectation. Note that some testing methods (e.g., focus groups, cognitive interviews) require nine participants or less. These numbers are not reflected below. Data collection through focus groups and cognitive interviews for testing purposes will not be used to generalize results, but rather for preliminary item and instrument development, and piloting only¹. Table 1 below reflects potential respondent universe, expected response rates, and statistically adjusted number of respondents for each respondent category.

Table 1: Respondent Universe and Relevant Numbers

	Respondent Category	Potential Respondent Universe (N)	Expected Response Rate (R)	Statistically Adjusted Number of Respondents (n)
NASA STEM Gateway	Students (15 and younger)	9,200	0.6	9,200
	Students (16 and older)	9,200	0.6	9,200
	Educators and Parents	4,000	0.6	4,000
		22,400		22,400

¹Further description of methodological testing techniques can be found in Appendix A.

Sampling Methods

Systematic Random Sampling

For each Respondent Category, OSTEM has identified the potential respondent universe for the purposes of piloting instruments. OSTEM will systematically generate a random list in length corresponding to n Statistically Adjusted Number of Respondents wherein every nth element from the population list will be selected (Hesse-Biber, 2010, p. 50.). This process attempts to create a sampling frame that closely approximates in characteristics pertinent to the Respondent Universe for each data collection instrument.

Nonprobability Purposive Sampling

For the purposes of focus groups and cognitive interviews, nonprobability purposive sampling will be used wherein the research purpose determines the type of elements or respondents selected in the sample. This sampling strategy gathers a collection of specific informants deemed likely to exemplify patterns of behavior or characteristics reflective of the Respondent Universe from which they are drawn necessary for the purposes specific to a particular data collection instrument under development (Hesse-Biber, 2010, p. 126). Even in the event that a focus group or cognitive interview fails to yield persuasive results, the P&E Team will not interview a participant more than once. Instead, the P&E Team will recruit an entirely new focus group or set of participants for cognitive interviews. Obtaining statistical rigor later on in the process begins by avoiding introduction of confounding variables in the preliminary stages of instrument design. Interviewing a participant twice in a cognitive interview or including her or him in a new focus group may be a source of confounding variables and should be entirely avoided.

PROCEDURES FOR COLLECTING INFORMATION

Describe the procedures for the collection of information including:

Statistical methodology for stratification and sample selection:

Not applicable. For the purposes of this data collection instrument development, OSTEM has no need for instrumentation specific to subgroups within any of the Respondent Universe categories of interest.

For the purposes of large-scale statistical testing, we will make consideration of the aforementioned variables within the context of this methodological testing package to ensure that the collection of responses statistically resembles each Respondent Universe. For each instrument tested under this methodological testing package, we will identify a sample that is appropriate based on the instrument and type of collection.

*Degree of accuracy needed for the purpose described in the justification:

OSTEM project activities target STEM-related activities. Hence, instrumentation and the sample with which data collection instrumentation is tested must correspond with a high degree of accuracy. Moreover, because data from these instruments is used to inform policy, a high degree of accuracy must be integrated throughout the entire data collection instrument process.

Unusual problems requiring specialized sampling procedures,

Not applicable. The P&E Team does not foresee any unusual problems with executing pilot or large-scale statistical testing via the procedures described.

²Again, in this instance, the category “pre-college” refers to students who are over the age of consent, but have not formally enrolled in a college or university. As such, this group of students applies for opportunities associated with college preparation as a means to become more competitive for enrollment in college or as a means to explore potential STEM majors prior to enrolling in college or university.

Any use of periodic (less frequent than annual) data collection cycles to reduce burden.
- Since this information collection request applies to methodological testing activities, data collection activities will occur as needed to gather statistically significant data to appropriately determine the validity and reliability characteristics of instruments, where applicable, and the psychometric properties of instrumentation, where applicable.
- Rigorously tested data collection instrumentation is a requirement for accurate performance reporting. If these testing activities are not conducted, NASA will not be able to conduct basic program office functions such as strategic planning and management.
- Without the timely and complete set of planning, execution, and outcome (survey) data collected by valid and reliable instruments, OSTEM will be unable to assess program effectiveness, meet federal and agency reporting requirements, or make data informed management decisions.
- Less timely and complete information will adversely affect the quality and reliability of the above-mentioned endeavors. The degradation of any single component of our data collection would jeopardize the integrity and value of the entire suite of applications and the integrity of our databases
- Information collected under the purview of this clearance will be maintained in accordance with the Privacy Act of 1974, the e-Government act of 2002, the Federal Records Act, and as applicable, the Freedom of Information Act in order to protect respondents’ privacy and the confidentiality of the data collected (see: http://www.nasa.gov/privacy/nasa_sorn_10EDUA.html). Further information on data security is provided in Appendix B.

METHODS TO MAXIMIZE RESPONSE

Maximizing response rates and managing issues of non-response are equally relevant concerns in recruiting participants for pilot testing and routine data collection instrument administration. In that regard, the P&E Team in collaboration with the Educational Platforms and Tools Team and Center STEM Engagement Offices, intends to utilize such methods to reach each targeted population to yield statistically significant data from a random sample of at least 200 respondents to determine initial reliability coefficients and validity (Komrey and Bacon, 1992; Reckase, 2000). Furthermore, the same procedures will be employed during regular data collection through OMB-approved instruments. Meaning, similar patterns of effectiveness of participant recruitment strategies and response rates are inextricably linked and any procedures for maximizing response rates, as complex as they may be, are interdependent (Barclay, Todd, Finlay, Grande, & Wyatt, 2002). Therefore, despite the wide range of data sources being recruited for study participation— undergraduate student, graduate student, or educator, for instance—the same strategies for maximizing response apply.

Study Participant Recruiting

The P&E Team will work in collaboration with the Educational Platform and Tools Team and Center STEM Engagement Offices to use a combination of recruitment by NASA Center STEM Engagement Directors and automatic email reminders adopted from Swail and Russo (2010) to maximize participant response rates for data collection instrument testing. Participant contact lists will be solicited from the appropriate Center Point of Contact (POC) for the respondent population sampled. Center POCs will use one month to identify respondents who agree to participate and submit their contact information to the P&E Team. Bi-weekly reminders will be sent and follow-up phone calls will be made to POCs as needed.

Participant Assignment to Study

Using random assignment, respondents will be assigned to an instrument for which their responses are appropriate with the goal of having equal numbers of participants completing instruments across testing sites and to avoid Center effects, meaning, responses to survey instruments related to a participant’s Center culture.

TESTING OF PROCEDURES

This submission is in itself a request for authorization to conduct tests of data collection instruments that are in development and/or require OMB approval. The purpose of cognitive and other forms of intensive interviewing, and of the testing methods in general covered by this request, is not to obtain data, but rather to obtain information about the processes people use to answer questions as well as to identify any potential problems in the question items or instruments prior to piloting with a statistically relevant sample of respondents. In some cases, focus group and/or cognitive interview protocols will be submitted for OMB approval. In other cases where the evidence base provided by the educational measurement research literature has provided a basis for a reasonable instrument draft consistent with a program activity, the instrument draft will be submitted to OMB for approval for pilot testing. The testing procedures and methodologies to be used by OSTEM and its contractors are, overall, consistent with the educational measurement research literature evidence base and other Federal agencies engaged in STEM program performance data collection.

CONTACTS FOR STATISTICAL ASPECTS OF DATA COLLECTION

NASA OSTEM Leadership along with Richard L. Gilmore Jr. (NASA Office of STEM Engagement, Performance Assessment and Evaluation Program Manager) have consulted its contractor support workforce subject matter experts in performance measure development; data collection instrument design; quantitative, qualitative, and mixed-methods research; inferential and descriptive statistics; user-generated content; big data analytics; education research and analysis.

References

Barclay, S., Todd, C., Finlay, I., Grande, G., & Wyatt, P. (2002). Not another questionnaire!

Maximizing the response rate, predicting non-response and assessing non-response bias in postal questionnaire studies of GPs. Family Practice, 19(1), 105-111.

Blalock, H. M. (1972). Social statistics. New York, NY: McGraw-Hill.

Colton, D., & Covert, R. W. (2007). Designing and constructing instruments for social reserch and evaluation. San Francisco: John Wiley and Sons, Inc.

Costello, A. B., & Osborne, J. W. (2005). Best practices in exploratory factor analysis: Four recommendations for getting the most from your analysis. Practical Assessment, Research & Evaluation, 10(7), 1-9.

Davidshofer, K. R., & Murphy, C. O. (2005). Psychological testing: Principles and applications.

(6th ed.). Upper Saddle River, NJ: Pearson/Prentice Hall.

DeMars, C. (2010). Item response theory. New York: Oxford University Press.

Fabrigar, L. R., & Wegener, D. T. (2011). Exploratory factor analysis. New York, NY: Oxford University Press.

Haladyna, T. M. (2004). Developing and validating multiple-choice test items (3rd ed.).

Mahwah, NJ: Lawrence Erlbaum Associates, Inc.

Hesse-Biber, S. N. (2010). Mixed methods research: Merging theory with practice. New York: Guilford Press.

Jaaskelainen, R. (2010). Think-aloud protocol. In Y. Gambier, & L. Van Doorslaer (Eds.),

Handbook of translation studies (pp. 371-373). Philadelphia, PA: John Benjamins.

Komrey, J. D., & Bacon, T. P. (1992). Item analysis of acheivement tests based on small numbers of examinees. Paper presented at the annual meeting of the American Educational Research Association. San Francisco.

Kota, K. (n.d.). Testing your web application: A quick 10-step guide. Retrieved from http://www.adminstrack.com/articles/testing_web_apps.pdf.

Reckase, M. D. (2000). The minimum sample size needed to calibrate items using the three- parameter logistic model. Paper presented at the annual meeting of the American Educational Research Association. New Orleans.

Swail, W. S., & Russo, R. (2010). Instrument field test: Quantitative summary. Library of Congress- Teaching with Primary Sources: Educational Policy Institute.

Wilson, M. (2005). Constructing measures: An item response modeling approach. New York: Psychology Press.

Yamane, T. (1973). Statistics: An introductory analysis. New York: Harper & Row.

APPENDIX A: Descriptions of Methodological Testing Techniques

Usability testing: Pertinent are the aspects of the web user interface (UI) that impact the User’s experience and the accuracy and reliability of the information Users submit. The ease with which Users navigate the data collection screens and the ease at which the User accesses the actions and functionality available during the data input process are equally important. User experience is also impacted by the look and feel of the web UI and the consistency of aesthetics from page to page, including font type, size, color scheme utilized and the ways in which screen real estate is used (Kota, n.d.). The foundation for Usability testing will be a think-aloud protocol analysis as described by Jääskeläinen (2010) that exposes distractions to accurate input of data whereas a short Likert Scale survey with qualitative questions will determine the extent of distraction and nature of the distractions that impede accurate data input.
Think-aloud protocols (commonly referred to as cognitive interviewing): This data elicitation method is also called ‘concurrent verbalization’, meaning subjects are asked to perform a task and to verbalize whatever comes to mind during task performance. The written transcripts of the verbalizations are referred to as think-aloud protocols (TAPs) (Jääskeläinen, 2010, p 371) and constitute the data on the cognitive processes involved in a task (Ericsson & Simon, 1984/1993). When elicited with proper care and instruction, think-aloud does not alter the course or structure of thought processes, except with a slight slowing down of the process. Although high cognitive load can hinder verbalization by occupying all available cognitive resources, that property is of no concern regarding the tasks under analysis that are restricted to information actively processed in working memory (Jääskeläinen, 2010, p. 371). For the purposes of NASA Education, think-aloud protocols will be especially useful towards the improvement of existing and developing of new data collection screens, which are different in purpose from online applications. Whereas an online application is an electronic collection of fields that one either scrolls through or submits, completed page by completed page, data collection screens represent hierarchical layers of interconnected information for which user training is required. Since user training is required for proper navigation, think-aloud protocols capture the user experience to incorporate it into a more user-friendly design and implementation of this kind of technology. Lastly, data from think-aloud protocols is used to ensure that user experiences are reliable and consistent towards collecting robust data.
Focus group interviews: With groups of nine or less per instrument, this qualitative approach to data collection is a matter of brainstorming to creatively solve remaining problems identified after early usability testing of data collection screen and program application form instruments (Colton & Covert, 2007, p. 37). Data from this type of research will include audiotapes obtained with participant consent, meeting minutes taken

by a subject matter expert in administration assistance, and reflective comments submitted by participants after conclusion of the focus group. Focus group interviews may be used to refine items that failed initial reliability testing for the purposes of retesting. Lastly, focus group interviews may be used with participants as a basis for a grounded theory approach to instrument development or for refining an already existing instrument to be appropriate to a specific audience.

Comprehensibility testing: Comprehensibility testing of program activity survey instrumentation will determine if items and instructions make sense, are ambiguous, and are understandable by those who will complete them. For example, comprehensibility testing will determine if items are complex, wordy, or incorporate discipline- or culturally-inappropriate language (Colton & Covert, 2007, p. 129).
Pilot testing: After program activity survey instruments have performed satisfactorily in readability and comprehensibility testing, the next phase is pilot testing with a sample of the target population that will yield statistically significant data, a random sample of at least 200 respondents (Komrey and Bacon, 1992; Reckase, 2000). The goal of pilot testing is to yield preliminary validity and reliability data to determine if items and the instrument are functioning properly (Haladyna, 2004; Wilson, 2005). Data gleaned from pilot testing will be used to fine-tune items and the instrument in preparation for more complex statistical analysis upon large-scale statistical testing.
Large-scale statistical testing: Instrument testing conducted with a statistically representative sample of responses from a population of interest. In the case of developing scales, large-scale statistical testing provides sufficient data points for exploratory factor analysis (EFA), a multivariate statistical method used to uncover the underlying structure of a relatively large set of variables and is commonly used when developing a scale, a collection of questions used to measure a particular research topic (Fabrigar & Wegener, 2011). EFA is a “large-sample” procedure where generalizable and/or replicable results is a desired outcome (Costello & Osborne, 2005, p.5). This technique is particularly relevant to examining relationships between participant traits and the desired outcomes of NASA OSTEM project activities.
Item response approach to constructing measures: Foundations for testing that address the importance of item development for validity purposes, address item content to align with cognitive processes of instrument respondents, and that acknowledge guidelines for proper instrument development will be utilized in a systematic and rigorous process (DeMars, 2010). Validity will be determined as arising from item development, from statistical study of item responses, and from exploring item response patterns via methods prescribed by Haladyna (2004) and Wilson (2005.)
Split-half method: This method for determining test reliability is an efficient solution to parallel-forms or test/retest methods. Split-half method does not require developing alternate forms of a survey and it places a reduced burden on respondents in comparison to other methods, requiring participation in a single test scenario rather than requiring retesting at a later date. This method involves administering a test to a group of

individuals, dividing the test in half along odd and even item numbers, and then correlating scores on one half of the test with scores on the other half of the test (Davidshofer & Murphy, 2005).

APPENDIX B: Privacy Policies and Procedures

Information collected under the purview of this clearance will be maintained in accordance with the Privacy Act of 1974, the e-Government act of 2002, the Federal Records Act, NPR 7100.1, and as applicable, the Freedom of Information Act in order to protect respondents’ privacy and the confidentiality of the data collected⁵.
Data is maintained on secure NASA servers and protected in accordance with NASA regulations at 14 CFR 1212.605.
Approved security plans are in place for the NASA STEM Gateway system in accordance with the Federal Information Security Management Act of 2002 and Office of Management and Budget, Circular A-130, Management of Federal Information Resources.
Only authorized personnel requiring information in the official discharge of their duties are authorized access to records from workstations within the NASA Intranet or via a secure Virtual Private Network (VPN) connection that requires two-factor hardware token authentication.
NASA STEM Gateway resides in a certified NASA data center and has met strict requirements relating to application security, network security, and backup/recovery of the NASA Office of the Chief Information Officer’s security plan.
Data will be secured and removed from this server and location upon guidelines set out by the NRRS/1392, 68-69. Specific guidelines relevant to the OPEM system include the following:
- Project management records documenting basic information about projects and/or opportunities, including basic project descriptions, funding amounts and sources, project managers, and NASA Centers, will be destroyed when 10 years old or when no longer needed, whichever is longer.

Records of participants (in any format), maintained either as individual files identified by individual name or number, or in aggregated files of multiple participants identified by name or number, including but not limited to application forms, personal information supplied by the individuals, will be destroyed 5 years after the last activity with the file.

Survey responses and other feedback (in any format) from project participants and the general public concerning NASA educational programs, including interest area preferences, participant feedback, and reports of experiences in projects, will be destroyed when 10 years old or when no longer needed, whichever is longer.

⁵http://www.nasa.gov/privacy/nasa_sorn_10EDUA.html

The following Confidentiality Statement and Paperwork Reduction Act (PRA) statement, edited per data collection source, will be posted on all data collection screens and instruments, and will be provided to participants in methodological testing activities per NPR 7100.1:

Privacy Act Statement: In accordance with the Privacy Act of 1974, as amended (5 U.S.C. 552a), you are hereby notified that this study is sponsored by the National Aeronautics and Space Administration (NASA) Office of Education, under authority of the Government Performance and Results Modernization Act (GPRMA) of 2010 that requires quarterly performance assessment of Government programs for purposes of assessing agency performance and improvement. Your participation is important to the success of this study. The information we collect will help us improve the nature of NASA education project activities and the accuracy with which NASA Office of Education can report to the stakeholders about the project activities offered.

Paperwork Reduction Act Statement: This information collection meets the requirements of 44 U.S.C. §3507, as amended by section 2 of the Paperwork Reduction Act of 1995. You do not need to answer these questions unless we display a valid Office of Management and Budget (OMB) control number. The OMB control number for this collection is 2700-0159 and expires mm/dd/yyyy. Send comments to: [email protected].

List of Tables

Table 1: Respondent Universe and Relevant Numbers 3

File Type	application/vnd.openxmlformats-officedocument.wordprocessingml.document
Author	Wills, Lisa E (HQ-HA000)[VALADOR INC]
File Modified	0000-00-00
File Created	2021-08-05