Quantitative Data Analysis

Yaprak Dalat Ward

8 Quantitative Data Analysis

Yaprak Dalat Ward

Definitions of Key Terms

Causation: A relationship between variables, wherein one causes a change in another.
Correlation: A statistical relationship between variables, wherein they vary positively (when one goes up, the other also goes up; or when one goes down, the other also goes town) or negatively (when one goes up, the other instead goes down; or when one goes down, the other goes up).
Correlational Design: A quantitative research design that examines relationships between variables but does not imply causation.
Data: The plural form of the singular “datum;” Leedy and Ormrod (2005) defined data as the manifestations of what reality is. In quantitative research, numerical data are collected, but data can take many other forms.
Descriptive Design: A quantitative research design that describes trends and characteristics (e.g., surveys, observational studies) in terms of descriptive statistics
Descriptive Statistics: Simple measures that describe a variable, such as mean, median, mode, standard deviation, variance.
Experimental Design: A quantitative research design to test hypotheses, wherein (1) participants are assigned randomly but representatively to an experimental group and a control group, (2) all variables are tightly controlled, and (3) some treatment/intervention/experimental condition is implemented to compare data before/after.
Hypothesis: An assumption to be tested that attempts to explain the relationship between certain variables.
Inferential Statistics: Statistical analyses that attempt to demonstrate relationships among variables, such as t-tests, ANOVA, chi-square tests, and regression analysis.
Instrument(s): Tools which are used to collect data, such as surveys.
Null Hypothesis: An inverse of the hypothesis of a study wherein it is put forward that there is no relationship between the variables you are testing.
Pilot: When selecting an instrument, such as a survey, it needs to be tested with a small group to determine its reliability and validity.
Prediction: Following a correlational research, researchers can make predictions (forecasting) related to the correlational research outcome.
p-value: A measure of the statistical significance of findings, wherein it is the likelihood that the null hypothesis is correct and the actual hypothesis should be rejected.
Quasi-Experimental Design: Like a true experiment but without full control of the variables, which can limit the power of its findings (especially in the attempt to show cause-and-effect relationships).
Reliability: Ensures consistent results across repeated trials.
Sample: A group that is selected (randomly, purposefully, conveniently, etc.,) from a population. The population is a large group of things that have a common trait (ex., living in the United States), and a sample is a smaller group selected from the population.
Statistical Significance: When numerical data are analyzed, in general the findings should be indicated as “statistically significant”
Validity: Ensures the study measures what it intends to
Variable: A thing which varies and can be measured in quantitative research. Variables can be grouped as 1) an independent variable (which is manipulated) and 2) a dependent variable (which gets measured).

Plan to Conduct a Rigorous and Successful Quantitative Study (continued)

Note: Steps 1-5 are located in the previous chapter.

6. Data Analysis

This step is about developing a comprehensive data analysis plan outlining the statistical techniques and procedures you will use to analyze your data / test your hypotheses. Once we collect our numerical data, here is what we need to do.

Data Description: At a first glance, prior to our analysis, we need to describe and summarize our data. During this stage statistics plays a crucial role. This first form of statistics, descriptive statistics includes measures of central tendency (mean, median, mode) and measures of variability (range, variance, standard deviation), providing researchers with a clear understanding of the characteristics and distribution of their data. By organizing and presenting data effectively, statistics can help researchers identify patterns, trends, and outliers, laying the foundation for further analysis.
Data Analysis: Second, quantitative research often involves analyzing numerical data to test hypotheses, identify relationships, and make predictions. Statistical analysis techniques, such as inferential statistics, enable researchers to draw conclusions about populations based on sample data (Cohen & Swerdlik, 2002). By applying probability theory and hypothesis testing methods, we can determine the likelihood of observed differences (experimental design) or relationships (correlational design) being reflective of true population parameters. Common statistical tests include t-tests, analysis of variance (ANOVA), correlation coefficient, regression analysis, and chi-square tests, among others.

Remember quantitative data analysis will yield a significant “yes” or a significant “no.” There is no deeper understanding, like in qualitative research.

Using Software to Analyze Quantitative Data

When we decide which statistical test to use, there are two primary software programs used for calculations: SPSS or R Guide. We can import our numerical data, tell the program what to do (compare, correlate, describe, etc.,) and we will obtain the results in seconds. It is, however, up to us to interpret the findings (depending on the value you get—significant or not).

Secondary Data Analysis: Just a few words on secondary data analysis. What does this entail? Secondary data analysis involves analyzing existing datasets collected by other researchers or organizations for purposes other than the original research question, as opposed to our own collection of new data to analyze. It allows researchers to leverage existing data to address new research questions or replicate previous findings.
- Example of Secondary Data Analysis: A researcher is interested in studying the effects of socioeconomic status on educational attainment. Instead of collecting new data, they decide to use existing survey data from a national education database that includes information on students’ socioeconomic backgrounds and academic achievement.
Existing Websites for Datasets:
- The U.S. Census Bureau: The U.S. Census Bureau provides a wide range of demographic, economic, and social datasets collected through national surveys and censuses. Researchers can access data on population characteristics, household income, education, employment, and more.
- The World Bank: The World Bank offers datasets on global development indicators, including poverty, health, education, gender equality, and environmental sustainability. Researchers can access data from various countries and regions to conduct cross-national analyses.
- The Inter-university Consortium for Political and Social Research (ICPSR): ICPSR is an international consortium that provides access to a vast archive of social science datasets. Researchers can find datasets on topics such as sociology, political science, economics, psychology, and public health.
- Kaggle Datasets: Kaggle is a platform for data science competitions and collaboration where researchers and data enthusiasts can access and share datasets on a wide range of topics. Users can explore datasets, participate in competitions, and engage with the data science community.
- Data.gov: Data.gov is the official open data portal of the United States government, providing access to thousands of datasets from federal agencies and departments. Researchers can find datasets on various topics, including agriculture, climate, energy, health, transportation, and more.
- National Center for Education Statistics (NCES): The NCES collects P-20 educational data across the nation and provides it for researchers.
- Google Dataset Search: Google Dataset Search is a tool that allows researchers to discover datasets from a wide range of sources across the web. Users can search for datasets by topic, keywords, or specific data attributes.

These are just a few examples of websites where researchers can find datasets for secondary data analysis. Depending on your research topic and discipline there are many other data repositories and sources available for accessing and analyzing existing datasets. The following databases such as Web of Science (WoS) and SCOPUS are two widely used multidisciplinary databases that provide access to scholarly literature, including journal articles, conference proceedings, and other academic publications. While these databases primarily serve as platforms for accessing research articles rather than raw datasets, researchers can still utilize them in several ways for quantitative research.

Literature Review and Background Research: Researchers can use WoS and SCOPUS to conduct comprehensive literature reviews and gather background information on their research topics. By searching for relevant keywords, authors, or topics, researchers can identify existing studies, theories, and methodologies related to their research area, helping to contextualize their own research within the broader scholarly discourse.
Citation Analysis: WoS and SCOPUS provide citation databases that allow researchers to track citation patterns, identify influential articles, and analyze citation networks within specific fields or research domains. Citation analysis can help researchers identify key publications, researchers, and research trends, providing insights into the impact and dissemination of scholarly work overtime.
Bibliometric Analysis: Researchers interested in bibliometric analysis, scientometrics (patterns and trends in scientific literature), or research evaluation can use WoS and SCOPUS to collect bibliographic data on publications, authors, journals, and institutions. By analyzing publication patterns, citation counts, collaboration networks, and other bibliometric indicators, researchers can assess research productivity, impact, and collaboration patterns within specific disciplines or research communities.
Data Mining and Text Analysis: While WoS and SCOPUS primarily index metadata and abstracts of publications, researchers can still access full-text articles from many journals within these databases. Researchers interested in data mining, text analysis, or natural language processing techniques can extract data from full-text articles to conduct content analysis, sentiment analysis, topic modeling, or other text mining approaches.
Quantitative Analysis and Meta-analysis: Researchers can download metadata or citation data from WoS and SCOPUS to conduct quantitative analysis or meta-analysis studies. By aggregating data from multiple studies or publications, researchers can analyze trends, patterns, or relationships across a larger body of literature, providing empirical evidence to support their research hypotheses or research questions. Overall, while WoS and SCOPUS may not offer raw datasets in the same way as dedicated data repositories, researchers can still leverage these databases to access scholarly literature, conduct bibliometric analyses, track citation patterns, and gather data for quantitative research studies. By combining insights from the literature with other research methods and data sources, researchers can enhance the rigor, validity, and impact of their quantitative research endeavors.

7. Reporting

Once you complete tour research, you may want to share it because reporting is all about sharing your findings and informing stakeholders about your findings depending on the purpose. The researcher who publishes his/her research gets the ownership of that research adding to the field of knowledge. The research which is not published does not have any value. Report writing requires general research language and writing guidelines including the step-by-step plan. In addition, the writing needs to adhere to a publication style. In educational and social science research, the current edition (7th, currently) of the Publication Manual of the American Psychological Association (2020) is the most commonly used style guide.

Furthermore, there are many ways to report findings depending on why research is conducted in the first place and who the audience is. Is this a school report to be shared with a school community? Is this a report to make data driven decisions? Is this a proposal, the first part of a thesis/dissertation which leads to a degree? Is this a journal article to be published in a peer-reviewed journal? Or is this a conference paper to be shared with the conference community?

Let us now look into how we can effectively report our quantitative research to ensure clarity, transparency, and credibility. A well-structured research report typically follows a format which can be abbreviated as IMRaD:

Introduction: Establishes the research problem, significance, objectives, and hypotheses. It provides a theoretical framework and a review of relevant literature.
Methodology: Details the research design, sample selection, data collection procedures, and statistical analysis techniques. This section ensures replicability and justifies methodological choices.
Results: Presents findings using descriptive and inferential statistics, often with tables, graphs, and figures. Data should be reported objectively, avoiding interpretation in this section.
Discussion: Interprets results in relation to the research question, comparing findings with existing literature. Discusses implications, limitations, and potential biases. Additionally, a summary is provided on key findings, highlight of contributions with suggested directions for future research or practical applications.

In sum, in reporting it is fundamental to 1) use clear, concise, and precise language; 2) report statistics correctly; 3) avoid overgeneralization and acknowledge study limitations; and 4) adhere to current APA or other discipline-specific formatting guidelines.

The Role of Statistics in Quantitative Research

As in the previous chapter, unlike qualitative research, which focuses on understanding subjective experiences and perspectives, quantitative research aims to produce objective, replicable findings based on quantifiable measures.

The role of statistics in quantitative research is multifaceted and essential for various aspects of the research process, but before explaining the roles of statistics (noun/singular) in quantitative research, it is critical to define what statistics really is. According to Rowntree (1984), statistics has at least four different explanations: 1) it is a subject or discipline; 2) methods used to collect, analyze, and interpret data; 3) it may “refer to certain specially calculated figures (e.g. an average) that somehow characterize such a collection of data” (p. 17).

Additionally, according to Stigler (2016):

statistics has changed dramatically from its earliest days to the present, shifting from a profession that claimed such extreme objectivity that statisticians would only gather—not analyze them—to a profession that seeks partnership with scientists in all stages of investigation, from planning to analysis. (p. 1)

Now let us explore the details of statistics in data description (descriptive statistics) and data analysis (inferential statistics), two fundamental branches of quantitative research after data collection. Let us also remember that “in many instances… the researcher’s primary objective is to draw conclusions that extend beyond the specific data that are collected” (Huck, 2000, p. 111).

Data Description: At a first glance, prior to our analysis, we need to describe and summarize our data. During this stage statistics plays a crucial role. This first form of statistics, descriptive statistics includes measures of central tendency (mean, median, mode) and measures of variability (range, variance, standard deviation), and provide researchers with a clear understanding of the characteristics and distribution of their data (Popham & Sirotnik, 1992). By organizing and presenting data effectively, statistics can help researchers identify patterns, trends, and outliers, laying the foundation for further analysis.
Data Analysis: Second, quantitative research often involves analyzing numerical data to test hypotheses, identify relationships, and make predictions. Statistical analysis techniques, such as inferential statistics, enable researchers to draw conclusions about populations based on sample data. By applying probability theory and hypothesis testing methods, researchers can determine the likelihood of observed differences (experimental design) or relationships (correlational design) being reflective of true population parameters. Common statistical tests include t-tests, analysis of variance (ANOVA), correlation coefficient, regression analysis, and chi-square tests, among others.

Now let us delve into the details of descriptive and inferential statistics to be able to analyze and interpret numerical data.

Descriptive Statistics

Descriptive statistics is used to summarize and describe the main features of a dataset. They provide simple summaries about the sample and the measures. Common descriptive statistics include 1) measures of central tendency (such as mean, median, and mode) and 2) measures of variability (such as range, variance, and standard deviation). Descriptive statistics helps researchers understand the basic characteristics of their data, such as its distribution, dispersion, and typical values. They are particularly useful for organizing and presenting data in a clear and understandable manner, making it easier to interpret and draw preliminary conclusions.

For an example, first, if a school wanted to determine the average achievement test results, they we would need to look into the mean score by adding up all the scores and dividing them up by the total number of participants. Second, if they wanted to know what the median of the test results is, they would look into the score that falls in the middle of the distribution. Third, they may want to calculate the mode, which is looking into the scores regarding which is the most frequently repeated score. Understandably, these scores do not tell us the spread/range of scores.

If we wanted to see how the achievement test scores were clustered, we would need to look into the measures of variability (such as range, variance, and standard deviation). For an example to calculate the standard deviation (SD), we need the mean of the scores as a starting point, and look into the differences between scores.

Inferential Statistics

Inferential statistics, however, is used to make inferences or predictions about a population based on sample data. Unlike descriptive statistics, which focus on summarizing observed data, inferential statistics involves using probability, making estimates or predictions to draw conclusions about the population from which the sample was drawn. Researchers use inferential statistics to test hypotheses, make predictions, and determine the likelihood that observed differences or relationships in the sample are reflective of true differences or relationships in the population. Common inferential statistical techniques include t-tests, analysis of variance (ANOVA), regression analysis, and chi-square tests. These techniques enable researchers to generalize findings from their sample to the broader population, providing insights into underlying relationships and patterns that may exist beyond the observed data.

To summarize, in quantitative research, both descriptive and inferential statistics play crucial roles in data analysis and interpretation. Descriptive statistics are often used to summarize and present the main characteristics of the data, providing an initial understanding of the variables under study. Inferential statistics, on the other hand, allow researchers to test hypotheses, make predictions, and draw conclusions about population parameters based on sample data. By combining both descriptive and inferential statistics, researchers can gain comprehensive insights into their research questions, identify significant findings, and make informed decisions based on empirical evidence.

Overall, descriptive and inferential statistics are essential tools in quantitative research, enabling researchers to analyze, interpret, and draw meaningful conclusions from numerical data. By employing these statistical methods appropriately, researchers can contribute to the advancement of knowledge in their respective fields and inform evidence-based practices and policies.

Conclusion

Quantitative research provides a structured and objective approach to investigating educational phenomena through numerical data. Its strength lies in its ability to generalize findings, identify patterns, and establish relationships or causal links using statistical methods. Common designs include descriptive, correlational, quasi-experimental, and experimental studies, each serving different research purposes. Researchers rely on descriptive and inferential statistics to analyze data, ensuring validity and reliability in their findings. While quantitative methods offer precision and replicability, they may overlook contextual nuances. Ethical considerations (see section 2 of this book entitled Values & Ethics of Social Research), such as informed consent, data integrity, and transparency, are essential to conducting rigorous research. By mastering quantitative approaches, graduate students can critically assess educational issues and contribute to evidence-based decision-making in the field.

Key Takeaways

Since quantitative studies produce numerical data, statistical analysis is central to the findings and applications of quantitative research on social/behavioral topics.
Descriptive statistics describe a variable (such as an average response), whereas inferential statistics seek potential relationships among the variables (such as cause-and-effect).
Validity and reliability are key metrics of the quality of a quantitative study and its instruments.

Additional Resources

Quantitative Data Analysis Software

SPSS (https://www.ibm.com/spss)

R (https://www.r-project.org/) – free

Datasets

The U.S. Census Bureau

The World Bank

The Inter-university Consortium for Political and Social Research (ICPSR)

Kaggle Datasets

Data.gov

National Center for Education Statistics (NCES)

Google Dataset Search