7 Qualitative Data Analysis

Phillip Olt

Definitions of Key Terms

  • Category: A synthesis of codes that creates a consolidated meaning from them
  • Code: “Most often a word or short phrase that symbolically assigns a summative, salient, essence-capturing, and/or evocative attribute for a portion of language-based or visual data” (Saldaña, 2016, p. 4)
  • Methodologist: Someone who specializes in a type of research methodology
  • QDA: Qualitative Data Analysis
  • Rich, Thick Description: Qualitative writing that presents the findings of a particular study to convey understanding in both breadth and experiential detail as situated within the participant(s) and site(s)
  • Theme: “An extended phrase or sentence that identifies what a unit of data is about and/or what it means” (Saldaña, 2016, p. 199).
  • Transcribe: To create a verbatim textual rendering of data originally created in audio
  • Trustworthiness: An approach to evaluating the quality of qualitative research based on the “integrity of the data, balance between reflexivity and subjectivity, and clear communication of findings” (Williams & Morrow, 2009, p. 577).

Qualitative Data Analysis (QDA) is perhaps the area with the most fundamental difference compared to quantitative research. QDA is often extremely time-consuming, subjective, variable, and—perhaps—frustrating. This is quite the opposite of the stereotype of quantitative analysis, which involves importing raw data into a program, selecting the tests you want it to use, clicking “Run,” waiting 2 minutes, and exporting the tables into your manuscript.

QDA as Refinement

The purpose in QDA is to answer your research question(s). Saldaña (2016) discussed this inductive process as coding → categorizing → themeing (p. 14), which is going from small (codes qualifying the data) to medium (categories essentially coding the codes) to large (the general themes that described major divisions of the categories). In that approach then, those themes should be the answers to your research question(s).

It is easy for novice and expert qualitative researchers alike to get distracted during QDA. Perhaps you asked a research question about one thing, but as you become immersed in data collection and analysis, you find something not directly related but really interesting. At this point, you have a couple options:

  1. Drop the rabbit trail and go back to analyzing for your original research question(s). Finish this study.
  2. If you see enough of something that seems significantly important, you can add a research question to the initial study to answer from these data (especially if closely associated), or you might spin the new research question(s) into a new study out of the same original data. Your original research question(s) and study might be completed or abandoned at the discretion of the researcher.

How should one do QDA?

Some questions to consider:

  1. What methodology are you using? Certain qualitative methodologies (most notably, grounded theory) have well-defined expectations for QDA.
  2. What type of data do you have? For example, visual data (such as young children’s paintings depicting the emotion they feel when taking a test) will be analyzed quite differently than the transcript of an interview.
  3. What makes the most sense for your study, as determined by you the researcher?

One of the unsettling things for new (and sometimes veteran) qualitative researchers is the reality that there usually is not a single “right” answer of how to do QDA. There is not a checklist of how to correctly do a qualitative study. You, as the expert, get to make reasoned selections, which you should be able to defend and justify (usually from methodological literature). Then, in any peer review process, you have to mentally prepare yourself for editors and/or reviewers who have very different opinions on how they think it should have been done. What makes this particularly difficult in qualitative research is that, normally, both perspectives are equally fine, just matters of preference or opinion.

Below, I will detail one common, major approach to doing QDA: Saldaña’s (2016) coding → categorizing → themeing. That is not the only way to do things, nor is it necessarily a right way to do QDA in a particular study. Students wanting to learn more should engage with Tier 2 or 3 qualitative texts/courses.

Coding

Saldaña (2016) defined a qualitative code as, “most often a word or short phrase that symbolically assigns a summative, salient, essence-capturing, and/or evocative attribute for a portion of language-based or visual data” (p. 4). Coding, then, is the process of developing and assigning those codes. There is an incredible variety of coding approaches, which is well beyond the scope of this course. One of the most straightforward methods is “in vivo coding,” which uses the participant’s own words as codes.  Consider the example below:

  • Interview Excerpt: I can remember feeling like I was starving during my morning classes before lunch, since my parents couldn’t really afford breakfast.
  • In vivo Code: STARVING

While in vivo coding is great for extracting the most important points, it normally needs a second round coding process to construct meaning. That might involve grouping codes of similar ideas (starving, super hungry, no food, etc.), similar chronology (morning hunger, lunchtime hunger, evening hunger), and/or other relevant similarities appropriate to the specific study.

Miles et al. (2014) also noted the importance of clearly defining codes so that the researcher and any members of a research team will consistently apply them. This is important as a frame of reference even for a single researcher; however, I have found it indispensable when collaborating with others. What seems obvious and/or intuitive to one person may not be to another (or obvious in a completely different way). Not doing so commonly leads to significant frustrations during the analysis and interpretation process.

Categorizing

Saldaña (2016) described the refinement from codes to categories as “synthesis” that does not reduce content but rather creates “consolidated meaning” (p. 10). In a sense, this is grouping codes, but really it is more than that, in much the same way a car is more than the sum of its parts.

Themeing

The ideas of theme, themeing, and thematic analysis might be the most used and abused terms in QDA and even qualitative research more broadly. Saldaña (2016) described a theme as something that “can be an outcome of coding, categorization, or analytic reflection, but it is not something that is, in itself, coded” (p. 15). Themes of the data, which are often represented as sub-headings underneath the “Findings” Level 1 Heading, generally “emerge from analysis” (p. 16). A theme, then, is, “an extended phrase or sentence that identifies what a unit of data is about and/or what it means” (p. 199). Lochmiller (2021) also provides an excellent overview and nuts-and-bolts details for thematic analysis.

Summary

As mentioned above, I often conceptualize the entire QDA process by focusing on the research question(s) of the study. The purpose of things like coding and thematic analysis are to answer the research question(s) from the data. At the end of writing up a qualitative study, I ask myself, “If I form these themes, that are headings in my Findings section, into a single compound sentence, does that sentence directly answer my research question(s)?” Or, maybe more simply, if someone directly asked me my research question and I responded by explaining my theme(s), would that be a good answer? If there’s a “well, that doesn’t cover X” thought in my mind, I consider if I did omit something important. If parts of my Findings themes do not respond directly to the research question and/or seem extraneous, I consider if I should remove or combine them.

In my pre-pandemic phenomenological study on synchronous online education, I asked this research question: “What is the academic experience like for freshmen doing their first year of college through synchronous online education in classes blended with face-to-face students” (Olt, 2018, p. 382)? To that, I had the following themes in my Findings:

  • Ambiguity about Group Membership
  • Ambiguity about Functionality
  • Ambiguity about Place

So, if we frame this like a conversation (which it is, in practice, between the author and readers), it might look like this:

  • Question: What is the academic experience like for freshmen doing their first year of college through synchronous online education in classes blended with face-to-face students?
  • Answer: It’s ambiguous for the remote students, as they struggle to see where they fit in with regard to group membership as students in the class, whether the technology will work properly on any given day, and what place they actually exist in as students (physically in their remote location but virtually on a screen in a distant classroom).

Transcription

While there are many qualitative data sources from pictures to legal documents, the most common qualitative data sources are interviews, whether individual or in focus groups. These interviews are generally recorded by the researcher. However, how does one analyze 60 minutes of audio recording? Qualitative data of this sort first has to be transcribed from an audio format into a usable text document.

One popular way for researchers to convert audio data to text is to outsource that work. For a professor in a well-supported research position, that might come in the form of a graduate assistant. Other times, one might use an outside service, such as Rev (https://www.rev.com/). Both of these come with costs; however, more significantly, there may be ethical concerns. Perhaps a graduate assistant might get approved by the Institutional Review Board to have access to the data as part of a research team; however, there is no real oversight of an outside service. Individual transcriptionists working for the outside service are not known by the institution. While their company might swear them to confidentiality, researchers should take great care in using these outside services. Clarify any intended use in an Institutional Review Board application, and one should not even consider such a service with especially sensitive topics (ex., where there are potential criminal or social consequences associated with interview content). In either case, there is still a transcriptionist manually converting speech to text.

Ultimately, the only way to assure quality and confidentiality of transcription is via the direct efforts of the researcher. That is not to imply that tools to support that effort do not exist, such as Otter.Ai (https://otter.ai/) that is also available embedded with Zoom Pro accounts. Such tools might provide an AI-generated preliminary transcript, which the researcher then should manually compared to the audio recording and correct as needed. Make sure to check the privacy policies of AI programs used with transcripts, as some companies use the data input by users to train their models and could spit out your data in response to a query by another user. As the researcher transcribes, they might play the audio at a slower speed and keep pace typing in a word processing program. Foot pedals connected to the computer have been a popular device to allow the researcher doing transcription to automatically rewind a predetermined number of seconds (should the researcher have difficulty understanding what was said, need to confirm, or just generally fall behind).

QDA Tools

This section will provide some historical context for the development of QDA tools and end in the present with software and AI. It is important to remember though that newer QDA tools are not actually better in substance; they do the exact same things as were done before. Different tools may just be more or less efficient and have utility with digital data. All of these tools can get a qualitative project to a full and complete end state; ultimately, what is “best” is a qualitative question with a variety of subjective answers based on personal preferences.

A Printer, Scissors, and Highlighters of Many Colors

Before the advent of tools to do QDA inside a computing device, QDA was often done with typed transcripts that were physically highlighted in different colors (representing different codes/themes) and physically cut up to be sorted into piles. This system relied on successive iterations of total readings with any undoing of QDA representing a total and massive undertaking. Even before the personal computer, this level of QDA tools might include handwritten text and typewriters.

This method is not without its challenges. A colleague once told me about a time they had dozens of pages carefully laid out, highlighted, and cut up across their office floor. They came back in after a weekend, and the custodian had visited their office. While things had not been disposed of, they had been totally scattered from their order and categorization, effectively throwing away hours and hours of researcher work.

Word Processing and Spreadsheet Software

With the advent of the personal computer, software soon became available that could be used to assist with QDA, which certainly overlapped much of the print/highlight/cut era of QDA tools. While not designed specifically for QDA, word processing (Microsoft Word, Google Sheets, etc.) and spreadsheet (Microsoft Excel, Google Sheets, etc.) applications provided a degree of ease to QDA. Word processing applications allowed the user to digitally highlight with a variety of colors, search documents with ease using the Ctrl+F function, make changes to QDA relatively easily, and consolidate data. Spreadsheets could do much of the same functions; however, they were/are more commonly used to digitally organize excerpts of qualitative data in columns devoted to tags for coding, categorizing, and themeing. These could then be easily filtered to view organized data sets.

These tools are still very common in QDA. Almost all qualitative projects involve some data inside of a word processing application, though there could be anywhere from zero to all of the QDA done inside of it. Again though, these tools were not designed specifically for QDA, and so software programs designed specifically for that were eventually developed and came into wide usage.

Because of the ubiquity of word processing and spreadsheet applications (free or already owned for other basic uses), this approach is still relatively common, especially for small projects or working with succinct data (such as qualitative survey responses that are manageably short). Indeed, it may even be better for those small QDA endeavors without the additional setup needed for projects in QDA software.

QDA Software

Emerging out of the need for software specifically designed for QDA, there has been an explosion of products. There are products that seek to provide a streamlined, lite tool (ex., Dedoose), while others try to provide all the bells-and-whistles possible to cover every QDA need (ex., NVivo). Preferences on QDA software can become somewhat tribalistic among qualitative methodologists, but outside of less common QDA needs, they all provide essentially the same functionality. It is really a matter of familiarity and preference, though financial considerations are also often of great practical importance for those outside of well-funded departments.

It is, however, very important that the qualitative researcher remember that these are just tools, only as effective as the artisan who wields them. This is quite different than quantitative analysis tools, which generally do all the heavy lifting for the researcher upon being given data and instructions.

Using AI in QDA

The use of artificial intelligence (AI) in QDA is, at the time of this writing (fall 2024), quite controversial. Since the introduction of software programs for QDA, there has been a built-in way for technology to “do” some of the QDA, such as counting frequencies of words. QDA software then might return a word cloud weighted by frequency. However, that word cloud must be taken with a degree of skepticism, as programs could only find exact matches (missing different words that conveyed the same meaning), emotional impact, and context.

So, is AI better at QDA or is that superficial with it just being more of the same? This is a qualitative question with a qualitative answer that is both subjective and complex. AI is better at things like finding patterns than the former algorithms built into QDA software. It can even be trained specifically on QDA texts to generate themes with rich, thick description. One such example is Moxie Learn AI out of the Academic Insights Lab (https://moxielearn.ai/), while an increasing number of the QDA software systems include AI assistants, such as MAXQDA’s AI Assist (https://www.maxqda.com/products/ai-assist). However, AI does still struggle with understanding context, impact, and human meaning (especially when participants describe things very differently but yet have a common meaning).

The use of AI for qualitative research is a bit ironic, as AI is fundamentally quantitative. It only appears qualitative on its outputs while actually using incredibly refined and complex quantitative analysis to make predictions about words. So, if one uses AI in QDA, does that then blur the lines between quantitative and qualitative, effectively doing quantitative analysis on qualitative data (which is a thing even beyond AI)? That does, in fact, seem to be the case, even though it may not feel that way seeing AI’s outputs.

Gillen (2024) asked the question, “Can we trust AI in qualitative research?” He concluded that it could have some useful applications, but overall, he concluded that it was not advisable for large-scale implementation. There are risks of hallucination, errant quantitative prediction of qualitative ideas, and bias that cannot be filtered through researcher positionality. However, Gillen’s concerns about security are significant. Should an AI gain access to data before anonymization, all ethical and confidentiality protections could be lost with no way to resolved. Even with anonymized data sets (ex., interview transcripts), AI could potentially use that information to match to real-world people, which whether those matches were right or wrong could create significant negative impacts. Practically, Gillen concluded that, “it should be applied cautiously in its current form” to do “supplementary tasks” (para. 12). Waxing more philosophical than practical, Gillen argued that “much like art, qualitative research can be a celebration of humanity” and “to study humans, particularly in an open and interpretative way, requires a human touch” (para. 14).

AI is rapidly evolving. Much like Gillen, I agree it could have practical uses even in QDA. However, I do not believe it is wise to let AI tell the story of humanity. Qualitative research fundamentally tells non-fiction stories, and we as humans should do that for ourselves.

[Author’s Note: Because of how rapidly AI is evolving, this section may be regularly updated without a new edition of the book in order to keep it current and accurate.]

Reporting Results of QDA

The current American Psychological Association (2019) manual provides an extensive description of how a qualitative study should be written (pp. 93-105). However, it is important to note that there are differing disciplinary, methodological, departmental (thesis/dissertation), and journal expectations from the APA. Beyond structural elements, I provide some guidance below on four important areas of consideration in writing up qualitative research.

Rich, Thick Description

Merriam and Tisdell (2016) described rich, thick description as, “providing enough description to contextualize the study such that readers will be able to determine the extent to which their situations match the research context and, hence, whether findings can be transferred” (p. 259). I do think that definition conveys part of the essence of what “rich, thick description” is, but it is focused on external transference rather than conveying findings. Saldaña and Omasta (2018) noted that it “does not imply lengthy narratives but a written interpretation of the nuances, complexity, and significance of a people’s actions. By focusing on the details of what we experientially witness, we can reflect on and hopefully render an account that provides insightful knowledge for readers” (p. 31). Synthesizing these ideas, I hold rich, thick description to be qualitative writing that presents the findings of a particular study to convey understanding in both breadth and experiential detail as situated within the participant(s) and site(s) .

This is usually accomplished in a qualitative study by effectively balancing the direct presentation of qualitative data with researcher analysis. synthesis, and explanation. The Findings section of a qualitative study should be rich in qualitative data, such as quotes from interviews. Directly presenting such excerpts helps readers see researcher bias in analysis or interpretation, make judgments about the findings directly, and find trustworthiness in the qualitative account. Of course, participant confidentiality must be protected in this process, but participant quotes humanize the Findings narrative.

However, the researcher also must be careful not to overwhelm the Findings section with qualitative data. A good approach to an “average” qualitative study (whatever that is) would be to have one to two exemplars of qualitative data per heading of any level. These should be carefully selected to illustrate the point being made, and each should make a unique contribution to that understanding. If one quote conveys that essence entirely, it is generally unnecessary to include two; however, if for example, there is a divergence of opinions among participants, two quotes might be used to illustrate that divergence. Excerpted quotation lengths should be no longer than they need to be to convey the necessary content, but they should not be shortened artificially. It is common that these quotations would be 20-50 words in length. The researcher should not, however, just drop participant quotes under headings and feel as if they have conveyed the findings of the study. They should discuss and explain key components from each quote within the context of the synthesis explanation of that heading.

Qualitative Theses and Dissertations

Before embarking on the journey of writing a thesis or dissertation, one should make sure their plan is approved by their chair. If the chair is not a qualitative-focused researcher, I recommend that the student add a qualitative methodologist on their committee as a protection against methodological ignorance or bias, which is unfortunately not uncommon in segments of the social science world. As additional committee members are selected, I recommend looking for those with at least some publication history that is qualitative.

Finding a Journal

At the forefront of most decisions in selecting a journal to submit a manuscript is usually the content. For discipline-specific journals, this is relatively obvious—submitting a political science paper to an economics journal is likely a waste of time (though not always). Often, research-intensive institutions and departments will give greatest priority to those disciplinary journals. However, there are also journals that are methodologically focused, such as The Qualitative Report, which are also ranked highly.

Additionally, there are methodological considerations. Some social science journals are specific to certain methodological approaches (ex., quantitative only), and so it is important to make sure from the journal’s aim and scope that it accepts qualitative works. However, it is unfortunately also not uncommon for a journal to say that it is methodologically open but not practice that. This could be because of the current editor’s preferences or something more systemic, but it is advisable for qualitative researchers to look at the last two years of publication history at a journal before submitting to make sure there are qualitative pieces being published. Submitting to either dead end can be extremely frustrating and time-wasting for the qualitative researcher.

Those Pesky Word Count Maximums

Perhaps the most frustrating thing about being a qualitative researcher are the word count maximums set by a journal. There may be nothing more deflating than finding a “perfect” disciplinary fit, only to see there is a maximum allowable word count of 5,000. A low word count allowance (say, 6,000 or less) is often a strong sign of quantitative bias at the journal. However, with required introduction, literature review, discussion, and references content, low word count requirements can be very difficult for many qualitative pieces to meet without compromising quality (i.e., rich, thick description). There are also usually expectations for far longer methodology sections in qualitative pieces than quantitative, which takes up even more of the allowable word count.

It is reasonable to understand why word count minimums came into existence. In the era when all journals were in print rather than digital, more words meant more cost to print and ship. Additionally, this provides a protection for reviewers and editors in how much time will be spent reviewing. However, good reports of qualitative social research take significantly more space than quantitative reports. Rich, thick description will not be had with <1,000 words for a Findings section.

Key Takeaways

  1. QDA is a slow, iterative process requiring a significant amount of time and effort from researchers.
  2. QDA is variable and subjective to the researcher’s discretion, but it should be consistent with methodological literature.
  3. QDA tools enhance the analysis and interpretation process of qualitative research, but they do not replace it or do it independently.
  4. Good reports of qualitative research balance qualitative data and researcher analysis/interpretation in rich, thick description.

Additional Resources

Methodological Journals

The two journals below are open-access sources of peer-reviewed qualitative research and methods. They are excellent sources to find qualitative methodological guides, nuances, and considerations.

The Qualitative Report (https://nsuworks.nova.edu/tqr/)

Forum: Qualitative Social Research (https://www.qualitative-research.net/index.php/fqs)

Transcription Tools

Otter.Ai (https://otter.ai/)

Rev (https://www.rev.com/)

Qualitative Data Analysis Software

Atlas.ti (https://atlasti.com/)

Dedoose (https://www.dedoose.com/)

HyperRESEARCH (http://www.researchware.com/products/hyperresearch.html)

MAXQDA (https://www.maxqda.com/)

NVivo (https://lumivero.com/products/nvivo/)

Quirkos (https://www.quirkos.com/)

Taguette (https://www.taguette.org/) – free

QDA AI

MAXQDA AI Assist (https://www.maxqda.com/products/ai-assist)

Moxie Learn AI (https://moxielearn.ai)

Chapter References

American Psychological Association. (2019). Publication manual of the American Psychological Association (7th ed.).

Gillen, A. L. (2024, October 9). Can we trust AI in qualitative research? Inside Higher Ed. https://www.insidehighered.com/opinion/views/2024/10/09/can-we-trust-ai-qualitative-research-opinion

Lochmiller, C. R. (2021). Conducting thematic analysis with qualitative data. The Qualitative Report, 26(6), 2029-2044. https://doi.org/10.46743/2160-3715/2021.5008

Merriam, S. B., & Tisdell, E. J. (2016). Qualitative research: A guide to design and implementation (4th ed.). Jossey-Bass.

Miles, M. B., Huberman, A. M., & Saldaña, J. (2014). Qualitative data analysis: A methods sourcebook (3rd ed.). SAGE Publications.

Olt, P. A. (2018). Virtually there: Distant freshman brought into classes with synchronous online education. Innovative Higher Education, 43(5), 381-395. https://doi.org/10.1007/s10755-018-9437-z

Saldaña, J. (2016). The coding manual for qualitative researchers (3rd ed.). SAGE Publications.

Saldaña, J., & Omasta, M. (2018). Qualitative research: Analyzing life. SAGE Publications.

Williams, E. N., & Morrow, S. L. (2009). Achieving trustworthiness in qualitative research: A pan-paradigmatic perspective. Psychotherapy Research, 19(4-5), 576-582. https://doi.org/10.1080/10503300802702113

License

Share This Book