Ross Woods, 2022-24
Thematic coding is an increasingly common way to analyze documentary data, usually transcripts of interviews or focus groups. As a qualitative methodology, it gives researchers a way to interpret data.
Thematic coding is derived from a research approach called grounded theory. In essence, this is a method of using a comprehensive set of examples to identify patterns, from which the researcher can create a theory. The theory is justified by the range of real examples.
Thematic coding is really only a systematic way of analyzing data to reach a conclusion. It has become increasingly popular in recent years, partly because it looks like a set of steps. However, the latter stages are less procedural and require more thought. It is not actually a set of steps, but is more like a set of phases that can overlap. For example, you can start transcribing and analyzing data as soon as it is collected.
Thematic coding has several advantages. First, the researcher simply has to follow the method. Second, it gives a way to systematically analyze lots of data, such as when writing a longer thesis or a dissertation. Third, the researcher can use it as in stages, giving an opportunity to adapt the method as needed, and perhaps hold more interviews. Fourth, it is easier if teh researcher uses voice-to-text software to transcribe interviews.
It also has several disadvantages. Although it is quite flexible, it probably doesn’t allow much scope for innovation. If you do not use transcription software, it is very time-consuming to transcribe interviews by hand, or quite expensive if you have to pay someone else to do it for you.
You should already be keeping a written diary of your methodolology, including what you did, why you did it, your methods, and your observations. Your description is essential to your accountability. In principle, it must to be detailed enough to enable someone else to follow your method.
Add records of your reflections to your diary. You can start to informally analyze data it as soon as you start collecting it. You should take notice anything relevant to your research question, for example:
In your diary, you should also write down the reasons why you interpreted the data the way you did. The description will probably be quite simple at first, but any later changes or elaborations will be significant because they indicate a better interpretation of the data.
💡 If you are writing a dissertation:
You can can start formulating deductive themes very early in the whole process, even before before you collect any data. It is quite permissible to derive a set of deductive themes from your literature review or your statement of the research question. However, it would be a mistake to use deductive themes exclusively because other unexpected but significant themes emerge later on in the data.
The other alternative is to use inductive themes, that is, themes that emerge in the data, that you identify as you assign codes and themes. You can even modify your system of inductive themes during data-gathering and analysis in order to get themes that better represent your data.
Some of your ongoing analysis might affect your data gathering. Qualitative research is often iterative, and this method allows you to improve your data collection and analysis as you progress:
One of these methods will help you when to decide to stop collecting data:
How good does my data have to be?
How much data do I need? When have I finished collecting data?
You can start transcription as soon as you have collected data. Transcribe it word-for-word into documents, although you might be able to exclude anything clearly irrelevant to your research purpose. Warning: Some things that look irrelevant at first might appear more relevant later on when you understand the data better.
Most researchers prefer to use transcription voice-to-text software or external services to do transcriptions. A few, however, prefer to do it manually because it brings the very close to the data, even though it is horrendously time-consuming.
Start reading and re-reading all your data while it is still coming in, and make diary notes of any other questions arising. (If you transcribe manually, this will come very easily.)
When you become very familiar with your data, it might look very little even when you have enough. Don’t worry.
You might have started collecting quotations, but now you can treat it as an extra stage. Using direct quotations from respondents in your final report has two particular benefits:
If possible, start coding as soon as you have transcriptions and are familiar with the texts, while you are still collecting data.
Mark all parts of the text that are relevant to your research question with a color-code or symbol. These might be “recurring patterns, terms, or visual elements.” (Naeem et al. p. 2.) On each part of the text that you marked, put a brief label of single word or a short phrase that says what is going on. These labels are your codes.
Coding is itself paart of analysis, because you are sorting raw data into structured meaning.
You now have a patchwork of the meanings of everything in your data that is relevant to theory development. It is also simpler and briefer than the full text of raw data.
It is good practice to have someone else check your coding. It will help prevent or minimize personal bias in interpreting data.
Group related codes together and represent them with a theme, that is, an overarching idea that represents what is happening. Themes are a higher level of abstraction.
Do your themes accurately represent the theoretical ideas in your data and codes?
When your have created a system of themes, compare different occurrences and look for patterns in the data. By this stage, you should be able to see patterns; the sooner you spot the patterns and confirm them, the faster you make progress. You will find that you read the transcripts again and again, and become very familiar with them.
What are the relationships between codes and themes? You can use diagrams or models to represent the relationships among these concepts. (Naeem et al. p. 4.) Can you accurately define those relationships and demonstrate them from your data?
You can try this approach as long as you don't treat it as a rigid set of steps that will meet all your conceptualization needs**:
Some mistakes are easy to make if you make incorrect assumptions about your respondents:
How many themes?
There is no rule about specific numbers of themes. The principle is that you need enough to represent the data accurately and to help you reach sound conclusions. If the number of codes hinder and confuse your analysis, you should ask whether the number of them is the cause of the difficulty.
How can I know that the data will answer my research question?
It will if your questions gather data that addresses the research question. (This is why alignment is so valuable.)
However, the answer might not be the answer you anticipate. Some students think that their data is wrong if it leads to conclusions that they didn't expect.
How can I code qualitative data from my interviews so that I work smarter, not harder?***
Organizing large amounts of data is possible with a computer, but it might not be the best way for everybody. Besides, if you make a mistake with a computer you might not notice it or might not be able to reverse it. Advice so far:
__________
* A mathematical proof of data saturation is unlikely because qualititative data is not appropriate for a mathematical proof.
** Ross Woods, 2020, '24, derived from Strauss and Corbin, 1990, pp. 99-107.
*** With thanks to Rιchαrd Scοtt Bαskαs, Rαιnεε Βrγαnt, Lγndα Dανis.
Muhammad Naeem, Wilson Ozuem, Kerry Howell, and Silvia Ranfagni. A Step-by-Step Process of Thematic Analysis to Develop a Conceptual Model in Qualitative Research
International Journal of Qualitative Methods Volume 22:1–18 (2023) DOI: 10.1177/16094069231205789
Ross Woods, 2020, '24. Toolkit of research methods.
Anselm Strauss and Juliet Corbin. 1990. Basics of Qualitative Research: Grounded Theory and Procedures and Techiniques (Newbury Park, Ca.: Sage Publications).