Unlocking Research Potential: New Framework Redefines Social Science Design
The scientific method as we know it today dates to, at least, ancient Egyptian and then Greek philosophers such as Aristotle.
But even with centuries of practice, the scientific method is not some holy monument etched with eternal wisdom. Details matter. The nature of scientific inquiry and progress requires the capacity for the improvement of procedures.
Which brings us to a new book by Institution for Social and Policy Studies faculty fellow Alexander Coppock; Graeme Blair, associate professor of political science at the University of California, Los Angeles; and Macartan Humphreys, director of the Institutions and Political Inequality group at the WZB Berlin Social Science Center.
In “Research Design in the Social Sciences: Declaration, Diagnosis, and Redesign,” the authors introduce a new framework for putting together a study. They call it MIDA, four letters standing for the essential components of a social science research design: a model, an inquiry, a data strategy, and an answer strategy.
We recently spoke with Coppock about how this framework operates, why each component is necessary, and how researchers can access and deploy this enhanced scientific method to better share results and advance knowledge.
ISPS: What’s new in this book? What are you contributing to the social sciences?
Alexander Coppock: There have been many research design books that focus on, say, experiments only. Or they discuss qualitative research only, or studies of observational causal inference only. Investigators within each type of research have developed their own traditions and ways to do their work. What our book does is put it all together under the same framework. Experimental designs, non-experimental designs, quantitative research, qualitative research. We agree with a growing concept in science that all research designs are similar. They are all trying to answer questions that the world is hiding from us. That’s new. And worth codifying in a way that anyone can follow.
ISPS: So how do you bring all these different types of research under one umbrella? How can researchers know they are choosing the correct design?
AC: The simple answer is by diagnosing the quality of a study design through computer simulations. In addition to proposing a new framework for everyone to follow, we have created a software language that allows you to mix and match design elements into a cohesive whole and re-use it. For example, by simulating a study, researchers can decide how many subjects or how many treatment arms they need. And if they follow our framework, if they learn our way of doing things and understand our approach, they will see the payoffs. The answer to the question about how to design your study will become self-evident.
ISPS: Let’s back up for a second. The first chapter of the book is titled “What is a research design?” Without giving away too many spoilers (or actually, please give away as much as you can), what is a research design? How do you know when you have a good one?
AC: A research design has a theoretical half and an empirical half. In the theoretical half, you imagine a model of the world in which you can ask your research question. Let’s say I know that people respond differently when someone tries to change their mind about a political issue. You can state your question in terms of a theory and ask: What is the difference? What happens if someone tries to change someone’s mind or not?
ISPS: So that would be the “M” and the “I” in MIDA. Your model (“M”) of the world and what you are targeting, through inquiry (“I”), to learn in the study. What’s the empirical half?
AC: The empirical half describes the procedures a researcher uses to gather information from the world and summarize that information. The “D” for data strategy and the “A” for analysis.
ISPS: And so good research design requires both theoretical and empirical halves to be sound?
AC: Yes, but not just sound. They need to work together. If there is a mismatch, you have a bad research design. If you have a tight connection between the two halves, research design appears to be stronger. One simple way of putting it is that you really need to know what you are trying to learn and then design your study in a way to learn it.
ISPS: How hard is it to construct a useful model to guide your research design?
AC: It can be really hard!
There is a great experiment that was published this summer in Science. The researchers were trying to understand whether the algorithm on Facebook is causing people to have different political attitudes as the feed plays up some stories and downweighs other stories.
To design an experiment measuring the causal effect on people’s opinion of an algorithmic feed compared with a reverse chronological feed, you need to know the distribution of people’s attitudes in both states of the world. Many researchers don’t spend a lot of time simulating the world that way. But if you did, you would make your hypothesis that much more precise.
ISPS: As you said, this is not easy.
AC: Right. What if I’m uncertain about the way the world will be under those two regimes? Then you need to simulate a range of possibilities. You could be designing a study that’s just too small to detect what’s important and going through all this trouble so that even if you are correct, you wouldn’t necessarily learn what you want to learn. You couldn’t know what any model might tell you just by sitting around and scratching your chin.
ISPS: Which is why you need computer simulations to guide your design.
AC: Yes, and not everyone does it because it’s hard. You need to learn how to code. In our book, we provide the code. Our publisher, Princeton University Press, has put a free version of the book online, so that researchers can easily access the coding. In addition, you might not know quite what to put into the simulation. The answer to that question is MIDA. So, in our book we’ve put it all together. It’s modular. Now you can identify and use design elements like a sampling procedure that someone else has used effectively. And because it’s all interoperable, the simulations are easier.
ISPS: How much work is necessary to feel confident your model is sound before going further in a research design?
AC: You have asked the hardest question. Because you never really know if the models you have considered include the real world. A researcher could be off in simulation land with all sorts of intellectual curiosities, but maybe the models don’t correspond to the real world. You could think your study is well designed, but the background assumptions in your design mean that in the end you don’t learn what you thought you were going to learn.
ISPS: How do you avoid this problem?
AC: You need to consider many, many models. You need to run simulations that consider whether the design is good under a worst-case scenario, a best-case scenario, the most likely scenario, and on and on. This is very different from how many researchers design studies. They do it the way their advisor did it. Forty units per arm of a study or 100 units. Some arbitrary number.
ISPS: How do you advise students?
AC: Students often ask me, “Alex, should I add another arm to my experiment? I have a theory where this variable could be important.” My answer is always: “I don’t know — let’s simulate!” Importantly, everything is subject to logistical, financial, and technical constraints. If you only have the capacity to collect 500 responses from subjects, you need to decide what to ask them. And you will need to use the computer to help you find out the highest purpose of what you can get out of those subjects.
ISPS: So, how can a researcher assess a design? What criteria add up to a “good” design?
AC: For the longest time, the most important criterion was statistical power — the probability that your study comes back statistically significant. But that definition is arbitrary. If the procedure is biased, if it systematically gets the wrong answer, who cares if it has higher statistical power? That just means you have a higher chance of getting the wrong answer.
ISPS: What should researchers be asking instead?
AC: One way to understand the quality of design that I would like to encourage is this: What is the probability the study will show my theory will be correct on four dimensions and the other theory to be wrong on a different set of dimensions? Not just one number. There should be eight different numbers. What is the probability that you are right and the other theory is wrong on those eight different dimensions? The only way to learn that is to simulate.
ISPS: How can you evaluate a study design before you have collected any data?
AC: This is the thing that often bewilders people. They might think: I haven’t done the study. I don’t know what to put in it. What’s perhaps counterintuitive is that you need to design your study before you have conducted it. You have to design in ignorance. You don’t know the distribution of responses. If you did, you wouldn’t need to do the study. So, you need to plan for a range of possibilities. Ask yourself: How can I imagine a series of possibilities and understand how my design would perform under all of them?
ISPS: How does your book help with this process?
AC: One answer is to follow the lead of others. We show how you can incorporate into your study design data that have been gathered already. And then you can simulate from that distribution. For novel studies, we describe how to use pilot studies to build up the knowledge you need to create a dispositive main study.
ISPS: After a study concludes with statistically significant published results, the work is not over, right? What are the remaining steps in the research design lifecycle? Why are they necessary?
AC: A research study’s finding exists within a body of literature. You want that finding to be integrated within it. The most obvious way to have your finding integrated is through meta-analysis, an examination of different studies on the same topic to identify trends. If you are studying the same phenomenon as others, you want to know if you are getting the same answers.
So often we as social sciences talk past each other. Or the studies don’t answer the same question exactly. There is also pressure to produce novel studies for publication. But people also don’t always do the hard work to find research designs that are commensurable with other people’s work. We want to encourage people to design studies that can be compared later and included in a meta-analysis. Researchers need to archive their work and post it online. Your duty to the scientific community does not end with publication.
ISPS: So, reading through the book’s chapter headings, we’ve got MIDA — modeling, inquiry, data strategy, and answer strategy. Declaration, diagnosis, and redesign. Planning, realization, and integration. Then you add three more principles to guide researchers: design early to reap the benefits of clarity, design often so you can correct course, and design to share so that you maximize transparency and contribute maximally to knowledge creation. Am I missing anything? Science is hard, huh?
AC: Well, we want to do it right if we want useful, reliable results. Research design principles are an editorial voice. With the terminology described in our book, we are envisioning a scientific community where everyone speaks the same language and shares not only their findings but their design objects.
ISPS: What’s a design object?
AC: A design object is what you get when you declare your MIDA in a computer code. Put all of these elements together, and that’s a design object. Something you can simulate and share.
ISPS: And you want people to share their design objects even before publication, right?
AC: Yes. Sharing that object pays dividends at a lot of different points in the process. One is right before conducting a study, when researchers commonly post their plans on the internet. By posting the complete design object before beginning, colleagues can help keep the project on a sound footing.
ISPS: What pitfalls might researchers encounter if they do not follow your guidelines?
AC: Well, one example is that you might design studies that are too complex. Often what happens is that you have a theory that is more complicated than your study can actually deal with. I fell prey to this as a graduate student, designing a study with too many treatment arms. And because I tried to answer all those questions, I ended up answering none of them.
Another problem is that you might end up doing a study that produces a data set that you can’t use to answer the question you thought you were answering. If you had simulated the design before beginning, you would have avoided that problem. If you don’t simulate first, you are kind of driving without a seatbelt.
ISPS: We are talking about a textbook, appropriate for social scientists and people studying to become social scientists. But can you think of anything in the book or even just its approach to science and scientific literacy that a lay person might find useful in their life or understanding of the world?
AC: I think there’s an important philosophical point lay people can appreciate. When social scientists conduct a research study, we get a reflection of the world. That one reflection is not equal to the truth. Sometimes scientists in the public discourse might say they looked into something, and now we know the answer for sure. But as a scientist, I know we hardly ever know anything for sure. Data is produced by humans who are seeing the world through their own lenses. Data is produced by research designs. Were these data gathered by good research designs? We know so much about the world because of science. But I think we also need to stay vigilant and constantly improve how we even start to ask the next question.