As a professor of Epidemiology, my primary goal is to teach my students how to read, critically analyze, and apply the results of an epidemiological study to create healthy communities. I want to empower my students to read research papers from beginning to end (not just the Abstract and definitely not skipping the Methods section). I want them to confidently read the literature, determine for themselves the strengths and limitations of the study, and be able to communicate the findings and any applications of those findings to a group of community members without any knowledge of epidemiology.
And I want to invite all of you into my classroom (so to speak). I want to provide you with a guide to reading the epidemiological research and the opportunity to read, discuss, and apply the findings of epidemiological studies with me.
Together — with improved literacy and the ability to see the strengths and limitations that are inherent in every study — we can fight misinformation, spot disinformation, craft strategies to improve health, and create healthy communities.
Are you ready?
Do you want to learn how to read, analyze, and apply the epidemiological literature?
Let’s get started… (and if you missed my Rule #1 post, go back and read it first).
Today I’m highlighting three things the authors of epidemiological research papers assume you, the reader, know. And yes, I am kind of tied to (read: obsessed) with three things in my posts. I shared the genesis of Three Things Thursday here. I think growing up in a Presbyterian church where each Sunday’s sermon had three points has had a lasting impact, too (it’s like three things have been baked into me). Anyway…
The authors of epidemiology papers assume that you know -
Association does NOT equal causation
There will ALWAYS be outliers
How to interpret and apply the findings of a statistical analysis.
Association ≠ Causation
In most epidemiological research papers, we calculate measures of association — a numerical value that captures the risk or benefit of something (often a risk factor) and disease occurrence. For example, people who smoke cigarettes are 15 times more likely to develop lung cancer compared to people who do not smoke cigarettes. Or two doses of the MMR vaccine are 99% effective against measles.
These are measures of association.
They are NOT statements of causation.
This is an important distinction. While epidemiologists write their research findings with the utmost confidence and we are damn sure are statistics are perfect,
we CANNOT prove causation (in some ways this is a continuation from last week’s post about no perfect study).
In epidemiology, we can quantify the association — aka we show how big a risk is or how much disease we can prevent. But this is NOT a claim of causation.
According to renowned epidemiologist Dr. Kenneth Rothman, we conduct epidemiological research to —
“Identify an event, condition, or characteristic that plays an essential role in producing an occurrence of disease.”
Once these events, conditions, or characteristics are identified, we can use that information to develop policies, programs, or interventions that will improve health.
We are not in the business of proving causation. We take the preponderance of the evidence and use it to make healthy decisions.
We are looking to quantify associations and then use that information to make changes to improve public health.
Please do NOT expect us to prove causation or be absolutely sure before we take action to improve health in our communities.
We are in the business of saving lives and creating healthy communities — we take our valid and reliable associations and get to work to create healthier and safer communities for all.
There Will Always Be Outliers
Epidemiology is the study of the distribution and determinants of health in populations. We strive — through our research, as teachers, and when we are proposing new policies, programs, and interventions — to improve the health of as many people as possible.
Epidemiologists are committed to creating healthy communities for all (or as many as we can). This means that our research and its applications are aimed at improving the health of as many people as possible. For example — let’s look at seat belts…
Seatbelts save lives. And we have seatbelt laws (and car seat laws) because they make it safer to be in a car. Every time I get into the car, I buckle my seatbelt because (it’s the law and…) if I am in a crash the seatbelt will protect me from going through the windshield. I never get into my car thinking “Today I’m going to buckle my seatbelt because I am planning to crash my car.” Instead, I put my seatbelt on every time I get into the car so that I will be protected IF I get into a crash.
HOWEVER — seatbelts do cause injuries. And on VERY rare occasions (for example, a car plunging into a body of water) being seatbelted into a car can cause death. But the benefit of wearing a seatbelt and being vaccinated FAR OUTWEIGHS the risk of these rare complications occurring. I have never thought “I am not going to buckle my seatbelt today on the off chance (maybe 1 in a million) that my car plunges into the river while I cross the bridge.”
There will always be outliers — I’m sure there is a case in which a car crash victim would have survived if they hadn’t been wearing a seatbelt. Similarly, there are a few lung cancer patients who have never smoked a cigarette. And, well, you’re reading the work of a mother who religiously takes her children to their annual wellness checks, but had her kid contract rotavirus in the pediatrician’s office (he was hospitalized for more than a week).
The findings of epidemiological studies are NOT deterministic.
For example, the risk of lung cancer does increase (15-fold) for individuals who smoke cigarettes. You are WAY MORE likely to develop lung cancer if you smoke. However, sometimes individuals who smoke do not develop lung cancer and marathon runners who never touch tobacco do.
Research points us in the right direction. But it does NOT determine our health outcomes. And while the stories of those who are the outliers (the marathon runner with lung cancer or the healthy kid who gets rotavirus at the pediatrician’s office) are surprising, concerning, and DEFINITELY memorable — they cannot and should NOT dictate the decisions we make at the individual- or community-level to be healthy.
There will always be outliers.
And my challenge to each of you is to not be distracted by the outliers.
We still need to use our seatbelts, we should not smoke cigarettes, and we definitely need to visit our doctors (at least once per year).
Statistics
The vast majority of statistical findings in an epidemiology paper are summarized and the authors assume you know how to interpret them on your own. For example, our lung cancer and cigarette smoking statistic might be summarized in a paper this way — RR: 15; 95% CI: 7-55.
The authors assume you know that this means —
“The relative risk is 15, meaning individuals who smoke are 15 times more likely to develop lung cancer compared to individuals who do not smoke cigarettes. And we are 95% confident that in the population of individuals represented by our student sample the true relative risk lies somewhere between 7 and 55.”
There’s a lot to know. And in the weeks ahead, we’ll cover some of the basics.
Here’s a quick primer on statistics in epidemiological studies —
Biostatistics is the art and science of collecting, organizing, describing, and analyzing data in epidemiology. Biostatistical methods provide principled and objective methods for:
Testing scientific hypotheses
Weighing evidence
Estimating risk and other characteristics of a population
NOTE: they do NOT prove causation either.
In most epidemiological studies (both descriptive and analytic), we study a sample—a subset of a larger population—and use both descriptive and inferential biostatistical methods to make generalizations about the larger population (known as our source population).
Descriptive statistics describe important features and trends in a data sample and allow us to decide whether or not it is representative of the source population. Inferential statistics are used to investigate the research hypothesis about the source population, using information from the sample. We use the results from inferential statistical analyses to make conclusions or generalizations about the source population from which the sample was selected.
Most of the epidemiological literature includes 4-5 tables per paper. As a reader of epidemiological papers, you must be able to identify where the essential information of the results is located – characteristics of the sample as well as measures of association – and be able to interpret what is in those tables.
It’s a lot of work.
For now — what is important is that you know that the researchers are assuming you know a lot. In the weeks to come, I will help you to understand some of those assumptions.
I also highly recommend that you befriend a statistician. We might be a little (or a lot) nerdy, and we may find delight in sitting in front of a computer screen writing statistical code. However, we can help you detangle and understand the tables and copious amounts of data that are included in each epidemiological paper.
That’s it… you know what assumptions are being made.
Next Tuesday we’ll move on to The Canon of Epidemiology…
Be sure you are subscribed so you do not miss a thing.
Do you have questions about these assumptions? Or would you like me to share a homework assignment with you? Leave me a comment (like “sign me up” or “send the homework now”) and your email address. I’ll be sure to include an answer key!
Following on from your (excellent as always) comments - this time regarding statistics - I thought you might be interested in the following two podcasts (and their respective authors/books):
https://newbooksnetwork.com/david-spiegelhalter-the-art-of-statistics-how-to-learn-from-data-basic-2019
And
https://newbooksnetwork.com/alberto-cairo-how-charts-lie-getting-smarter-about-visual-information-norton-2019