Why is U.S. national data so terrible?


A week or two ago, I got an unexpected call on my cell phone. It was Rochelle Walensky. Yes, the Director of the CDC. Apparently she (and many others at the CDC) are big YLE fans! Which is… wild.

During our conversation, the topic of data came up. Why are we flying blind in the U.S.? She connected me with Caitlin Rivers. Dr. Rivers works at Johns Hopkins and was recruited by the CDC in 2021 to help open their new Center for Forecasting and Outbreak Analytics. I had so many questions for her, like: Why is our national data so abysmal? Why don’t we collect antigen tests to better understand community transmission? Why don’t we have real-time data, like the U.K. or Israel, to assess vaccine effectiveness? How can we fix all of this? She was kind enough to let me record our conversation. Thought you would appreciate the answers to these questions! The recording is above, the transcript below.

Love, YLE

P.S. I am not a Zoom whiz, so excuse the end! Also, you’ll notice the name Carol Williams on the recording. She was just listening as she is a new team member at the Center for Forecasting and Outbreak Analytics who is focusing on health communication!


Dr. Jetelina: Hi YLE Universe, this is Katelyn, Your Local Epidemiologist. You finally have a face and a voice, although I do have a cold right now, but I’m really excited because this is obviously something very different than what I’ve been doing on this platform for the past two years. I’m actually going to be playing reporter for once and talking to one of my pandemic heroes. A week or two ago, Director Walensky put me in contact with the Caitlin Rivers. Dr. Rivers is an assistant professor at the Center for Health Security at Johns Hopkins School of Public Health, and she actually focuses on public health preparedness, which we obviously desperately need going forward. But in 2021, I believe, she was snatched by the CDC to serve in a temporary role as Associate Director to their new center, which is called the Center for Forecasting and Outbreak Analytics. It’s new, but the whole point is to improve outbreak response using data, using modeling, using analytics, and also, like I heard this morning, using communication, which I’m super excited about. And boy do we need it. So, I understand that she will return to Johns Hopkins once the center has gotten off the ground. So Caitlin, welcome! Thank you so much for taking time this morning. 

Dr. Rivers: Yeah, thanks for having me! I’m glad to be here with you. And we share a namesake, so that’s always exciting.

Dr. Jetelina: Yeah, I know! Katelyn, epidemiologist, is talking to Caitlin, epidemiologist. So, I guess I’ll just dive in. So throughout this pandemic we’ve been flying blind. It’s been [for] many reasons, but one of the biggest is we just don’t have national data to drive data-driven decisions proactively in real time. I think a classic example of this is vaccine effectiveness. We just don’t know how well our vaccines are working right now in the face of new variants, over time, and we constantly really have been looking to other countries like the U.K. or Israel to help guide our decisions in the United States, [and] we are a very different population than them. So, Dr. Rivers, what’s going on? Why are we in this position right now?

Dr. Rivers: Well, we’re not where we want to be with our public health data, and that’s something we’re working really hard on at CDC. Part of the reason that we have not had the insights that our colleagues from the U.K. or Israel have had, there’s a few reasons. The first that you’ll hear, and most often repeated, is that we don’t have a national health system, unlike those other two countries, and that is a struggle. Our public health is federated—our state and local jurisdictions are each kind of independent entities that collect and share data according to their priorities and their specifications. And that is different from places where everyone receives their medical and public health services from a national network. But there’s actually another issue, that doesn’t get as much attention, that I think is a bigger problem for how we collect and share data in the United States, and that’s that the CDC doesn’t actually have the authority to direct data collection. We get all of our data through individual data-use agreements with every jurisdiction on basically every public health issue. And you can imagine, it’s a pile of paperwork that is slow and cumbersome and particularly in a fast moving health emergency, that infrastructure is just not really well suited to collecting and sharing data. 

Dr. Jetelina: I guess that means, also, we are dependent on how they collect that data, so the rigor of it. And then also what they agree and don't agree to share. Is that right? I mean they can say no.

Dr. Rivers: They can say no, yeah. Luckily we have a good relationship with our public health partners, but it's a precarious situation, because if the DUA doesn't go through, if the various people who have to sign it are on leave, if there is some sort of point of friction, they can take their data and they can go home really, so that the CDC doesn't have the data that it needs to be able to understand what's happening in the communities across the country. But, I do want to highlight one thing you said at the beginning which is the different data elements. It's a little bit of a wonky thing that like epidemiologists care a great deal about, but the importance of it might not be immediately apparent, but if every jurisdiction is collecting different data or collecting it in different ways, it's really hard to aggregate that into a national picture that really gives us a sense of what's happening across the country. And I think race and ethnicity is one example that stands out to me. We were very late in the pandemic to recognize the disparities across race and ethnicity, the disparate impacts, and that's because many jurisdictions didn't collect race and ethnicity, or they collected it in ways that didn't make sense when you aggregated it to the national picture. And the CDC doesn't have the authority to really standardize that, and that's part of the reason why there are gaps in our understanding. 

Dr. Jetelina: Yeah, the other thing that I've noticed is—and I would be curious to hear your perspective—is age. So, for example, like with kid data, right? So, some, I feel like some jurisdictions report just all those under 18, some of them report in buckets and these are different buckets, and so when you start combining all of them, you don't know what basically the rate is for under fives compared to adolescents on a national level. And that seems, you know, that's important because we have such—even though some states are really rigorous in their data, we need it across the plane because we're so huge and we're so diverse that, people in Texas, for example, are very different and they're in a different environment, genetics, than those in New York, for example. So, yeah, I hear that and I've seen that too. Now what data are we talking about? Are we just talking about vaccine data or is this across the board?

Dr. Rivers: It's across the board. There are a few exceptions like the Nationally Notifiable Disease System, that is a compulsory report to CDC in a standardized way. But, for basically all other data, this is how it works. What many people don't realize is that the hospitalization data for Covid, for example, is tied to the public health emergency declaration. When the declaration goes away, which it will, our ability to require reporting of the Covid hospitalization data will go away. There are a few fixes in the works, like CMS, I think has recently extended the hospitalization data, but I'm using that as a window into a wider set of problems that our data flows are really precarious and it puts us in a tight spot.

Dr. Jetelina: In the same vein right now, I think it's really difficult because we know that we're severely under-reporting cases right now and this is problematic because we are asking people to make their individual decisions, based on metrics in their county and we just don't know the level of transmission. For example, one I saw was that for every 100 cases, only 7 are officially reported. And that's for a lot of reasons, but one of them is because of at-home antigen testing. Is that the same deal? Like I know some jurisdictions have put in systems for antigen testing, but not all of them, and so is that why we don't have a national picture?

Dr. Rivers: I think the antigen testing is also a technical problem since there are not great systems to be able to capture what results people are recording at home. But to the extent that we're thinking about cases, hospitalizations, and those metrics are subject to these limitations where we get what the states have voluntarily agreed to give us. 

Dr. Jetelina: Yeah, I would just feel like, maybe in the future I mean, can't we just put together a website on CDC and just throw out a campaign and be like, “Hey, can you guys just please report your antigen test?” so we have some understanding of that on a national level, or is that just not feasible?

Dr. Rivers: Well, that reminds me of like the Flu Near You work that Harvard Children's Hospital has been leading for years, and there's now a Covid Near You, and I think those kinds of projects can be interesting windows into what is happening in communities. It's just hard because it's subject to so many biases, like who is receiving the messages that they should be reporting? Who has access and time to be able to do that? And then without a denominator, it's also hard, like I have trouble making sense of what to do with that data, but I think these non-traditional approaches for doing surveillance are really interesting. And particularly for us, at the Center for Forecasting and Outbreak Analytics, modeling can be an interesting approach for weaving together these different data sources, sort of making them more than the individual parts.

Dr. Jetelina: Yeah, absolutely. So what I understand—we have this decentralized system in the United States, and the CDC asked for the national picture. That report, that data, gets collected at the local level and then reported up to the CDC. And so what that means is you guys are dependent on, or we are dependent on, local jurisdictions either being excited to share this data and really on the ground, or local jurisdictions being like, “No, we don't want to share this.” So what's the scoop? Are you allowed to share like which states or which local jurisdictions have been super helpful and others that haven't?

Dr. Rivers: Well, I'll just highlight that the status quo is not that great for states either. It's great that they're able to control their data and they are the ones at the front lines of the community. But for example, if you look at the cases of acute hepatitis that are kind of popping up, we're still really learning what's happening, where they're happening—it's the very beginning of the investigation. But the states are asking CDC, “What's going on with acute hepatitis?” But if we don't have—if we can't look across states, if we're waiting for the states to report voluntarily, it's slow. And so it's hard for the states too to understand what's happening with their neighbors, what's happening across the country, because the system is just a little bit gummed up. 

Dr. Jetelina: Yeah, that makes a lot of sense. And I would assume that, I mean, willingness to share over time changes with the normalization of a pandemic and people wanting, like some states or jurisdictions, wanting to move on and some others saying, “Hey, this is still a problem.” Have you seen that kind of change and willingness to share over time?

Dr. Rivers: I think the importance of public health data has changed, like it's more widely recognized now as like practically a national sport to check your local Covid levels. And so I think the importance of public health data has really changed over time. Willingness, I'm not sure we've seen any major differences just because Covid is not the first time that we’ve faced, that we've tried to manage public health surveillance in this way. It's a long-standing set of relationships, but I think that our collective understanding of why it's so important to have high quality, timely detailed public health data is at an all-time high.

Dr. Jetelina: Finally! Finally, we epidemiologists have been shouting from the rooftop, so I guess that's, I don't know, a silver lining to the pandemic. So one thing you actually mentioned briefly at the beginning—and I wanted to dive into this a little more—is that the importance of this emergency order and how we expect that will be lifted probably this summer through whispers. But what are the implications of that, I mean, what's going to change? Are we going to be flying blind even more?

Dr. Rivers: Hopefully not, but it's possible. There are a lot of data streams tied to the public health emergency declaration. Hospitalizations is always the first one that comes to my mind because it's so important for the community burden indicators that CDC is using to tell people what the risk is in their communities. Hospitalizations is tied to the public health emergency declaration. CMS, just this week, released a rule that I think is going to extend the reporting through 2024, so maybe we have a little bit longer of a runway for that, but there's also cases, there's electronic lab data, there's all sorts of data streams—and I don't have the full list but it's quite extensive—of what is tied to emergency declaration. And so I do think that there will be changes in our ability to understand what's happening with Covid.

Dr. Jetelina: And for everyone listening, can you describe what CMS is?

Dr. Rivers: CMS is the Centers for Medicaid and Medicare Services. It is the government agency that provides Medicare and Medicaid coverage to millions of Americans, and they have an enormous role in policy making because they provide so much coverage to just so many people. They are able to have great influence on data reporting, for example. 

Dr. Jetelina: So once the emergency order is lifted and, say, CMS continues to work their butts off to figure out a way about hospitalizations, but even that then is a biased sample. That's not a very good generalizable sample if we're just looking at Medicaid and Medicare, right? Because we have an entire population that's also insured through private insurance, no?

Dr. Rivers: A little bit over my skis here, but I think because CMS covers so many Americans, that when they set a policy like that, it generally is far reaching enough that we'll get data beyond the covered population. 

Dr. Jetelina: So at least some kind of general understanding of, like, how well our vaccines are working. The other thing that has been interesting is the “with Covid” and “for Covid” hospitalizations. And that seems to be very dependent on local jurisdiction as well, right? So that's why, for example, the CDC can't report that on a systematic level?

Dr. Rivers: That is tied to our inability to require reporting of certain data elements. It's up to jurisdictions to decide that they want to collect that data and to report it to us and because it's just not a priority everywhere, our picture is uneven. 

Dr. Jetelina: Yeah, okay, so my last question for you: how do we fix this? I mean, so the virus is going to continue to change. We know that, we expect it's going to continue to mutate and then, also, this isn't going to be the last virus. I mean like you said, we're seeing acute hepatitis, we are seeing Ebola over in Africa right now, we're seeing avian flu virus. I mean, how do we fix this? It seems pretty ingrained in our culture and systems in the United States.

Dr. Rivers: Yeah, so CDC, with support from Congress, is investing about 1 billion dollars in improving the public health data infrastructure at the state and local level. We want to make it as easy and as technologically modern as possible to collect and report public health data, and so we have a big project, investing a lot of money in making that possible at the state and local level. But there is still this authorities piece where we may make those investments and not necessarily receive that data, and so I think updating our authorities to be able to synchronize and standardize our data reporting would improve our ability at CDC to understand what's happening in communities across the country and improve the ability for state and local jurisdictions to understand what's happening with their neighbors and what do they need to be aware of across the country. So I think that was one outstanding piece that would really help us to improve our public health data picture.

Dr. Jetelina: Now, in the first thing you said was money, which is obviously incredibly important, but isn't—because the CDC is under Congress's authority, I mean can't that money leave too? Or, I mean, or is that kind of guaranteed support? I don't know how that works.

Dr. Rivers: The money that we are investing in the data modernization initiative has already been appropriated, so it’s going out.

Dr. Jetelina: Okay, awesome. But for future, it may or may not be there.

Dr. Rivers: That’s right.

Dr. Jetelina: Okay. Interesting, well that's slightly terrifying, but at least it's coming on the ground. Well, thank you so much, Caitlin, for providing your insights! Is there anything else that I missed that you want to share?

Dr. Rivers: Yeah, thanks for the opportunity. Any CMS experts out there, I’m sorry I bungled your important role that CMS policy plays in our public health and medical system, but I'm glad for the opportunity to share a little bit more about how public health data works. 

Dr. Jetelina: All right, thank you so much, Caitlin. Bye, everyone!

Katelyn Jetelina