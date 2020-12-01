Epidemiologists find meaningful patterns in data. That’s our job.

This pandemic has given us the opportunity to find disease patterns in sources I never thought possible. Like stool samples in New York or mobile phone data. But this latest analysis is, by far, the most innovative (and hilarious) I’ve seen.

Enter Amazon reviews. Turns out Yankee Candle company has been getting some bad reviews lately. One budding scientist thought to look into whether this was because of the pandemic, as one of the main symptoms is loss of smell.

So, she downloaded a random subsample of Amazon customer reviews of the 3 most popular scented candles. Between January 2017 to January 2020, the average rating consistently hovered around 4.3 (out of 5). But, interestingly, there was a sharp drop in 2020 (Figure 1).

But correlation doesn’t equal causation. In other words, other things could be causing this drop. She needed a comparison. So, she then downloaded data from the 3 most popular unscented candles on Amazon. There wasn’t nearly as big as a drop (Figure 2).

THEN she downloaded the actual text of the reviews. She used statistical software to search for words like “lack of scent”. Since the beginning of this year, the proportion of reviews mentioning lack of scent grew more than 4% from January 2020 to November 2020.

Absolutely brilliant!

So, before you write a review like this, make sure you don’t have COVID19 first. In fact, epidemiologists should partner with Amazon to send COVID19 testing reminders to “lack of scent” reviews? (I’m half joking).

Love, YLE

Data Source: Check out the young budding Harvard scientist on Twitter (@kate_ptrv). No she’s not an epidemiologist, but maybe I can convince her to apply to our program