Databrewers Project 2024
Welcome to the Databrewers Project
Welcome to the official website of the Databrewers Project for the Applied Data Analysis course at EPFL. This project aims to analyze beer review data, focusing on insights and trends in beer reviews and their characteristics.
Beers have been enjoyed for centuries across cultures, and today beer enthusiasts share their experiences and rate their favorite brews on online platforms such as BeerAdvocate and RateBeer. With a vast number of reviews available online, these platforms have become a gold mine for analyzing trends in beer preferences, consumption patterns, and the factors that influence people’s choices.
Our project focuses on how beer ratings and reviews evolve over time, especially with respect to seasonal and long-term changes. How do beer preferences shift across different seasons, and how do climate and geographical factors influence these choices? Does the alcohol content or beer style impact ratings differently depending on the time of year? Furthermore, can we identify specific beers that are consistently linked to certain seasons? By addressing these questions, we aim to provide actionable insights for brewers, marketers, and enthusiasts alike, helping them align their offerings with the dynamic preferences of consumers.
Dataset
For our study, we were provided with two distinct datasets. The first one contained data from the BeerAdvocate website, while the second contained data from the RateBeer website. After some preliminary analysis, we decided that the data from the BeerAdvocate website was enough to conduct our analysis.
We decided to focus especially on the data generated by users based in the United States of America. It represents most of the overall data and allows us to study more in-depth the differences in ratings between the states.
The diversity of beer styles can make it difficult to compare; to avoid redundancy of styles with similar characteristics, styles were simplified using an article from the BeerAdvocate website, reducing the number of beer styles from 105 to 45.
Ratings
To understand the seasonal analysis, it is important to understand how the beers are rated. In both reviews and ratings dataframes, six columns contain metrics (which are scores out of 5): appearance, aroma, taste, mouthfeel, overall and rating.
The rating metric is computed based on the five other scores with the following formula:
\[\text{Rating} = 0.06 \cdot \text{Appearance} + 0.24 \cdot \text{Aroma} + 0.4 \cdot \text{Taste} + 0.1 \cdot \text{Mouthfeel} + 0.2 \cdot \text{Overall}\]The rating metric takes into account all other metrics, weighted by their importance; therefore, for numerical study of the ratings, only the rating metric will be investigated.
1. Seasonal influence on beer ratings for ABV category
One important detail about the beers is their Alcohol By Volume (ABV), expressed as a percentage ranging from 0 to 100%. To analyze how the alcohol content of a beer might influence its preference across seasons, and by extension, different weather conditions, we categorized the beers into three groups: low (0-5.4%), middle (5.4-8.7%), and high (8.7-67.5%) ABV.
Will low-alcohol beers be favored in the warm season? Or in the cold season?
To answer these questions, we group beers by season and alcohol content category, and compare their scores. As we can see, ratings are not constant throughout the year, and high and low peaks can be observed for both low and high ABV beers. In winter, high-alcohol content beers reach their highest peak, while low-alcohol content beers reach their lowest peak at this time of year in terms of rating. In summer, on the other hand, high ABV beers are the least rated. And in spring, low ABV beers reach their highest rating. We can therefore conclude that during the warm season (summer, spring), low-alcohol beers are preferred, while as the cold season (winter) approaches, users tend to favor high-alcohol beers.
2. Seasonal trend in beer ratings at the scale of the United States
Let’s dive into the seasonal trends of the most popular beer styles in the United States! To simplify our analysis, we grouped similar beer styles during pre-processing and focused on the top-rated styles. We only included styles that accounted for at least 2% of total ratings - highlighting the true favourites among beer enthusiasts.
Pale Lager has its best ratings around spring and worse around winter, with a difference of about 0.1. Some may not consider this a sufficient difference, but remember that there are more than 100k ratings for this beer style. For other styles, the difference is harder to see. With an ANOVA test, some trends in rating variability across seasons emerge:
-
Warmer months:
Lighter styles such as Pale Lager, Pale Ale, IPA, Wild Beer, Amber/Red Ale, Brown Ale, and Strong Ale perform better in the spring and summer. These styles score higher in the warmer months, consistent with the preference for lighter, more refreshing beers during this time. Pale Lager and IPA show the most improvement in summer, while Amber/Red Ale has the highest ratings in summer compared to other seasons. -
Colder months:
Richer and heavier styles such as Wheat Beer, Barleywine, Strong Ale have higher ratings in winter and fall. These beers, known for their fuller body and warming qualities, are more popular in the colder months. Barleywine and Wheat Beer are especially popular in winter, while Strong Ale has its highest ratings in fall, outperforming both spring and summer. -
Minimal or No Seasonal Variation:
Stouts and Porters show very little seasonal variation. The differences are minimal, indicating that these styles have consistent appeal throughout the year regardless of season.
That said, the magnitude of the observed differences between seasons, while statistically significant, is small (around 0.05), and very little variation is observed with ratings alone at the United States scale.
3. Seasonal trend in beer ratings across the states
While beer style preferences show some variation across the United States over the seasons, it remains limited overall. However, given that the U.S. is a very large country, some regions may not experience the same weather conditions across seasons. This raises the question: will this limited variation persist at the state level, or will we observe greater differences in preferences by state?
By visualizing the highest-rated beer styles by state and season, we can now see more clearly how this evolves over the seasons. To simplify our analysis, we regroup the states by region as follows:
- Northeast (e.g. New York, Massachusetts)
- South (e.g. Texas, Florida)
- West (e.g. California, Arizona)
- Midwest (e.g. Minnesota, Wisconsin)
In winter, the West favors Wild beer and Lambic-Unblended, while Montana stands out with Dark Mild Ale. The South prefers Stouts and Wild beer, while the Midwest and Northeast lean towards Dark Mild Ale and Stout, with Lambic-Fruit in the Northeast. In spring, the West favors Pale Mild Ale, low alcohol beer, and Lambic-Fruit, while the South and Northeast also lean toward Lambic-Fruit, with the South showing a preference for Kristalweizen. The Midwest also embraces Kristalweizen, with a shift from Stout to Lambic-Unblended. In summer, the West continues to favor Pale Mild Ale and Lambic-Fruit, while the South shifts to Lambic-Unblended. The Northeast stays consistent with Lambic-Fruit, and the Midwest prefers Lambic-Fruit and Milk/Sweet Stout. In fall, the West still prefers Lambic-Fruit but starts moving toward Strong Pale Ale, especially in Oregon. The South continues with Lambic-Unblended, while some states begin leaning towards Pale Mild Ale. The Midwest remains consistent with Lambic-Fruit but shows growing interest in Stout, while the Northeast favors Lambic-Unblended and Wild beer.
What we can conclude from this is that, as the seasons progress, beer style preferences shift, with fruitier, lighter styles being preferred in warmer months and stronger beers like Stout and Dark Mild Ale becoming more popular as the temperature cools.
Sentiment and semantic analysis
In addition to numerical ratings, textual reviews play a crucial role in understanding consumer preferences and perceptions of beer. Sentiment and semantic analysis are powerful techniques that allow us to extract insights from the textual data in the BeerAdvocate reviews.
1. Sentiment analysis
When it comes to beer reviews, the words that people use can tell us a lot about their preferences. In this section, we dive into the words used in beer reviews to uncover which ones are linked to positive ratings and which ones are tied to negative feedback. By analyzing these words, we can start to understand what makes a beer enjoyable for consumers, and what makes it not that enjoyable.
To investigate, we categorized beers based on their ratings: beers with ratings greater than 4/5 are considered highly rated (positive), while those with ratings lower than 3/5 are seen as poorly rated (negative).
We generated two word clouds: one for the most common word pairs found in positive reviews and another for those in negative reviews. At first glance, it was clear that some word pairs were more associated with high-rated beers, while others appeared more often in lower-rated ones. These word clouds provided a valuable insight into the language of beer enthusiasts, highlighting the key aspects they appreciate or dislike in their beer experiences.
Certain words and phrases seem to appear a lot for beers that get high marks. Words like “full bodied” “dark chocolate” and “roasted malts” popped up frequently in positive reviews. It highlights beer charactectistics that consumers appreciate.
There were also words that kept appearing in negative reviews. For instance, phrases like “light bodied” and “yellow color” seemed to show up more often when beers weren’t getting high marks. These terms seemed to be tied to beers that reviewers weren’t as impressed by.
Some word pairs, like “medium bodied” appeared in both positive and negative reviews, suggesting that context plays a big role in how we perceive these terms. Some characteristics of a beer could be appreciated by some drinkers but not by others which could explain this.
These results allow us to determine the characteristic that an ideal beer should have, and those that it should not have, in order to be enjoyed by consumers.
2. Semantic analysis
While sentiment analysis uncovers the emotional tone behind a review, semantic analysis digs deeper into the meanings of words, helping us understand how different characteristics are associated with the beers’ appeal. It’s not just about whether a beer is liked or disliked; it’s about why it stands out in the first place. By analyzing the specific language used in high rated reviews, we can uncover the key qualities that turn a simple beer into a fan favorite. In this section, we will dive into a semantic comparison of beer reviews across seasons to explore which characteristics define the most-loved beers in each time of the year.
In each season, specific aroma preferences emerge, shaping the appeal of beers based on what people enjoy the most. Based on our analysis of aroma reviews, the following trends stand : during winter, beers with rich, warm aromas are favored. Terms such as “malt”, “caramel”, “chocolate”, and “roasted” are prominent, indicating a preference for fuller-bodied beers. In spring, there is a noticeable shift in aroma preferences. While the rich, malty notes remain popular, there is an increase in fruity terms like “citrus” and “grapefruit”. On the other side, summer introduces a different set of preferred aromas. Here, herbal scents become more prominent, such as “floral”, “herbal” and “grassy” terms. These refreshing aromas provide a crisp and clean feeling, making beers more enjoyable in warmer weather. Finally, as autumn arrives, there is a resurgence of malt-based aromas, similar to winter. However, there is also an interesting interplay with fruity terms, which become less prominent compared to malty ones.
When it comes to mouthfeel, the seasonal differences are less pronounced compared to other attributes. However, there are still subtle shifts in preferences that provide insights into what consumers enjoy the most throughout the year. Across the cooler seasons, terms like « rich », « creamy » and « smooth » dominate the reviews, indicating a preference for beers with a full-bodied and velvety texture. These characteristics align well with the comforting appeal of beers during these times of the year. During summer, there is a noticeable decrease in the prevalence of the term « rich ». Instead, lighter and more refreshing descriptors such as « tart » and « thin » start to gain prominence.
The palate of a beer plays a significant role in its overall appeal, and seasonal variations reveal shifting preferences in flavor profiles. During the colder months, sweet terms dominate positive reviews. Words like “sweet”, “caramel”, and “toffee” are frequently mentioned, reflecting a preference for richer, sweeter flavors that provide warmth and comfort. However, as temperatures rise, acidic terms become more prevalent in positive reviews. Descriptors like “acidic”, “citrusy”, and “sour” indicate a preference for sharper, more refreshing palates that offer a crisp and invigorating experience.
Seasonal variations also reveal distinct preferences for taste: positive reviews during spring season often highlight fruity notes. Words like “citrus”, “grapefruit”, and “fruit” are significantly more prevalent, showcasing a clear preference for bright, fresh flavors that resonate with the season’s renewal. During winter, malty flavors take center stage in the colder months. Descriptors such as “caramel”, “roasted”, and “toffee” dominate positive reviews, reflecting a craving for rich, warming tastes that pair well with the season. About autumn, this transitional season sees a blend of fruity and sweet terms in positive reviews. Words like “fruit” and “sweet” become prominent, indicating a preference for beers with both fresh and indulgent flavor profiles as temperatures cool down.
Conclusion
In this project, we explored beer reviews, analyzing patterns in ratings, styles, and consumer preferences across seasons and regions in the United States. By leveraging numerical ratings and textual reviews, we uncovered insights into how beer preferences vary with alcohol content, seasonal changes, and geographical trends. Sentiment analysis of textual reviews further enriched our understanding, putting light on the aspects the most valued by beer enthusiasts.
Our findings reveal that preferences are not only shaped by individual tastes but are also influenced by external factors like weather and cultural trends. Brewers and marketers can use these insights to better align their offerings with consumer expectations. Through this analysis, we hope to contribute to a deeper understanding of the dynamic and evolving beer landscape, offering value to researchers, breweries, and beer lovers alike. Based on our analysis, we have developed the following recommendations for making the perfect seasonal beer:
-
Winter: Rich, warm aromas of malt, caramel, chocolate and roast dominate. High ABV beers are favored, along with fuller-bodied styles such as Stouts and Barleywines. Mouthfeel preferences tend towards creamy and smooth textures. Sweet flavors, including chocolate and caramel, are prominent.
-
Spring: Fruity aromas such as citrus and grapefruit appear alongside malty notes. Low to medium-ABV beers gain in popularity, with lighter styles such as Pale Lager and Pale Ale highly rated. Refreshing mouthfeel descriptors such as thin and tart become common. Bright, fresh flavors dominate, reflecting the renewal of the season.
-
Summer: Herbal and floral aromas such as grassy and herbal are prevalent. Low ABV beers are most preferred, with lighter and fruitier styles such as IPA and Wild Beer performing best. Mouthfeel tends towards light and thin textures, complementing sharp and acidic flavors such as citrus and sour notes.
-
Autumn: Malt-based aromas resurface, with a mix of caramel and chocolate notes, while fruity notes decrease. Mid-to-high ABV beers gain traction, with rich and comforting styles such as Strong Pale Ale becoming more popular. Mouthfeel remains smooth but incorporates fuller-bodied textures. Flavors blend fruity and sweet profiles, reflecting the transitional nature of the season.
Thank you for visiting our project website, and we hope our work inspires further exploration of this rich and flavorful topic!
Contributors
This project has been conducted by:
- Cyrot Eugénie
- Kiehl Clémence
- Philippe Marin
- Renaud Cléo
- Theimer-Lienhard Pauline
Learn More
If you’d like to dive deeper into the project or explore the data, check out our GitHub Repository for the full project details, code, and datasets.
Feel free to reach out to us for any questions or collaborations!