Are beer ratings biased?

CrimeDog · Apr 21, 2017

Great post/question....

Earlier today, I bought a 4 pack of Putting Out Fires from Sand City which got me thinking about the same thing (I think)...

"Fires" is an awesome session, so in my opinion it should be judged against other sessions - not big DIPA's etc...

But at the same time I understand someone elses post about how much they enjoy the beer in front of them at that moment....damn...Im rambling now...

Have a great weekend everyone #LGM

drtth · Apr 21, 2017

zid said: ↑

Of course ratings are biased. They always will be, even if rating "to style." It's inherent and it's fine.

This topic has been explored many times and there are multiple schools of thought on rating to style:
1) Anyone not rating to style is breaking the system and doing it wrong.
2) I'll never rate the greatest AAL higher than a great barrel aged stout because in reality there is an inherent hierarchy.
3) I rate according to taste regardless of whether or not it is to style.

None of these approaches are really wrong even if they aren't compatible. Group ratings that reflect all of the views above will be inherently more interesting than a singular approach.

Personally, I view beer through the lens of style, but I think "rating to style" is a terribly flawed idea. When @TongoRad specifically says "with style in mind" he is not being a pendant nor is he being casual with his word choice. It's a deliberate choice and it makes all the difference (for the better). You don't judge a Helles lager according to the same standard as a stout, but there is no genuine standard for a Helles lager or a stout. People tend to treat "rating to style" as a gold standard of a wide viewpoint, but it's really a narrow (or misguided) viewpoint. Rating according to personal taste is also a narrow viewpoint by definition.
Click to expand...

Agree, "rating to style" on a site such as this one is basically flawed, if only because almost none of us have the training or experience to do it well and virtually nobody reviews blind. I'd say that's why both @TongoRad's comment and the instructions on this site about "keeping style in mind" when rating are so important.

But I would suggest that there is indeed a standard for an Helles lager or a different standard for a Stout. But for each currently used/definted style there is not a single point on each dimension used in categorization that is the only target. The categorization dimensions are fuzzy dimensions with each dimention having a range of values associated with or "typical" foreach style. For example, there is a range of ABVs that are said to typical for an APA but there are examples of APAs that fit most or all of the other dimensions of categorization but that have an ABV that falls outside the normal range. Etc.

zid · Apr 21, 2017

drtth said: ↑

Agree, "rating to style" on a site such as this one is basically flawed, if only because almost none of us have the training or experience to do it well and virtually nobody reviews blind. I'd say that's why both @TongoRad's comment and the instructions on this site about "keeping style in mind" when rating are so important.

But I would suggest that there is indeed a standard for an Helles lager or a different standard for a Stout. But for each currently used/definted style there is not a single point on each dimension used in categorization that is the only target. The categorization dimensions are fuzzy dimensions with each dimention having a range of values associated with or "typical" foreach style. For example, there is a range of ABVs that are said to typical for an APA but there are examples of APAs that fit most or all of the other dimensions of categorization but that have an ABV that falls outside the normal range. Etc.
Click to expand...

I mainly agree, but to a point . To take it a step further - if standards of beer styles are ranges rather than points, and one is rating to style, then any beer that falls within those ranges w/o flaws (flaws can even be subjective as well) would be a perfect "5." If the ranges are large, then there's little point in using that approach. Rebel IPA, Redhook Long Hammer, Harpoon IPA, Racer 5, and Two Hearted might all fit within the acceptable ranges for American IPAs (and if they don't, then perhaps there's an issue with the ranges rather than the beer), but should everyone rate them evenly?

rgordon · Apr 21, 2017

If nothing else, various rating (systems) are an interesting yardstick for current trends. In the wine world, if, say, "Barnstormer" Zinfandel scored a 95 at the Cloverdale Fair in 1995, that 95 score redounds into eternity on bottle neckers, box graphics, and company propaganda. I take almost all ratings with a grain of salt. There are some voices that I do trust. Ratings are indeed biased.

drtth · Apr 21, 2017

zid said: ↑

I mainly agree, but to a point . To take it a step further - if standards of beer styles are ranges rather than points, and one is rating to style, then any beer that falls within those ranges w/o flaws (flaws can even be subjective as well) would be a perfect "5." If the ranges are large, then there's little point in using that approach. Rebel IPA, Redhook Long Hammer, Harpoon IPA, Racer 5, and Two Hearted might all fit within the acceptable ranges for American IPAs (and if they don't, then perhaps there's an issue with the ranges rather than the beer), but should everyone rate them evenly?
Click to expand...

Well, first lets abandon the idea of "rating to style" which is different than rating "with style in mind" and I think we can agree that the former will never be the norm on sites like this, if only because there is no way to force people to do blind ratings.

Next let's factor in the fact that people themselves have different patterns of behavior when it comes to using rating scales, regardless of what is being rated. There are some folks who almost never use the extreme scores and there are others who seldom use the values in between. So a "perfect" 5 across all raters is simply not going to happen, and is an unrealistic expectation about human behavior.

Sure, if the ranges are wide rather than narrow there's likely to be a lot more "slop" in the system but that doesn't really mean they are not useful or that there can not be multiple beers falling into the "equivalence category" created by the range and still display differences within that range. Further, that doesn't mean their scores/ratings, etc. are not useful.

Rather it means people have to live with variability. Which is actually something we do regularly and quite well in many circumstances. To use a loose analogy, we often speak about people who are "redheads" and are well understood when we day that, but there is acutally wide range of shades that fall into that category. To do allow anything else would put us in the position of having to have a different category name for every individual different shade and the location of a line would still be difficult to decide.

So our choices boil down to either a few potentially imperfect categories with fuzzy boundaries and ranges that may be of different widths or it leaves us with an almost totally unworkable multiplication of descriptors, each referring to a different category. In addition we'd still have a problem determining which criterion of difference-sameness to use. For the example of hair color, would it be human ratings limited by the sensitive of the human eye and differences in terminology, or would it be the unambiguous wave length of light which can be assessed very precisely but contains many subdivisions not detectable by the human eye.

All of which is a long, overly detailed way to geting to "No, we should not expect every individual to rate exactly as does every other individual."

Junior · Apr 21, 2017

Not really bias, but I find it interesting that the average rating on this site is somewhere near 3.75. Shouldn't it be closer to 2.5 or 3.

drtth · Apr 21, 2017

Junior said: ↑

Not really bias, but I find it interesting that the average rating on this site is somewhere near 3.75. Shouldn't it be closer to 2.5 or 3.
Click to expand...

Only if there is a normal distribution of ratings centered around one of those numbers. (3 since there is no 0 score to be assigned.) But that requires assumptions and/or procedures not in place.

Ninjakillzu · Apr 22, 2017

Bias will always exist. My ratings are almost always skewed high, because if I like a beer's aroma/flavor, the rating for those will never go below 4.5. It's not often that I dislike a beer. My ratings are usually a combination between flavor and style correctnesd.

SFACRKnight · Apr 22, 2017

JAStheAce said: ↑

In general, no. However, when it is a beer that is not easily obtainable aka rare, then the rating becomes more susceptible to bias. It is just human nature to subjectively rate such a beer a bit higher than your normally would because the fact that you are drinking it and others are not makes it taste better. Remember, the most delicious flavor in beer - is rare and I believe this skews ratings.
Click to expand...

That's funny, I find rare beers never live up to the hype and I usually rate them lower. KBS was one of those beers. I had just built it up in my head and when I finally tried it I was let down. Same with red poppy. Same with a bunch of others too.

1ale_man · Apr 22, 2017

For what it's worth, I really "listen" to what I read in the ratings on this site! All of you have way more experience than I. These ratings may be somewhat skewed, but I still "listen". We all have our favorites, and no matter how hard we try, we will lean toward them! So please!!! Don't lead me too far astray! By the way. I don't rate porters and stouts! Not my favorite! Y'all keep on rating!!!!

Cheers

StoutElk_92 · Apr 22, 2017

andy712 said: ↑

I hope this isn't too off topic, but I'm curious given that you're from Vermont and rate based on, among other things, on appearance and mouthfeel. How do you rate the appearance of the super cloudy NE style IPAs that are so popular right now? Do you rate relative to other NE IPAs or relative to the general IPA style? Same re mouthfeel. Personally, I have had a number of NE style IPAs that I have liked (although I prefer IPAs with a little malt backbone), but I find the look off putting and would rate lower in appearance than, say, a Pliny the Elder. Taste is, of course, most important, but I am befuddled by reviews giving 5s for appearance for beers that look nothing like the traditional look of that beer style (I am presuming IPAs were initially mostly clear, but I could be wrong about that).
Click to expand...

The American IPA is nothing like the original IPAs. Why does a beer need to look traditional to be a 5? If it looks good it looks good that's about it. What if the modern interpretation of a style looks better than the original traditional interpretation? Do we rate it lower because it's not how it is suppose to look?

andy712 · Apr 22, 2017

StoutElk_92 said: ↑

The American IPA is nothing like the original IPAs. Why does a beer need to look traditional to be a 5? If it looks good it looks good that's about it. What if the modern interpretation of a style looks better than the original traditional interpretation? Do we rate it lower because it's not how it is suppose to look?
Click to expand...

Well, I guess that's what I am trying to get at. Is there any objective measure of appearance or is it entirely subjective? E.g., I like NE IPAs; this looks like a NE IPA; I therefore rate it a 5 in appearance. If we don't "rate it lower because it's not how it is suppose to look" what's the point of rating appearance? I suppose it's a matter of deciding what the criteria are. Maybe NE IPAs are an anomaly, or at least a substandard with its own criteria for appearance. In the end, it's a pretty minor component of what makes a good beer, I guess.

drtth · Apr 22, 2017

andy712 said: ↑

Well, I guess that's what I am trying to get at. Is there any objective measure of appearance or is it entirely subjective? E.g., I like NE IPAs; this looks like a NE IPA; I therefore rate it a 5 in appearance. If we don't "rate it lower because it's not how it is suppose to look" what's the point of rating appearance? I suppose it's a matter of deciding what the criteria are. Maybe NE IPAs are an anomaly, or at least a substandard with its own criteria for appearance. In the end, it's a pretty minor component of what makes a good beer, I guess.
Click to expand...

It is, as you say, a matter of deciding what the criteria are. In the case of the NE IPA does one use the long standing tradition of clarity or allow murky appearance as a good thing. That really won't be resolved until the larger debate over whether NE IPA is a new style or not has been resolved. The statement of using different criteria should, however, be made by the reviewer if they choose to ignore the long standing criteria of clarity and a fluffy white head that leaves lots of lacing. These did not come about arbitrarily, if only because a fail on either or both might be an indication of a brewing flaw.

However, this appearance business is not such a minor matter that we should dismiss it out of hand, the way some folks do as being irrelevant to their enjoyment of the beer. (And which you are clearly not doing in this thread.)

Appearance can have an effect. Not only can it indicate a brewing flaw it can be shown to have an effect on people's expectations of flavors and their enjoyment, etc. This is one reason food production companies actually worry about such things as clarity in some beverages.

http://www.foodbusinessnews.net/articles/news_home/Beverage_News/2014/01/The_value_of_beverage_clarity.aspx?ID={1DB752F4-D4F8-466E-B250-4FD4AEE47479}

Until the debate is resolved about whether the NE IPA is "a new style with it's own criteria or still part of an older style that violates one or more widely accepted criteria," my own approach is to be quite clear in my review about what my reasons are for assigning the numerical value that turns my subjective experience into an objective number. That way any one who happens to read the review can make their own judgment about what I've done and why.

TongoRad · Apr 22, 2017

zid said: ↑

I mainly agree, but to a point . To take it a step further - if standards of beer styles are ranges rather than points, and one is rating to style, then any beer that falls within those ranges w/o flaws (flaws can even be subjective as well) would be a perfect "5." If the ranges are large, then there's little point in using that approach. Rebel IPA, Redhook Long Hammer, Harpoon IPA, Racer 5, and Two Hearted might all fit within the acceptable ranges for American IPAs (and if they don't, then perhaps there's an issue with the ranges rather than the beer), but should everyone rate them evenly?
Click to expand...

I look at it more like multiple dimensions rather than larger ranges. Some beers are delicate, some are bold, some are inventive, some are classic, and on and on. All can be seen within the construct of keeping style in mind.

So, although I don't think all of the beers you listed are on the same level, I do allow for multiple beers being on the super elite (not "perfect") 5 level.

drtth · Apr 22, 2017

TongoRad said: ↑

I look at it more like multiple dimensions rather than larger ranges. Some beers are delicate, some are bold, some are inventive, some are classic, and on and on. All can be seen within the construct of keeping style in mind.

So, although I don't think all of the beers you listed are on the same level, I do allow for multiple beers being on the super elite (not "perfect") 5 level.
Click to expand...

Hmm, but if you allow for multiple dimensions rather than ranges, where is your either/or dividing line between delicate and bold?

I think you've clarified part of what I was hoping to say as well. There are both the problem of deciding which dimensions to use and how broad the dividing line (i.e., the range) is to be.

TongoRad · Apr 22, 2017

drtth said: ↑

Hmm, but if you allow for multiple dimensions rather than ranges, where is your either/or dividing line between delicate and bold?

I think you've clarified part of what I was hoping to say as well. There are both the problem of deciding which dimensions to use and how broad the dividing line (i.e., the range) is to be.
Click to expand...

To really simplify it, the question is "what are they going for, and how well did they do? ". In reality, though, the answer isn't all that simple to answer, and can take a few times with a new beer to wrap ones head around it.

dlcarst · Apr 22, 2017

Absolutely, although I've found myself becoming more balanced lately. I used to give 4.5+ ratings to double IPAs and coffee or barrel aged stouts, now I've found myself giving (only!) a 4 for highly rated/hyped beers and 4.5 for a superb Irish stout or Vienna lager. However, I still can't seem to break out of rating almost every single beer I drink 3.5-4.5. Part of it is that I know how to avoid mediocre beer, but when I look at my ratings I often can't believe that I rated two beers so close to each other when I clearly enjoyed one much more.

Giantspace · Apr 22, 2017

I rate by how I like what I am drinking. If drinking a Hefeweizen I think to the Hefeweizen I have had before and compare. If it's a style I don't know about I will still rate to my taste , correct to style or not. I try to read reviews before using the scores to decide if I would buy.

I use a 3.0 score for PBR as my base for all beers.

Enjoy

StoutElk_92 · Apr 22, 2017

andy712 said: ↑

Well, I guess that's what I am trying to get at. Is there any objective measure of appearance or is it entirely subjective? E.g., I like NE IPAs; this looks like a NE IPA; I therefore rate it a 5 in appearance. If we don't "rate it lower because it's not how it is suppose to look" what's the point of rating appearance? I suppose it's a matter of deciding what the criteria are. Maybe NE IPAs are an anomaly, or at least a substandard with its own criteria for appearance. In the end, it's a pretty minor component of what makes a good beer, I guess.
Click to expand...

What I'm trying to get at is I think we should just rate based on how good it is to us, in this case looks. I've had hazy unfiltered IPAs that don't look nearly as good as say most Trillium or Tree House beers. If you like it you like it, never mind what it's suppose to be like. If it's good it's good. That's how I feel about rating.

Beer_Economicus · Apr 22, 2017

I have read through all these posts, and I feel like everyone is providing solid feedback. Here'S my take (with some overlap to previous posts).

There will inherently be bias for myriad reasons, most notably:
1) Everyone is not starting at the reference point. Some people come into a review with 50 beers, some with 250. This will dramatically alter your perspective.
2) Everyone has different tastes. I do not mean that some people like IPAs and some don't, I mean literally that taste is different. Someone that says "I have rarely had a beer that I cannot drink" will likely have higher avg scores than someone who is significantly more selective.
3) regarding #2, if you only drink "high regarded" offerings, your avg score will also be different, as you have more high end beers to compare to (even if still comparing to high end beers)
4) people are likely not only comparing within "style", but also within class. I can see a lot of people comparing KBS largely to other "well regarded BA stouts" rather than just other "BA stouts." This will make a big difference.
5) There is no guide for people to use for rating, and as a result most of the scale does not get used. One person may use 1 as "the worst beer ever", while someone else might say that their worst beer ever was still drinkable, and therefore a 3. Same silly logic you see in graduate school grades where you see clumping at 90%. Tells you very little about where everyone falls compared to the others in that group - instead it just says "competent."

Few others, but these seem to carry most the weight (for me).

Are beer ratings biased?

CrimeDog Zealot (749) Dec 31, 2015 New York

drtth Initiate (0) Nov 25, 2007 Pennsylvania
In Memoriam

zid Grand Pooh-Bah (3,132) Feb 15, 2010 New York
BA4LYFE Society Pooh-Bah Trader

rgordon Pooh-Bah (2,701) Apr 26, 2012 North Carolina
Pooh-Bah

drtth Initiate (0) Nov 25, 2007 Pennsylvania
In Memoriam

Junior Pooh-Bah (1,883) May 23, 2015 Michigan
Pooh-Bah Trader

drtth Initiate (0) Nov 25, 2007 Pennsylvania
In Memoriam

Ninjakillzu Initiate (0) Oct 5, 2015 Washington

SFACRKnight Grand Pooh-Bah (3,348) Jan 20, 2012 Colorado
Pooh-Bah Trader

1ale_man Initiate (0) Apr 25, 2015 Texas

StoutElk_92 Grand Pooh-Bah (4,045) Oct 30, 2015 Massachusetts
Pooh-Bah

andy712 Initiate (0) Jul 23, 2016 Oregon

drtth Initiate (0) Nov 25, 2007 Pennsylvania
In Memoriam

TongoRad Grand Pooh-Bah (3,884) Jun 3, 2004 New Jersey
Society Pooh-Bah Trader

drtth Initiate (0) Nov 25, 2007 Pennsylvania
In Memoriam

TongoRad Grand Pooh-Bah (3,884) Jun 3, 2004 New Jersey
Society Pooh-Bah Trader

dlcarst Zealot (733) Aug 21, 2015 Missouri
Trader

Giantspace Grand Pooh-Bah (3,043) Dec 22, 2011 Pennsylvania
Pooh-Bah

StoutElk_92 Grand Pooh-Bah (4,045) Oct 30, 2015 Massachusetts
Pooh-Bah

Beer_Economicus Pooh-Bah (2,698) Apr 8, 2017 Ohio
Pooh-Bah Trader

About

Contribute

Fun

Boring

Are beer ratings biased?

CrimeDog Zealot (749) Dec 31, 2015 New York

drtth Initiate (0) Nov 25, 2007 Pennsylvania In Memoriam

zid Grand Pooh-Bah (3,132) Feb 15, 2010 New York BA4LYFE Society Pooh-Bah Trader

rgordon Pooh-Bah (2,701) Apr 26, 2012 North Carolina Pooh-Bah

drtth Initiate (0) Nov 25, 2007 Pennsylvania In Memoriam

Junior Pooh-Bah (1,883) May 23, 2015 Michigan Pooh-Bah Trader

drtth Initiate (0) Nov 25, 2007 Pennsylvania In Memoriam

Ninjakillzu Initiate (0) Oct 5, 2015 Washington

SFACRKnight Grand Pooh-Bah (3,348) Jan 20, 2012 Colorado Pooh-Bah Trader

1ale_man Initiate (0) Apr 25, 2015 Texas

StoutElk_92 Grand Pooh-Bah (4,045) Oct 30, 2015 Massachusetts Pooh-Bah

andy712 Initiate (0) Jul 23, 2016 Oregon

drtth Initiate (0) Nov 25, 2007 Pennsylvania In Memoriam

TongoRad Grand Pooh-Bah (3,884) Jun 3, 2004 New Jersey Society Pooh-Bah Trader

drtth Initiate (0) Nov 25, 2007 Pennsylvania In Memoriam

TongoRad Grand Pooh-Bah (3,884) Jun 3, 2004 New Jersey Society Pooh-Bah Trader

dlcarst Zealot (733) Aug 21, 2015 Missouri Trader

Giantspace Grand Pooh-Bah (3,043) Dec 22, 2011 Pennsylvania Pooh-Bah

StoutElk_92 Grand Pooh-Bah (4,045) Oct 30, 2015 Massachusetts Pooh-Bah

Beer_Economicus Pooh-Bah (2,698) Apr 8, 2017 Ohio Pooh-Bah Trader

drtth Initiate (0) Nov 25, 2007 Pennsylvania
In Memoriam

zid Grand Pooh-Bah (3,132) Feb 15, 2010 New York
BA4LYFE Society Pooh-Bah Trader

rgordon Pooh-Bah (2,701) Apr 26, 2012 North Carolina
Pooh-Bah

drtth Initiate (0) Nov 25, 2007 Pennsylvania
In Memoriam

Junior Pooh-Bah (1,883) May 23, 2015 Michigan
Pooh-Bah Trader

drtth Initiate (0) Nov 25, 2007 Pennsylvania
In Memoriam

SFACRKnight Grand Pooh-Bah (3,348) Jan 20, 2012 Colorado
Pooh-Bah Trader

StoutElk_92 Grand Pooh-Bah (4,045) Oct 30, 2015 Massachusetts
Pooh-Bah

drtth Initiate (0) Nov 25, 2007 Pennsylvania
In Memoriam

TongoRad Grand Pooh-Bah (3,884) Jun 3, 2004 New Jersey
Society Pooh-Bah Trader

drtth Initiate (0) Nov 25, 2007 Pennsylvania
In Memoriam

TongoRad Grand Pooh-Bah (3,884) Jun 3, 2004 New Jersey
Society Pooh-Bah Trader

dlcarst Zealot (733) Aug 21, 2015 Missouri
Trader

Giantspace Grand Pooh-Bah (3,043) Dec 22, 2011 Pennsylvania
Pooh-Bah

StoutElk_92 Grand Pooh-Bah (4,045) Oct 30, 2015 Massachusetts
Pooh-Bah

Beer_Economicus Pooh-Bah (2,698) Apr 8, 2017 Ohio
Pooh-Bah Trader