The Problem with Beer Scores

Discussion in 'Article Comments' started by BeerAdvocate, Nov 14, 2017.

  BeerAdvocate

    Founders (17,635) Aug 23, 1996 Massachusetts

    Many beer scores follow a dated, A-F standard borrowed from the wine and spirits world that was basically designed for marketing.

    Read the full article: The Problem with Beer Scores
  nc41

    nc41 (1,482) Sep 25, 2008 North Carolina
    There's no way to make it perfect, it's still weighted heavily to the trendy and more popular styles. You just have to look at the style in relation to the scores and interpret them.
  Harrison8

    Harrison8 (1,700) Dec 6, 2015 Missouri
    This seems more like an article to highlight Rate Beer's problems, rather than problems with beer scores in general, although it is quite interesting in lieu of their recent ownership change.

    As per the rating system, was it borrowed from wine and spirits, or ultimately academia? Even in those other realms, ratings contain bias. There is a big push in academia now for teachers and those performing the grading to write up rubrics for each complex task so there are tangible qualities that can be attributed to why a paper or project scored as such. If we want to make a push to siphon more bias out of our scores, we should look to do the same. Then again, providing general style guidelines (as has been done) and establishing a larger and larger pool of reviewers could help even out biases from both sides (work in progress) - a benefit not available to a single teacher grading, or beer festival judge.

    Side note: I am satisfied with the rating system on this site. The developers have been open about it, and yet have not divulged enough for users to abuse it.
  Leebo

    Leebo (148) Feb 7, 2013 Massachusetts

    Wine and whiskey drinker here as well. So you have a 100 point scale. Ever seen below say a 65? Nope. Taste is subjective, always will be. Haven't looked at ratings in years, just trust your own palate.
  NeroFiddled

    NeroFiddled (8,971) Jul 8, 2002 Pennsylvania
    Ahhh, but to taste it you usually have to buy it. I trust BA ratings to steer me in the right direction when buying.
  Squire123

    Squire123 (1,603) Jul 16, 2015 Mississippi
    I'm old enough and have seen enough to be relatively immune to the ego massaging shelf tags that proclaim 92 points with an exclamation mark. Good Lord, you can become an instant wine snob without even bothering to read the label. I cannot count (well, I suppose I could were I not so dedicated to avoiding work) the number of times I've heard that "I only drink 90 points or better" comment from a person I know can't tell Petrus from plonk.

    Beer is an incredibly diverse beverage that cannot be constrained by something so arbitrary as a 100 point system. A Pilsner is no more a Stout which isn't a Klosch which is not a Wee Heavy and so forth till we get to those journeyman players called IPAs. A celebration of life is this marvelous brew proclaiming it's brilliance with different styles to suit any occasion or time of day.

    A 1-5 point system covers the territory nicely and you won't hear someone at a tasting saying,"I normally won't touch anything under 3.75 but this 3.69 is drinking well".

  DrivinNCryin

    DrivinNCryin (138) Aug 21, 2017 South Carolina
    I have friends that fall into mainly two camps for rating on Untappd or whatever 1-5.

    1- The rating is soley subjective and private to them. "my rating is for me only" (to determine future purchase perhaps)
    2- (the camp I like to fall under)- How good is it in the realm of it's style. Surely there are some 5 star / cap pilsners out there.
  DonicBoom

    DonicBoom (132) Mar 26, 2015 Virginia

    I read the article as inadvertently implying RateBeer uses an A-F-inspired or 50-100 scale. The scale is in fact 0-100. The score for Budweiser, since ABI was invoked, is 0.

    Despite the comparison to the wine and spirits worlds, this is completely unlike the scale of major publications. Even the worst wine could not score lower than 50 (Wine Advocate and Wine Spectator), would be simply "not recommended" if below 80 (Wine & Spirits), or wouldn't even be published if below 80 (Wine Enthusiast).

    I'm sure there was no intent to mislead and love the continued transparency on BA ratings. I agree RB's system's downsides includes the disconnect on how users' 0-5 ratings get translated into 0-100 scores.
  DonicBoom

    DonicBoom (132) Mar 26, 2015 Virginia

    Whether BA's or RB's scores best inform consumers depends on familiarity with the scales. Those familiar with BA will have a sense the top 100 beers in practice range from 4.5 to 4.83 currently. It works perfectly for a frequent visitor like me. However, one does not need any familiarity with RB to instantly grasp the highest ratings are 100. If, as I agree, they have too many 95-100 ratings within popular styles, that's inflation from their formula (and arguably users) that need not be inherent to a 100-point scale
  Leebo

    Leebo (148) Feb 7, 2013 Massachusetts

    OK, that works for you. I trust the brewers that I know, ( a few) and what their general style/strengths are. My local stores do tastings all the time, plus flights at the brewery etc. I have so many, so good local choices that have never let me down. Just read the can? Jacks Abby, Night Shift, Exhibit A, Harpoon special releases, Cambridge Brewing, Ipswich, Newburyport, Wormtown etc. All do great stuff, no rating review needed. Just how I roll. The Mystic sour raspberry gose? Not so much, will leave that for someone else.
  Sabtos

    Sabtos (3,594) Dec 15, 2015 Ohio
    Straight, no chaser. Kudos for keeping it cold.
  Dan_K

    Dan_K (414) Nov 8, 2013 Colorado
    The biggest problem with BA ratings, to me, is that there aren't enough of them. I rate all the beers I drink on Untappd, because it's quick, easy, and connected to my beer friends. And the volume of ratings is a quality in and of itself. Meanwhile, there are beers in Colorado that get very few if any ratings on BA. Why? I think a lot of people have moved beyond lengthy, verbose ratings. For example I bought a beer that they released 800 bottles of, a BA Imperial Stout, and not one rating on BA several months later.

    This is not a knock on BA, just an observation.
    BA scores are probably the most accurate. Untappd ratings are inflated a bit -but if you subtract 0.15-0.2 points from each you get very close to BA ratings.

    There are obvious and apparent flaws with RBs system, even more so now that the ownership is convoluted and controversial.
  Dave_S

    Dave_S (115) May 18, 2017 England

    I thought that RB users rated stuff out of five, and that the rating out of 100 was actually a percentile thing? So a 100 is in the top 1% by average score, a 90+ is in the top 10% and so on.

    Anyway, I dunno, to me there are a lot of ways that a beer can be great, so there's an inherent limitation in any system that tries to boil it all down to a single numerical score, whatever system you use. And that's fine but you have to be aware of it if you're going to cite BA or RB or Untappd scores as a source of certified objective truth.
  dennis3951

    dennis3951 (772) Mar 6, 2008 New Jersey
    The trouble with ratings is one reviewer might rate an IPA 4.5 because it's a hop bomb. Another reviewer might rate the same IPA 3.5 because it is over hopped.
  DonicBoom

    DonicBoom (132) Mar 26, 2015 Virginia

    It's a weighted average that's a bit more complicated, but you're correct that user ratings out of 5 get translated into a percentile ranking out of 100.

    I agree, and often the publishers of numeric scores are among the first to acknowledge the importance of reading reviews from those with a somewhat similar palate. Even then, no two people have the same preferences. At the same time, an instantly graspable numeric score can be very helpful when one doesn't have the time or interest in examining descriptive reviews. The vast majority of beer drinkers probably fall in that category.
  SmokeBeerMinistry

    SmokeBeerMinistry (145) Sep 26, 2014 Texas

    Too many cooks spoil the broth. Just follow one palate. Like mine, for instance, at SmokeBeerMinistry ... if you're into beers that taste like smoke. And use a 1 - 7 scale. My friend is in marketing research and that's what they do, because of science I guess.
  Squire123

    Squire123 (1,603) Jul 16, 2015 Mississippi
    A numerical score really doesn't tell me much about how a beer tastes. If I didn't have the time or interest to read descriptive reviews I wouldn't be here.
  musicsherlock

    musicsherlock (323) Jan 2, 2012 New York

    I have three problems with user numerical scores:
    1) There is no 'perfect' beer (5.0)...disregard my ratings! So all "good to great" beers fall within 3.8-4.5
    2) For the most part IPA's, DIPA's, Imperial Stouts rate higher than all other varieties...i.e High ABVs already have a leg up in rankings
    3) Palate shift. I've had beers that taste great with the initial beer or two and then at subsequent sittings taste worse, and vice-versa

    That said, I still use BA as a guide when shopping
  rgordon

    rgordon (733) Apr 26, 2012 North Carolina

    Ratings and scores are meaningless to me. I try new beers all of the time, knowing most often what to expect. I also know some reviewers that have tastes similar to mine and I trust their recommendations. In all honesty, I think rating beers is over-rated.
  Premo88

    Premo88 (1,439) Jun 6, 2010 Texas

    I say keep doing what you do, Bros. Alstrom! I love the database and have learned how to make the numbers work for me. And the fact that BA.com is transparent about what gets done here makes it even better.
  mikeinportc

    mikeinportc (513) Nov 4, 2015 New York

    Yep. Also, there are those that are "hard graders", that are loathe to give anything a high score, whose "great beer" barely gets a "4", if that, and those that think lots of beers deserve the magical 5. Have to take that into account.

    For me the # is a starting point. Occasionally, I'll look at ones I know, then look at "Beers" of those that have a similar opinion, then look at those rated/reviewed beers of theirs that I've also had. From that, I've noted those who frequently have the same general opinion as I do. Now, when I look at an unknown, I'll look for their rating/review. The daily WBAYDN? thread has been helpful in this regard.

    That said, I'll try anything once, provided I can get a single.
  mudbug

    mudbug (601) Mar 27, 2009 Oregon

    I'd say this site does just about as well as you can with the data that is available to them. Some things to ponder would be that if every beer drinker in the US joined BA and rated their favorite beer the overwhelming number one winner would be Bud Light.
    Another thing I wonder about is why there is no "top beers" list that takes say the top two beers in every style and calls them the best of the best.
  mudbug

    mudbug (601) Mar 27, 2009 Oregon

    Another thing that needs addressing IMO is that if you are going to carve out a new "Style" like NEIPA then you should also create a listing for best NEIPAs don't you think? https://www.beeradvocate.com/lists/style/116/
  pat61

    pat61 (4,538) Dec 29, 2010 Minnesota
    The BA system works for me and I can look through the ratings to get a feel of where they are coming from . Its not perfect but I understand it. Rate Beer is harder to follow and I distrust the AB InBev involvement. Any rating system has to assume a certain level of intelligence on the part of the person making decisions on the rating system. Intelligent users will understand and know what the ratings mean and how to use them in a meaningful way. For the unintelligent - we can do much about that other than hope they eventually learn and be kind in the interim.
  drtth

    drtth (3,223) Nov 25, 2007 Pennsylvania

    The problem isn't so much the BA scoring system itself as it is the assumptions about made about what the numbers can/do mean. (E.g. lots of folks automatically assume that the mean must be 3 because it is in the middle of the numbers used in rating and/or they assume that the numbers have the same meaning across styles as they do within a particular style.)
  dbrauneis

    dbrauneis Site Editor (6,293) Dec 8, 2007 North Carolina
    If/when a new style (like NEIPA) is added to the site it will appear with it's own list by style like all of the others have.
  EmperorBevis

    EmperorBevis (3,682) Sep 25, 2011 United Kingdom (England)

    What I love about Beeradvocate is that the majority of the reviews are genuinely written to fairly describe an ale for the love of good beer.
    This might seem like an unremarkable fact
    but I've seen other apps & sites where members tend to be really harsh on microbrewers and highly praise big national/multinational conglomerates.
    I don't know if this is to try and solicit favour or money from big business, hoping to get a foot in the door for a career but as someone who favours traditional beers that are best made in smaller batches I find horrifying.

    I always try and keep foremost in my mind the style and sometimes if it is one that isn't exactly a favourite trying to be a little less critical,
    though it makes me think
    if a beer is sold as say a black IPA
    but drinks more like a traditional stout
    would that be cause to mark it down score wise?
  Squire123

    Squire123 (1,603) Jul 16, 2015 Mississippi
    Supporter Subscriber Beer Trader

    My sentiments exactly.
  TongoRad

    TongoRad (2,034) Jun 3, 2004 New Jersey
    Whenever I look up a beer the scores don't mean nearly as much to me as the written reviews. And while I really do appreciate the we written ones that make me feel like I'm experiencing the beer, I find it most helpful to look for common threads throughout the page of reviews.
  drtth

    drtth (3,223) Nov 25, 2007 Pennsylvania

    The only thing my very similar use would bring to this description is that the numerical scores and lists can be very helpful as a rough initial filter. Within a single style list the 5th beer and the 10th beer may be effectively indistinguishable from each other except possibly in a bit of flavor profile. But I don't think much is lost by ignoring the bottom of the list (which often has a weakness in the flavor profile) when there are quite a few above them remaining untried. Similarly there are so many beers I've not tried it's hardly worth worrying about beers not in that style's list unless there's a special case involved.
  HotDogBikeRide

    HotDogBikeRide (533) Dec 26, 2015 Texas
    The only issue I take with beer ratings is the subjectivity regarding mood, time of day, other things going on around the time of the rating. Did the rater just smoke a cigarette for instance, or are they drinking the beer with food? Someone could be having a really good day, drink an above-average beer, and consider it world class just due to thought association.

    I find myself often going back, also, after years of ratings and retrying a beer. I am then forced to edit my rating as my tastes/pallet have evolved to detect and describe the new things I see in it. I still love rating - something about scribbling notes and cataloging my beer adventures as I get older is super rad.

    Love going back and seeing what I was discovering a year, or two ago. Can only imagine the nostalgia five, ten years down the line. Cheers guys.
  PorterPro125

    PorterPro125 (876) Jan 19, 2013 New Brunswick (Canada)

    I would say the majority of people that rate and review beers here on BA don't rate them with the particular style in mind. One of the main examples that come to my mind are AAL's. I'm not the biggest AAL fan but there are some examples in the style that are much better than similar beers, yet they are still scored poorly because they are what they are.
  nc41

    nc41 (1,482) Sep 25, 2008 North Carolina
    Look at the Lager the Pils scores, even the popular Fest style beers suffer here. It takes the best of the best to get over 90 in a lot of cases, and a middling IPA probably hits that mark.
  PorterPro125

    PorterPro125 (876) Jan 19, 2013 New Brunswick (Canada)

    I would wager a bet and say that most malt-forward styles aren't rated as well as their hoppy counterparts.
  drtth

    drtth (3,223) Nov 25, 2007 Pennsylvania

    I'd say more the wine and spirits based on how far back the BA basic rating scale ideas go in time. But I'd also suggest that both incorporate some "inappropriate" assumptions about how numbers work and what they mean. So effectively there's a good bit of pseudo-science involved in each, since neither of them appear to be solidly based in, or derived from, the scientific aspects of rating scale development or psychometrics.
    #35 drtth, Nov 19, 2017
    Last edited: Nov 19, 2017
  woodchipper

    woodchipper (735) Oct 25, 2005 Connecticut

    A lot of people in this thread have said something to the effect that ratings are meaningless or that they don't pay attention to them. I don't agree with that view, but I do agree that a lot of the flaws discussed are real.
    If you are a critical thinker and can factor in(or out) the flaws with any such system, then there is value in ratings. Just look at the ratings with your personal grading curve and never take then at face value. They are imperfect, but they are data and therefore of some value when analysed critically. The new inclusion of histograms on the BA ratings are a big help in interpreting ratings.
  keithmurray

    keithmurray (1,131) Oct 7, 2009 Connecticut

    Seems like only ipas stouts and imperial styles get high grades
  meefmoff

    meefmoff (312) Jul 6, 2014 Massachusetts
    Given that there are top 100 lists already generated for every style here, has the site ever considered including this information on the front page for each individual beer when applicable?

    For instance, the #1 German pilsner (Hill Farmstead Mary) has a rating of 4.1 rating that only earns it a #2313 ranking overall. But if the page for Mary included the information about it being ranked #1 within its style that might mitigate the issues with grade inflation/deflation for certain styles that always get referenced in threads like these.
  dbl_delta

    dbl_delta (1,120) Sep 22, 2012 Pennsylvania
    Granted that any individual evaluation of a beer is subjective, and can be affected by many outside variables (time of day, mood, that handful of M&M's you just scarfed down). And granted that the quality of reviewers is highly variable, ranging from the true connoisseur to the guy who's had a few too many and just decides WTF let's write a REVIEW here. And some reviewers review to style, while others rate on their own personal enjoyment - so we're not even rating things on the same criteria. BUT...

    Ever hear of the "wisdom of the crowd"? It's been proven that the average of the guesses of a large number of people (say, in guessing the number of jelly beans in a jar) is always more accurate than any individual guess. Four conditions apply. There must be: (a) true diversity of opinions; (b) independence of opinion; (c) decentralisation of experience; (d) suitable mechanisms of aggregation. Any of the rating systems (whether it's 0-5, 50-100, or A-F) meets those criteria, as long nobody (eg, RateBeer) mucks with the numbers.

    So I'd contend that the larger the number of reviewers, the closer the aggregate rating is to the "true" answer. And I'd even go so far as to say that an excellent Pilsner's score relative to a mediocre IPA accurately reflects the current state of affairs with respect to those styles.
  TongoRad

    TongoRad (2,034) Jun 3, 2004 New Jersey
    It reflects the attitudes of the scorers, not the beers.

    All the number crunching in the world doesn't fix GIGO.
