Cornell Computer is Better at Spotting Fake Hotel Reviews Than You Are

By Eric Limer Jul 27th, 2011, 11:51 am

Recommended Videos

If you’ve ever looked for hotel reviews online, you’ve probably come across what is called opinion spam, otherwise known as fake reviews. The problem is, you probably don’t know it, or at least can’t single out the bogus ones. That’s why researchers at Cornell have developed a computer program that can call out fake reviews with 89.8% accuracy. You might be thinking, “so what? I can totally do that.” Well, the numbers beg to differ. When Cornell pitted three human judges against a slew of hotel reviews, half of which were truthful and verified, half of which were complete fiction, and asked them to single out the phonies, the puny humans fared no better than chance.

It all comes down to this concept called “truth bias.” Basically, when you read something, you generally take it as truth until you find evidence to the contrary (makes my job easier). On the flip side, if you’re told to be on the lookout for deception, you start shadowboxing like a schizophrenic and won’t believe your own mother’s story about how fluffy the pillows were. Enter the zen quietude of the robot brain.

The program that sorts through these reviews has none of the psychological problems we have, of course, and instead focuses on some really odd, but interesting facts about real reviews and fake ones. For instance, real reviews tend to use more concrete nouns, while fake ones lean heavily on verbs. The liars will also do more scene-setting and talk about “vacation” or “my business trip” while the truthful among us refer to boring real things like “the bathroom” and “lobby.” Basically, liars tend to write in a more flowery, scenic way and the truthful fellows write like Hemingway. These differences are subtle, however, and given the whole truth bias thing and how closely you have to look to get this stuff right, the computers are better at it than you ever will be.

The kicker? These algorithms, while awesome, are only validated for reviews, and to narrow the scope even further, only validated for hotel reviews. Still, there are applications to be had in first string, online review screening and a new word bank and some fine tuning could, presumably, open the program up for applications in spotting fake online reviews for other things. Just beware, when they finally rise up, the robots will be able to know if you are lying, so stop pitting your roombas against each other, because I highly doubt our robot overlords will approve.