Best-Seller Foreteller?

What if a soothsayer could tell you if your manuscript would become a best-seller? If you were a publisher, you’d hire that soothsayer, right?

Throughout the history of the publishing industry, editors and publishers had to make buy-or-reject decisions based on experience and gut feel.

Welcome to the Age of Big Data.

Crystal ball image from Wikipedia

According to an article in The Telegraph , researchers at Stony Brook University used computers to analyze writing styles and could predict whether a book would be successful with up to 84% accuracy.

Following up on that, Jodie Archer and Matthew L Jockers wrote The Bestseller Code, a book about their algorithm (the “bestseller-o-meter”) that analyzes character, plot, setting, style, and theme to make its predictions. According to an article in BBC Culture, this strangely named algorithm is also highly accurate.

More recently, I read an article in BuiltinAustin about a company in Austin, Texas called that has developed their own algorithm, StoryFit, which they market to publishers.

These algorithms chew on massive amounts of data—thousands of novels—and perform statistical analyses. After being given test data about past novels for which the success or failure results are known, the algorithm “learns,” or at least develops rules, to distinguish best-sellers from flops. You then apply the algorithm to an unpublished manuscript and make a reasonable prediction. A crystal ball for novels.

Could this lead to a world where publishers reject your manuscript because their algorithm said it wouldn’t sell? Or a world where authors could edit their manuscript to add in the aspects such algorithms judge to be indicative of success? Could the writing and publishing of novels be reduced to a numbers game?

Not quite yet, apparently. The Stony Brook University algorithm struggled to predict the success of books in one genre—historical fiction. Also their algorithm “predicted” Hemingway’s The Old Man and the Sea would flop. Archer and Jockers’ bestseller-o-meter rated The Help by Kathryn Stockett as meh. Further, the novel achieving their algorithm’s highest score (The Circle by Dave Eggers) was a commercial failure.

Certainly, these artificially intelligent systems will improve and get more accurate in the coming years. They’ll identify trends in how the reading public’s tastes are changing. Maybe the algorithms will never be 100% right, and some books they reject will succeed and vice versa. Every now and then, an author tries something new and it sells well despite being unlike the norm. They do call them novels, after all.

As publishers make increasing use of tools that predict a novel’s success, and as authors begin to use similar tools to tune their manuscripts for market success, could it be that overall novel writing will improve? Will that lead to an increase in readership, a renewed clamor for books by the buying public?

I hope so. In the meantime, my new big-data algorithm has just finished analyzing all my previous blog posts, and states there is a 99% probability I’ll conclude this one by signing it—

Poseidon’s Scribe