Social media is now being extended beyond its original applications into a tool for predicting the future. The exponential growth in social media has helped create a large body of content that reflects the trends, experiences, evaluations, and sentiment of the marketplace. It is becoming increasingly apparent that this content can be mined and analyzed to help predict the size of markets, the outcomes of marketing campaigns, and marketing ROI. In this post I’ll take a look at three ways in which data generated by social media has been used recently for Predictive Marketing.
Predicting Movie Box Office Results with Twitter
A group of researchers at HP Labs recently published a paper describing how they used data captured from Twitter posts to predict box-office revenue at the movies. The researchers extracted 2.89 million tweets from 1.2 million users referring to 24 different movies over a period of three months. For each tweet, the timestamp, author, and tweet text were collected and used for analysis. The researchers focused on what they termed the “critical period” – the week before and the two weeks after the release of a movie.
An initial analysis of the tweets revealed that:
- The tweets built up in volume the week before the movie release; peaked at the time of the release; and fell during the two weeks following the release.
- The average number of tweets made by individuals about a particular movie was between 1 and 1.5.
- The distribution of tweets by individuals showed that a handful of individuals made many tweets; the distribution followed the “long tail” power distribution that frequently occurs on the web.
The team then proceeded to analyze the data for predictive power. First, what didn’t prove to be very good predictors of box office success:
- Prior to the release of a movie, studios promote the film heavily via TV, print, news releases, interviews with the stars, and trailer videos. The researches classified tweets according to whether they contained urls, indicating that they could reference trailers, movie reviews, or other PR about the movie. It turned out that although 22% to 40% of the tweets contained urls, such tweets were only mildly predictive of box office success.
- The percentage of retweets was in the 11%-12% range, and was even less predictive of box office success. This is surprising, given that retweets are indicative of word-of-mouth.
There were three factors that proved to be powerful predictors of box office revenue:
- The tweet rate, or the number of tweets about a given movie per hour. This is indicative of the overall attention and interest that the movie is generating. This factor is particularly important in predicting the box office reevnue for the opening weekend.
- Positive sentiment about the movie. The researchers created a customized method for analyzing positive and negative sentiment about movies for the purposes of this study. Sentiment proved to be an important factor in predicting box office revenue in the weeks after the opening.
- Distribution, or the number of theaters in which the film screened. The wider the distribution, the more opportunity that existed for revenue generation.
Using these data as a predictive model, the team was able to demonstrate that they could predict opening weekend box office revenue with 97.3% accuracy. This compared favorably with a well-known prediction tool for movies, the Hollywood Stock Exchange (HSX), that had 96.5% accuracy. The results for these two techniques are shown in the graph below:
The exciting implication of this study is not that this particular application of using social media to predict box office revenue, but that a model has now been developed to to use social media to predict a wide variety of outcomes from product sales to elections. The researchers conclude:
While in this study we focused on the problem of predicting box office revenues of movies for the sake of having a clear metric of comparison with other methods, this method can be extended to a large panoply of topics, ranging from the future rating of products to agenda setting and election outcomes. At a deeper level, this work shows how social media expresses a collective wisdom which, when properly tapped, can yield an extremely powerful and accurate indicator of future outcomes.
Using Social Media to Predict Election Results
The predictive model used to forecast outcomes using social media developed by the HP Labs team is one of the few that is customized, well-defined, and mathematically rigorous. However, that hasn’t stopped people from predicting outcomes from social media trends, even if they lack such a powerful tool.
One of the most stunning election outcomes in the past few years was the victory of Scott Brown over Martha Coakley in the special Massachusetts U.S. Senate election to replace the vacant seat created by the passing of Ted Kennedy. Larry Kim published a blog post five days before the election forecasting the upset for Scott Brown.
At the time, conventional polls suggested that the race was too close to call, despite the fact that the Democrat Coakley had been the early front runner and had a seeming lock on the seat, given the nature of the Massachusetts electorate. However, while Coakley coasted through the campaign, the hard work and grass roots effort employed by the Brown campaign paid huge dividends. Kim’s analysis of social media trends showed that Brown had developed a huge advantage is social media presence:
- 10:1 Advantage in YouTube video views
- 4:1 Advantage in Facebook fans
- 3:1 Advantage in Twitter mentions
- 10:1 Advantage in estimated web traffic
The trend in web traffic as measured by Alexa was particularly telling:
While the data looked overwhelming, Kim, to his credit, lacking a quantitative predictive model such as that employed by the HP Labs team, cautioned that the type of people who were heavy users of social media were undoubtedly a biased sample that was perhaps not representative of the electorate as a whole. However, the trends looked so overwhelming that, on the basis of this data, Kim concluded that Scoot Brown was headed for a victory. Five later Brown proved the prognostication accurate.
Using Social Media to Predict Fashion Trends
One final example of the predictive power of social media involves the notoriously fickle and unpredictable fashion world.
Luke Brynley-Jones of Our Social Times reports that Geoff Watts from Stylesignal has developed a new social media monitoring tool that helps to track new fashion trends. The tool is used to monitor the sites of opinion leaders in the fashion world. The data collected by the tool is then analyzed offline to predict fashion trends. Case studies on the site claim that the StyleSignal has helped correctly predict the colors, shapes, and styles that become trend setters.
It has become increasingly clear that social media can be used to predict future events with accuracy. Now that predictive models have been developed for quantifying and measuring the accuracy of predictions, you should expect to see explosive growth in the use of social media in forecasting.
Twitter has become an increasingly popular and important tool for businesses to keep in touch with their customers. Twitter is a medium unlike any other. Each tweet has a limited life-span – if it is not read within a short time of its being posted, the chances of it ever being read drop exponentially. The constant stream of new tweets from the group of individuals each twitterer is following makes it unlikely that the tweet will be read if it is a few hours old. For few twitterers capture all of their tweets in RSS feeds, or take the time to examine all the latest tweets from more than a handful of individuals. For a business hoping to broadcast a message that is read my the most followers possible, timing is of the essence.
So then, what is the best time of day to tweet? There have been several approaches to answer this question:
- Gary McCaffrey examined the Twitter referrals to his websites and concluded that any time between 9:00AM and 3:00PM is best.
- Malcom Coles claims that according to his survey, 4:01PM is the best time to tweet.
- Guy Kawasaki recommends posting your most important tweets 4 times, 8 to 12 hours apart.
- The Social Media Guide recommends 9:00AM Pacific Time as the single best time to tweet.
- Hubspot’s “State of the Twittersphere” report indicates that global tweets peak between 10:00 PM and 11:00 PM.
As you can see, there are a lot of different opinions about the best time to tweet. In order to develop the best answer possible to this question, I collected data over the course of several weeks for a business whose followers consist primarily of event professionals.
The data set consisted of several thousand tweets, including the username, the time and day of the tweet, and the tweet itself. For the purpose of this analysis, I assumed that the best indicator of a given twitterer’s degree of engagement was whether or not they had tweeted within a given hour. So in order to determine the best time of day to tweet, what is most important is not the number of tweets being posted at a particular time, but the number of unique users posting tweets. Here’s the data, in Eastern Time:
For this group of followers, there are actually two optimal hours to tweet – 10:00 – 11:00 AM and 12:00 – 1:00 PM. Tweets during these two hours reach 23.7% of the total number of followers, an 18% advantage over the next best time, 11:00 AM – 12:00 PM, and a 31% advantage over 1:00 PM – 2:00 PM. These increases in total available audience are highly significant to a business with thousands of followers.
Notice that, for this particular group, a tweet during the hour beginning at 9:00 AM, the beginning of Gary McAffrey’s time window, would only reach an available audience that is two-thirds the size of that available during 10:00 – 11:00 AM and 12:00 – 1:00 PM. Malcolm Cole’s suggestion of 4:01 PM reaches an available audience that is less than half the size – only 41% – of that of the best time to tweet. Guy Kawasaki’s formula of four tweets varied over 8 – 12 hour intervals is a hit-or-miss proposition. In this particular case, the Social Media Guide is right on the money – the hour beginning at 9:00 AM Pacific/12:00 PM Eastern is best.
But does this pattern hold for every group of followers? Or does each group of followers have a unique pattern, a sort of “time fingerprint”? To answer this question, I examined a second group of followers of a CRM company. Here’s the data, once again expressed in Eastern Time:
This group is far different! The group following the CRM company is much more likely to be active during the morning hours, and is more evenly distributed over the entire day. As a result, a tweet to this group reaches a maximum of 10.8% of the total available audience, as compared to the group of event professionals, which peaked at 23.7%. The CRM group reaches its maximum at 11:00 AM – 12:00 PM, rather than the hour before or after, as in the previous case. So while a close approximation, the Social Media Guide guideline of 12:00 PM Eastern Time would for this group reach an audience 17% smaller than the peak time period of 11:00 AM – 12:00 PM.
As these two data sets demonstrate, there is no one best time to tweet for every business. Each business has a unique set of followers with their own Twitter “time fingerprint”. You have to track the habits of your own set of followers in order to determine the best single time of day for your business to tweet.
Develop this graph for your own set of followers. How much different is your group compared to these two?
One of the most important insights from these two examples is that at any given time, you can only reach 10% – 24% of your followers with a single tweet. In a future post, I’ll examine what percentage of a group of followers can be reached with multiple tweets.
Conferences, exhibitions, and events were the original forms of social media. In recent years attendees have been more difficult to attract, due to the rise of the Internet, the increased hassle of travel, and an economic recession. But even as event producers have struggled against these forces to maintain or grow attendance levels, they have in many cases ambitiously attempted to increase revenue by increasing prices to attend their events. And as the recent experience at a number of events demonstrates, this can be a formula for failure.
One of the most forthright and savvy publishing operations on the Web, Mequoda Daily, recently discovered that price increases can backfire. In their own words:
Mequoda Summit: Rolling Back Prices to 2009
A 3-day program for the price of 2-days
After a multi-month test we have decided to reduce the price of the Seventh Mequoda Summit. We originally tested a theory explained in today’s Mequoda Daily post. It basically consisted of our desire to add more content to this year’s Mequoda Summit, to further enhance the experience for our attendees.
So we went forth with the test. This included increasing the content of the Summit by 25%. To be able to support the time and resources spent on this additional content, we decided to increase the price by 14%.
As a result we concluded that a 25% increase in content and a 14% increase in price yielded a 38% decrease in attendance.
In turn we have ended our test, and have shared our results with all of our loyal readers. We hope that you consider our findings when planning live events in the future. We are also offering admittance to the Summit for last year’s price, which is $200 cheaper than our original offer for 2010.
This decision by Mequoda Daily is at once smart and courageous. I’m sure they saw registrations and revenue increase immediately.
Test Multiple Price Points to Determine the Best Price
The best way to determine the appropriate price for an event is by testing several different price points at the start of your marketing campaign. The graph below displays the results of a price test I recently conducted for a client at conference fees that ranged from from $1,395 to $1,995. The test enabled us to determine that the best price was $1,595, which provided a projected incremental $60,000 in revenue over the next best price of $1,395, and more than $130,000 of incremental revenue over the worst outcome at $1,995.
Pricing is often a seat of the pants decision for an event producer. There are many methods that can be employed to determine the price for your conference – what your competitors charge, how many days it lasts, or how much content you have. Testing provides a way to find out directly from your customers what value they place on your product. The best strategy, and the one that will generate the most revenue, is to set your pricing through testing.
The Revenue Implications of Charging for Exhibit Attendance
Many conferences have an exhibit area that also provides a significant revenue stream. In order to maximize traffic on the exhibit floor, event management usually offers free passes to individuals who would like to visit the exhibits, but not attend conference sessions.
In some cases an event producer may decide to charge a nominal fee for passes to visit the exhibits to generate some additional revenue. This can be a major mistake.
Let’s take a look at some actual data and the revenue implications of charging for admission. In this case, the event producer decided to offer free admission to the exhibits if the attendee pre-registered, and charge a $50 fee if the attendee registered on site. This was a change in policy from the previous year, when admission was free regardless of when the attendee registered. The change in policy permits a year to year comparison that provides a dramatic illustration of what can happen when a fee is charged for exhibit attendance.
The first thing to note about the data is the significant drop in on site registrations, which declined from 397 to 103, a drop of 74%, in contrast to an increase of 81% in pre-registrations. One could assume that if the $50 fee had not been applied for on site registrations, they would also have grown by 81%. So attendance grew by 39% (to 2,067), when it should have grown by 81% (to 2,680). Actual attendance was 23% lower than it should have been.
The effect of this shortfall in attendance had a major, negative impact on the exhibit sales. Since the size of the exhibit floor grew by 50% (from $342,000 to $521,000), but attendance grew by only 39%, the density of attendees on the exhibit floor decreased by 11%. For an event, the size of attendance is perceived by the density of attendees on the exhibit floor – how crowded it looks. Even though actual attendance grew by 39%, because the size of the exhibit floor grew by 50%, it looked like there were actually fewer visitors in attendance.
The effect on exhibit sales and revenue was immediate. The percentage of exhibitors who signed contracts on site to exhibit at the next conference dropped from 79% to 59%, resulting in a revenue level that was $104,200 lower than it should have been. This revenue loss far exceeded the $5,150 in revenue realized from charging a $50 on site fee for exhibits passes.
This whole scenario could have been avoided by a simple price test on the exhibit pass at the start of the attendee marketing campaign. Event management would than have known the effect of the increase on price on overall attendance, and could have made the pricing decision accordingly.
It never pays to set prices first and react later. Always be testing!