Social media is now being extended beyond its original applications into a tool for predicting the future. The exponential growth in social media has helped create a large body of content that reflects the trends, experiences, evaluations, and sentiment of the marketplace. It is becoming increasingly apparent that this content can be mined and analyzed to help predict the size of markets, the outcomes of marketing campaigns, and marketing ROI. In this post I’ll take a look at three ways in which data generated by social media has been used recently for Predictive Marketing.
Predicting Movie Box Office Results with Twitter
A group of researchers at HP Labs recently published a paper describing how they used data captured from Twitter posts to predict box-office revenue at the movies. The researchers extracted 2.89 million tweets from 1.2 million users referring to 24 different movies over a period of three months. For each tweet, the timestamp, author, and tweet text were collected and used for analysis. The researchers focused on what they termed the “critical period” – the week before and the two weeks after the release of a movie.
An initial analysis of the tweets revealed that:
- The tweets built up in volume the week before the movie release; peaked at the time of the release; and fell during the two weeks following the release.
- The average number of tweets made by individuals about a particular movie was between 1 and 1.5.
- The distribution of tweets by individuals showed that a handful of individuals made many tweets; the distribution followed the “long tail” power distribution that frequently occurs on the web.
The team then proceeded to analyze the data for predictive power. First, what didn’t prove to be very good predictors of box office success:
- Prior to the release of a movie, studios promote the film heavily via TV, print, news releases, interviews with the stars, and trailer videos. The researches classified tweets according to whether they contained urls, indicating that they could reference trailers, movie reviews, or other PR about the movie. It turned out that although 22% to 40% of the tweets contained urls, such tweets were only mildly predictive of box office success.
- The percentage of retweets was in the 11%-12% range, and was even less predictive of box office success. This is surprising, given that retweets are indicative of word-of-mouth.
There were three factors that proved to be powerful predictors of box office revenue:
- The tweet rate, or the number of tweets about a given movie per hour. This is indicative of the overall attention and interest that the movie is generating. This factor is particularly important in predicting the box office reevnue for the opening weekend.
- Positive sentiment about the movie. The researchers created a customized method for analyzing positive and negative sentiment about movies for the purposes of this study. Sentiment proved to be an important factor in predicting box office revenue in the weeks after the opening.
- Distribution, or the number of theaters in which the film screened. The wider the distribution, the more opportunity that existed for revenue generation.
Using these data as a predictive model, the team was able to demonstrate that they could predict opening weekend box office revenue with 97.3% accuracy. This compared favorably with a well-known prediction tool for movies, the Hollywood Stock Exchange (HSX), that had 96.5% accuracy. The results for these two techniques are shown in the graph below:
The exciting implication of this study is not that this particular application of using social media to predict box office revenue, but that a model has now been developed to to use social media to predict a wide variety of outcomes from product sales to elections. The researchers conclude:
While in this study we focused on the problem of predicting box office revenues of movies for the sake of having a clear metric of comparison with other methods, this method can be extended to a large panoply of topics, ranging from the future rating of products to agenda setting and election outcomes. At a deeper level, this work shows how social media expresses a collective wisdom which, when properly tapped, can yield an extremely powerful and accurate indicator of future outcomes.
Using Social Media to Predict Election Results
The predictive model used to forecast outcomes using social media developed by the HP Labs team is one of the few that is customized, well-defined, and mathematically rigorous. However, that hasn’t stopped people from predicting outcomes from social media trends, even if they lack such a powerful tool.
One of the most stunning election outcomes in the past few years was the victory of Scott Brown over Martha Coakley in the special Massachusetts U.S. Senate election to replace the vacant seat created by the passing of Ted Kennedy. Larry Kim published a blog post five days before the election forecasting the upset for Scott Brown.
At the time, conventional polls suggested that the race was too close to call, despite the fact that the Democrat Coakley had been the early front runner and had a seeming lock on the seat, given the nature of the Massachusetts electorate. However, while Coakley coasted through the campaign, the hard work and grass roots effort employed by the Brown campaign paid huge dividends. Kim’s analysis of social media trends showed that Brown had developed a huge advantage is social media presence:
- 10:1 Advantage in YouTube video views
- 4:1 Advantage in Facebook fans
- 3:1 Advantage in Twitter mentions
- 10:1 Advantage in estimated web traffic
The trend in web traffic as measured by Alexa was particularly telling:
While the data looked overwhelming, Kim, to his credit, lacking a quantitative predictive model such as that employed by the HP Labs team, cautioned that the type of people who were heavy users of social media were undoubtedly a biased sample that was perhaps not representative of the electorate as a whole. However, the trends looked so overwhelming that, on the basis of this data, Kim concluded that Scoot Brown was headed for a victory. Five later Brown proved the prognostication accurate.
Using Social Media to Predict Fashion Trends
One final example of the predictive power of social media involves the notoriously fickle and unpredictable fashion world.
Luke Brynley-Jones of Our Social Times reports that Geoff Watts from Stylesignal has developed a new social media monitoring tool that helps to track new fashion trends. The tool is used to monitor the sites of opinion leaders in the fashion world. The data collected by the tool is then analyzed offline to predict fashion trends. Case studies on the site claim that the StyleSignal has helped correctly predict the colors, shapes, and styles that become trend setters.
It has become increasingly clear that social media can be used to predict future events with accuracy. Now that predictive models have been developed for quantifying and measuring the accuracy of predictions, you should expect to see explosive growth in the use of social media in forecasting.
Twitter has become an increasingly popular and important tool for businesses to keep in touch with their customers. Twitter is a medium unlike any other. Each tweet has a limited life-span – if it is not read within a short time of its being posted, the chances of it ever being read drop exponentially. The constant stream of new tweets from the group of individuals each twitterer is following makes it unlikely that the tweet will be read if it is a few hours old. For few twitterers capture all of their tweets in RSS feeds, or take the time to examine all the latest tweets from more than a handful of individuals. For a business hoping to broadcast a message that is read my the most followers possible, timing is of the essence.
So then, what is the best time of day to tweet? There have been several approaches to answer this question:
- Gary McCaffrey examined the Twitter referrals to his websites and concluded that any time between 9:00AM and 3:00PM is best.
- Malcom Coles claims that according to his survey, 4:01PM is the best time to tweet.
- Guy Kawasaki recommends posting your most important tweets 4 times, 8 to 12 hours apart.
- The Social Media Guide recommends 9:00AM Pacific Time as the single best time to tweet.
- Hubspot’s “State of the Twittersphere” report indicates that global tweets peak between 10:00 PM and 11:00 PM.
As you can see, there are a lot of different opinions about the best time to tweet. In order to develop the best answer possible to this question, I collected data over the course of several weeks for a business whose followers consist primarily of event professionals.
The data set consisted of several thousand tweets, including the username, the time and day of the tweet, and the tweet itself. For the purpose of this analysis, I assumed that the best indicator of a given twitterer’s degree of engagement was whether or not they had tweeted within a given hour. So in order to determine the best time of day to tweet, what is most important is not the number of tweets being posted at a particular time, but the number of unique users posting tweets. Here’s the data, in Eastern Time:
For this group of followers, there are actually two optimal hours to tweet – 10:00 – 11:00 AM and 12:00 – 1:00 PM. Tweets during these two hours reach 23.7% of the total number of followers, an 18% advantage over the next best time, 11:00 AM – 12:00 PM, and a 31% advantage over 1:00 PM – 2:00 PM. These increases in total available audience are highly significant to a business with thousands of followers.
Notice that, for this particular group, a tweet during the hour beginning at 9:00 AM, the beginning of Gary McAffrey’s time window, would only reach an available audience that is two-thirds the size of that available during 10:00 – 11:00 AM and 12:00 – 1:00 PM. Malcolm Cole’s suggestion of 4:01 PM reaches an available audience that is less than half the size – only 41% – of that of the best time to tweet. Guy Kawasaki’s formula of four tweets varied over 8 – 12 hour intervals is a hit-or-miss proposition. In this particular case, the Social Media Guide is right on the money – the hour beginning at 9:00 AM Pacific/12:00 PM Eastern is best.
But does this pattern hold for every group of followers? Or does each group of followers have a unique pattern, a sort of “time fingerprint”? To answer this question, I examined a second group of followers of a CRM company. Here’s the data, once again expressed in Eastern Time:
This group is far different! The group following the CRM company is much more likely to be active during the morning hours, and is more evenly distributed over the entire day. As a result, a tweet to this group reaches a maximum of 10.8% of the total available audience, as compared to the group of event professionals, which peaked at 23.7%. The CRM group reaches its maximum at 11:00 AM – 12:00 PM, rather than the hour before or after, as in the previous case. So while a close approximation, the Social Media Guide guideline of 12:00 PM Eastern Time would for this group reach an audience 17% smaller than the peak time period of 11:00 AM – 12:00 PM.
As these two data sets demonstrate, there is no one best time to tweet for every business. Each business has a unique set of followers with their own Twitter “time fingerprint”. You have to track the habits of your own set of followers in order to determine the best single time of day for your business to tweet.
Develop this graph for your own set of followers. How much different is your group compared to these two?
One of the most important insights from these two examples is that at any given time, you can only reach 10% – 24% of your followers with a single tweet. In a future post, I’ll examine what percentage of a group of followers can be reached with multiple tweets.
Conferences, exhibitions, and events were the original forms of social media. In recent years attendees have been more difficult to attract, due to the rise of the Internet, the increased hassle of travel, and an economic recession. But even as event producers have struggled against these forces to maintain or grow attendance levels, they have in many cases ambitiously attempted to increase revenue by increasing prices to attend their events. And as the recent experience at a number of events demonstrates, this can be a formula for failure.
One of the most forthright and savvy publishing operations on the Web, Mequoda Daily, recently discovered that price increases can backfire. In their own words:
Mequoda Summit: Rolling Back Prices to 2009
A 3-day program for the price of 2-days
After a multi-month test we have decided to reduce the price of the Seventh Mequoda Summit. We originally tested a theory explained in today’s Mequoda Daily post. It basically consisted of our desire to add more content to this year’s Mequoda Summit, to further enhance the experience for our attendees.
So we went forth with the test. This included increasing the content of the Summit by 25%. To be able to support the time and resources spent on this additional content, we decided to increase the price by 14%.
As a result we concluded that a 25% increase in content and a 14% increase in price yielded a 38% decrease in attendance.
In turn we have ended our test, and have shared our results with all of our loyal readers. We hope that you consider our findings when planning live events in the future. We are also offering admittance to the Summit for last year’s price, which is $200 cheaper than our original offer for 2010.
This decision by Mequoda Daily is at once smart and courageous. I’m sure they saw registrations and revenue increase immediately.
Test Multiple Price Points to Determine the Best Price
The best way to determine the appropriate price for an event is by testing several different price points at the start of your marketing campaign. The graph below displays the results of a price test I recently conducted for a client at conference fees that ranged from from $1,395 to $1,995. The test enabled us to determine that the best price was $1,595, which provided a projected incremental $60,000 in revenue over the next best price of $1,395, and more than $130,000 of incremental revenue over the worst outcome at $1,995.
Pricing is often a seat of the pants decision for an event producer. There are many methods that can be employed to determine the price for your conference – what your competitors charge, how many days it lasts, or how much content you have. Testing provides a way to find out directly from your customers what value they place on your product. The best strategy, and the one that will generate the most revenue, is to set your pricing through testing.
The Revenue Implications of Charging for Exhibit Attendance
Many conferences have an exhibit area that also provides a significant revenue stream. In order to maximize traffic on the exhibit floor, event management usually offers free passes to individuals who would like to visit the exhibits, but not attend conference sessions.
In some cases an event producer may decide to charge a nominal fee for passes to visit the exhibits to generate some additional revenue. This can be a major mistake.
Let’s take a look at some actual data and the revenue implications of charging for admission. In this case, the event producer decided to offer free admission to the exhibits if the attendee pre-registered, and charge a $50 fee if the attendee registered on site. This was a change in policy from the previous year, when admission was free regardless of when the attendee registered. The change in policy permits a year to year comparison that provides a dramatic illustration of what can happen when a fee is charged for exhibit attendance.
The first thing to note about the data is the significant drop in on site registrations, which declined from 397 to 103, a drop of 74%, in contrast to an increase of 81% in pre-registrations. One could assume that if the $50 fee had not been applied for on site registrations, they would also have grown by 81%. So attendance grew by 39% (to 2,067), when it should have grown by 81% (to 2,680). Actual attendance was 23% lower than it should have been.
The effect of this shortfall in attendance had a major, negative impact on the exhibit sales. Since the size of the exhibit floor grew by 50% (from $342,000 to $521,000), but attendance grew by only 39%, the density of attendees on the exhibit floor decreased by 11%. For an event, the size of attendance is perceived by the density of attendees on the exhibit floor – how crowded it looks. Even though actual attendance grew by 39%, because the size of the exhibit floor grew by 50%, it looked like there were actually fewer visitors in attendance.
The effect on exhibit sales and revenue was immediate. The percentage of exhibitors who signed contracts on site to exhibit at the next conference dropped from 79% to 59%, resulting in a revenue level that was $104,200 lower than it should have been. This revenue loss far exceeded the $5,150 in revenue realized from charging a $50 on site fee for exhibits passes.
This whole scenario could have been avoided by a simple price test on the exhibit pass at the start of the attendee marketing campaign. Event management would than have known the effect of the increase on price on overall attendance, and could have made the pricing decision accordingly.
It never pays to set prices first and react later. Always be testing!
While some social media enthusiasts struggle with the question of how to measure the ROI from social media, the free market is alive and well and functioning. Consider this: an unnamed celebrity was recently paid $20,000 for a single tweet to endorse a product. A company called Sponsored Tweets matches advertisers with celebrities to create sponsored conversations on Twitter. According to Ted Murphy of Izea, the company that runs Sponsored Tweets, “It was actually an incredible value for the advertiser, since the net cost per click came out to less than $.50 per click.”
Sound familiar? This is nothing more than Old School mass media advertising. Considering that there are 350 million Facebook users, 75 million Twitter users, and over 50 million Linkedin users, it is not surprising that companies are figuring out ways to leverage the vast reach that these platforms can provide, sometimes in the most mercenary of ways.
But you don’t have to be a mass marketer to derive measurable ROI from social media. Take this example: a high tech conference, with a mere 350 Twitter followers, recently sent out a series of tweets promoting its conference. The links in each of the tweets were coded to enable tracking from a click on the link in the tweet through a completed registration. The result: $15,000 in registrations from new customers. The process can be analyzed as follows:
When you think about it, the progression from tweet to registration as illustrated above is similar to an email. Here are the parallels:
So in one instance, the case of the $20,000 tweet, Twitter is being used as mass media. In another instance, a series of tweets promoting a conference that generated $15,000 in registrations, Twitter is being used as a direct response vehicle.
A lot of the confusion over measuring the ROI of social media is a result of its chameleon-like qualities. For businesses, it can be leveraged as mass media, one-to-one marketing, customer service, a business intelligence tool, a source of new product ideas, competitive intelligence, market research, and in a host of other ways. Methods of measuring the ROI of every one of these disciplines have already been established in other contexts. These methods can be adapted to measure ROI on social media. But only one in six companies measures the ROI from their social media investment today.
Your company can measure the ROI of social media, and continuously improve it. Here’s how:
- Establish clear objectives for the use of social media. As we have seen, social media can be used to achieve multiple objectives. You need to be clear about how you intend to achieve and measure the results of every one of them.
- Categorize each type of customer interaction according to the objective it will help achieve. On Twitter, for example, each tweet will fall into a different category, based on its objective.
- Develop a tracking system that enables you to measure the results of each customer interaction in comparison with the desired result.
- Analyze the results in light of your objectives.
- Optimize your strategy – choose the tactics that are providing the most ROI. Eliminate the ones that aren’t.
- Go back to step 1. Set new objectives, and start the cycle again.
If you have clear objectives for your company’s use of social media, the ROI can be measured. Admittedly, some objectives may be harder to measure than others. But if you don’t measure it, you can’t improve it.
When approached in this way, I believe that social media will generate a high ROI for most companies. Social media advocates don’t need to struggle with the ROI question any longer. The ROI is there if companies approach their social media efforts the right way.
Many email service providers admit that there has been a gradual decline in open rates over the past few years. While the open rate doesn’t tell the whole story on on email success, it is still vital to measure. After all, if your audience doesn’t open up your email, they have no chance to read it and respond to it.
One of the primary reasons cited for the decline is inbox clutter. According to Forrester Research, 60% of consumers believe they receive too much email. In another study, Customer Knowledge is Marketer Power, Forrester found that the chief reason that marketers who believe email will be less effective in 2 years is “too much clutter in consumer inboxes.” A belief that “SPAM” will drive the decline was cited by only 59%.
Clearly, we are all becoming increasingly numb to the steady stream of email arriving in our inboxes. A second, related reason often given for the decline in open rates is the increasing effectiveness of spam filters that help manage this flood of email.
A third reason, and a significant one, is technological. The way that opens are measured is by including a tiny image (usually a 1 pixel by 1 pixel gif or jpeg) within the email. Once the images that are embedded in the email are served, the email is recorded as opened. The problem is that there are a lot of email readers don’t automatically serve the images in an email. In fact, ExactTarget estimates that 50% of all email is now delivered to email readers that either don’t automatically render images or are unable to render images, such as Outlook, Gmail, AOL, and handheld devices such as Blackberries. Thus, there is an inherent bias in not detecting all of the opens.
If you’re running an email campaign, it’s important to know the true open rate, so you can gauge the true reach of your email message. There’s an easy way to do this. It’s based on the insight that click-throughs are always measured, even if opens aren’t. Even though the email reader may not be indicating an open, because it hasn’t rendered the images, the recipient of the email can still click on the links. That means that some recipients will be tracked as clicking through, but not opening an email. Let’s walk through an example.
Here’s the initial tracking information for an email:
Here’s how to estimate the true open rate:
- Download the list of the email addresses that have opened the email from your email service provider.
- Download the list of the email addresses that have clicked on a link in the email. Now match up the list of those who have clicked through, to see if they were tracked as opening the email. In the case above, it turns out that 105 recipients clicked a link in the email, but only 75 of them were tracked as having opened the email.
- Multiply the open rate above by the ratio 105/75. This gives an estimate of the true open rate, assuming the same click through to open ratio for the group that clicked on a link in the email, but was not tracked as having opened the email. The revised tracking information is as follows:
As you can see, because not all of the email reader render images, the estimated open rate in this case was actually 40% higher than reported. Here’s how you can use this information:
- In order to maximize your click through rates, make sure that message in your emails does not rely on images. That way, if the recipient of your email doesn’t see the images, they can still respond to your message. As demonstrated above, this can help increase your open rates by 40% – or more.
- It’s vital to know what the real underlying trends are for your email campaigns, so you can make adjustments as necessary. You’re in a better position to know that if you monitor the estimated open rate, as described above, because it eliminates quirks in the tracking system. You need to make adjustments in your strategy based on real changes in customer behavior, rather than changes in the way email readers render images.
- With the estimated open rate, you now have a better estimate of the cumulative penetration of your message to your target audience. For example, if the reported rate shows a cumulative penetration of 33% after several emails, and you actually have a 40% higher open rate, a better estimate of your penetration is 1.4 x 33% or roughly 46%. You can then make better decisions about how to most effectively reach the rest of your target audience.