Predictive Marketing

5 Free Ways to Archive and Analyze Your Tweets

Your Twitter stream is a moving target. After a couple of weeks, tweets disappear, unrecoverable via Twitter search. Fortunately, if you want to collect, save, and analyze Tweets, there are several alternatives that are freely available.

If you are interested mainly in saving your own Tweets, using Google Reader is perhaps the best alternative. You simply need to locate the RSS feed for your (or anyone else’s Twitter account, if you’d like) and subscribe. Just locate the “Browse for stuff” option under the “All Items” drop down menu in the upper left hand corner of the Google Reader screen, click on “Search”, and enter the username of the Twitter account. The feed will then appear in the search results. Simply click on “Subscribe”, and you’re ready to go! All of the tweets from the account will then be saved from that point forward. This makes your archive of Tweets searchable and pretty much ageless (if you don’t expect Google to be destroyed in the near future).

TwimeMachine is another alternative that will  pull your older tweets into a single web page for you, starting with the most recent. However, it is restricted to you last 3,200 tweets and you can only view 25 at a time.


Snap Bird is a more powerful way to search through tweet history. You can use it to view your old tweets dating back several years. You simply enter your Twitter username in the search box and leave the search term blank to get Snap Bird to pull up all of your old tweets. You’ll get a list of 100 tweets to start, and you can continue to go back by 100 tweets at a time.

Not only can you look at your old tweets using Snap Bird, but you can also search for any Twitter user’s timeline, any Twitter user’s favorites, your friends’ tweets, tweets that mention you, and your sent and received direct messages.

Snap Bird

The Archivist is the best and most flexible tool for saving and analyzing tweets, structured around searches. The Archivist offers two different ways to save tweets, in an online or desktop version.

The Archivist - Online Version

The online version will archive tweets beginning from the point in time when you initiate a search. It will periodically and automatically update the search based on the amount of activity for the search term. At any time you can observe the most recent tweets and some key statistics about all of the tweets in the archive, including tweet volume over time, top users, the percentage of tweets vs. retweets, top words, top URLs, and top sources. It also offer the opporutnity to make the archive public so that you can share it with colleagues.

Here’s how it works:  First you do a search—using Twitter’s own search syntax (for example, from:yourusername). It will then return a list of matching tweets. Your first search will return a maximum of 500 matching tweets. You can then save that search, which will continue to be updated until you delete it.

There are two problems with the online version. The first is that, since you don’t control when the search is updated, you may lose some of the tweets you want to archive. The second is that, due to Twitter’s terms of service, you cannot export the tweets to archive them in files on your own computer.

Both of these problems can be overcome if you download the desktop version.

The Archivist - Desktop Version

With the desktop version, you gain control over the frequency of the search updates. Once you have started a search, The Archivist will continue to monitor that search term while you leave it open, refreshing itself every ten minutes. You can save the results from your search and reopen it at a later time. Once you save the results to your file system, The Archivist will automatically save any new tweets that come in, so you only need to click save one time.

If you would like to have multiple searches going simultaneously, you can launch multiple instances of The Archivist. However, if you have too many instances of The Archivist open, you could get rate limited by Twitter.

If your search term has a lot of Twitter traffic, you can choose to leave The Archivist running, otherwise there is a chance you will miss some tweets. For example, if you do a search, save the results, close The Archivist and then reopen that search the next day – if there have been more than 1500 tweets since the last time you ran the search, there will be a gap in your archive.

Another convenient feature: if you would like to see the Twitter homepage for a user of a given tweet, you can click their avatar, which will launch a browser that takes you to the person’s Twitter homepage.

Most important of all, if you’d like to perform deeper data analysis, you can export The Archivist data to Excel. When you click Export To Excel, The Archivist will create a tab delimited text file which you can then open in Excel. If you are more tech savvy, you can save the data in an .xml file for further analysis.

The Archivist is a perfect tool for saving, creating a transcript, and analyzing a Twitter chat. If you have created a hashtag for an event, you can collect the tweets about the event to determine more about the attendees and their attitudes about your event than you could from any survey.

  • Share/Bookmark

Predict Market Success With Google Insights

We are awash in a sea of data. Thanks to web analytics tools, CRM systems, and social media, we have more data than ever about the behavior of customers and prospects. What is often lacking are the knowledge and skills necessary to turn this data into useful information.

Both are on display in a brilliant study conducted by Seth Stephens-Davidowitz of Harvard. As reported in the Wall Street Journal, using freely available data from Google Insights, skillful research, and clever thinking, he was able to determine that in the 2008 presidential election, racial attitudes reduced the number of votes garnered by President Obama by 3%-5%. His method of reaching this conclusion, which we’ll review here, represents techniques that can be used by all marketers in gaining insights into topics such as forecasting product demand, buying attitudes, geographical preferences, and buyer demographics.

Stephens-Davidowitz performed the study because of the notorious unreliability of surveys to capture the true racial attitudes of voters. Participants in surveys are highly likely to misreport their true attitudes due to embarrassment. Google-based measures of racial bias are more likely to accurately reflect voters’ attitudes, since they perform Google searches online while likely alone. In addition, information about Google searches is available at finer geographic levels, uses data that is more recent, and aggregates information from larger samples as compared to typical surveys.

The method used in the study was as follows:

  1. Choose a search term that represents the underlying attitude. In this case, Stephens-Davidowitz used a certain well know racial epithet that began with “n” for the representative search term.
  2. He had to make sure that the term represented a strong proxy for racial bias; he did this by:
    • Examining some of the output from Google Insights, which includes the top related search terms including the word. From the list of related terms, it was clear that the search was motivated by racial bias.
    • Verifying that Google search volumes correlate well with demographics one would more often expect to search the term. For example, the percent of a state’s residents who say they believe in God explains 65% of the variation of the search volume for the word “God”. The table below gives further examples:
    • Finally, the major potential bias with racial attitude survey data – misreporting due to embarrassment – is unlikely to significantly bias Google data. As mentioned previously, the conditions under which people search -online and likely alone – limit this concern. The following table documents substantial search volume for various terms that researchers suspect may be under-reported in surveys.
    • He then used Google Insights to determine the geographic variation in the use of this term in searches. Quite a wide variation was found by media market:
    • Markets with high racial bias have darker colors

  3. Stephens-Davidowitz next sought to arrive at an estimate of how this bias translated into votes. In order to do this, he arrived at a first estimate by comparing voting results by media market in the Obama – McCain election with results in the Kerry – Bush election using linear regression.
  4. To verify that his estimate of racial bias was a strong predictor of the difference in voting patterns between the two elections, he then added additional variables to his analysis that are known to affect voting outcomes.Stephens-Davidowitz concludes that

Estimating the effect of racial animus on voting is complicated by surveyed individuals’ propensity to misreport socially unacceptable attitudes. This paper sidesteps surveys using area-level Google search data and administrative voting records. I find that racial animus played a major role in the 2008 election. Relative to the attitudes of the most tolerant area, racial animus cost Obama 3 to 5 percentage points of national popular vote.

More details are offered in the full study, The Effects of Racial Animus on Voting: Evidence Using Google Search Data. The method described here can be used in a host of marketing applications, including forecasting product demand, buying attitudes, geographical preferences, and buyer demographics.

  • Share/Bookmark

On Twitter, Timing is Everything

Twitter is a medium of the moment. The life-span of a tweet is exceedingly short.  If a tweet it is not read quickly after being posted, chances are that it won’t be read at all. The lifetime of a tweet appears to be social media’s answer to the mayfly.

In one of my previous posts, I examined the question “When is the best time of day to tweet?”. It turned out that there was no one universal answer to that question. The best time to tweet depended on what time of day your particular set of followers were active on Twitter.  Recent evidence regarding Twitter usage patterns illustrates exactly how important it is to time your tweets so that you are reaching as large and audience as possible.

So what is the effective lifetime of a tweet? Sysomos, a leading provider of social media monitoring and analytics technology, analyzed 1.2 billion tweets to find out how many of them generated some sort of reaction. The key points from the Sysomos analysis:

  • 92.4% of all retweets happen within the first hour of the original tweet being published. Thus, if your Tweet is not retweeted in the first hour after it is posted, it probably won’t be.
  • 96.9% of @ replies happen within the first hour of the original tweet being published
  • 23% of tweets generate replies, while 6% generate retweets.
  • Of all tweets that generated a reply, 85% have only one reply. Another 10.7% attracted a reply to the original reply – the conversation was two levels deep. Only 1.53% of Twitter conversations are three levels deep.

The following graph summarizes these important findings:

Like many things in life, on Twitter, timing is everything. If you want your message to be read, to engage your audience, and to be retweeted, you need to know when your followers are online. Every group of followers is different in terms of the periods of peak activity during the day. Remember that:

  • A single tweet will only reach a fraction of your followers.
  • By analyzing the times during which your followers tweet, it is possible to develop a strategy to predict the percentage of your followers that you can reach with multiple tweets.
  • It is also possible to determine the best times of day for multiple tweets. Note that the muliple tweets don’t necessarily have to take place during one day; they can be spread out over several days so as not to annoy your most attentive followers.
  • Share/Bookmark

A Vital New Marketing Metric: The Network Value of a Customer

New research is underscoring the influence of social networks in marketing. Researchers at Telenor, a mobile phone carrier in Scandanavia, developed a map of social connections based on calling patterns between subscribers to analyze the adoption of the iPhone since 2007. The research showed that an individual with just one iPhone-owning friend was three times more likely to own one themselves than someone whose friends had no iPhones. Individuals with two friends who had iPhones were more than five times as likely to have purchased an iPhone.

What is groundbreaking about this research is not the realization that friends and colleagues influence what you buy, but the unprecedented ability in today’s connected world to track, measure, and quantify the effects of social influence. This newfound capability calls for a dramatic overhaul of the way that businesses determine the value of their customers.

Time evolution of the iPhone adoption network. One node represents one subscriber. Node color: represents iPhone model: red=2G, green=iPhone 3G, yellow=3GS. Node size, link width, and node shape (attributes which are visible in Q3 2007) represent, respectively, internet volume, weighted sum of SMS and voice traffic, and subscription type. Round node shape represents business users, while square represents consumers. Source: Product Adoption Networks and Their Growth in a Large Mobile Phone Network (

The Lifetime Value of a Customer

Traditionally, determining the lifetime value of a customer has long been the starting point for calculating  the ROI of a marketing campaign. The lifetime value of a customer is defined as the net present value of the profit a business will realize on the average new customer over a period of years from that customer’s purchases. This number is critical, because it indicates exactly how much it is worth to acquire a given customer. Armed with this information, a business can manage its marketing programs not as an expense, or for short term profits, but as a long-term business investment.

A New Metric – The Network Value of a Customer

As the research on iPhone adoption illustrates, with the rise in the popularity of social networks, it has become increasingly clear that the true value of a customer goes beyond how much he or she might buy from you directly. Traditional measures of customer value ignore the influence a customer may have on how much others buy. For example, if a customer buys your product, and then, based on his recommendation, three of his colleagues buy your product as well, his effective value to you has quadrupled. On the other hand, if a prospect makes his decision based purely on what others tell him about your product, you will be better off spending your marketing dollars on his colleagues.

The implication for marketers means that the lifetime value of a customer can no longer be considered to have captured the true value of a customer.  The advance in the understanding of how social influence effects purchase decisions has lead to the creation of a new metric – the network value of a customer.  The network value of a customer is the expected increase in sales to others that results from marketing to that customer.

The Factors That Determine The Network Value of a Customer

Which customers have a high network value? There are few businesses that have access to the kind of data that the Telenor researchers had at their disposal – billions of call records. However, by considering the characteristics of customers that have a high network value, there is data that you can collect that will begin to help you identify and target the customers that you have with the highest network value. The customers with high network value share these common characteristics:

  1. A high level of satisfaction with your product
  2. Is highly likely to recommend your product to others
  3. Is highly connected to other potential buyers
  4. Is highly influential, an opinion leader

How to Target Customers With High Network Value

Even if you don’t have access to billions of records detailing the social connections and behavior of your customers, like the researchers at Telenor, there is data that you can easily collect about your customers that can help you target the customers that you have with the highest network value. They include:

  • Collect a Net Promoter Score from each customer – The metric is simple to collect and straightforward to determine, as described on

By asking one simple question — How likely is it that you would recommend [Company X] to a friend or colleague? — you can track these groups and get a clear measure of your company’s performance through its customers’ eyes. Customers respond on a 0-to-10 point rating scale and are categorized as follows:

  • Promoters (score 9-10) are loyal enthusiasts who will keep buying and refer others, fueling growth.
  • Passives (score 7-8) are satisfied but unenthusiastic customers who are vulnerable to competitive offerings.
  • Detractors (score 0-6) are unhappy customers who can damage your brand and impede growth through negative word-of-mouth.


With this one metric you can capture the first two characteristics of a customer with high network value – they 1) have a high level of satisfaction with your product, and 2) are likely to recommend it to others.

  • Collect social network information about your customers – many companies are starting to ask customers for their Twitter and/or Facebook usernames, in addition to other contact information such as email address. The very fact that a customer is willing to give you this information is an excellent indicator that the customer is actively involved with you product. In addition, it allows you to invite them to follow/friend you on Twitter and Facebook. Also, in the case of Twitter, it allows you to follow them, and collect vital publicly available information about them that indicates how many friends and followers they have, how many tweets they have made, and their bio. This will give you a measure of the third characteristic of high network value customers – how highly they are connected to other buyers.
  • Perform a social network analysis of your Twitter and Facebook followers – you can analyze your own Facebook and Twitter followers to determine which customers:
    • have the highest number of connections
    • are most likely to pass key marketing messages along to their followers
    • have the highest influence and are opinion leaders

This information allows you to fill in the final piece of information you need to get a handle on the network value of a customer – the fourth criterion, whether they are highly influential and an opinion leader. Now you’re ready to start testing and scoring groups of customers according to their network value.

Optimize Your Marketing Programs

Clearly, ignoring the network value of a customer may lead to suboptimal marketing decisions. By collecting the information you need to assess the network value of your customers, you can now model both the likelihood that a given customer will buy from you, and the influence that customer has on other’s buying decisions. Then you can select a subset of your customers, and determine not just how much they will buy from you, but the total amount of revenue that they might generate from their influence over others. This enables you to determine the optimal set of customers to market to that will generate the highest ROI.

  • Share/Bookmark

How Analytics is Revolutionizing Audience Development

I was recently invited to speak at the IAEE meeting in Boston to shed some light on how analytics can be used to increase attendance at events. In recent years event producers have found it more difficult to attract attendees, due to the rise of the Internet, the growing inconvenience of travel, and an economic recession. As event producers have struggled against these forces, they have in many cases not yet taken advantage of analytic techniques such as data mining, CRM, web analytics, social network analysis, and test and learn strategies to grow attendance levels. My session explored how to apply analytic techniques to radically improve the results of audience development campaigns. I have used these techniques on over 100 conferences, trade shows, and special events to achieve significant increases in attendance.

The topics I covered included the following:

The Lifetime Value of a Customer – A discussion of how to determine the lifetime value of a conference attendee is followed by the an examination of the much more difficult question of how to determine the lifetime value of an exhibit attendee. These attendees usually attend at no charge, and usually generate revenue only indirectly by attracting exhibitors and sponsors. In addition, I review an example of how knowledge of the lifetime value of an attendee can be crucial in decision making.

Closed Loop Marketing – A closed loop marketing system allows event managers to measure the results of all the various components of their audience development programs.  With accurate measurement of program results, they can accurately gauge the ROI of marketing programs, run controlled tests to optimize ROI, and identify key leverage points.

Email Optimization – Email is the keystone of many audience development programs. It is vital to optimize the revenue and response generated by email marketing through a comprehensive testing program. Properly done, email optimization can improve response by 50% or more, and in some cases double or even triple response. The presentation provides examples of how to identify key email test elements, implement carefully designed tests, and analyze the results.

Customer Profiling – Using the information about attendees collected during the registration process, prospects can be targeted with increased accuracy, and the results of marketing programs can be markedly improved.

Predictive Modeling – Moving beyond simple customer profiling, models can be developed that accurately predict which customers are likely to respond to promotions, and which customers are likely to defect. A case study is included on how predictive modeling helped triple conference revenue.

Segmentation Analysis – A highly effective way to identify which customers will respond to which promotions. Event managers can create custom-tailored marketing messages that address the needs of each segment to increase response, lower the cost of customer acquisition, increase retention, and increase cross-sales, up-sales, and referrals. An example of how a segmented campaign increased response by 20% is reviewed.

Web Site Optimization – Small increases in conversion rates can have a dramatic increase in registrations. An example of how minimizing abandonment rates during the registration process helped increase registrations by 30% is discussed.

Social Media Optimization –  Analytics can help event producers amplify the results of their audience development campaign through the optimal use of social media. By mining social networks to identify influential customers and prospects, adding social media profiles to the CRM system, and using predictive modeling to target high probability prospects, an event increased attendance by 30%.

As more event producers take advantage of these analytics techniques, they’ll be able to attract more and better qualified attendees to their events. Face-to-face meetings, the original channel of social media, will remain a vital method of marketing. Here are the slides from the presentation:

Related Posts with Thumbnails
  • Share/Bookmark

Next Page »

Predictive Marketing