Sentiment analysis: an information-gathering method of computations-or algorithm- used in dissecting the “opinion, sentiment, and subjectivity in text” in order to form a technical examination for filtering mass content.
The latest investment tool to be practiced involving sentiment analysis has been derived from social media outlets. The method reached momentum when a London- based investment boutique, Derwent Capital Markets, unveiled earlier this year a $40 million hedge fund that uses the contents from Twitter as a guide for its investment analyses. In fact, the company has restructured its entire business just to accommodate this practice; and re-launched in the Cayman Islands, no less. But nevertheless, as stated by The Atlantic in May, “The world’s first social media-based hedge fund will monitor a selection of tweets in real-time to feel out market sentiment before placing its bets.”
Dubbed the “Twitter Hedge Fund,” which taps into the theory that tweets can affect stock market behavior, this phenomenon, if you will, was first studied by Indiana University- Bloomington and University of Manchester researchers (Bollen, Mao and Zeng) in 2010 under the viewpoint, “Twitter Mood Predicts the Stock Market.” The researchers- whom have received a grant from the National Science Foundation in order to conduct the study- claim to have generated results showing an 87.6% indication of significant social climate affecting stock market outcome.
Submitted for public viewing on October 14, 2010, the researchers state the following:
“Here we investigate whether measurements of collective mood states derived from large-scale Twitter feeds are correlated to the value of the Dow Jones Industrial Average (DJIA) over time. We analyze the text content of daily Twitter feeds by two mood tracking tools, namely OpinionFinder that measures positive vs. negative mood and Google-Profile of Mood States (GPOMS) that measures mood in terms of 6 dimensions (Calm, Alert, Sure, Vital, Kind, and Happy).”
The study attempts to prove that public mood can directly affect stock market movements- thusly prediction by causation. The explanation given by the researchers states that since “recent research suggests that news may be unpredictable but that very early indicators can be extracted from online social media (blogs, Twitter feeds, etc) to predict changes in various economic and commercial indicators. This may conceivably also be the case for the stock market,” but I say, when can you call an economic indicator also a stock market predictor? Is it that easy to predict the market? If so, then how do the researches later clarify in the report that “correlation however does not prove causation.” But to be fair to the researchers, they do state afterward that “we are not testing actual causation but whether one time series has predictive information about the other or not.”
Additionally, I find that the econometric equation used to test the twitter hypothesis does not properly fit the subject at hand. Bollen, Mao and Zeng, apply the Granger Causality Analysis, named for W.J. Grander, economist and winner of the 2003 Nobel Prize in Economics. As defined by Scholarpedia, “Granger causality is a statistical concept of causality that is based on prediction. According to Granger causality, if a signal X1 “Granger-causes” (or “G-causes”) a signal X2, then past values of X1 should contain information that helps predict X2 above and beyond the information contained in past values of X2 alone.”
A more laymen’s explanation is given by John L. Daly stating, “Granger Causality,…has been used to compare stock market prices with changes in GDP, allowing phase correlations between the two to predict future GDP based on prior stock market trends (1987 being just a mistake?). It is thus primarily a creature of econometric models, and should perhaps have remained so.” He goes on to expound that:
An example of `statistical causality’ would be this. I drive to work on the highway around 8 am every morning, Monday to Friday, but not Saturday or Sunday. On exactly the same days, a torrent of traffic hits the highway about 15 minutes after I drive on it, but not on the days that I don’t drive on it. Therefore there is statistical causality here, that I cause the tide of traffic to follow me 15 minutes after my passage. That’s the kind of nonsense that Granger Causality can get you into. Even economists use it with caution as it has many hidden traps, such as the quality of input data and ignorance or oversight of other external causal variables.
Personally, I find the algorithmic valuations (stated previously) as well as the equation used the in research problematic because no matter how hard one tries you cannot predict human emotions. Grant it, behaviors can be broken down into foreseeable patterns for financial analysis but, nonetheless, minds change on a daily basis and how one may feel today is certainly no indicator of how one will feel tomorrow. Furthermore, I am troubled by how loosely the data source is applied to the study.
The timeline used in the analysis ranged from February 28, to December 19, 2008 to plot the number of tweets that were recorded for each day- with the exception of weekends. However, the cross- validation points to predict (or to watch out) for any significance to socio-culture events were the U.S. presidential election (November 4, 2008) and Thanksgiving Day (November 27, 2008). Well, as I recall, the 2008 U.S. presidential election was pretty significant huh? The first African-American to be chosen as the President of United States; I say anyone would agree that that was a significant event and one can merely use judgment in assuming that public mood was still pretty high by the time Turkey Day came around.
The conclusion that the researchers draw is in the following as well as a visual table reflecting such results:
Based on the results of our Granger causality…We observe that X1 (i.e. Calm) has the highest Granger causality relation with DJIA for lags ranging from 2 to 6 days (p-values < 0:05). The other four mood dimensions of GPOMS do not have significant causal relations with changes in the stock market, and neither does the OpinionFinder time series. As can be seen in Fig. 3 both time series frequently overlap or point in the same direction. Changes in past values of Calm (t – 3) predicts a similar rise or fall in DJIA values (t =0). The Calm mood dimension thus has predictive value with regards to the DJIA. In fact the p-value for this shorter period, i.e. August 1, 2008 to October 30 2008, is significantly lower (lag n = 3, p = 0:009)…for the period February 28, 2008 to November 3, 2008.
If, in fact, we are to go according to the outcome, we would assuredly detect a correlation- but a negative (or better yet, a neutral correlation) is what we will find instead. Again, if we are to follow the scores of the Calm mood dimension along with the DJIA, we will see that if anxiety is reflected by being below zero, then naturally “zero” means no anxiety, right? Also, Bollen, Mao and Zeng goes on by affirming that, “In particular we point to a significant deviation between the two graphs on October 13th where the DJIA surges by more than 3 standard deviations trough-to peak. The Calm curve however remains relatively flat at that time after which it starts to again track changes in the DJIA again.”
Did you catch that?
“…track changes in the DJIA;” by the very definition, to track means to follow, not predict movement.
So, now what?
With all that is said, I do not disagree with time series analysis or even sentiment analysis; both have fundamental and valid uses as investment mechanisms. However, when broad implications are made using these tools, the findings can be a bit jarring to fathom or even relatable when applied to misguided terms. Case in point: the 87.6% rate in predicting socio-cultural change merely is an indicator of the accuracy of the mechanism applied not from the result of the “twitter” theory as implied.
Bo Pang and Lillian Lee conducted a research titled, “Opinion Mining and Sentiment Analysis,” (2008) showing the effects of content-sensitive driven analysis. In one example,”“go read the book” most likely indicates positive sentiment for book reviews, but negative sentiment for movie reviews;” and no matter how far technology has gone, applied science cannot differentiate between the two.