Sentiment analytics and its objectives
Sentiment analytics or emotion AI refers to the use of technologies such as Natural Language Processing (NLP), text analysis and biometrics to systematically identify, extract, tabulate, quantify and analyze emotional states and subjective information. Such technologies have shown immense promise in areas where gauging the mood of customers is important. Capgemini has also been working in this area and over the last one year, I got the chance to be involved just one such project. The project dealt with spot market forecasting and the pilot phase, which ran from July to December 2017, returned some extremely positive results.
The aim of the project was simple – to use a combination of the traditional methods together with sentiment analysis, to generate buy, hold, or sell recommendations in real-time. This was done over a period of nine months in the spot market. For those who might not be aware, the spot market is a classic trading market, a financial market in which commodities/inventories, such as gold, silver, iron ore, etc. are traded for immediate delivery. To deliver its trading recommendations, our engine tracked the following:
– Underlying demand and supply factors impacting the commodity price.
– Time series data related to price movements and volumes traded
– Prevailing trading sentiment
All of these factors were analyzed to generate a trading outlook (positive or negative or neutral) for the next day. Based on this outlook and price prediction model results, an expert would take the decision to buy, hold, or sell the commodity.
The overall business cycle with and without automation is illustrated in following figure:
The objective was to come up with a decision support system through automatic market sentiment analytics and predictive modeling for price movement predictions.
How was it achieved?
The ‘fundamental analysis’ was based on the usual micro-economic price indicators as well as market variables. Thus, news related to things such as output quality and quantity, forecasted demand, commodity movement and supply-chain bottlenecks, all served as inputs to the analytics engine. In addition, the engine also factored in prices of related commodities. This inherent relationship between the price of certain commodities, such as that between crude oil, gold, and the US Dollar is well known in trading circles. One may not be very clear about the exact underlying relationship between these commodities but over large periods of time, it has been seen that price movements in one of these commodities invariably impacts the price of the other in a certain manner. Such trading relationships were also taken into consideration by the engine and the prices of related products monitored.
In addition to this many traders also rely on ‘technical analysis’ or an analysis of time-series data related to either product price or volumes traded. This purely statistical analysis of price movements or volume traded is used to evaluate the level of support for a particular price level, or the level of pressure on it so as to enable price forecasting. Most trading softwares on the market today, use technical analysis to make their predictions and our engine also analyzes this time series data.
However, for making immediate and short-term predictions, one of the most important factors is an ability to read the market sentiment and this is where our engine differs from others as it combines an analysis of market sentiments with technical and fundamental analysis to make predictions. As part of the pilot project, the engine analyzed numerous English and Mandarin news sources and went through relevant information (freely available on the internet) to decipher the prevailing market sentiment. Mandarin was chosen in addition to English, as China accounts for a huge proportion of the world demand for the product being traded.
It is only after combining all the three types of analysis given above that an overall market sentiment score was arrived at, which was then used as one of the predictors for the next day’s price movement. The workflow of this process is illustrated by the diagram below:
This entire solution is hosted on an Azure Cloud platform with machine learning technologies and cognitive algorithms being used for the sentiment analysis. Like any other Big Data solution, the entire process, from data capture to recommendation, is completely automated with news being captured, retrieved and analyzed from multiple sources in near real-time. There were, of course, a couple of key challenges in getting such a solution to work.
The first key challenge was that of heterogeneous news sources all of which published their stories at different times throughout the day. These sources also varied the rates at which they published the information. Therefore, one big challenge was to capture this information, sync the various news items with each other and extract knowledge out of this mass of information.
The second critical challenge was that the various information sources would broadcast a variety of news, from which it was necessary to extract only the critical parts. Thus, filtering is a very important aspect of this solution and the engine needed to filter in real-time as well as update these filters dynamically in response to changing market forces. For instance: certain keywords might have a great impact on the price of my desired commodity today but might not have the same impact after a couple of months, so the weights given to various keywords needed to be dynamically updated over time.
System in operation
These and other challenges were effectively tackled by our engine which uses Azure data factory component to capture the information from heterogeneous sources. We also developed customized API’s in dotnet and python, which did unstructured data mining. The unstructured data captured by them is tabularized neatly and indexed, after which it is analyzed by the sentiment engine. In our solution, we had separate sentiment engines for the two different languages viz: English and Mandarin. Each of these engines had separate dictionaries to identify meanings and analyze the sentiment scores of different news.
The figure below illustrates our system’s architecture deployed for real time operation:
We ran this solution from January 2017 to November 2017 for one of our clients and the results were impressive. Our predictions were more than 70% accurate and trading based on these recommendations led to a revenue increase of 1 Mn which amounted to about 0.5% of their annual revenue. Thus, the benefits of trading based on our sentiment analysis engine has already been proven and while today, it works with a one-day forecasting window, there are plans to adapt this engine for larger windows of over a week or even a month.
There is a wide range of potential users for our solution with electricity distribution companies, crude oil corporations and metal commodity brokers likely to be particularly interested in a product like this. We also have many different potential usages for this technology in mind and I would love to discuss this engine’s applicability and scope with you. Please do leave your comments or drop me a mail at firstname.lastname@example.org to know more.