Summary
- Investing firms and consultants are using new technologies like Big Data and Artificial Intelligence (AI) to analyse a massive set of data for gathering business intelligence.
- Source of information or data is most important than the data itself.
- Data miners use discrete data from various sources and try to find correlation and make the interpretation based on some assumptions. The big question is, how reliable is the interpretation?
Most people associated with the investing world consider stock market unpredictable and erratic. Traders and investors assimilate the past data along with current financial data to develop some model for predicting the movement of particular stock or industry.
Data is the new Oil
Any informed investor knows the value of data in this erratic market. The process of data collection is in itself a vast and tiring task. Analysts around the world gather and analyse various set of data available and work up a model to forecast the behaviour of a market.
Kalkine Image
The revolution in information technology has enabled people to access the data and information with ease. So what is it about data mining that makes it special and a highly paid career options? Once the data is made available, how to know whether it is fool’s gold or something valuable?
Let us discuss the various sources of data & information
Source of the data is the most important. Having said that, investors and traders first check the source of information to check the authenticity of the data. Some of the popular sources of free data are discussed below:
- Search Engine
Most of the traders and investors use search engines to access any information available on the internet. It has become the most convenient way of accessing information anywhere. Google trend and Baidu index are the most famous search engines in the investing world.
Analysts use these indexes to unearth the relationship between investor attention, price volatility, market efficiency and volatility combined with stock returns. The search engines also keep a tab on users search queries and can be used to generate relationships between the most search queries and the market.
- Social Media
Social media platforms are the fastest medium of information propagation. Some news or information is first spread on famous social media, and then investors exchange their ideas and take on that particular news. Often, people who are new to the market read those blogs and social media updates to get information.
Twitter, Facebook and Youtube are some well known social media platforms where investors share their information. Most watched videos of people on Youtube and Facebook are considered as influential people, and it is often observed that novice investors put their money based on their opinion.
Nearly 200 million pieces of tweets on Twitter on 30 NASDAQ stocks are used to develop investor sentiment index.
- Forums
Various online stock forums are available to track stock news and updates with their in-house reports of analysts. These forums could be used to understand any stock related information and improve one’s understanding. Often, these forums act as a good platform for crowd sourcing information.
- News
The news could be in print media, digital media or dedicated news for the stock market. Investors and traders try to catch up on such news as they are believed to be coming from direct company sources or someone expert on such topics.
The Search Engine Optimisation (SEO) could be used for by digital news publishers to reach out to their target audience in a much efficient manner.
Now you have the Data, what to do with it?
Most of the data miners try to find links between discrete data sets which looks good. Sometimes they succeed in finding a good interpretation of the data while other times, wrong data can lead to investors losing money.
Data Miner figuring out links between different sets of data (Image source - © Kalkine Group 2020)
Suppose there is some hypothesis that a particular stock may see some rally. Data miners will go all gaga on the information and find all the data which supports the hypothesis instead of testing the hypothesis in the first place. It is like putting wagons before the rail engine and expecting high-speed transit.
With such high volume and speed of all sorts of data, no one has the requisite time to evaluate each one and deduce logical and rational correlations. Most of the times if the correlations look good, and then people start deducing their answers around the correlation. It is exactly the opposite which needs to be done.
All the hypothesis and answers should come from the correlations, not the other way round. Various media and online news forums bombard the investors with lot of information. Lack of time and understanding drives people to take short cuts and thus make grave mistakes.
Best way to avoid data mis interpretations is to develop skills to mine the actual and correct data from the original source. People should follow company reports like annual or quarterly reports to understand the financial condition of the company.
Most of the companies file compulsory operational updates on a regular basis to the stock exchanges. Those are considered as the primary source of information. One should keep a tab on global economic and business conditions to analyse the sector in which he/she is planning to put the hard earned-money.
The famous adage comes of help for many investors before they plunge into the world of data analysis, ‘Garbage in garbage out’.