A UAE-based financial firm with an investment fund that trades in the Forex currency markets has approached Cognitro Analytics to provide data analytics service which will ultimately lead to proving insights from a sample of trading data. The organization had implemented an automated trading systems with mixed views/sentiments about how the market reacts to global news. The firm requested that Cognitro analytics to explore extracting relationships from existing variables to build an appropriate forecasting model and ultimately predict the direction of the Forex market on the short term.
The problem of currency trading is an old one, and many had attempted to provide insights through classical technical analysis. However, there’a a vital role that advanced Analytics and AI (Artificial Intelligence) can play in optimizing trading strategy. The data provided was for USD-Euro trading transactions captured between 2013- 2015, with each entry representing a five-minute interval reading of 1) the sale price of the Euro over the dollar and 2) investment sentiments of long and short. Aside from predicting the Forex movement, the focus was also to pin-point the main events in the market that have triggered the any upward or downward movement. (See figure 1).
Figure 1. Project Phases
In the mixed of volatility of the streaming trading data, there are so many hidden signals of various levels of significance. The first part of the project was to pass the trading signal through machine learning heuristics to isolate those sequence of events driving currency fluctuations drivers with no prior knowledge or assumptions. To achieve this, we applied several pre-processing techniques and transformation techniques which including time-series to discrete events transformation a different instances of time along with the sentiments data in order to generate a set of probabilistic rules describing indicating. In order to generate the target variables, a series of optimal binning was achieved through vector quantization to obtain a 3-levels of L-M-H at any given moment forming possible trigger signals (See figure 2). A C4.5 decision tree was then used as a classifier for determining an appropriate action (among a predetermined set of actions) for a given case and post pruning method was then applied to avoid model overfitting.
Figure 2. How to generate the target variable
Two data set we used :the training set (seen data) to build the model ,and the test set (unseen data) to measure its performance , this method is repeated 10 times , on each time the whole dataset is divided into 70:30 ratio for training: testing process. After 200 round of simulations of the training part of the earlier trading cycles, our model was found to have accurately predicted the direction of the Euro/Dollar 5 days in advance, and that acting on the recommendation of the model, a 15% of return on investment over a period of one month was guaranteed.