Introduction
“They were silly enough to think you can look at the past to predict the future”. The Economist.
We will explore the vectorized back-testing of algorithmic trading strategies. The term algorithmic trading strategy is used to describe any type of financial trading strategy that is based on an algorithm designed to take long, short, or neutral positions in financial instruments on its own without human interference. A simple algorithm, such as “altering every five minutes between a long and a neutral position in the stock of Apple, Inc.,” satisfies this definition. More technically, an algorithmic trading strategy is represented by some Python code that, given the availability of new data, decides whether to buy or sell a financial instrument in order to take long, short, or neutral positions in it.
long, short, neutral positions
A long—or a long position—refers to the purchase of an asset with the expectation it will increase in value—a bullish attitude.
A short position refers to a trading technique in which an investor sells a security with plans to buy it later. Shorting is a strategy used when an investor anticipates the price of a security will fall in the short term – a bearish attitude.
Neutral describes a position taken in a market that is neither bullish nor bearish. In other words, it is insensitive to the direction of the market's price.
We will focus on the technical aspects of the vectorised back-testing approach for a few select strategies. With this approach the financial data on which the strategy is tested is manipulated in general as a whole, applying vectorised operations on NumPy ndarray and pandas DataFrame objects that store the financial data. We will focus on the application of machine and deep learning algorithms to formulate algorithmic trading strategies. For this purpose, classification algorithms are trained on historical data in order to predict future directional market movements. In general this requires the transformation of the financial data from real values to a relatively small number of categorical (nominal) values in order to harness the pattern recognition power of such algorithms.
Vectorised back-testing approach The core concept of back-testing is to simulate a given trading strategy by going back in time and execute the trades as if you were in the past. The generated profits are often compared against a benchmark performance through some metrics (e.g. annualized returns). Depending on the nature of the strategy, different benchmarks and metrics are used. Caution: Back-testing is not an exact indicator of past performance and should not be used as a research tool. You do not randomly change your strategy parameters just because it looks better in the back-test. You should consider the historical performance of the market as one of many possible realizations of a random variable. Fitting your parameters to the back-test without sound economic logic will inevitably lead to overfitting.
There are two main methods for back-testing a strategy: (i) event-driven and (ii) vectorized. An event-driven back-test often involves the use of a loop that iterates over time, simulating an agent that sends orders depending on the market signals. This loop based approach can be extremely flexible, allowing you to simulate potential delays in orders execution, slippage costs, etc.
A vectorized back-test instead gathers strategy related data and organizes them into vectors and matrices, which are then processed through linear algebra methods (optimised for computation) to simulate past performance
Vectorized back-testing is hugely faster than event-driven back-testing and should be used for the explorative stage of strategy research, whereas event-driven back-tests are suited for future more in-depth analysis
Introduction
We will explore: -
Simple Moving Averages - an algorithmic trading strategy based on simple moving averages and how to back-test such a strategy.
Random Walk Hypothesis - introduces the random walk hypothesis.
Linear OLS Regression – applying OLS regression to derive an algorithmic trading strategy.
Clustering - using unsupervised learning algorithms to derive algorithmic trading strategies
Frequency Approach - a simple frequentist approach for algorithmic trading.
Classification – using traditional supervised learning algorithms for algorithmic trading.
Deep Neural Networks - deep neural networks and how to use them for algorithmic trading.
Simple Moving Averages
Trading based on simple moving averages (SMAs) is a decades-old trading approach (see, for example, the paper by Brock et al. (1992)). Although many traders use SMAs for their discretionary trading, they can also be used to formulate simple algorithmic trading strategies. We will explore the use of SMAs by introducing vectorized backtesting of algorithmic trading strategies.
Simple Moving Averages – Data Import
We will import the financial time series for a single symbol, the stock of Apple, Inc.
(AAPL.O). The analysis will be based on end-of-day data.
We calculate the SMA values for two different rolling window sizes. The Figure below shows the three time series visually.
Random Walk Hypothesis
The previous section introduces vectorised back-testing as an efficient tool to back-test algorithmic trading strategies. The single strategy back-tested based on a single financial time series, namely historical end-of-day prices for the Apple stock, outperforms the benchmark investment of simply going long on the Apple stock over the same period. Although rather specific in nature, these results are in contrast to what the random walk hypothesis (RWH) predicts, namely that such predictive approaches should not yield any outperformance at all. The RWH postulates that prices in financial markets follow a random walk, or, in continuous time, an arithmetic Brownian motion without drift. The expected value of an arithmetic Brownian motion without drift at any point in the future equals its value today. As a consequence, the best predictor for tomorrow’s price, in a least-squares sense, is today’s price if the RWH applies. (For a formal definition and deeper discussion of random walks and Brownian motion– based processes, refer to Baxter and Rennie (1996)).
“For many years, economists, statisticians, and teachers of finance have been interested in developing and testing models of stock price behaviour. One important model that has evolved from this research is the theory of random walks. This theory casts serious doubt on many other methods for describing and predicting stock price behaviour — methods that have considerable popularity outside the academic world. For example, we shall see later that, if the random-walk theory is an accurate description of reality, then the various “technical” or “chartist” procedures for predicting stock prices are completely without value”. Eugene F. Fama (1965)
The RWH is consistent with the efficient markets hypothesis (EMH), which, nontechnically speaking, states that market prices reflect “all available information.” A direct implication is that it is impossible to "beat the market" consistently on a riskadjusted basis since market prices should only react to new information. Different degrees of efficiency are generally distinguished, such as weak, semi-strong, and strong, defining more specifically what “all available information” entails. Formally, such a definition can be based on the concept of an information set in theory and on a data set for programming purposes, as the following quote illustrates.
“A market is efficient with respect to an information set 𝑆 if it is impossible to make economic profits by trading on the basis of information set.” S. Michael Jensen (1978)
Using Python, the RWH can be tested for a specific case as follows. A financial time series of historical market prices is used for which a number of lagged versions are created — say, five. OLS regression is then used to predict the market prices based on the lagged market prices created before. The basic idea is that the market prices from yesterday and four more days back can be used to predict today’s market price. The following
Python code implements this idea and creates five lagged versions of the historical end-of-day closing levels of the S&P 500 stock index.
Using NumPy, the OLS regression is straightforward to implement. As the optimal regression parameters show, lag_1 indeed is the most important one in predicting the market price based on OLS regression. Its value is close to 1. The other four values are rather close to 0. Figure (next slide) visualizes the optimal regression parameter values.
lag_1 indeed is the most important one in predicting the market price based on OLS regression. Its value is close to 1. The other four values are rather close to 0.
Using NumPy, the OLS regression is straightforward to implement. As the optimal regression parameters show,
𝒚 = 𝜷𝟎 + 𝜷𝟏𝒙𝟏 + ⋯ + 𝜷𝒏 + 𝜺
When using the optimal results to visualize the prediction values as compared to the original index values for the S&P 500, it becomes obvious that indeed lag_1 is basically what is used to come up with the prediction value.
Graphically speaking, the prediction line is the original time series shifted by one day to the right (with some minor adjustments).
All in all, the brief analysis has revealed some support for both the RWH (random Walk Hypothesis) and the EMH (Efficient Markets Hypothesis). The analysis was done for a single stock index only and uses a specific parameterization. This can easily be widened to incorporate multiple financial instruments across multiple asset classes, different values for the number of lags, etc. In general, one will find out that the results are qualitatively more or less the same. After all, the RWH and EMH are among the financial theories that have broad empirical support. In that sense, any algorithmic trading strategy must prove its worth by proving that the RWH does not apply in general. A tough hurdle.
Linear OLS Regression
This section applies linear OLS regression to predict the direction of market movements based on historical log returns. To keep things simple, only two features are used. The first feature (lag_1) represents the log returns of the financial time series lagged by one day. The second feature (lag_2) lags the log returns by two days. Log returns — in contrast to prices — are stationary in general, which often is a necessary condition for the application of statistical and ML algorithms. The basic idea behind the usage of lagged log returns as features is that they might be informative in predicting future returns. For example, one might hypothesize that after two downward movements an upward movement is more likely (“ mean reversion”), or, to the contrary, that another downward movement is more likely (“ momentum” or “trend”). The application of regression techniques allows the formalization of such informal reasonings.
Stationary series
A stationary time series is one whose properties do not depend on the time at which the series is observed. Thus, time series with trends, or with seasonality, are not stationary — the trend and seasonality will affect the value of the time series at different times.
Most financial and economic times series are not stationary. Even when you adjust them for seasonal variations, they will exhibit trends, cycles, random walk and other non-stationary behaviour. We can use a variety of techniques to make a non-stationary series stationary depending on the kind of non-stationary behaviour present in the series. The two techniques are differencing and logarithmic transformations.
Log transformation can be used to stabilize the variance of a series with non-constant variance. This is done using the log() function. One limitation of log transformation is that it can be applied only to positively valued time series. Taking a log shrinks the values towards 0. For values that are close to 1, the shrinking is less and for the values that are higher, the shrinking is more, thus reducing the variance. For negative data, you can add a suitable constant to make all the data positive before applying the transformation. This constant can then be subtracted from the model to obtain predicted (i.e., the fitted) values and forecasts for future points. Use of log returns is popular in finance, rather than the prices or raw returns.
Figure (below) shows the frequency distribution of the daily historical log returns for the EUR/ USD exchange rate. They are the basis for the features as well as the labels to be used.
Create the features by lagging the log returns and visualize it in combination with the returns data. Refer Figure below.
With the data set completed, linear OLS regression can be applied to learn about any potential (linear) relationships, to predict market movement based on the features, and to back-test a trading strategy based on the predictions. Two basic approaches are available: using the log returns or only the direction data as the dependent variable during the regression. In any case, predictions are real-valued and therefore transformed to either +1 or -1 to only work with the direction of the prediction.
Clustering
We will apply k-means clustering to financial time series data to automatically come up with clusters that are used to formulate a trading strategy. The idea is that the algorithm identifies two clusters of feature values that predict either an upward movement or a downward movement.
Admittedly, this approach is quite arbitrary in this context — after all, how should the algorithm know what one is looking for? However, the resulting trading strategy shows a slight outperformance at the end compared to the benchmark passive investment (see Figure next slide). Hit rates though are not high.
Frequency Approach
Beyond more sophisticated algorithms and techniques, one might come up with the idea of just implementing a frequency approach to predict directional movements in financial markets. To this end, one might transform the two real-valued features to binary ones and assess the probability of an upward and a downward movement, respectively, from the historical observations of such movements, given the four possible combinations for the two binary features (( 0, 0), (0, 1), (1, 0), (1, 1)). Making use of the data analysis capabilities of pandas, such an approach is relatively easy to implement.
Digitizes the feature values given the bins parameter.
Shows the digitized feature values and the label values.
Note: np.digitize(data, bins) where bins=[0] will test if the data value is less than 0 and return ‘0’ when true and ‘1’ when false.
How can ML algorithms be useful in Finance?
Many financial operations require making decisions based on pre-defined rules, like option pricing, algorithmic execution, or risk monitoring. This is where the bulk of automation has taken place so far, transforming the financial markets into ultra-fast, hyper-connected networks for exchanging information. In performing these tasks, machines were asked to follow the rules as fast as possible. High-frequency trading is a prime example. See Easley, López de Prado, and O’Hara [2013] for a detailed treatment of the subject. The algorithmizing of finance is unstoppable. Between June 12, 1968, and December 31, 1968, the NYSE was closed every Wednesday, so that back office could catch up with paperwork. Can you imagine that? We live in a different world today, and in 10 years things will be even better. Because the next wave of automation does not involve following rules, but making judgment calls.
Contact us to get any help related to financial data Science or ML algorithms or send your project requirement details at:
realcode4you@gmail.com
We are providing top rated services without any plagiarism issues. Hire expert and instant help with an affordable price.
Comentarios