PDF Optimal High-Frequency Market Making


Fortunately, the stochastic control theory helps to handle such kind of optimization problem by seeking an optimal strategy in order to maximize the trader’s objective function and to face a dyadic problem for the high-frequency trading. The theory encourages the study of optimizing activities in financial markets as it allows to accomplish the complex optimization problems involving constraints that are consistent with the price dynamics while managing the inventory risk. In order to detect the optimal quotes in the market, it is, therefore, necessary to solve the corresponding nonlinear Hamilton-Jacobi-Bellman equation for the optimal stochastic control problem. This is generally achieved by applying various root-finding algorithms that can handle the complexity and high-dimensionality of the equation.

standard deviation

If w and k were different for Gen-AS and Alpha-AS, it would be hard to discern whether observed differences in the performance of the models are due to the action modifications learnt by the RL algorithm or simply the result of differing parameter optimisation values. Alternatively, w and k could be recalibrated periodically for the Gen-AS model and the new values introduced into the Alpha-AS models as well. However, this would require discarding the prior training of the latter every time w and k are updated, forcing the Alpha-AS models to restart their learning process every time. The combination of the choice of one from among four available values for γ, with the choice of one among five values for the skew, consequently results in 20 possible actions for the agent to choose from, each being a distinct (γ, skew) pair. We chose a discrete action space for our experiment to apply RL to manipulate AS-related parameters, aiming keep the algorithm as simple and quickly trainable as possible.

Figures in bold are the best values among the five models for the corresponding test days. Figures for Alpha-AS 1 and 2 are given in green if their value is higher than that for the AS-Gen model for the same day. Figures in parenthesis are the number of days the Alpha-AS model in question was second best only to the other Alpha-AS model (and therefore would have computed another overall ‘win’ had it competed alone against the baseline and AS-Gen models). We performed genetic search at the beginning of the experiment, aiming to obtain the values of the AS model parameters that yield the highest Sharpe ratio, working on the same orderbook data. At each training step the parameters of the prediction DQN are updated using gradient descent.

Paper statistics

avellaneda stoikov market making on the market state and the agent’s private indicators (i.e., its latest inventory levels and rewards), a prediction neural network outputs an action to take. As defined above, this action consists in setting the value of the risk aversion parameter, γ, in the Avellaneda-Stoikov formula to calculate the bid and ask prices, and the skew to be applied to these. The agent will place orders at the resulting skewed bid and ask prices, once every market tick during the next 5-second time step. The usual approach in algorithmic trading research is to use machine learning algorithms to determine the buy and sell orders directly. In contrast, we propose maintaining the Avellaneda-Stoikov procedure as the basis upon which to determine the orders to be placed.

On this performance indicator, AS-Gen was the overall best performing model, winning on 11 days. The mean Max DD for the AS-Gen model over the entire test period was visibly the lowest , and its standard deviation was also the lowest by far from among all models. In comparison, both the mean and the standard deviation of the Max DD for the Alpha-AS models were very high. Indeed, the differences in Max DD performance between Gen-AS and either of the Alpha-AS models, over all test days, are not statistically significant, despite the large differences in means. The latter are a result of extreme outliers for the Alpha-AS models from days in which these obtained a very poor (i.e., high) value for Max DD. The medians, however, are very similar to the median for the Gen-AS model.

2 Gen-AS: Avellaneda-Stoikov model with genetically tuned parameters

Then, a robust sparse-norm and graph regularization constraints are performed in the objective function to ensure the consistency of the spatial information. For the optimization of the parameters involved in the model, a distributed adaptive proximal Newton gradient descent learning strategy is proposed to accelerate the convergence. Furthermore, considering the dynamic time-series and potentially non-stationary structure of industrial data, we propose extended incremental versions to alleviate the complexity of the overall model computation. Extensive data recovery experiments are conducted on two real industrial processes to evaluate the proposed method in comparison with existing state-of-the-art restorers. The results show that the proposed methods can impute better with different missing rates and have strong competitiveness in practical application.

What are the 3 factors that affect risks?

  • Behavioural.
  • Physiological.
  • Demographic.
  • Environmental.
  • Genetic.

It is worth mentioning that the trader changes her qualitative behavior depending on the liquidation and penalizing variations of the constants and her positions on inventories as the time approaches to maturity. On the optimal quotes will have just the opposite effect of when k is employed. While the other parameters are kept the same as in the Table1.

Conversely, the gains may also be greater, a benefit which is indeed reflected unequivocally in the results obtained for the P&L-to-MAP performance indicator. The latter is an important feature for market maker algorithms. Indeed, this result is particularly noteworthy as the Avellaneda-Stoikov method sets as its goal precisely to minimize the inventory risk.

  • The max_order_age parameter allows you to set a specific duration when resetting your order’s age.
  • While the other parameters are fixed to those in Table1, we see that there are more buy market orders arriving, thus the optimal filled sell spreads are larger for all inventory levels comparing to the case when the arrival of market orders is symmetric.
  • Allows your bid and ask order prices to be adjusted based on the current top bid and ask prices in the market.
  • The main contribution of this paper is a new integral deep LOB trading system that embraces model training, prediction, and optimization.
  • Rather, taking inspiration from Teleña , we mediate the order placement decisions through the AS model (our “avatar”, taking the term from ), leveraging its ability to provide quotes that maximize profit in the ideal case.

Wireless ad hoc networks are infrastructureless networks and are used in various applications such as habitat monitoring, military surveillance, and disaster relief. Data transmission is achieved through radio packet transfer, thus it is prone to various attacks such as eavesdropping, spoofing, and etc. Monitoring the communication links by secure points is an essential precaution against these attacks. Also, deploying monitors provides a virtual backbone for multi-hop data transmission. However, adding secure points to a WANET can be costly in terms of price and time, so minimizing the number of secure points is of utmost importance. Graph theory provides a great foundation to tackle the emerging problems in WANETs.

Additionally, sensitivity to volatility changes will be included with a particular parameter vol_to_spread_multiplier, to modify spreads in big volatility scenarios. The original Avellaneda-Stoikov model was chosen as a starting point for our research. We plan to use such approximations in further tests with our RL approach. The performance of the Alpha-AS models in terms of the Sharpe, Sortino and P&L-to-MAP ratios was substantially superior to that of the Gen-AS model, which in turn was superior to that of the two standard baselines. On the other hand, the performance of the Alpha-AS models on maximum drawdown varied significantly on different test days, losing to Gen-AS on over half of them, a reflection of their greater aggressiveness, made possible by their relative freedom of action.

However, this situation does not need to happen, so there is no guarantee he will set prices compatible with current market prices. If users choose to set the eta parameter, order sizes will be adjusted to WAVES further optimize the strategy behavior in regards to the current and desired portfolio allocation. This value is defined by the user, and it represents how much inventory risk he is willing to take. Closing_time – Here, you set how long each “trading session” will take. The value of q on the formula measures how many units the market maker inventory is from the desired target.

The selected action is then taken repeatedly, once every market tick, in the following 5-second window, at the end of which the reward (the Asymmetric Dampened P&L) obtained from this repeated execution of the action is computed. Where Ψ(τi) is the open P&L for the 5-second action time step, I(τi) is the inventory held by the agent and Δm(τi) is the speculative P&L (the difference between the open P&L and the close P&L), at time τi, which is the end of the ith 5-second agent action cycle. The reservation price is highly influenced by the election of the parameter T isn’t it? So, if T is high enough, each step in which q is not zero, the reservation price could be too high , and so the election of bid and ask quotes (both above or below the mid-price). The trading_intensity estimator is designed to be consistent with ideas outlined in the Avellaneda-Stoikov paper.

The resulting Gen-AS model, two non-AS baselines (based on Gašperov ) and the two Alpha-AS model variants were run with the rest of the dataset, from 9th December 2020 to 8th January 2021 , and their performance compared. To perform the first genetic tuning of the baseline AS model parameters (Section 4.2). Again, the probability of selecting a specific individual for parenthood is proportional to the Sharpe ratio it has achieved. A weighted average of the values of the two parents’ genes is then computed. Private indicators, consisting of features describing the state of the agent.

3 Test models and performance indicators

Then, we develop another, but novel, approach considering an underlying asset model with jumps in stochastic volatility. Such an extension allows one to fit the implied volatility smile better in practice. To overcome this problem, a deep Q-network approximates the Qs,a matrix using a deep neural network.

We were able to achieve some parallelisation by running five backtests simultaneously on different CPU cores. Upon finalization of the five parallel backtests, the five respective memory replay buffers were merged. 10 such training iterations were completed, all on data from the same full day of trading, with the memory replay buffer resulting from each iteration fed into the next. The replay buffer obtained from the final iteration was used as the initial one for the test phase.

max dd

Last but not least, we have substantially improved the performances of a market maker with the proposed models. Table13 which is achieved from all simulations demonstrates that the Model C which is the stock price modeling with stochastic volatility, has relatively larger expected return, but also a relatively larger standard deviation. Meanwhile, the other stock price modelings in Table13 produce higher Sharpe ratios. Consequently, she will sell the assets with a lower price on the positive inventory levels to reduce both the price risk and liquidation risk. On the other hand, she does not face with the liquidation risk on the negative inventory levels but wants to receive higher amount for selling the assets.

max dd

After a theoretical presentation of the method, an application using real data will be presented to demonstrate how the method works. In this part, we operate the simulations under the quadratic utility function for all introduced models here for the comparison purposes, although they have been defined with different utility criteria and solved under the different settings in their original papers. 2, we set the framework in continuous time and formulate the optimization problem in terms of the expected return of the trader.

Mann-Whitney tests comparing the four daily performance indicator values (Sharpe, Sortino, Max DD and P&L-to-MAP) obtained for the Gen-AS model with the corresponding values obtained for the other models, over the 30 test days. Number of days either Alpha-AS-1 or Alpha-AS-2 scored best out of all tested models, for each of the four performance indicators. The dataset used contains the L2 orderbook updates and market trades from the btc-usd (bitcoin–dollar pair), for the period from 7th December 2020 to 8th January 2021, with 12 hours of trading data recorded for each day. Most of the data, the Java source code and the results are accessible from the project’s GitHub repository .

  • The same authors have recently explored the use of a soft actor-critic RL algorithm in market making, to obtain a continuous action space of spread values .
  • Furthermore, on 9 of the 12 days for which Alpha-AS-1 had the best Sharpe ratio, Alpha-AS-2 had the second best; conversely, there are 11 instances of Alpha-AS-1 performing second best after Alpha-AS-2.
  • The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception .
  • That is, they achieve a better P&L profile with less exposure to market movements.

Besides, we find that the number of signals avellaneda stoikov market makingerated from the system can be used to rank stocks for the preference of LOB trading. We test the system with simulation experiments and real data from the Chinese A-share market. The simulation demonstrates the characteristics of the trading system in different market sentiments, while the empirical study with real data confirms significant profits after factoring in transaction costs and risk requirements. Consequently, the Alpha-AS agent adapts its bid and ask order prices dynamically, reacting closely (at 5-second steps) to the changing market. This 5-second interval allows the Alpha-AS algorithm to acquire experience trading with a certain bid and ask price repeatedly under quasi-current market conditions.

Should you hedge or should you wait? – Risk.net

Should you hedge or should you wait?.

Posted: Wed, 24 Aug 2022 07:00:00 GMT [source]

What our https://www.beaxy.com/ algorithm determines are, as we shall see shortly, the values of the main parameters of the AS model. It is then the latter that calculates the optimal bid and ask prices at each step. The AS algorithm is static in its reliance on analytical formulas to generate bid and ask quotes based on the real-time input values for the market mid-price of the security and the current stock inventory held by the market maker. These formulas have fixed parameters to model the market maker’s aversion to risk and the statistical properties of market orders. The AS model generates bid and ask quotes that aim to maximize the market maker’s P&L profile for a given level of inventory risk the agent is willing to take, relying on certain assumptions regarding the microstructure and stochastic dynamics of the market. Extensions to the AS model have been proposed, most notably the Guéant-Lehalle-Fernandez-Tapia approximation , and in a recent variation of it by Bergault et al. , which are currently used by major market making agents.


To this approach, more specifically one based on deep reinforcement learning, we turn to next. In this paper we present a limit order placement strategy based on a well-known reinforcement learning algorithm. We use the RL algorithm to modify the risk aversion parameter and to skew the AS quotes based on a characterization of the latest steps of market activity. Another distinctive feature of our work is the use of a genetic algorithm to determine the parameters of the AS formulas, which we use as a benchmark, to offer a fairer performance comparison to our RL algorithm. Data normalization for features and labeling for signals are required for classification. Instead of simply labeling the mid-price movement as in Kercheval and Zhang and Tsantekidis et al. , we consider the direct trading actions, including long, short, and none.

Top 10 Quant Professors 2022 – Rebellion Research

Top 10 Quant Professors 2022.

Posted: Thu, 13 Oct 2022 07:00:00 GMT [source]

These individuals run through the orderbook data, and are then ranked according to the Sharpe ratio they have attained. For each subsequent generation 45 new individuals run through the data and then added to the cumulative population, retaining all the individuals from previous generations. The 10 generations thus yield a total of 450 individuals, ranked by their Sharpe ratio. Note that, since we retain all individuals from generation to generation, the highest Sharpe ratio the cumulative population never decreases in subsequent generations. The data on which the metrics for our market features were calculated correspond to one full day of trading .