Revision as of 22:59, 11 May 2024 by Bot

Introduction

[math] \newcommand*{\rom}[1]{\expandafter\@slowromancap\romannumeral #1@} \newcommand{\vertiii}[1]{{\left\vert\kern-0.25ex\left\vert\kern-0.25ex\left\vert #1 \right\vert\kern-0.25ex\right\vert\kern-0.25ex\right\vert}} \DeclareMathOperator*{\dprime}{\prime \prime} \DeclareMathOperator{\Tr}{Tr} \DeclareMathOperator{\E}{\mathbb{E}} \DeclareMathOperator{\N}{\mathbb{N}} \DeclareMathOperator{\R}{\mathbb{R}} \DeclareMathOperator{\Sc}{\mathcal{S}} \DeclareMathOperator{\Ac}{\mathcal{A}} \DeclareMathOperator{\Pc}{\mathcal{P}} \DeclareMathOperator*{\argmin}{arg\,min} \DeclareMathOperator*{\argmax}{arg\,max} \DeclareMathOperator{\sx}{\underline{\sigma}_{\pmb{X}}} \DeclareMathOperator{\sqmin}{\underline{\sigma}_{\pmb{Q}}} \DeclareMathOperator{\sqmax}{\overline{\sigma}_{\pmb{Q}}} \DeclareMathOperator{\sqi}{\underline{\sigma}_{Q,\textit{i}}} \DeclareMathOperator{\sqnoti}{\underline{\sigma}_{\pmb{Q},-\textit{i}}} \DeclareMathOperator{\sqfir}{\underline{\sigma}_{\pmb{Q},1}} \DeclareMathOperator{\sqsec}{\underline{\sigma}_{\pmb{Q},2}} \DeclareMathOperator{\sru}{\underline{\sigma}_{\pmb{R}}^{u}} \DeclareMathOperator{\srv}{\underline{\sigma}_{\pmb{R}}^v} \DeclareMathOperator{\sri}{\underline{\sigma}_{R,\textit{i}}} \DeclareMathOperator{\srnoti}{\underline{\sigma}_{\pmb{R},\textit{-i}}} \DeclareMathOperator{\srfir}{\underline{\sigma}_{\pmb{R},1}} \DeclareMathOperator{\srsec}{\underline{\sigma}_{\pmb{R},2}} \DeclareMathOperator{\srmin}{\underline{\sigma}_{\pmb{R}}} \DeclareMathOperator{\srmax}{\overline{\sigma}_{\pmb{R}}} \DeclareMathOperator{\HH}{\mathcal{H}} \DeclareMathOperator{\HE}{\mathcal{H}(1/\varepsilon)} \DeclareMathOperator{\HD}{\mathcal{H}(1/\varepsilon)} \DeclareMathOperator{\HCKI}{\mathcal{H}(C(\pmb{K}^0))} \DeclareMathOperator{\HECK}{\mathcal{H}(1/\varepsilon,C(\pmb{K}))} \DeclareMathOperator{\HECKI}{\mathcal{H}(1/\varepsilon,C(\pmb{K}^0))} \DeclareMathOperator{\HC}{\mathcal{H}(1/\varepsilon,C(\pmb{K}))} \DeclareMathOperator{\HCK}{\mathcal{H}(C(\pmb{K}))} \DeclareMathOperator{\HCKR}{\mathcal{H}(1/\varepsilon,1/{\it{r}},C(\pmb{K}))} \DeclareMathOperator{\HCKR}{\mathcal{H}(1/\varepsilon,C(\pmb{K}))} \DeclareMathOperator{\HCKIR}{\mathcal{H}(1/\varepsilon,1/{\it{r}},C(\pmb{K}^0))} \DeclareMathOperator{\HCKIR}{\mathcal{H}(1/\varepsilon,C(\pmb{K}^0))} \newcommand{\mathds}{\mathbb}[/math]

The mathematical approach to many financial decision-making problems has traditionally been through modelling with stochastic processes and using techniques from stochastic control. The choice of models is often dictated by the need to balance tractability with applicability. Simple models lead to tractable and implementable strategies in closed-form or that can be found through traditional numerical methods. However, these models sometimes oversimplify the mechanisms and the behaviour of financial markets which may result in strategies that are sub-optimal in practice and that can potentially result in financial losses. On the other hand, models that try to capture realistic features of financial markets are much more complex and are often mathematically and computationally intractable using the classical tools of stochastic optimal control. In recent years the availability of large amounts of financial data on transactions, quotes and order flows in electronic order-driven markets has revolutionized data processing and statistical modeling techniques in finance and brought new theoretical and computational challenges [1]. In contrast to the classical stochastic control approach, new ideas coming from reinforcement learning (RL) are being developed to make use of all this information. Reinforcement learning describes methods by which agents acting within some system might learn to make optimal decisions through repeated experience gained by interacting with the system. In the finance industry there have been a number of recent successes in applying RL algorithms in areas such as order execution, market making and portfolio optimization that have attracted a lot of attention. This has led to rapid progress in adapting RL techniques to improve trading decisions in various financial markets when participants have limited information on the market and other competitors. % {\color{red}[The criticism seems to be a bit too strong...It might be better to first introduce each review paper, and then discuss the focus of our review paper, and hence the difference from the literature.]} Although there are already a number of more specialized review papers concerning aspects of reinforcement learning in finance, we aim to review a broad spectrum of activity in this area. This survey is intended to provide a systematic introduction to the theory of RL, provide a unified framework for performance evaluation and a comprehensive summary of the cutting-edge results in RL theory. {It is} followed by an introductory discussion of each of the following financial problems -- optimal execution, portfolio optimization, option pricing and hedging, market making, smart order routing, and robo-advising. Moreover, we will also discuss the advantages of RL methods over classical approaches such as stochastic control, especially for the problems that have been studied extensively in the mathematical finance literature. % For other recent surveys with different emphases see [2], [3], [4], [5], [6], and [7]. For other surveys in the literature on RL applications in finance, see [3][4][5][6][7]. The main focus of these surveys are traditional financial applications such as portfolio optimization and optimal hedging [3][5], or trading on the stock and foreign exchange markets [6], or specific RL approaches such as actor-critic-based methods [4] and deep RL methods [7]. %No modern financial applications such as market making, high-frequency trading, and robo-advising are discussed in these survey papers. In addition, none of these survey papers provides a unified framework of performance evaluation nor a comprehensive summary on the cutting-edge results in RL theory.

% For a discussion of RL methods with applications to other financial problems, including option pricing and portfolio optimization, within the broader framework of machine learning, see [1](Chapter 10). % {\color{ForestGreen}Much effort has been made in reviewing RL methods with applications in finance. [4] splits the RL approaches to three classes: actor-only, critic-only, and Actor-Critic methods and discusses each class of RL methods and their applications in a mixture of topics. [6] categorizes RL methods into several specific types containing classic RL methods and combinations of different algorithms, and discusses the applications of each type of methods in either trading or price forecasting problems. [7] surveys deep learning and deep RL methods with applications in economics, whereas only the applications of deep RL methods in stock trading, portfolio optimization, and online recommendation services have been reviewed. [3] focuses on RL algorithms applied in economic modelling, operation research and game theory, and finance. In the section of financial application, they provide short discussions on risk management, portfolio allocation, and market microstructure. We also refer to [1] for a book-length discussion of RL methods with applications (see Chapter 10) in financial problems including option pricing and portfolio optimization. Different from the existing review papers, we provide both a systematic introduction to the RL theory and a thorough discussion on each of the following financial problems -- optimal execution, portfolio optimization, option pricing and hedging, market making, smart order routing, and robo-advising. Moreover, we also discuss the advantages of RL methods over classic methods such as stochastic control, especially for the problems that have been studied extensively in the mathematical finance literature.} Our survey will begin by discussing Markov decision processes (MDP), the framework for many reinforcement learning ideas in finance. We will then consider different approaches to learning within this framework with the main focus being on value-based and policy-based methods. In order to implement these approaches we will introduce deep reinforcement methods which incorporate deep learning ideas in this context. For financial applications we will consider a range of topics and for each we will introduce the basic underlying models before considering the RL approach to tackling them. We will discuss a range of papers in each application area and give an indication of their contributions. We conclude with some thoughts about the direction of development of reinforcement learning in finance.

General references

Hambly, Ben; Xu, Renyuan; Yang, Huining (2023). "Recent Advances in Reinforcement Learning in Finance". arXiv:2112.04553 [q-fin.MF].

References

  1. 1.0 1.1 1.2 Cite error: Invalid <ref> tag; no text was provided for refs named dixon2020machine
  2. Cite error: Invalid <ref> tag; no text was provided for refs named cartea2021deep
  3. 3.0 3.1 3.2 3.3 Cite error: Invalid <ref> tag; no text was provided for refs named charpentier2020
  4. 4.0 4.1 4.2 4.3 Cite error: Invalid <ref> tag; no text was provided for refs named fischer2018
  5. 5.0 5.1 5.2 Cite error: Invalid <ref> tag; no text was provided for refs named kolm2020modern
  6. 6.0 6.1 6.2 6.3 Cite error: Invalid <ref> tag; no text was provided for refs named meng2019
  7. 7.0 7.1 7.2 7.3 Cite error: Invalid <ref> tag; no text was provided for refs named mosavi2020