guide:A0002b5f37: Difference between revisions

From Stochiki
No edit summary
No edit summary
Line 42: Line 42:
\newcommand{\mathds}{\mathbb}</math></div>
\newcommand{\mathds}{\mathbb}</math></div>
The mathematical approach to many financial decision-making problems has traditionally been through modelling with stochastic processes and using techniques from stochastic control. The choice of models is often dictated by the need to balance tractability with applicability. Simple models lead to tractable and implementable strategies in closed-form or that can be found through traditional numerical methods. However, these models sometimes oversimplify the mechanisms and the behaviour of financial markets which may result in strategies that are sub-optimal in practice and that can potentially result in financial losses. On the other hand, models that try to capture realistic features of financial markets are much more complex and are often mathematically and computationally intractable using the classical tools of stochastic optimal control.
The mathematical approach to many financial decision-making problems has traditionally been through modelling with stochastic processes and using techniques from stochastic control. The choice of models is often dictated by the need to balance tractability with applicability. Simple models lead to tractable and implementable strategies in closed-form or that can be found through traditional numerical methods. However, these models sometimes oversimplify the mechanisms and the behaviour of financial markets which may result in strategies that are sub-optimal in practice and that can potentially result in financial losses. On the other hand, models that try to capture realistic features of financial markets are much more complex and are often mathematically and computationally intractable using the classical tools of stochastic optimal control.
<!-- \color{RedOrange}
r2018} and deep RL methods <ref name="mosavi2020"/>.   
%Continuous-time models with stochastic processes and stochastic optimal control techniques are the foundation of many financial decision-making problems including algorithmic trading, portfolio management, market making, etc. These models are formulated in a way to balance the tractability and the practicability. Simple models lead to tractable and implementable trading strategies in closed-form or can be solved numerically. However, these models sometimes oversimplify the mechanism and the behaviour of the  markets which may resulting in mis-specified optimal strategies and potential financial losses. On the other hand, realistic models are more complex and are often mathematically and computationally intractable with the classical tools of stochastic optimal control. -->
In recent years the availability of large amounts of financial data on transactions,
quotes and order flows in electronic order-driven markets has
revolutionized data processing and statistical modeling techniques in finance and brought new theoretical and computational challenges <ref name="dixon2020machine"/>. In contrast to the classical stochastic control approach, new ideas coming from reinforcement learning (RL) are being developed to make use of all this information. Reinforcement learning describes methods by which agents acting within some system might learn to make optimal decisions through repeated experience gained by interacting with the system.
In the finance industry there have been a number of recent successes in
applying  RL algorithms in areas such as order execution, market making and portfolio optimization that have attracted a lot of attention. This has led to rapid progress in adapting RL techniques to improve trading decisions in various financial markets when participants have limited information on the market and other competitors.
% {\color{red}[The criticism seems to be a bit too strong...It might be better to first introduce each review paper, and then discuss the focus of our review paper, and hence the difference from the literature.]}  
Although there are already a number of more specialized review papers concerning aspects of reinforcement learning in finance, we aim to review a broad spectrum of activity in this area. This survey is intended to provide a systematic introduction to the theory of RL, provide a unified framework for performance evaluation and a comprehensive summary of the cutting-edge results in RL theory. {It is} followed by an introductory discussion of each of the following financial problems -- optimal execution, portfolio optimization, option pricing and hedging, market making, smart order routing, and robo-advising. Moreover, we will also discuss the advantages of RL methods over classical approaches such as stochastic control, especially for the problems that have been studied extensively in the mathematical finance literature.
% For other recent surveys with different emphases see <ref name="cartea2021deep"/>, <ref name="charpentier2020"/>, <ref name="fischer2018"/>, <ref name="kolm2020modern"/>, <ref name="meng2019"/>, and <ref name="mosavi2020"/>. 
For other surveys in the literature on RL applications in finance, see <ref name="charpentier2020"/><ref name="fischer2018"/><ref name="kolm2020modern"/><ref name="meng2019"/><ref name="mosavi2020"/>. The main focus of these surveys are traditional financial applications such as portfolio optimization and optimal hedging  <ref name="charpentier2020"/><ref name="kolm2020modern"/>, or trading on the stock and foreign exchange markets <ref name="meng2019"/>, or specific RL approaches such as actor-critic-based methods <ref name="fischer2018"/> and deep RL methods <ref name="mosavi2020"/>.   
%No modern financial applications such as  market making, high-frequency trading, and robo-advising are discussed in these survey papers. In addition, none of these survey papers provides a unified framework of performance evaluation  nor a  comprehensive summary on the cutting-edge results in RL theory.


% For a discussion of RL methods with applications to other financial problems, including option pricing and portfolio optimization, within the broader framework of machine learning, see <ref name="dixon2020machine"/>{{rp|at=Chapter 10}}.
% {\color{ForestGreen}Much effort has been made in reviewing RL methods with applications in finance. <ref name="fischer2018"/> splits the RL approaches to three classes: actor-only, critic-only, and Actor-Critic methods and discusses each class of RL methods and their applications in a mixture of topics. <ref name="meng2019"/> categorizes RL methods into several specific types containing classic RL methods and combinations of different algorithms, and discusses the applications of each type of methods in either trading or price forecasting problems. <ref name="mosavi2020"/> surveys deep learning and deep RL methods with applications in economics, whereas only the applications of deep RL methods in stock trading, portfolio optimization, and online recommendation services have been reviewed. <ref name="charpentier2020"/> focuses on RL algorithms applied in economic modelling, operation research and game theory, and finance. In the section of financial application, they provide short discussions on risk management, portfolio allocation, and market microstructure. We also refer to <ref name="dixon2020machine"/> for a book-length discussion of RL methods with applications (see Chapter 10) in financial problems including option pricing and portfolio optimization. Different from the existing review papers, we provide both a systematic introduction to the RL theory and a thorough discussion on each of the following financial problems -- optimal execution, portfolio optimization, option pricing and hedging, market making, smart order routing, and robo-advising. Moreover, we also discuss the advantages of RL methods over classic methods such as stochastic control, especially for the problems that have been studied extensively in the mathematical finance literature.}
Our survey will begin by discussing Markov decision processes (MDP), the framework for many reinforcement learning ideas in finance.  
Our survey will begin by discussing Markov decision processes (MDP), the framework for many reinforcement learning ideas in finance.  
We will then consider different approaches to learning within this framework with the main focus being on value-based and policy-based methods. In order to implement these approaches we will introduce deep reinforcement methods which incorporate deep learning ideas in this context. For  financial applications we will consider a range of topics and for each we will introduce the basic underlying models before considering the RL approach to tackling them. We will discuss a range of papers in each application area and give an indication of their contributions. We conclude with some thoughts about the direction of development of reinforcement learning in finance.
We will then consider different approaches to learning within this framework with the main focus being on value-based and policy-based methods. In order to implement these approaches we will introduce deep reinforcement methods which incorporate deep learning ideas in this context. For  financial applications we will consider a range of topics and for each we will introduce the basic underlying models before considering the RL approach to tackling them. We will discuss a range of papers in each application area and give an indication of their contributions. We conclude with some thoughts about the direction of development of reinforcement learning in finance.

Revision as of 00:52, 12 May 2024

[math] \newcommand*{\rom}[1]{\expandafter\@slowromancap\romannumeral #1@} \newcommand{\vertiii}[1]{{\left\vert\kern-0.25ex\left\vert\kern-0.25ex\left\vert #1 \right\vert\kern-0.25ex\right\vert\kern-0.25ex\right\vert}} \DeclareMathOperator*{\dprime}{\prime \prime} \DeclareMathOperator{\Tr}{Tr} \DeclareMathOperator{\E}{\mathbb{E}} \DeclareMathOperator{\N}{\mathbb{N}} \DeclareMathOperator{\R}{\mathbb{R}} \DeclareMathOperator{\Sc}{\mathcal{S}} \DeclareMathOperator{\Ac}{\mathcal{A}} \DeclareMathOperator{\Pc}{\mathcal{P}} \DeclareMathOperator*{\argmin}{arg\,min} \DeclareMathOperator*{\argmax}{arg\,max} \DeclareMathOperator{\sx}{\underline{\sigma}_{\pmb{X}}} \DeclareMathOperator{\sqmin}{\underline{\sigma}_{\pmb{Q}}} \DeclareMathOperator{\sqmax}{\overline{\sigma}_{\pmb{Q}}} \DeclareMathOperator{\sqi}{\underline{\sigma}_{Q,\textit{i}}} \DeclareMathOperator{\sqnoti}{\underline{\sigma}_{\pmb{Q},-\textit{i}}} \DeclareMathOperator{\sqfir}{\underline{\sigma}_{\pmb{Q},1}} \DeclareMathOperator{\sqsec}{\underline{\sigma}_{\pmb{Q},2}} \DeclareMathOperator{\sru}{\underline{\sigma}_{\pmb{R}}^{u}} \DeclareMathOperator{\srv}{\underline{\sigma}_{\pmb{R}}^v} \DeclareMathOperator{\sri}{\underline{\sigma}_{R,\textit{i}}} \DeclareMathOperator{\srnoti}{\underline{\sigma}_{\pmb{R},\textit{-i}}} \DeclareMathOperator{\srfir}{\underline{\sigma}_{\pmb{R},1}} \DeclareMathOperator{\srsec}{\underline{\sigma}_{\pmb{R},2}} \DeclareMathOperator{\srmin}{\underline{\sigma}_{\pmb{R}}} \DeclareMathOperator{\srmax}{\overline{\sigma}_{\pmb{R}}} \DeclareMathOperator{\HH}{\mathcal{H}} \DeclareMathOperator{\HE}{\mathcal{H}(1/\varepsilon)} \DeclareMathOperator{\HD}{\mathcal{H}(1/\varepsilon)} \DeclareMathOperator{\HCKI}{\mathcal{H}(C(\pmb{K}^0))} \DeclareMathOperator{\HECK}{\mathcal{H}(1/\varepsilon,C(\pmb{K}))} \DeclareMathOperator{\HECKI}{\mathcal{H}(1/\varepsilon,C(\pmb{K}^0))} \DeclareMathOperator{\HC}{\mathcal{H}(1/\varepsilon,C(\pmb{K}))} \DeclareMathOperator{\HCK}{\mathcal{H}(C(\pmb{K}))} \DeclareMathOperator{\HCKR}{\mathcal{H}(1/\varepsilon,1/{\it{r}},C(\pmb{K}))} \DeclareMathOperator{\HCKR}{\mathcal{H}(1/\varepsilon,C(\pmb{K}))} \DeclareMathOperator{\HCKIR}{\mathcal{H}(1/\varepsilon,1/{\it{r}},C(\pmb{K}^0))} \DeclareMathOperator{\HCKIR}{\mathcal{H}(1/\varepsilon,C(\pmb{K}^0))} \newcommand{\mathds}{\mathbb}[/math]

The mathematical approach to many financial decision-making problems has traditionally been through modelling with stochastic processes and using techniques from stochastic control. The choice of models is often dictated by the need to balance tractability with applicability. Simple models lead to tractable and implementable strategies in closed-form or that can be found through traditional numerical methods. However, these models sometimes oversimplify the mechanisms and the behaviour of financial markets which may result in strategies that are sub-optimal in practice and that can potentially result in financial losses. On the other hand, models that try to capture realistic features of financial markets are much more complex and are often mathematically and computationally intractable using the classical tools of stochastic optimal control. r2018} and deep RL methods [1].

Our survey will begin by discussing Markov decision processes (MDP), the framework for many reinforcement learning ideas in finance. We will then consider different approaches to learning within this framework with the main focus being on value-based and policy-based methods. In order to implement these approaches we will introduce deep reinforcement methods which incorporate deep learning ideas in this context. For financial applications we will consider a range of topics and for each we will introduce the basic underlying models before considering the RL approach to tackling them. We will discuss a range of papers in each application area and give an indication of their contributions. We conclude with some thoughts about the direction of development of reinforcement learning in finance.

General references

Hambly, Ben; Xu, Renyuan; Yang, Huining (2023). "Recent Advances in Reinforcement Learning in Finance". arXiv:2112.04553 [q-fin.MF].

References

  1. Cite error: Invalid <ref> tag; no text was provided for refs named mosavi2020