Reinforcement Learning (RL) is one of the most promising research areas in Machine Learning research. It has demonstrated its immense potential for iterative decision-making under uncertainty, which can potentially be used to solve complex real-world challenges. Its application to financial settings is however, still underused and underesearched.


Reinforcement Learning in Finance is easy! Even a caveman or cavewoman can do it:

import gym

from stable_baselines.common.policies import MlpPolicy
from stable_baselines import PPO2

env = gym.make(EASY_FINANCE_ENV)

model = PPO2(MlpPolicy, env, verbose=1)

obs = env.reset()
for i in range(1000):
    action, _states = model.predict(obs)
    obs, rewards, dones, info = env.step(action)


Of course, in reality this is not so easy. One needs to define the Environment the RL agent will observe and decide actions in, in our case that usually means simulating a financial market. Simulating requires large volumes of data (either historical or generated) and in addition reward engineering needs to be taken in consideration in order to properly guide the behavior of the RL agent during the training.