Reinforced life

I love reinforcement learning.

It is is a comprehensive framework for describing the world.

It is quite simple and reminds me of a quote regarding, no-limit texas hold-em poker. It takes 5 minutes to learn and a lifetime to master.

Here is that short introduction to reinforcement learning.

In reinforcement learning you are an agent with a state

S who can perform an action A that gives you a reward R.

Once you receive the reward you will transform to a new state S’ (state prime) with a new set of actions A’ (actions prime) and rewards R’ (rewards prime).

The goal is to maximise all future rewards. The state space can further be divided into a world model (for observing the world) and models of self and other actors. Actions are ways of interacting with the world, the self or other actors.

This loops forever (or technically until the game ends).

The only constraint is that all information about the world should be present in the current state-space (we don’t need to look back to previous states for information).

To maximise future rewards two competing principles needs to be balanced, exploration (to gather useful information about the world by exploring it through actions to find high value actions) and exploitation (to use your current knowledge of the world to perform high value actions to receive rewards).

This is hard because the state and action space in most complex games (and life) is enormous and ever-changing.

Still I believe that reinforcement learning is a good introduction for understanding life from a mathematical point of view.

There are many ways to move forward from this short introduction.

For instance, I’m a person who has done my share of exploration. This is useful because I have knowledge of a large amount of potential states and actions.

Leave a Reply

Your email address will not be published. Required fields are marked *