Reinforced life

I love reinforcement learning.

It is is a comprehensive framework for describing the world.

It is quite simple and reminds me of a quote regarding, no-limit texas hold-em poker. It takes 5 minutes to learn and a lifetime to master.

Here is that short introduction to reinforcement learning.

In reinforcement learning you are an agent with a state

S who can perform an action A that gives you a reward R.

Once you receive the reward you will transform to a new state S’ (state prime) with a new set of actions A’ (actions prime) and rewards R’ (rewards prime).

The goal is to maximise all future rewards. The state space can further be divided into a world model (for observing the world) and models of self and other actors. Actions are ways of interacting with the world, the self or other actors.

This loops forever (or technically until the game ends).

The only constraint is that all information about the world should be present in the current state-space (we don’t need to look back to previous states for information).

To maximise future rewards two competing principles needs to be balanced, exploration (to gather useful information about the world by exploring it through actions to find high value actions) and exploitation (to use your current knowledge of the world to perform high value actions to receive rewards).

This is hard because the state and action space in most complex games (and life) is enormous and ever-changing.

Still I believe that reinforcement learning is a good introduction for understanding life from a mathematical point of view.

There are many ways to move forward from this short introduction.

For instance, I’m a person who has done my share of exploration. This is useful because I have knowledge of a large amount of potential states and actions.

Are we going in spirals?

Here is an interesting philosophical question. Are we going in circles or are we going in a spirals?
Adolf Zeising, whose main interests were mathematics and philosophy, found the golden ratio expressed in the arrangement of parts such as leaves and branches along the stems of plants and of veins in leaves. He extended his research to the skeletons of animals and the branchings of their veins and nerves, to the proportions of chemical compounds and the geometry of crystals, even to the use of proportion in artistic endeavors. In these patterns in nature he saw the golden ratio operating as a universal law.
In connection with his scheme for golden-ratio-based human body proportions, Zeising wrote in 1854 of a universal law “in which is contained the ground-principle of all formative striving for beauty and completeness in the realms of both nature and art, and which permeates, as a paramount spiritual ideal, all structures, forms and proportions, whether cosmic or individual, organic or inorganic, acoustic or optical; which finds its fullest realization, however, in the human form.”
In 2010, the journal Science reported that the golden ratio is present at the atomic scale in the magnetic resonance of spins in cobalt niobate crystals.
Since 1991, several researchers have proposed connections between the golden ratio and human genome DNA.


I listened to something that gave me hope the other day.
Chris Rock told the story of how the atrocities of George W. Bush paved way for Obama to become president. The destruction made it possible for the world to experience something miraculous.
Jokingly Chris Rock said that considering the current president it is only plausible that soon Jesus will return. Still. There is truth to the fact that progress often follow destruction.
Here is another example.
The second world war might be one of the darkest chapters in human history. But following it, one great woman was instrumental in permanently improving the world.
The woman was Eleanor Roosevelt and what she created as the chair of the United Nations Human Rights Commission was the The Universal Declaration of Human Rights.
Words have power. Here are some words to remember.
Article 1.
All human beings are born free and equal in dignity and rights. They are endowed with reason and conscience and should act towards one another in a spirit of brotherhood.

When the model of the world breaks

“Some people are quite capable of seeing things they don’t understand and being okay with it. I wasn’t okay that something had violated my model of the world. I really am not okay with things that do that,” says Hinton.
<3 <3 <3
This is something I can relate to extremely well.
When I experience things in my life that violate my models of the world. I can’t help but explore it. I have to find the answer.
I spend all my time thinking until I have found models that are more accurate for describing the world. This can take years. But ultimately it gives me a better understanding of the world.
This is a great article about Geoffrey Hinton, the godfather of artificial intelligence.
Who like me is both a psychologist and a computer scientist.
We’re machines,” says Hinton. “We’re just produced biologically.
For a long time I have been saying that emotions are hidden computation. It seems true to me and it helps bridge gaps in theories of how a mathematical mind would work.
You experience the world, you feel emotion as the unconscious processing inside your mind release neurotransmitters and you experience thought as the process of explaining these unconscious changes in neurotransmitters using your conscious model of the world.
I hope that the revolution within the field of artificial intelligence can reach back into the fields of psychology, psychiatry and neuroscience to help make sense of our own understanding of the human mind.
It saddens me that such a large proportion of people on earth suffer because of the lack of scientific rigour within these fields.
Psychology suffers from the problem that most things are self-explanatory, psychiatry suffers from the problem of co-morbidity and neuroscience suffers from the lack of overarching theories and continuous re-inventions of the wheel.
I think that every human being should have a freedom to define him or herself as they see fit.

Imagining the world simulator

A cool thing happened to me on the train this morning.
I closed my eyes and envisioned the train and everyone in it with my eyes closed.
You can do the same thing right now. Just close your eyes and notice that you are getting a mental representation of the space you are in.
Then I thought. Is the simulation that is running in my head when I close my eyes the world simulator?
Meaning, is the experience of the world around me that I see when I close my eyes the representation of the world that my mind guesses will happen at the next timestep, which is the basic foundation for all my actions?
To explain further it might be good to know some things about reinforcement learning.
Reinforcement learning is one of the most advanced fields in machine learning. So I make the following assumption:
The mathematical representations of how the mind functions used in reinforcement learning are accurate, which is why they result in agents that can perform better than humans on specific tasks.
In reinforcement learning you have a state, make an action and receive a reward. Then you are in a new state, make a new action and receive a new reward. This circle continues until the game is over and the goal is to maximise the cumulative reward.
To do this accurately it is important to have an accurate representation of the world you are in, in order to perform the best action given your current state.
The problem with human intelligence is that most of our brain structures are only semi-conscious which is why I thought the experience on the train was significant.
Because the more you understand about your own mind, the more you become master of your own house. I believe we start out as fairly unconscious and over time become more conscious of our own experience of life.