Non environmentally determined decision making and chaotic CTRNNs

A good proportion of the tasks that our neuron-inspired dynamical system controllers are evolved for have a very strong environmentally determined component. In other words, what they do next is (by far) predictable more by their current environment then by their internal dynamics. Yet the motivation for most of us (at least those interested in cognition, however minimal) using such internally rich systems is to get coupled-with-the-environment but at the same time interestingly autonomous behaviors.

So I have been thinking about a smallish project (perhaps in the context of the upcoming ECAL? keeping in mind that I will be doing so while at the same time writing my thesis). The motivation would be to evolve networks to do a simple task in a simple environment in several different ways. The hypothesis is that the systems will evolve some form of chaotic internal dynamics to generate the variety of behaviors required. This is interestingly related to the random number generator in Ashby’s Homeostat, but more generally chaotic behavior has been observed in the nervous system and the contributions of chaotic dynamics to cognitive processes has been discussed by for example, Freeman and Skarda. In fact, for Walter Freeman chaotic dynamics in the internal dynamics of the living organism are at the very root of autonomy, learning, memory, and even creativity.

There would be two main objectives in such work. The first would be to measure the reactivity in evolved situated agents for different tasks. Reactivity being how much the behavior of a situated agent is determined by the environment as opposed to its internal dynamics. The second idea would be to compare CTRNNs with alpha-CTRNNs. I’m calling alpha-CTRNNs the brother/cousin of the CTRNN that I discussed from Ollie’s work in a previous post. Alpha is a parameter for each neuron that determines the monoticity of its activation function. The question that we would ask is, do nonmonotonic transfer functions facilitate chaotic behavior? I have found already a couple of papers (here and here) that suggest that nonmonotonic transfer functions may lead to macroscopic chaos in attractor neural networks. I still have to understand them better.

So what is the simplest version of an experiment that can shed some light on these questions?

I have been inspired by the simplicity of the models that physicists / dynamicists use. In particular, I have become interested in the forced double-well oscillator studied by Francis Moon and colleagues at Cornell. I think this is also known as Duffing’s mechanical oscillator (see here, here). As Strogatz points out, as soon as nonautonomous systems (in the DS sense) are considered strange attractors start turning up everywhere. I won’t go into any detail about the model, if you want to learn more about it you can start off with Strogatz’ ND&C p.441-446. The important points are that the model is extremely simple and that it produces chaotic behavior.

So what is the ‘alify’ version of that simple model? An agent in a 1D world with a temperature gradient that follows the following form:

environment.png

The agent senses the local temperature and can move left and right.

The first task could be to simply go to the places of highest temperature – highly reactive systems would be expected. The second task would be to do the same, but in a variety of different ways – highly non-reactive / ‘internally-driven’ systems would be expected for this one.

I made some code for this yesterday. Is all only about 350 lines of C code (including the CTRNN and the GA) and ran a first batch of experiments overnight. The summary of the fitness is to maximize both time spent on hottest regions as well as diversity of behavior. I am using 3-node CTRNNs with a slightly larger family of transfer functions (each node with an evolvable alpha between 0.5-1.0) for 25000 (equivalent to 500 generations) with a population of 50 individuals. The overnight results is that 5/20 of the evolutionary runs were quite succesful.

In the next figure I depict two trajectories/behavior (red and blue) of the best evolved network. The x axis is time. The y axis is the position along the 1 dimensional physical space. The underlying shades of gray represent the temperature of the regions (white = hot, black = cold). The simulation is entirely deterministic – there is no noise. The starting states of the two trajectories are nearly identical, the difference is only in the exact starting place off by only an infinitesimal distance (0.005 to be exact).

exampletrajectory.png

The red and blue lines represent the behavior of two nearly indentical runs from one evolved agent. The point is that what determines which of the two peaks to head for in any given case is more up to the internal dynamics than to the environment – as the latter is nearly the same for both cases. Perhaps more interesting still is to analyse the actual internal dynamics. Is it chaotic? quasi-periodic? typical boring attractos? Although this is only preliminary, I think it might be still interesting. What do you think?

Of course, many things to do still. First, I shall investigate measures of how chaotic a non-autonomous system can be and use it to measure the best evolved agents. Second, compare evolving CTRNNs with different transfer functions (i.e. monotonic versus non). Any other ideas?

Advertisements

One thought on “Non environmentally determined decision making and chaotic CTRNNs

  1. Will post here two questions that I got by mail:

    What was the fitness measure for diversity? did the simple difference work fine?
    Yes, the simple difference seems to do the trick. Also, I plan to use measures of chaos to analyse the evolved dynamical system to see whether they are chaotic or not or something else.

    How many different trials did an agent have to perform? you only show two …
    During evolution, first, 50 different starting points are chosen. the agent is then started on each of those plus one infinitesimally similar starting point (thus 100 per fitness evaluation per agent). The difference between the trajectories of each of the two that are infinitesimally close is maximized. As well as the overall time spent on the hottest regions for all trajectories – quite simple really. Each run lasts 500 units of time. I don’t show here any figure of what happens during an evolutionary run.

    Regarding the trajectories shown here, yes those are just two for an already evolved agent. As far as I can tell, any two picked (out of the 40 that I have seen), even if with a very small difference, end up doing very different things. The one that I show is not particular in any sense.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s