Iterated Function Systems / Crayon Impressions of Leaves


I am about half way through my time working here and I have reached a crucial turning point in my research. All this time the goal has been to make progress on the reverse engineering problem, but it was first necessary to familiarise myself with the forward-engineering. I am now familiar with that forward-engineering. In this post I sum up what I’ve learned and embellish the descriptions with pretty outputs from the program I wrote.

Note: if you’re not drawn in by technical details, don’t go! Skip to the interactive demonstration I’m very proud of it! Maybe the details will be more interesting after some intriguing pictures.

There’s this one detail about the creation of these fractal images that really confused me to begin with (and still to an extent now, although familiarity softens much frustration), and that’s what I’d like to talk about here. It’s this mismatch between the theory as I understood it, and the actual methods used in practice which didn’t seem at all related to that theory.

Here’s how I originally understood the process used to generate reproductions of fractal images:

  • these fractals have smaller copies of themselves embedded within themselves
  • if you specified the exact way that the copies were embedded, then you could reproduce the image by repeatedly making copies according to that rule
  • For example: if you started with the stem of the fern, then copying it around over and over again, making smaller and smaller copies of that one stem, you would end up with a skeletal model of the same fern that - after enough iterations - would look just like the original.

My reading on the topic of IFS fractals has informed me that this conception is not quite in line with the common formal definition. This is the gist of the mathematical formulation:

  • Rules for how to make copies define a ‘contraction’ on the input space. (this is the formal way of saying the copies must be smaler than the original region, which makes intuitive sense because if that was not the case the process would grow outwards endlessly rather than settling down inwardly)
  • the visual form we call the ‘fractal’ corresponds to the set of all points that are invariant under this contraction. ie. After applying all the rules, if the resulting set of points is identical to the set you started with then that set was invariant under the contraction, so you have identified the fractal set, called the attractor.
  • this set of points is called the ‘attractor’ because no matter the region you start with, the application of the contraction rule always reduces the difference between the input region and the fractal set (an observation which can formalised by defining a distance function on the set of all compact regions in the space; the Hausdorf distance).
  • It is therefore possible to generate an image of the attractor by choosing some reasonable starting set then repeatedly applying the contraction to that region until its form has mangled into one that ceases to change significantly on further contractive iterations.

Now that is all well and good, but here’s the screwy thing: the method for generating these images is nothing like either of those descriptions. Nothing like it.

I’m going to describe the actual method which is used in detail so hopefully you can understand how strange it is.

Making the rules

The idea of ‘contractions’ on the input space is a powerful one because it is not very prescriptive about the sorts of functions we have to use as transformation rules. It just enforces this intuitively reasonable notion that the repeated application of the rule has to result in a settling down not a blowing up. We will not work with this very general notion however, we are going to focus on one specific type of transformation which is particularly well suited to characterising fractal images as we are familiar with them: Affine transformations. Let’s define that term.

A transformation is a rule that tells us how to move around different points based on their position. Once we know how to move individual points around we can then ask what happens to every point within a shape or region to see how it warps space at a more macro level.

A rule that says ‘all points move to the position $(4,3)$’ is a perfectly good example of a transformation.

A rule that says ‘multiply the $x$ coordinate by two’ is also a perfectly good transformation.

They can be as simple or as complex as you like, so long as the definition is unambiguous for every input position.

Transformation that completely mangles any input region into something unrecognisable will not be much use for describing fractal self similarity. We choose affine transformations to work with then because, like the name implies, while they may warp their input regions, the resulting outputs always share an affinity with the corresponding inputs.

Each affine transformation can be uniquely specified by the way it morphs a rectangle in the input space to a parallelogram of some description in the output space, so you can think of affine transformations as encompassing all of: shifting, scaling (not necessarily by the same amount in each direction), and rotating. Mathematically speaking, they are given by the composition of a linear transformation and a translation:

Where the light green matrix is the linear part of the composition, and the orange vector is the translation vector.

The translation does nothing more than move the region from one location to another, whereas the linear transformation captures all of the squishing and rotating.

Using the Rules

That’s the bit which every conception of these IFS fractals agrees on. Affine transformations characterise the fractal. So you’ve settled on your copying rules, how do you draw a picture with them? The formal method outlined above says start with some other region then just keep morphing it with your rules over and over until it looks right. Pleasantly clean in its conception but somewhat of a pain computationally, to have to apply the function for every single point in some region. Here’s what’s actually done:

The Image generation process, presented interactively and step by step:

↑ Select one of the above predefined rule sets to apply procedure to ↑

So there you have it, for some reason if you just choose any random starting point then plot the result of (also randomly) applying each transformation rule to that point over and over again, the result is a shockingly accurate approximation of your image.

Notes / Clarifications:

  • This process is often referred to as the ‘chaos game’ method.
  • When I say that the transformations are chosen at random, that is true, and the probability of any particular transformation being chosen at a given iteration is something you have to specify before beginning the process. (Tweaking these probabilities has the effect of making regions in the output associated with the corresponding transformation either more or less dense with points ending up there)
  • About my implementation: It’s not very fast, at least at the time of writing. The methods I used to implement this IFS plotter gave no real consideration to computational efficiency. My priority was to model in the code my understanding of the problem as cleanly as possible in a familiar language. At some point in the future I may rewrite the performance critical parts, as it stands however, plotting >500,000 points at a time is quite resource intensive.

Why does it work?

I think what this really instills in me is the sense that there is something very robust about this ‘attractor’ property of the set. You really don’t have to try too hard to find it. I think this method of imaging the set is like taking a crayon imprint of a leaf through paper or something. A scattering of points over the relevant region can hardly not find it because its ridges and valleys are just so prominent. The mathematics that grantees this sort of long term probabilistic certainty happens to belong to a branch called ‘ergodic theory’. That is the mathematics of long term behaviour of dynamical systems defined on probability measures.

Probability measures? Yes one detail I did not mention above is that a more nuanced way of imaging the attractor, akin to the other formal definition given, comes from considering not input regions, but something more like input distributions. Probability measures, in fact. Then rather than just morphing one region into another, you flow one distribution into the next, where regions have a higher value according to the measure if you expect more points to map to that region after the transformations are applied. It allows you to go from a set with hard boundaries in black and white to a grey-scale reproduction with light and dark regions showing you where the transformations collect more or less of the points. It turns out that the picture you get with this dynamical approach is really a crayon imprint of the invariant probability measure, not the harsh black-and-white in-or-out set invariant.

What now?

Writing this program was frustrating at first but then I got quite into it! So it is very tempting to just keep improving on it, and I have a list of changes I’d like to make to it that could certainly keep me busy for the rest of my time here. I can’t do that though. This part is done. I haven’t reverse engineered anything! So now I’m dragging myself away from this part of the project. My next step is to try to reproduce part of the inverse procedure outlined in the 1994 paper “Inverse and Approximation Problem for Two-Dimensional Fractal Sets”. I hope to have implemented the very first part of their method (application of a wavelet transform to identify periodic oscillations around regression line corresponding to the scaling factor of transformation rules) by… Wednesday? I think that’s optimistic. It’s kind of damning that all this work likely won’t even get me up to date with the 2000’s progress on the problem but that’s how it goes, I know. I’m learning lots.