In the last article we learnt how to create modified copies of an immutable in PHP. This one is going to tackle an issue I have hitherto skirted around and avoided. Objects in immutable data structures.

This article is part of a series I have written on the topic of immutability in PHP code:

  1. Part one - a discussion of caveats and a simple scalar handling immutable
  2. Part two - improve the process of creating modified copies of the immutable
  3. Part three - objects in immutable data structures and a generalised immutable implementation

What’s the problem with objects?

Objects or instances of classes are passed by reference in PHP. Any changes to the class will be reflected in all places it is passed to. This is different to scalar values like strings, that are passed by value instead.

Here you can see a function called addItem() that adds a property to stdClass instance - this produces a side effect. The original $class is also updated as it references the same value so if we dump the variable we can see it’s value has changed.

Now consider the same example with a simple scalar string where pass by value takes effect.

Here the original value remains intact because, unlike an object, there is no reference to it from within the addItem() function.

These side effects make putting an object into an immutable data structure difficult. Someone with access to the reference could simply change the object after the fact - thus breaking immutability.

What about resources?

Turns out the same issues plague resources as well. They are just references to a resource ID so any change to one will affect all those that also reference it. Simply moving the pointer in a file resource would break immutability.

This happens because fread() advances the file pointer as it reads. Even if we do rewind() the pointer then it is no guarantee of getting the same value back.

An additional issue with resources is that they are, by their nature, not a finite thing so even if you did prevent changes within your program you could still end up having mutations - someone updating a file on disk for example.

In between the two calls to fread() the data in the resource has changed through outside intervention. A new random value has effectively been written to /dev/urandom meaning the dumped value changes too even though we have rewound the pointer and used the same offset/index of 3.

Note, that the use of bin2hex() converts the binary bytes that /dev/urandom produces into a hexadecimal representation making it more legible to humans. This conversion process also increases the length of the value as �@D��N� becomes dd4044f5f84ed6 in hexadecimal notation. This is why the offset maybe 3, but the string that comes back is actually 6 characters long.

However, if your data source is not binary then you do not need to use bin2hex() in your code.

What can we do to fix it?

In the case of resources, it is too hard to protect them from unauthorised changes so we won’t bother. If you need an immutable resource you’ll have to fetch it as a scalar first and then put that into your immutable data structure.

As you can clearly see the value does not change between prints in this example because we are accessing a scalar string instead of a resource directly. You could just as easily feed the $randomStr into the Immutable definitions that are described further on.

In the case of objects though there is something we can do to protect the immutable from their pass by reference nature. For simple objects you can simply clone the incoming object value when setting it in an immutable data structure. This will create a new copy of the object with its own reference and, therefore, break the dependency on the previous reference - the two objects are not linked by reference. Any change in one will not be reproduced in the other.

By cloning the object we have created a duplicate instance and referenced that instead from within the Immutable. This means that when $test is later updated it does not affect the value inside $imm as it is does not have the same reference as $test.

So, there it is, we are done.

Deep nesting though

Yeah, right, not so fast! The previous example can easily be broken with one small change; provide an object for storage inside $test.

As you would expect just because we cloned $test when it is set inside Immutable does not mean its contents are cloned too. Unfortunately, $value is still referenced directly, so any subsequent updates get reflected across all referring locations - including inside our Immutable.

The same would be true of any immutable containing an array too. You could just set one of the array elements to be an object and change it later just like $value in this object example.

Long story short, this immutable is in fact mutable.

Immutable deep nesting with __clone()

You could work around the lack of protection by implementing the __clone() magic method in all classes that might be put inside an immutable. You could then clone all objects stored in the class when it, itself, is cloned. A simplified demonstration of how this could work is below.

As you can see MySimpleClass is very naive to make the demonstration easier to grasp. You will also note that the object ID jumps to 5 when the final var_dump() is applied - this is because __clone() in MySimpleClass was triggered.

If we step through the implementation again and attempt to make a change to $stdClass then it might be clearer.

Unfortunately, this would require you to trust developers to actually implement this correctly and there would be no way of accurately verifying that a __clone() method has been specified properly.

To solve this issue we must eschew quite a bit of flexibility and only allow known immutable objects to be set inside the Immutable. This means that we have to recursively step down through any arrays looking for mutable classes and rejecting them too.

Generalised immutable deep nesting

For those of us who want a more stringently protected immutable we can generalise the problem by making an immutable class that can sanitise itself. It will only allow known immutables to be set as data inside it thereby preventing nested object state changes, which would break its immutable property.

This class can then be implemented to create immutable lists of things.

The main new concept here is the recursive method sanitiseInput(), which recursively steps through the data array cloning any objects it finds. This is completed in sanitiseObject() that you will also note, uses a type hint to ensure only instances of Immutable can be set as values. This is how we ensure that only known immutable objects are being set inside an Immutable.

If you need to check for more than one known immutable class then you could check in a number of ways:

  • extend a base or abstract class when implementing them all,
  • use an interface that they all implement or
  • a simple set of instanceOf checks.

Something like this might do it.

Whichever way you choose or prefer is up to you of course.

So, that finally gives us a simple immutable structure that can store objects, scalars and arrays. You can use the techniques discussed in the previous article (part two) to easily create modified copies of your new immutable.

Using a generator to make generalisation easier

The same functionality can also be written using a generator class to create the immutable data structure. In this section though we are going to be extending the idea just a little further to add some convenience methods.

The data structure

Turning to the structure itself, we are going to add a few methods that will make data access more robust in the generalised class. To this end, it is useful to know if a value exists so we are going to add a has($key) method. This will also be used by a getOrElse($key, $default) function to allow a default value to be provided where a key does not already exist.

This is the complete immutable structure that our generator will populate for us.

Unlike the last Immutable this one makes use of static methods and prevents access to the class constructor by making it a private method. This skips the $mutable true/false dance we have been doing elsewhere. I prefer the dance, but this serves as a nice example of another method to achieve a similar result.

You will notice that there are actually a few other methods in there that we have not discussed yet. There is a get($key) that allows us to access a value by its key easily and getAsArray() has taken over the duties of returning the complete $this->data array. Finally, there is a toString() method, which produces a PHP parsable string representation of the stored data.

A generator in detail

Now onto the generator that will produce the populated instances of the ImmutableData class.

The main aim of this generator is to make it as generalised as possible - allowing a consumer to store the widest selection of types and values as possible whilst ensuring immutability is not broken. In tandem with this we will also add some methods to make modifying a copy of the immutable easier.

All the data will be stored in an array internally to easily facilitate different data shapes that may be thrown at the Immutable class.

All data will need to be stored against a key so that it can be accessed again easily.

Again this class uses a private constructor and static method to prevent calls to the constructor. You could use the $mutable true/false setup here, very easily, if you wanted to though.

Simple usage

These two classes can now be used to generate an immutable data structure like so.

This uses the __toString() method to print a simple and parsable text representation.

You can also put a trusted object into the immutable as well - in this case we will just use the immutable we created earlier, $immX.

Again, the output is parsable by the PHP engine so you will notice the slightly weird __set_state() magic method call in there - you can safely ignore this and concentrate on the data itself. This magic method is implemented in the ImmutableData class that we defined earlier and it merely serves to populate a class with a set of data/state when a var_export() output is parsed by PHP.

So, what is the point if we cannot get our data out? Well, remember those get(), has() and getOrElse() methods? They can be used to quickly and relatively easily access the stored data by key. The methods are fairly self-explanatory so here are a few examples just to demonstrate their usage against $immY.

This should give you enough of a foundation to build additional functions like map, reduce, etc upon were you choose to do so. You could also write methods to fetch items by their value rather than their key as well.

Modifying copies of the immutable structure using the generator

The key to making immutables useful is allowing consumers to easily and quickly create modified copies of the underlying data. This has been written into the generator we defined earlier and can be best described with a few examples. Note that the with() static method can accept an ImmutableData object as its first parameter and modification is exactly what this is for. You can then use set() to add or modify values.

In the result we should see our new properties added to the stored array from $immY.

Of course, you can also use arr() or setInt() here in the same way too when setting new values or overwriting existing ones. Just set a value with a key that already exists in the structure and you will overwrite it.

This would result in a data structure like the following.

It is also used to remove items from the data list quite simply too. We can either remove them one at time with unset($key) or you can remove many by supplying a list to unsetArr().

The execution of this results in the following modified output where a number of keys have been removed.

You can unset(), unsetArr, set, setIntKey and arr as much as you like before calling build() all in the one building chain.

Conclusion

Now you have a generalised immutable data structure that you can store anything you like in. If you have an untrusted object you will need store it as a string using either serialize() or var_export(). The same goes for resources like file handles where you will need to extract value as text before storing it.

Apart from these two caveats though, you are relatively free to use the immutable as you see fit.

This article is part of a series I have written on the topic of immutability in PHP code:

  1. Part one - a discussion of caveats and a simple scalar handling immutable
  2. Part two - improve the process of creating modified copies of the immutable
  3. Part three - objects in immutable data structures and a generalised immutable implementation

If you like this article then you might get a kick out of writing functional php code as taught in the Functional Programming in PHP book that I wrote.