Persistent Data Structures

I read recently the post “Persistence is a subset of immutability” and was then surprised to discover that Clojure shared the exact same concept of “persistent data structure”, which turned out to be a concept common to all functional language.

Still, I don’t like this terminology of “persistent data structure”.

“A persistent data structure is a data structure which always preserves the previous version of itself when it is modified; such data structures are effectively immutable, as their operations do not (visibly) update the structure in-place, but instead always yield a new updated structure.”

This definition is completely confusing to me. One the one hand it says that the structure is immutable and that invoking an operation on such a structure produces a new structure. This is ok to me.

One the other hand, it speaks about some form of versioning, as if both the old and the new version had some relationship. This implies then a notion of identity.

“An identity is indeed a logical entity which associate a series of different values over time.”

This is not ok. Both structures are maybe related implementation-wise (the newest structure being merely a diff over the previous one), but from point of view of the client, both structures are unrelated. From point of view of the client, data structure are immutable. Period.

Heraclitus, a greek philosopher, pioneered the concept of identity and state with his famous saying: “You can not step twice into the same river.” He recognizes the changing of objects with the flow of time.

This corresponds to the mutable references in Clojure. References are the only construction that can be mutated.

If the state of an entity changes over time, Einstein was the first one to recognize that the notion of simultaneity is not absolute but depend on the observer.

This corresponds to the transactional memory model in Clojure. At a given time, two different observers will observe two different states of the same entity.  We can indeed see transactional memory as a kind of special asynchronous replication: each observer (thread) has a complete snapshot of the world and synchronization between the snapshots happens at commit time. The sequence of state that can be observed is however always the same and the causality between events is preserved.

More links

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s