The Cost of Volatile

Assessing the scalability of programs and algorithms on multicore is critical. There is an important literature on locks and locking schemes, but the exact cost of volatile is less clear.

For the software composition seminar, Stefan Nüsch and I decided to look at this topic. Essentially, we devised a benchmark where multiple threads would access objects within a pool. Each thread has a set of objects to work with. To generate contention, the sets could be fully disjointed, have partial overlap, of have a full overlap. The ratio of reads and writes per thread was also configurable.

On an AMD 64 Dual Core, the graph looks as follows:

On a i7 Quad Core, the graph looks as follows:

We clearly see that different architectures have different performance profiles.

In future work, we could try to reproduce false sharing and assess the impact of other forms of data locality.

More details about the benchmark and methodology can be found in his presentation. The code in on github.

Here a some links about the semantics of volatile, and mememory management in general.

Leave a comment Cancel reply