Become a Domain Expert

It is only possible to design a feature correctly if you have enough understanding of the underlying business domain. Large software systems may have complex domain models with many special situations that aren’t immediately obvious. Taking the time to understand the domain is crucial for making the right assumptions during development.

Failing to understand the domain correctly will result in features that work “most of the time” but break in special occasions. The detection of the these problems will happen late in the project, cost time to analyse and debug (see the famous estimate), und may in the worst case require a redesign of the whole feature if the domain model was used inappropriately.

A little example of domain knowledge in the project I’m working with is the following: we have train timetable modelled as sequences of stops at trains stations. Using the identifier of the station to uniquely identify a stop in the timetable is not enough since a train might pass the same train station twice in some case. Instead, it is needed to work with pairs of train stations to correctly identify the stops. It’s easy to do if you know it, but if don’t consider this correctly early on, you’re good to rewrite your code later on.

Without enough domain knowledge, the intent of the existing code remain obscure. Sure, you will understand what the goal was technically, but the problem it solves on the business side will escape you. This is unfortunate, because it limits your range of action. If you understand the problem on the business side as well you can put things in context and come up with more technical solutions.

Each minute spent better understanding the domain is worth it. If you want to become a better engineer, become a domain expert.

The Ambitions of Scala

In the object paradigm, a system consists of objects with mutable state, whereas in the functional paradigm, it consists of functions and immutable values. At first, these two worlds seem incompatible.

But not so for Odersky. In 2004 he released the first version of Scala, a language that combines both.

Scala’s roots are object-oriented, sharing the same basic constructs as Java, with whom it is fully compatible. Its functional flavor comes from several features borrowed or transposed from concepts in functional languages like Haskell.  This includes first-class and higher-order functions, including currying, but also pattern matching with case classes, and the support for monads and tail recursions.

The mariage is suprisingly elegant. Maybe the two worlds are compatible after all.

But the ambitions of Scala do not stop here. It also aims at beeing scaleable, both in terms of modularity and in terms of expressivity. Scala should support the modularisation of small and large components, and help reduce the gap between the code and the domain concepts.

The many features of the Scala’s type system enables scalability along both axes. Traits enable for instance a fined grained modularisation of object behaviors. Implicit conversions on the other hand enable existing types in libraries to be extended to express code more clearly.

But more importantly, features of the language create synergies. Abstract type members combined with type nesting enable the cake pattern, a form of dependency injection, or family polymorphism, a way to type check constellation of multiple related classes. The support of call-by-name combined with implicits enable the definition of domain specific languages.

You can’t but be amazed by how features sometimes combine. It is for instance possible to map a collection and convert its type at the same time using the special breakout object. You can even pattern match regular expressions!

Such synergies are possible because the foundations of Scala are principled.

  • First, everything is an object. There is no primitive types. Instead, the type hierarchy has two main roots, one for mutable objects (with reference semantics) and one for immutable objects (with value semantics).
  • Second, you can abstract over types, values, and functions using parametrization or abstract members. The three constructs support both forms of abstractions consistently.
  • Third, any object that defines an apply() function can be used as a function. This closes the gap between functions and objects. The inverse of apply() is unapply(). Any object that defines unapply() can be used as an extractor for pattern matching.

Take the expression “val l = List(1,2,3)”. This is not native syntax for list construction, but actually the evaluation of the function “apply” on the singleton object “List” with the arguments “1,2,3”. Or take the expression “val (x,y) = (1,2)”. This is not native syntax for multiple assignments, but tuple unpacking using extractors. These principles enable nice extensions of the language.

The flexibility of Scala has a price though: it is easy to learn Scala on the surface, but mastering its intricacies is challenging.

Also, Scala comes with many additional features that seem to exist more for convenience than necessity, making it even harder to master. It is for instance questionable wether structural typing or default parameter values, to name a few, should really have made it into the language. Clearly they are usefull and alleviate some pain points of Java, but they also distract from the essence of the language. Scala might at times appear to lack focus.

The richness of the language is acknowledged by the Scala community itself. To quote Odersky, “Scala is a bit of a chameleon. It makes many programming tasks refreshingly easy and at the same time contains some pretty intricate constructs that allow experts to design truly advanced typesafe libraries.”

Scala is a language with many very powerful features and with many ways to do things. It’s up to the developers to use the features well and enforce a consistent programming style. For corporations, these two aspects could be a barrier to adoption. In comparison, a language like Kotlin offers the same basic ingredients but is a lot more simple.

The long bet of Odersky seems to pay off though. Scala has found its audience and made its way to the industry, including top players like Twitter or LinkedIn. It has established itself as a viable alternative.

Scala is a source of innovation and inspiration. While functions were already in object-oriented languages like Smalltalk in the 80s, Scala showed that object-orientation doesn’t mean mutability. The resulting programming style “OO in the large, FP in the small” is gaining traction. Having shown that the combination works, other languages will certainly follow this path.

Ten years after its inception, Scala has a mature and vivid community of users. To gain further adoption, it must now consolidate its foundation and keep it stable across releases. Fortunately, we can still count on Odersky to continue to innovate at the same time. At the recent ScalaDays 2015, he unveiled his plan to better control mutations of state, not with monads, but implicit conversions. That is yet another ambitious challenge.

Small, replaceable, composable

If you ask ten developers what characterises good software design, chances are you will get ten completely different answers. There are many principles around, plenty enough to choose from. Just have a look at the wikipedia entry List of software development philosophies to get a glimpse.

Yet, if we synthesise the many principles, it all boils down to three fundamental rules:

  1. Make it small
  2. Make it replaceable
  3. Make it composable

Separation of concern, single responsibility, high cohesion/low coupling, information hiding, encapsulation, dependency inversion, composition over inheritance, etc.  are all subsumed by the three rules above.

Software design is about controlling dependencies between the parts of the software–the inner wiring of the software.

If you were to pick only one rule between the three, pick number two, “Make it replaceable”. Aiming at replaceability, you’re almost forced to make things small and composable.

Rule number two gives a very simple but very effective design tool: the question “is this object replaceable?”. If you can’t image another implementation of an object, not even a stub in a unit test, or if you can’t replace one object without replacing other objects, then you should maybe go back to the drawing board. That said, not all objects are replaceable. Domain entities for instance aren’t, and it’s fine.

The very same three principles apply to all levels of abstraction of the software: from classes to modules, from modules to systems, and from systems to systems-of systems. Microservices and SOA are only slightly disguised forms of “Make it small, replaceable, and composable” at the system level.

What is funny about software design, is that problems frequently arise because pieces of the system do too much–not too little. The natural tendency to fix the problem is the add more logic and make it even worse. It takes a lot of courage to instead follow the three principles and break something big into many small pieces. We must be brave if we want to make things small, replaceable, and composable in order to ultimately make them simpler.

Unit Testing Matters

Unit testing is a simple practice that can be explained in one sentence: each method should have an associated test that verifies its correctness. This idea is very simple. What is amazing with unit testing is how powerful this simple practice actually is. At first, unit testing seems like a simple approach to prevent coding mistakes. Its main benefit seems obvious:

Unit testing guarantees that the code does what it should.

This is actually very good, since it’s remarkably easy to make programming mistakes: typo in SQL statements, improper boundary conditions, unreachable code, etc. Unit tests will detect these flaws. Shortly after, you will realize that it’s way easier to test methods that are short and simple. This confers to unit testing a second benefit:

Unit testing favors clean code.

This is also very good. Unit testing forces developers to name things and break down code with more care. This will increase the readability of the code base. Now, armed with a growing suite of tests, you will feel more secure to change business logic, at least when the change has local effects. This is a third benefit of unit testing:

Unit testing provides the safety net that enables changes

This is excellent. Fear is one of the prime factor that leads to code rot. With unit tests, you can ensure that you don’t break existing behavior, and can cleanly refactor or extend the code base. You might object that many changes are not always localized, and that unit tests don’t help in such case. But remember: a non-local changes is nothing more than a sequence of local changes. Changes at the local level represent maybe 80% of the work; the remaining 20% is about making sure that the local changes fit together. Unit tests help for the 80% of the work. Integration tests and careful thinking will do for the other 20%. As you become enamoured with unit testing, you will try to cover every line you write with unit tests. You will make it a personal challenge to achieve full coverage every time. This isn’t always easy. You will embrace dependency inversion to decouple objects, and become proficient with mocks to abstract dependencies. You will systematically separate infrastructure code from business logic. With time, your production code will be organized so that your unit tests can always obtain an instance of the object to test easily. Along the way, you will have noticed that the classes you write are more focused and easier to understand. This is the fourth benefit of unit testing:

    Unit testing improves software design

This is amazing! Unit testing will literally highlight design smells. If writing unit tests for a class is painful, your code is waiting to be refactored. Maybe it depends on global state (Yes, I look at you Singleton), maybe it depends on the environment (Yes, I look at you java.lang.System), maybe it does too much (Yes, I look at you Blob), maybe it relies too much on other classes (Yes, I look at you Feature Envy). Unit testing is “a microscope for object interactions.” Unit testing will force you to think very carefully about your dependencies and minimize them as much as possible. It will naturally promote the SOLID principles, and lead to better a decomposition of the software.

Honestly, I find it amazing that such a simple practice can lead to so many benefits. There are many practices out there that improve software development in some way. What makes unit testing special is the ridiculous asymmetry between its simplicity and its outcome.

More

Package Visibility is Broken

In Java, classes and class members have by default package visibility. To restrict or increase the visibility of classes and class members, the access modifiers private, protected, and public must be used.

Modifier Class Package Subclass World
public Y Y Y Y
protected Y Y Y N
no modifier Y Y N N
private Y N N N

(from Controlling Access to Members)

These modifiers control encapsulation along two dimensions: one dimension is the packaging dimension, the other is the subclassing dimension. With these modifiers, it becomes possible to encapsulate code in flexible ways. Sadly, the two dimensions interfere in nasty ways.

Shadowing

A subclass might not see all methods of its superclass, and can thus redeclare a method with an existing name. This is called shadowing or name masking.  For instance, a class and its subclass can both declare a private method foo() without that overriding takes place. This situation is confusing and best to be avoided.

With package visibility, the situation gets worse. Let us consider the snippet below:

package a;
public class A {
int say() {return 1;};
}
package b;
public class B extends a.A {
int say() {return 2;};
}
package a;
class Test {
public static void main(String args[]) {
a.A a = new b.B();
System.out.println(a.say()); // prints 1, WTF!!
} }

 (from A thousand years of productivity: the JRebel Story)

The second method B.say() does not override A.say() but shadows it. Consequently, the static type at the call site defines which method will be invoked.

One could argue that everything works as intended, and that it is clear that B.say() does not override A.say() since there is no @Override annotation.

This argument makes sense when private methods are shadowed. In that case, the developer knows about the implementation of the class and can figure this out. For methods with package visibility, the argument is not acceptable since developers shouldn’t have to rely on implementation details of a class, only its visible interface.

The static types in a program should not influence the run-time semantics. The program should work the same whether the variable “a” has static type “A” or “B”.

Reflection

With reflection, programmers have the ability to inspect and invoke methods in unanticipated ways. Reflections should honor the visibility rules and authorize only legitimate actions. Unfortunately, it’s hard to define what is legitimate or not. Let us consider the snippet below:

class Super {
@MyAnnotation
public void methodOfSuper() {
}
}

public class Sub extends Super {
}

Method m = Sub.class.getMethod("methodOfSuper");
m.getAnnotations(); // WTF, empty list

Clearly, the method methodOfSuper is publicly exposed by instances of the class Sub. It’s legitimate to be able to reflect upon it from another package. The class Super is however not publicly visible, and its annotations are thus ignored by the reflection machinery.

Package visibility is broken

Package-visibility is a form of visibility between private and protected: some classes have access to the member, but not all (only those in the same package). This visibility sounds appealing to bundle code in small packages, exposing the package API using the public access modifier, and letting classes within the package freely access each others. Unfortunately, as the examples above have shown, this strategy breaks in certain cases.

Accessiblitiy in Java is in a way too flexible. The combination of the fours modifiers with the possibility to inherit and “widen” the visibility of classes and class members can lead to obscure behaviors.

Simpler forms of accessibility should then be preferred. Smalltalk supports for instance inheritance, but without access modifiers; methods are always public and fields are always protected. Go, on the other hand, embraces package visibility, but got rid of inheritance. Simple solutions are easier to get right.

NOTES:

  • In “Moderne Software-Architektur: Umsichtig planen, robust bauen mit Quasar” the author argues that method level visibility makes no sense. Instead, components consist of classes, which are either exposed to the outside (the component interface) of belond to the component’s internals and are hidden (the component implementation). This goes in the direction of OSGi and the future Java module system.

Lines Spent

Studies have shown that the productivity of a developer is about 10 LOC/day. Considering that modern software consists in millions of lines of code, this number is appalling.

Measuring productivity with LOC is of course a dangerous thing to do. With programming, quality does not correlate with quantity, and it is wise to remember the words of Dijkstra:

If we wish to count lines of code, we should not regard them as “lines produced” but as “lines spent”. — Dijkstra

Indeed, programming is not a production process but a design process. Before a piece of code reaches maturity, various directions might first need to be explored, refined, and analyzed. Qualities like readability, performance, or reliability are competing dimensions in the design space. There is rarely “one way” to program something. Effective programming is about finding the best possible tradeoff in the shortest timeframe.

Writing high-quality code requires also a high level of rigor. Code must obey the established naming conventions, idioms and patterns of the system; it must be systematically refactored to prevent technical debt to accumulate; and it must be exhaustively tested. This level of discipline pays off in the long-term, but requires more time in the short-term.

Also, as a system grows, it becomes progressively harder to maintain an accurate mental model of the system. Considerable time is thus spent assessing the existing system to temporarily reconstruct enough knowledge for the task at hand. During this time, no code is written.

One could argue that counting total lines of code is useless; what is relevant is the number of lines of code modified per commit. Lots of modifications but no additions would suggest a modular system where features can be adapted declaratively without “writing new code”. Unfortunately, metrics tools do not follow this view and treat modifications as second-class citizens.

So, is a low number of LOC/day per day good or bad? For an optimist, it might indicate a sign of code quality; the team designs carefully and takes care to write the minimum amount of code needed. For a pessimist, if might indicate that the project is stalled; you’re not delivering enough code to make the deadline. For a realist, one thing is sure: software engineering is a very expensive activity.

More

Software As Liability

Masterminds of Programming

Masterminds of Programming51-8dA--hLL features exclusive interviews with the creators of popular programming languages. Over 400+ pages, the book collects the views of these inventors over varying topics such as language design, backward compatibility, software complexity, developer productivity, or innovation.

Interestingly, there isn’t so much about language design in the book. The creation of a language seems to happen out of necessity, and the design itself is mostly the realization of an intuitive vision based on gut feelings and bold opinions. The authors’ judgments about trade-offs (e.g. static or dynamic typing, or security vs performance) are surprisingly unbalanced, and when asked to explain the rationale for some design choices, explanation tends to be rather scarce.

Instead, the authors describe with passion the influences that led them to a particular design. The book contains thus a good deal of historical information about the context in which each language was born.

  • C++ was invented to enable system programming with objects
  • Awk was invented to easily process data in a UNIX fashion
  • Basic was invented to teach students programming
  • LUA was invented to easily script components
  • Haskell was invented to unify the functional programming language community
  • SQL was invented to query relational database with an approachable language
  • Objective-C was invented to bring objects to the C world
  • Java was invented to provide a secure language in a networked world
  • C# was invented as the strategic language for the modern Microsoft platform .NET
  • UML was invented as the unification of modeling languages
  • Postscript was invented to enable flexible typesetting and printing
  • Eiffel was invented to make objects robust with contracts

Both the interviewers and interviewees are knowledgeable and articulate. The inventors smoothly distill their experience and insights during semi-structured interviews. Throughout the book, discussions remain mostly general, which both a plus and a minus: the material is accessible to all, but multiple sections have a low information density. The book could be easily shortened with a better editing.

Discussion about software engineering in general turned out to be the one I enjoyed most. Some of the interesting ideas touched in the book were for instance:

  • Simulating projects help acquire experience faster, p.254
  • Classes are units of progress in a system, p.255
  • We need of an economic model of software, p.266
  • Object-oriented programming and immutability are compatible, p.315
  • What UML is good for: useful for data modelling, moderately useful for system decomposition, not so useful for dynamic things, p.342
  • Generating code from UML is a terrible idea, p.339
  • There’s no software crisis; it’s overplayed for shock value, p.354
  • How broken HTML is, and how better it would have been if the web had started with a typesetting language like postscript, p.405

These points come from the late interviews, but there are similarly nice bits and pieces in all chapters; it just turned out that I starting taking notes only half through the book.

Amongst the recurring themes, the notion of simplicity pops out and is discussed multiple times, at the language level and a the software level. Several interviewees quote Einstein’s “Simple as possible, but not simpler”, and emphasize the concepts of minimalism and purity, each in their own way.

The book is also very good at instilling curiosity about unknown languages. I was initially tempted to skip chapters about languages I didn’t know, and am glad that I didn’t. Stack-based languages like Forth and Postscript appear as examples of a  powerful but underlooked paradigm; the chapter about awk almost reconciled me with bash scripting; and the discussion about UML made me reconsider its successthe fact that the whole industry agreed on a common notation for basic language constructs shouldn’t be taken for granted.

In conclusion, this book isn’t essential, but it is enjoyable if you are an all-rounder with some time ahead, you appreciate thinking aloud, and good discussions around a cup of coffee.

Your Language is a Start-up

Watching the TIOBE index of programming language popularity is depressing. PHP and Javascript rule the web, despite the consensus that they are horrible; Haskell and Smalltalk are relegated to academic prototyping, but unanimously praised for the conceptual purity. How technolgy adoption happens is a puzzling question.

Evidences seem to suggest that what matters is to attract a set of initial users, and then expand. The initial offer needn’t be particularly compelling. As long as it wins on one dimension  maybe because it’s ambarassingly simple, or provides a very effective solution to a very specific problem it might attract early adopters. PHP won because of its simplicity to get things done; Javascript won because it was the first to provide a solution to make HTML dynamic. After initial success in a niche, the technology can evolve to attract more users. PHP and Javascript  evolved later to fix their initial design flaws. They are both now mature object-oriented languages.

The price for fixing initial flaws is however extraordinary high. Once a language feature is designed and made available, it’s cast in stone. Evolving a language while maintaining backward compatibility is extremelly challenging, but breaking compatibility and dealing with multiple branches isn’t much of an easy solution neither. Notorious examples of evolutions in Java are the Java Memory Model and generics. It took years of research to plan them, and years of availability to reeducate the community. C++ is still trying to catch up, and still lacks feature that we take for granted on some plateforms, e.g. a standardized serialization.

Surprisingly, when adoption happens, it might be from a difference audience than the one expected.  “Languages designed for limited or local use can win a broad clientele, sometimes in environments and for applications that their designers never dreamed of.”  say the authors of Mastermind of programming. This is definitively true. Java was initially designed for embedded systems, but succeed instead in the enterprise. Javascript was thought as a thin veneer for web pages, but now powers the new generation of client-side web app and is even expanding to the server-side.

When a technology starts to decline after the adoption peak, don’t be too quick to claim it dead. It might enjoy an unexpected renaissance. Many have for instance claimed that Java was dead. They have failed however to recognize the the underlying JVM is a rocking beast, amazingly fast and versatile for those who know how to tame it. Nowadays, one of the best strategy to implement new languages is to leverage the JVM and provide interoperability with Java libraries. In turn, this massive adoption of the JVM by new language implementors is driving innovation in the JVM itself, which has been extended with new bytecode for languages other than Java. I doubt that James Gosling had anticipated this evolution of the platform.

Clearly, the key characteristics of a language are its syntax and semantics, since they define together its expressive power. Expressivity isn’t however the unique force at play for adoption. What a real-world check suggests is that expressivity is only one factor amongst many other technical and social factors. The ease of debugging or the existence of a friendly community could for instance turn out to more important for some users than the ease of writing code. Language designers typically understimate such factors, severly impeding their chances of success.

To foster adoption, one must also rekon that people are reluctant to change. What people are already familiar with must be taken in consideration: in 2008, mobile users wouldn’t have been ready for the minimalistic iOS 7 interface. They were however ready for the original skeuomorphic interface, and now that they have become familiar with it, they can get rid of the skeuomorphic ornaments. People don’t change for the sake of changing, they change to solve or problem, and they change only if the gain outweight the pain. For programming languages, the problem is productivity and the pain is learning a new platform. In 2013, developers might not be familiar enough with functional programming to adopt a pure functional language like Haskell, but they definitively are ready to adopt a hybrid language like Scala.

Together, these elements might help explain the failure of some great languages, for instance Lisp. Lisp is a beautiful programming language that offers amazing flexibility. For a skilled practitionner, lisp is a secret weapon. However, lisp does nothing particularly well out of the box. “Lisp isn’t a language, it’s a building material.”, dixit Alan Kay. Clojure, on the other hand, is a Lisp dialect with just enough direction to solve one very painful problem: writting concurrent code. Given that the problem is so painful, people won’t mind a few parenthesis to solve it. This choice paid off, and in 2012 Clojure moved in the “adopt” quadrant of Thoughwork’s technology radar.

The language business is a competitive business where idealism won’t prevail. For a language to be adopted, it must solve a problem for some early adopters, who will then create an attractive ecosystem that will convince the late majority. In other words: language designers should think of their language like a start-up.

Thinking in Lifecycle

After having worked hard on a design I came once to a colleague to ask him his opinion about it. “Do you think I’ve decomposed the problem in the right way?” I said. He look at it and answered “You are mixing elements with different lifecycles”. I came back to my desk confused and worried.

What I understood later is that  every element in the system always has a lifecycle, be it objects in memory, data in the database, business logic in the code, build artifacts, design documents, or configuration properties. Too many times, the lifecycles are poorly understood or poorly managed. Elements with different lifecycles need to be able to change independently of each other’s. Your design should reflect these lifecycles.

Thinking in terms of lifecycle is a very powerfull design tool. Surprisingly, it isn’t particularly emphazised in the literature about software design and architecture. The closest would be the concepts of coupling and cohesion, which capture how changes ripple in the system. The single responsibility principle is also close, when formulated as “A class should have only one reason to change“.

I’m happy my colleague made me aware of this particular way of thinking and that I could add this design tool to my intelectual toolbox. It’s proved very valuable since then.

Debunking Object-Orientation

What is an Object? What is the essence of the object pardigm? How do objects differ from other abstractions? What are their benefits? What are their pitfalls? Can we encode objects with lower building blocks? Should we have objects all the way down?

Some people think they are clever to observe that OOP has no formal, universal definition. Democracy, love and intelligence don’t either. — Tweet from Allain de Boton

Some thoughts:

And for the haters: