Expert Knowledge isn’t Always Better

Technology people come in two flavours: generalist and experts. I’m definitively a generalist. I’ve a good grasp of the core concepts being the technologies we use. I’m however lacking the expertise about the details of using them. For this, I rely on experts.

This lack of expertise of the details may even turn out to be an advantage in my position. Technologies have core concepts that support some primary technology use case. With enough knowledge about technical details, it’s possible to use the technology to support other use cases. But it’s almost never a good idea in the long term. The implementation will be fragile to subtle changes in details of the technology, and few people in the organization have enough expertise to maintain such an implementation. If your use case is not supported easily, rethink your problem at some other level. Not knowing too much about the technology means I’m limiting its use to what’s maintainable.

The last example of this kind that I encountered was about container dependencies on openshift. In our current system (not openshift-based), we start the containers using “run levels”. Such a concept doesn’t exist in openshift. You could recreate something similar using init containers and some deep technical knowledge, but it wouldn’t be trivial. Rather than misusing the technology, we will have to solve our problem at another level. The containers should be resilient to random startup order at the application-level.

Other examples from the past include details of object-relational mapping or distributed caching. Here too I believe that beeing “too smart” about using the technology isn’t good in the the long term. It’s better to stick to the common use cases and change the design at the application-level if necessary.

Sometimes some part of the design may leverage deep technological details. I’m not completely forbidding it. Used in strategic places and encapsulated in proper technology component, this is a viable strategy. We have for instance engineered a library for leader election based on the solace messaging middleware, that relies on a deep expertise of the semantics of queues in solace. This was a strategic decision and the library is maintained by a team of technology experts with proper knowhow and the support from solace itself. But such situations should be exceptions rather than the norm.

It’s hard to resist the appeal of a clever technology-based solution, but as engineers we absolutely should. When we misuse technologies, we paint ourselves in a corner.

Silly Product Ideas that Win

When Twitter appeared more than a decade ago, I though it was silly. I saw little value in a service that only allowed sharing 140-character long text messages. I registered on a bunch of social media platforms and created nevertheless a twitter account. Some years later, the only social media platform I’m actively using is… twitter.

There’s a lesson here for me and it’s that it’s hard to predict what will succeed. A lot of products can appear silly or superficial at first. They may appear so in the current time frame, but this can change in the future. Initially, twitter was full of people microblogging their life. It was boring. But it morphed in a platform that is useful to follow the news.

A startup like mighty can look silly now – why would you stream your browser from a powerful computer in the cloud? But as applications are ported to the web, maybe the boundary between thin client and server will move again.

We prefer to endorse project that appear profound and ethical, like supporting green energy, or reducing poverty. Product ideas that are silly or superficial don’t match these criterion and it’s easy to dismiss them. But innovation happens often because of such products. No matter how silly or superficial you think they are, if they gain traction, they need to solve their problem well at scale. These products are incubators for other technologies that can be used in other contexts. Twitter, for instance, open sourced several components. If Mighty gains traction, it might lead to new protocols for low-latency interactive streaming interfaces. An obvious candidate for such a technology could be set-top TV boxes.

These products might appears superficial at first and might lack the “credibility” of other domains, but here too, the first impression might be misguiding. A platform like twitter can support free speech and democracy (sure, there are problems with the platform, but it at least showed there are other ways to have public discourse). A product like Mighty might in turn make it more affordable to own computers for poor people, since it minimizes hardware requirements. Because these product don’t have an “noble” goal initially attached to them, doesn’t mean they don’t serve noble cause in the long term.

There are of course silly ideas that are simply silly and will fail. But the difference between products that are superficially silly and truly silly is not obvious. I took in this text the example of twitter and mighty. In retrospect, the case for twitter is clear. For mighty, I still don’t know. The idea puzzles me because it’s at the boundary.

More

What’s My Exposure to Data Lock-out?

My computer died a few days ago. Fortunately, I had a backup and could restore my data without problem on another laptop. Still, I’ve been wondering in the meantime: what if the restore hadn’t worked? How easily could I be locked out of my data ?

I have data online and data offline. My online data are mostly stored by google. If say, my account is compromised and due to a misbehavior from the hacker, my account is disabled. Would I ever be able to recover my online data? Not sure.

My data offline are stored on the harddrive, which I regularly backup with time machine. If a ransomware encrypts all my data, the backup shouldn’t be affected. Unless the ransomware encrypts slowly over months, without me noticing, and suddenly activates the lock out. Am I sure ransomeware don’t work like this? Not sure.

My laptop suffered a hardware failure. It hanged during booting, and no safe booting mode made it through. The “target disk” mode seemed still to work, though. It would have been a very bad luck, to not be able to access either the data on the harddisk or the backup. Both should fail simultaneously. But can we rule out this possibility? Not sure.

Harddisks and backup can be encrypted with passwords. I don’t make use of this option because I believe it could make things harder in case I have to recover the data. I could for instance have simply forgotten my password. Or some part could be corrupted. Without encryption I guess the bad segment can be skipped; with encryption I don’t know. Granted, these are speculative considerations. But are they completely irrational? Not sure.

Connecting my old backup to the new computer turned out to be more complicated than I thought. It involved two adapters: one for firewire to thunderbolt 2 adapter and one thunderbolt 2 to thunderbolt 4 adapter. Protocol and hardware evolve. With some more older technology, could it have turned out to be impossible to connect it to the new world? Not sure.

The probability of any of these scenario happening is small. It would be very bad luck and in some case would require multiple things to go wrong at once. But the impact would be very big—20 years of memory not lost, but inaccessible. There’s no need to be paranoid, but it’s worth reflecting on the risks and reduce the exposure.

More:

The Inevitability of Superintelligence

If we assume that the brain is a kind of computer, artificial intelligence is the process of reproducing its functioning. Based on this hypothesis, it’s easy to dismiss the possibibility of above-human intelligence by arguing that we can only program what we understand, which would means the intelligence in the machine is bounded by our own. But it’s also very easy to refute this limitation by arguing that we encode learning processes in the machine. These learning processes would be working at a scale and speed that we can’t match. The machine will beat us.

This later argument definitively seems to hold if we look at recent achievements in deep learning. Computer achieve some tasks that very much ressemble some form of intelligence. Looking more carefully, it’s however questionable whether we should speak of intelligence or simply knowledge. Techniques like deep learning enable computers to learn facts based on large amounts of data. These facts might be very sophisticated, ranging from recoloring images correctly to impersonating the artistic style of a painters. But the computer isn’t intelligent because no reasoning really happen.

This leads actually to an interesting question about intelligence. How much of intelligence is simply about predicting things based on experiences? If an object fall, you predict its position in the future to catch it, based on other experiences with falling objects. If someone asks you a “what’s up?”, you can predict that they expect to learn about what’s going on.  With GPT-3, which works according to this principles, you can almost have a conversation. I say almost, because we also see the limit the approach. There are some classes of question that don’t work, like basic arithmetic.

Current artificial intelligence is able to learn, either by analysing large quantities of data (deep learning) or simulating an environment and learning what works and what doesn’t (reinforcement learning). But we’re still far from sentient, thinking machines.

If we assume that our brain is some kind of computer performing a computation, there’s however nothing that prevent us from replicating it. With this line of thought, it’s only a matter of time until we “crack” the nature the intelligence and will find the right way to express this computation. When this breakthrough will happen is unknown – maybe in a decade, maye much later – but there’s nothing that make it impossible. With sufficient perserverence, this breakthrough is inevitable.

Speaking in terms of computation and data, a system becoming smarter can happen in two ways. The first one, is what we have now. Systems that learn over time through the accumulation of data. The computation remains the same though. A deep learning network is programmed once (by human!) and than trained on large quantities of data to adjust its paramter. But maybe a second class of systems exists: system that self-improve by changing their computation. Systems able to inspect and change themselves do exist and are called reflective systems. In such a system, data can be turned in computation and computation into data. The system can thus modify itself.

Some people believe that with artificial intelligence, we risk beeing outsmarted by a “explosition” of intelligence. Systems of the first class learn within the bounds of the computation that defines them – however complex this computation is. The possibility of an explosion is limited. With systems of the second classes, we’re free to speculate, including the possibility of an explosion of intelligence. Such a system could outsmart us and lead to superintelligence.

If we assume that our brain is a computation: is it self-improving or not? Children acquire novel cognitive capabilities over time, which at least give the illusion of self-improvement. But maybe these learnings are only very complex form of data accumulation. Also, the boundary between reflective and non-reflective systems is not black and white. A fully reflective system can change any aspects of its computation, whereas a non-reflective system processes input data according to fixed rules that never changes. A system that is able to infer and defines some rules for itself would fall in between both categories: the rules can change, but to an extent that is limited to some aspect of the computation. The adaptive nature of neural networks could, in some way, be seen as a limited form of rule changing: the rules are fixed, but the “weight” given to them change over time due to feedback loops.

Learning requires data provided by an environment. We’re able to learn only because we interact with the world and other people. If we were to replicate the computation in our brain and the learning process that takes place, we would also need to simulate the environement. The computational complexity of all this is probably enourmous. Maybe we can replicate the computation in our brain, but not the environment, or only limited forms of it. In which case, it’s hard to tell what kind of intelligence could be achieved.

Depending on the computation and environment that we simulate, the intelligence won’t resemble human intelligence much. The algorithm of AlphaGo learns in an enviornment that only consists of the Go rules. We can not even image what this world would be like. Assuming that the artificial intelligence is human-like misjudges the nature of human intelligence. Intelligence is not one quantity that we can weight based on clear criterium. Intelligence has many facets and is contextual.

For some facets, like arithmetics, machines are for sure already superintelligent.

More

Mastering Technology

Things move fast in the IT industry. Half of the technologies I use today didn’t exist ten years ago. There’s a constant push to switch to new technologies.  But when it comes to technologies, I’m kind of conservative. I like proven technologies.

The thing that makes me conservative is that mastering a technology takes a lot longer that we think.

Most advocates of new technologies massively underestimated the learning curve. Sure, you get something working quickly. But truely understanding a new technology takes years.

Take object-oriented programming. On the surface it’s easy to grasp quickly and become productive. But it took the industry something like 20 years to figure out that inheritance isn’t such a great idea. The result is that early object-oriented systems overused inheritance hoping it will favor reuse, whereas it just led to hard mainteance.

The same holds for instance for the idea of distributed objects. It’s easy to grasp and appealing . Yet it took the industry decades to realize that abstracting remote boundaries is a flawed idea. We instead need to explicitly embrace and expose asynchronous API.

Another one of my favorite easy-to-grasp-but-hard-to-master technology is object-relational mappers (e.g. Hibernate). 10 years of experience and I’m still struggling with it as soon as the mapping isn’t entirely trivial.

Want another example? XA Transaction. Updating a database row and sending a message seems to be the poster child for XA Transactions. Well, it turns out that this simple scenario is already problematic. When it didn’t work I learned that I was experiencing the classic “XA 2-PC race-condition“.

There are lots of new technologies being developed right now, like distributed log systems, container schedulers, web framework. I perfectly understand why they are being developed and what problems they supposedly solve.

But don’t try to convince me that they are silver bullets. Every technology choice is a trade-off because you can never fully abstract the underlying complexity away. There’s a price somewhere. Things will break in unexpected way and nobody knows why. Performance will be hard to figure out. Subtle misuse of the technology will only be detected later and be hard to correct. It will take time to figure these things out.

At the end it’s all about risk management. If the technology might provide a strategic advantage, we can talk. The investment might be worth it. But if it’s not even strategic, I would seriously challenge if the risk of using new technologies is worth it.

More

  • How Technology Evolves

    We often take for granted the technology we have and forget that it’s the result of a tedious evolutionary process.

    A Railroad Track is the Width of Two Horses is one of the first stories about the evolution of technology that I remember reading, maybe ten years ago. It rings more like a colorful story than a true historic account, but it nevertheless left an impression on me.

    Later, doing research gave me a bette appreciation how of ideas evolve, cross-polinate and morph over time. True hindsights are rare. It’s a lot about tweaking existing ideas until the right form that works is found.

    Here are some of the most engaging stories about technology history that I’ve read:

    Oh boy, innovation is so a messy process.

    Platforms and Innovation

    I started my career writing flash applications. Then I moved to Java. Both are middleware technologies that abstract the underlying operating system and enable cross-platform interoperability. I’ve actually never wrote a professional application that relied directly on a specific operating system.

    This was fine to me. “Write once, run everywhere” was great for productivity.

    For the kind of applications I was developing, what these middleware stacks provided was enough. Maybe I occasionally wished that drag and drop between the application and its host system was better supported, but that’s it more or less. I didn’t really miss a deeper integration with the rest of the system.

    These technologies were also innovative on their own. Flash enabled developers to create rich web applications back in a time when web sites were mostly static. The same was true of Java and its applets, even if the technology never really took off.

    But middleware technologies also slow down innovation.

    An operating system provider wants developers to adopt its new functionalities as quickly as possible, to innovate and make the platform attractive. Middleware technologies make such adoption harder and slower.

    The official Apple memo “Thoughts on Flash” about not supporting Flash on iOS makes it very clear:

    We know from painful experience that letting a third party layer of software come between the platform and the developer ultimately results in sub-standard apps and hinders the enhancement and progress of the platform.

    The informal post “What really happened with Vista” gives similar arguments against middleware stacks:

    Applications built on [cross-platform] middleware tend to target “lowest common denominator” functionality and are slower to take advantage of new OS capabilities.

    For desktop applications, a good integration with the operating system was a plus, but not a killer. The drag and drop functionality I occasionally missed didn’t impact the whole user experience.

    With mobile devices, everything is different.

    Mobile applications are more focused and need to integrate on the device seamlessly–in terms of user experience, but also connectivity and power consumption. That’s what “Thoughts on Flash” was about.

    Think of notifications. Notifications for desktop applications are nice, but not a killer. For a mobile application, how the application integrates with notifications makes the difference between success and failure. Notifications are becoming the heart of the smartphone experience. You don’t want there to suck.

    Or think of ARKit, Apple’s upcoming augmented reality toolkit. Augmented reality hasn’t yet really hit the mass market and there is lots of potential there. If only, it will make our good old fashion ruler obsolete to measure distances. But such a toolkit relies on specific hardware (sensor, CPU, camera). You don’t want middleware there to slow down adoption.

    Platforms diverge and sometimes converge. They diverge when exclusive capabilities are added and converge when a cross platform standard is adopted.

    With HTML5 we have a good standard for regular applications with desktop-like features. The GMail mobile web application is for instance so well done, that I prefer it to the native iOS version. But you can only go that far with HTML5. If you want to push the envelope, you need to go native and use the full power of the platform.

    For applications in the broader context of the digitalization (social media, artificial intelligence, internet of things) innovation at the platform level will be decisive.

    The platform war will intensify.

    More

    10 Tips to Fail with Enterprise Integration

    If you want to make enterprise integration needlessly complicated, follow these tips.

    1. Model poorly

    A poor model is always a nice way to make things more complicated than they should.

    Examples: You can name thing badly. You can model everyting as strings (key, list, etc.). Or you can reuse overly generic abstractions in multiple contexts instead of defining one abstraction per context. Or you can expose a relational model instead of an entity model.

    2. Use immature technologies

    Whenever possible, use immature, non-standard, or inappropriate technologies to make the integration complicated.

    Example: Don’t use XML but JSON. Its support in IDE is still weak, its semantics for the various numeric types is poor, it prevents proper code generation (for class-based language), and JSON-Schema is still a draft.

    3. Assume the network is perfect

    Assume the network is perfect. It has infinite bandwidth as well as zero latency. This is a classic for disaster. Ignore completely the reality of networking. If your interface is sound at the logical level, then it will be fine in production.

    Examples: Don’t distinguish between the time of the event you model and the technical time when the message was sent or received–it doesn’t matter since latency is zero. Or send replies to individual requests on a topic and leave the burden of filtering out the irrelevant replies to the subscriber at the application level–it doesn’t matter since bandwith is infinite.

    4. Make loads and updates asymmetric

    It is common for an interface to publish updates on topics but also provide a mean for the consumer to load data upon startup. In such case, the system should work so that the same data are delivered to the consumer for loads and updates. To introduce subtle data inconsistencies, make it so that loads and updates don’t deliver the same data.

    Example: If an entity has multiple status, do not publish all status changes per updates. This way, there is a discrepance between the data you obtain per load requests and per updates.

    5. Make the system as stateful as possible

    If you find a way to complicate state management, go for it.

    Examples: Instead of publishing entities that are consistent, publish only the delta with what has changed. The consumer must carefully ensure that all deltas are applied in order. Or define requests that reference other requests, e.g. to implement paging. The provider will need to do some bookkeeping of the previous requests.

    6. Leave the protocol vague

    By defining the transport technology, the encoding, and the various messages that can go through your interface, most readers of the specification will have a good understanding of what the purpose of the interface is. So stop there. Don’t bother explaining the exact protocol with the assumptions about the order of messages or when a given message can be sent or not. This way, you leave the door open to non obvious misunderstandings.

    Example: don’t specificy which requests can be used anytime and which should be used only occasionally after a restart or recovery.

    7. Don’t properly version your interface

    Your interface will need to change. Don’t provide proper versioning. This way, supporting multiple versions will be a pain.

    Example: Use XML Namespaces, but don’t use it for versioning.

    8. Redefine the semantics of data between versions

    Do subtle changes to the meaning of the data, so that the semantics changes in a non obvious way.

    Example: Redefine what “null” means for a certain attribute.

    9. Don’t distinguish between endpoint and tenant

    Your interface will be accessible through an endpoint that will probably be used from multiple consumer systems (“tenant”). Define SLA per endpoint, but not per tenant. This way you will need to deploy multiple endpoints to really guarantee SLA for specific consumers.

    Example: provide a limit for the frequency of load requests at the endpoint-level, but independent of the consumer systems. If a consumer misbehaves, it will prevent all other consumers from loading data.

    10. Ignore monitoring needs

    Do not provide any meaningful way for the consumer to check whether the provider is healthy or not. Either the consumer will have to guess, or it will have to use feature not designed for monitoring to assess the system health.

    Example: aggregate data from multiple decentralized subsystems and publish them via a centralized interface, but don’t provide any way for the consumer to figure out which subsystem is healthy or not.

    More

    Living in the Future

    The world is constantly changing. From electricity to cars to television to the internet, most generations have seen at least one breakthrough.

    This will continue, and it’s certain that my generation will witness another technological shift.

    Interestingly, how we react to new technologies changes itself with time.  For a lot of new technologies, my first reaction was indifference, missing entirely the new possibilities the technology offered.

    The iPhone? I thought it would be a flop. Facebook? I thought it would be a fad. Bitcoin? I thought it would crash.

    It seems like I belong to the late majority rather than the early adopters. Maybe Douglas Adams has also a point:

    I’ve come up with a set of rules that describe our reactions to technologies:

    1. Anything that is in the world when you’re born is normal and ordinary and is just a natural part of the way the world works.

    2. Anything that’s invented between when you’re fifteen and thirty-five is new and exciting and revolutionary and you can probably get a career in it.

    3. Anything invented after you’re thirty-five is against the natural order of things.

    Since I’m certain to witness another change, I will have to adapt, whether I like it or not.

    For instance, virtual reality might be a thing, after all. It seems to me very against the natural order of things right now, but actually it’s not much crazy than television back then.

    First versions of new technologies always sucked. They were bulky, limited, slow, made just usable enough for a specific niche market. For virtual reality helmets, the gamers.

    With widespread adoption, the usage can completely change, though. I’m writing this post on an iPhone using a third party app, after all. Maybe virtual reality is the future of shopping, who knows.

    The talent is to foresee the potential of a mass market, which isn’t always obvious.

    I think there is a world market for maybe five computers — Thomas Watson, 1943

    Realizing that my ability to predict successful technology changes are as good as Thomas Watson, it’s interesting to try to see how innovators see the world.

    According to Paul Graham, innovators “live in the future.” They are natural early adopters and their use of technology is so that they simply build what is missing to them.

    An alternate formulation which I like is from Tim Urban: innovators have an accurate “reality box.” That is, unlike most people, whose understanding of the world and what technology enable reflects the common wisdom established 10 years ago, the innovator has an accurate and up-to-date understanding of the possibilities offered by technology. This make it obvious to create new products around these capabilities.

    Will virtual reality turn out to be the future of shopping, or self driving cars become mainstream, or bitcoin establish itself as a the first digital currency? Whatever the next breakthrough will be, there’s an exiting time ahead.

    So I’ve decided to be more open to new ideas and keep my reality box more accurate to assess them. But changing one’s way of reacting to new ideas is hard, just as well as predicting the future.

    Wearing a smart watch is still something that doesn’t appeal to me. And it apparently doesn’t appeal to many other people either.

    More

    The New Digital Age

    The New Digital Age explores the impact of internet connectivity and digital media on society. The book witnesses changes that have already occurred, reviews current trends, and tries to predict some future moves.

    Written by Eric Schmidt, a tech executive, and Jared Cohen, a former foreign policy advisor, the book focuses on the impact of technology at the political and societal level, not so much at the individual level (only the first chapter “Our future selves” is about it). I applaud this ambitious agenda.

    People interested in technology and cyber criminality (e.g. TED talks, Wired) might be familiar with some of the observations and speculations in the book. The novelty that it carries will depend on the background of the reader. Some of the predictions are however unique to the authors, and they do not hesitate to give their personal opinions. This gives a special edge to the matter.

    The trends and predictions are usually backed up with short annectodical evidences that are interesting in themselves. The overall discussion remains however usually quite abstract, which at times gives the impression that it lacks substance. This is to be expected from such a book, though. Prediction and precision don’t match up very well.

    My main criticism of the book is that while the chapters tell a consistent story of how society evolves with periods of peacetime, revolution, conflict, and reconstruction, the chapter internals do not enjoy such a coherent treatment. The predictions that they discuss appear to exist more by accident than as the outcome of a thorough analysis. For instance, I do not recall reading anything about electronic voting. This seems to me like an unavoidable topic for such a book.

    The book gives also a slight feeling of redundancy. Certain topics are discussed from a different point of view from chapter to chapter. For instance, the tension between privacy and security is discussed under the perspective of state organization, militantism, counter terrorism, etc. An improvement for a second edition would be to provide a roadmap of recurring topics and their treatment in each chapter. That would give a high-level view of the content, and would avoid this unpleasant feeling of redundancy.

    While the positions in the book are relatively balanced, the overall tone is inevitably biased towards US policy, which is no suprise given Jared Cohen’s background. Also, the book emphasizes tracking and surveillance a lot and will make proponents of an anonymous internet uneasy.

    Overall, I liked the book. The themes addressed are very relevant and it sharpened my understanding of the role of technology in modern society. What the future will really bring, nobody knows.