What’s My Exposure to Data Lock-out?

My computer died a few days ago. Fortunately, I had a backup and could restore my data without problem on another laptop. Still, I’ve been wondering in the meantime: what if the restore hadn’t worked? How easily could I be locked out of my data ?

I have data online and data offline. My online data are mostly stored by google. If say, my account is compromised and due to a misbehavior from the hacker, my account is disabled. Would I ever be able to recover my online data? Not sure.

My data offline are stored on the harddrive, which I regularly backup with time machine. If a ransomware encrypts all my data, the backup shouldn’t be affected. Unless the ransomware encrypts slowly over months, without me noticing, and suddenly activates the lock out. Am I sure ransomeware don’t work like this? Not sure.

My laptop suffered a hardware failure. It hanged during booting, and no safe booting mode made it through. The “target disk” mode seemed still to work, though. It would have been a very bad luck, to not be able to access either the data on the harddisk or the backup. Both should fail simultaneously. But can we rule out this possibility? Not sure.

Harddisks and backup can be encrypted with passwords. I don’t make use of this option because I believe it could make things harder in case I have to recover the data. I could for instance have simply forgotten my password. Or some part could be corrupted. Without encryption I guess the bad segment can be skipped; with encryption I don’t know. Granted, these are speculative considerations. But are they completely irrational? Not sure.

Connecting my old backup to the new computer turned out to be more complicated than I thought. It involved two adapters: one for firewire to thunderbolt 2 adapter and one thunderbolt 2 to thunderbolt 4 adapter. Protocol and hardware evolve. With some more older technology, could it have turned out to be impossible to connect it to the new world? Not sure.

The probability of any of these scenario happening is small. It would be very bad luck and in some case would require multiple things to go wrong at once. But the impact would be very big—20 years of memory not lost, but inaccessible. There’s no need to be paranoid, but it’s worth reflecting on the risks and reduce the exposure.


The Inevitability of Superintelligence

If we assume that the brain is a kind of computer, artificial intelligence is the process of reproducing its functioning. Based on this hypothesis, it’s easy to dismiss the possibibility of above-human intelligence by arguing that we can only program what we understand, which would means the intelligence in the machine is bounded by our own. But it’s also very easy to refute this limitation by arguing that we encode learning processes in the machine. These learning processes would be working at a scale and speed that we can’t match. The machine will beat us.

This later argument definitively seems to hold if we look at recent achievements in deep learning. Computer achieve some tasks that very much ressemble some form of intelligence. Looking more carefully, it’s however questionable whether we should speak of intelligence or simply knowledge. Techniques like deep learning enable computers to learn facts based on large amounts of data. These facts might be very sophisticated, ranging from recoloring images correctly to impersonating the artistic style of a painters. But the computer isn’t intelligent because no reasoning really happen.

This leads actually to an interesting question about intelligence. How much of intelligence is simply about predicting things based on experiences? If an object fall, you predict its position in the future to catch it, based on other experiences with falling objects. If someone asks you a “what’s up?”, you can predict that they expect to learn about what’s going on.  With GPT-3, which works according to this principles, you can almost have a conversation. I say almost, because we also see the limit the approach. There are some classes of question that don’t work, like basic arithmetic.

Current artificial intelligence is able to learn, either by analysing large quantities of data (deep learning) or simulating an environment and learning what works and what doesn’t (reinforcement learning). But we’re still far from sentient, thinking machines.

If we assume that our brain is some kind of computer performing a computation, there’s however nothing that prevent us from replicating it. With this line of thought, it’s only a matter of time until we “crack” the nature the intelligence and will find the right way to express this computation. When this breakthrough will happen is unknown – maybe in a decade, maye much later – but there’s nothing that make it impossible. With sufficient perserverence, this breakthrough is inevitable.

Speaking in terms of computation and data, a system becoming smarter can happen in two ways. The first one, is what we have now. Systems that learn over time through the accumulation of data. The computation remains the same though. A deep learning network is programmed once (by human!) and than trained on large quantities of data to adjust its paramter. But maybe a second class of systems exists: system that self-improve by changing their computation. Systems able to inspect and change themselves do exist and are called reflective systems. In such a system, data can be turned in computation and computation into data. The system can thus modify itself.

Some people believe that with artificial intelligence, we risk beeing outsmarted by a “explosition” of intelligence. Systems of the first class learn within the bounds of the computation that defines them – however complex this computation is. The possibility of an explosion is limited. With systems of the second classes, we’re free to speculate, including the possibility of an explosion of intelligence. Such a system could outsmart us and lead to superintelligence.

If we assume that our brain is a computation: is it self-improving or not? Children acquire novel cognitive capabilities over time, which at least give the illusion of self-improvement. But maybe these learnings are only very complex form of data accumulation. Also, the boundary between reflective and non-reflective systems is not black and white. A fully reflective system can change any aspects of its computation, whereas a non-reflective system processes input data according to fixed rules that never changes. A system that is able to infer and defines some rules for itself would fall in between both categories: the rules can change, but to an extent that is limited to some aspect of the computation. The adaptive nature of neural networks could, in some way, be seen as a limited form of rule changing: the rules are fixed, but the “weight” given to them change over time due to feedback loops.

Learning requires data provided by an environment. We’re able to learn only because we interact with the world and other people. If we were to replicate the computation in our brain and the learning process that takes place, we would also need to simulate the environement. The computational complexity of all this is probably enourmous. Maybe we can replicate the computation in our brain, but not the environment, or only limited forms of it. In which case, it’s hard to tell what kind of intelligence could be achieved.

Depending on the computation and environment that we simulate, the intelligence won’t resemble human intelligence much. The algorithm of AlphaGo learns in an enviornment that only consists of the Go rules. We can not even image what this world would be like. Assuming that the artificial intelligence is human-like misjudges the nature of human intelligence. Intelligence is not one quantity that we can weight based on clear criterium. Intelligence has many facets and is contextual.

For some facets, like arithmetics, machines are for sure already superintelligent.



Mastering Technology

Things move fast in the IT industry. Half of the technologies I use today didn’t exist ten years ago. There’s a constant push to switch to new technologies.  But when it comes to technologies, I’m kind of conservative. I like proven technologies.

The thing that makes me conservative is that mastering a technology takes a lot longer that we think.

Most advocates of new technologies massively underestimated the learning curve. Sure, you get something working quickly. But truely understanding a new technology takes years.

Take object-oriented programming. On the surface it’s easy to grasp quickly and become productive. But it took the industry something like 20 years to figure out that inheritance isn’t such a great idea. The result is that early object-oriented systems overused inheritance hoping it will favor reuse, whereas it just led to hard mainteance.

The same holds for instance for the idea of distributed objects. It’s easy to grasp and appealing . Yet it took the industry decades to realize that abstracting remote boundaries is a flawed idea. We instead need to explicitly embrace and expose asynchronous API.

Another one of my favorite easy-to-grasp-but-hard-to-master technology is object-relational mappers (e.g. Hibernate). 10 years of experience and I’m still struggling with it as soon as the mapping isn’t entirely trivial.

Want another example? XA Transaction. Updating a database row and sending a message seems to be the poster child for XA Transactions. Well, it turns out that this simple scenario is already problematic. When it didn’t work I learned that I was experiencing the classic “XA 2-PC race-condition“.

There are lots of new technologies being developed right now, like distributed log systems, container schedulers, web framework. I perfectly understand why they are being developed and what problems they supposedly solve.

But don’t try to convince me that they are silver bullets. Every technology choice is a trade-off because you can never fully abstract the underlying complexity away. There’s a price somewhere. Things will break in unexpected way and nobody knows why. Performance will be hard to figure out. Subtle misuse of the technology will only be detected later and be hard to correct. It will take time to figure these things out.

At the end it’s all about risk management. If the technology might provide a strategic advantage, we can talk. The investment might be worth it. But if it’s not even strategic, I would seriously challenge if the risk of using new technologies is worth it.


  • Technology

    How Technology Evolves

    We often take for granted the technology we have and forget that it’s the result of a tedious evolutionary process.

    A Railroad Track is the Width of Two Horses is one of the first stories about the evolution of technology that I remember reading, maybe ten years ago. It rings more like a colorful story than a true historic account, but it nevertheless left an impression on me.

    Later, doing research gave me a bette appreciation how of ideas evolve, cross-polinate and morph over time. True hindsights are rare. It’s a lot about tweaking existing ideas until the right form that works is found.

    Here are some of the most engaging stories about technology history that I’ve read:

    Oh boy, innovation is so a messy process.


    Platforms and Innovation

    I started my career writing flash applications. Then I moved to Java. Both are middleware technologies that abstract the underlying operating system and enable cross-platform interoperability. I’ve actually never wrote a professional application that relied directly on a specific operating system.

    This was fine to me. “Write once, run everywhere” was great for productivity.

    For the kind of applications I was developing, what these middleware stacks provided was enough. Maybe I occasionally wished that drag and drop between the application and its host system was better supported, but that’s it more or less. I didn’t really miss a deeper integration with the rest of the system.

    These technologies were also innovative on their own. Flash enabled developers to create rich web applications back in a time when web sites were mostly static. The same was true of Java and its applets, even if the technology never really took off.

    But middleware technologies also slow down innovation.

    An operating system provider wants developers to adopt its new functionalities as quickly as possible, to innovate and make the platform attractive. Middleware technologies make such adoption harder and slower.

    The official Apple memo “Thoughts on Flash” about not supporting Flash on iOS makes it very clear:

    We know from painful experience that letting a third party layer of software come between the platform and the developer ultimately results in sub-standard apps and hinders the enhancement and progress of the platform.

    The informal post “What really happened with Vista” gives similar arguments against middleware stacks:

    Applications built on [cross-platform] middleware tend to target “lowest common denominator” functionality and are slower to take advantage of new OS capabilities.

    For desktop applications, a good integration with the operating system was a plus, but not a killer. The drag and drop functionality I occasionally missed didn’t impact the whole user experience.

    With mobile devices, everything is different.

    Mobile applications are more focused and need to integrate on the device seamlessly–in terms of user experience, but also connectivity and power consumption. That’s what “Thoughts on Flash” was about.

    Think of notifications. Notifications for desktop applications are nice, but not a killer. For a mobile application, how the application integrates with notifications makes the difference between success and failure. Notifications are becoming the heart of the smartphone experience. You don’t want there to suck.

    Or think of ARKit, Apple’s upcoming augmented reality toolkit. Augmented reality hasn’t yet really hit the mass market and there is lots of potential there. If only, it will make our good old fashion ruler obsolete to measure distances. But such a toolkit relies on specific hardware (sensor, CPU, camera). You don’t want middleware there to slow down adoption.

    Platforms diverge and sometimes converge. They diverge when exclusive capabilities are added and converge when a cross platform standard is adopted.

    With HTML5 we have a good standard for regular applications with desktop-like features. The GMail mobile web application is for instance so well done, that I prefer it to the native iOS version. But you can only go that far with HTML5. If you want to push the envelope, you need to go native and use the full power of the platform.

    For applications in the broader context of the digitalization (social media, artificial intelligence, internet of things) innovation at the platform level will be decisive.

    The platform war will intensify.



    10 Tips to Fail with Enterprise Integration

    If you want to make enterprise integration needlessly complicated, follow these tips.

    1. Model poorly

    A poor model is always a nice way to make things more complicated than they should.

    Examples: You can name thing badly. You can model everyting as strings (key, list, etc.). Or you can reuse overly generic abstractions in multiple contexts instead of defining one abstraction per context. Or you can expose a relational model instead of an entity model.

    2. Use immature technologies

    Whenever possible, use immature, non-standard, or inappropriate technologies to make the integration complicated.

    Example: Don’t use XML but JSON. Its support in IDE is still weak, its semantics for the various numeric types is poor, it prevents proper code generation (for class-based language), and JSON-Schema is still a draft.

    3. Assume the network is perfect

    Assume the network is perfect. It has infinite bandwidth as well as zero latency. This is a classic for disaster. Ignore completely the reality of networking. If your interface is sound at the logical level, then it will be fine in production.

    Examples: Don’t distinguish between the time of the event you model and the technical time when the message was sent or received–it doesn’t matter since latency is zero. Or send replies to individual requests on a topic and leave the burden of filtering out the irrelevant replies to the subscriber at the application level–it doesn’t matter since bandwith is infinite.

    4. Make loads and updates asymmetric

    It is common for an interface to publish updates on topics but also provide a mean for the consumer to load data upon startup. In such case, the system should work so that the same data are delivered to the consumer for loads and updates. To introduce subtle data inconsistencies, make it so that loads and updates don’t deliver the same data.

    Example: If an entity has multiple status, do not publish all status changes per updates. This way, there is a discrepance between the data you obtain per load requests and per updates.

    5. Make the system as stateful as possible

    If you find a way to complicate state management, go for it.

    Examples: Instead of publishing entities that are consistent, publish only the delta with what has changed. The consumer must carefully ensure that all deltas are applied in order. Or define requests that reference other requests, e.g. to implement paging. The provider will need to do some bookkeeping of the previous requests.

    6. Leave the protocol vague

    By defining the transport technology, the encoding, and the various messages that can go through your interface, most readers of the specification will have a good understanding of what the purpose of the interface is. So stop there. Don’t bother explaining the exact protocol with the assumptions about the order of messages or when a given message can be sent or not. This way, you leave the door open to non obvious misunderstandings.

    Example: don’t specificy which requests can be used anytime and which should be used only occasionally after a restart or recovery.

    7. Don’t properly version your interface

    Your interface will need to change. Don’t provide proper versioning. This way, supporting multiple versions will be a pain.

    Example: Use XML Namespaces, but don’t use it for versioning.

    8. Redefine the semantics of data between versions

    Do subtle changes to the meaning of the data, so that the semantics changes in a non obvious way.

    Example: Redefine what “null” means for a certain attribute.

    9. Don’t distinguish between endpoint and tenant

    Your interface will be accessible through an endpoint that will probably be used from multiple consumer systems (“tenant”). Define SLA per endpoint, but not per tenant. This way you will need to deploy multiple endpoints to really guarantee SLA for specific consumers.

    Example: provide a limit for the frequency of load requests at the endpoint-level, but independent of the consumer systems. If a consumer misbehaves, it will prevent all other consumers from loading data.

    10. Ignore monitoring needs

    Do not provide any meaningful way for the consumer to check whether the provider is healthy or not. Either the consumer will have to guess, or it will have to use feature not designed for monitoring to assess the system health.

    Example: aggregate data from multiple decentralized subsystems and publish them via a centralized interface, but don’t provide any way for the consumer to figure out which subsystem is healthy or not.



    Living in the Future

    The world is constantly changing. From electricity to cars to television to the internet, most generations have seen at least one breakthrough.

    This will continue, and it’s certain that my generation will witness another technological shift.

    Interestingly, how we react to new technologies changes itself with time.  For a lot of new technologies, my first reaction was indifference, missing entirely the new possibilities the technology offered.

    The iPhone? I thought it would be a flop. Facebook? I thought it would be a fad. Bitcoin? I thought it would crash.

    It seems like I belong to the late majority rather than the early adopters. Maybe Douglas Adams has also a point:

    I’ve come up with a set of rules that describe our reactions to technologies:

    1. Anything that is in the world when you’re born is normal and ordinary and is just a natural part of the way the world works.

    2. Anything that’s invented between when you’re fifteen and thirty-five is new and exciting and revolutionary and you can probably get a career in it.

    3. Anything invented after you’re thirty-five is against the natural order of things.

    Since I’m certain to witness another change, I will have to adapt, whether I like it or not.

    For instance, virtual reality might be a thing, after all. It seems to me very against the natural order of things right now, but actually it’s not much crazy than television back then.

    First versions of new technologies always sucked. They were bulky, limited, slow, made just usable enough for a specific niche market. For virtual reality helmets, the gamers.

    With widespread adoption, the usage can completely change, though. I’m writing this post on an iPhone using a third party app, after all. Maybe virtual reality is the future of shopping, who knows.

    The talent is to foresee the potential of a mass market, which isn’t always obvious.

    I think there is a world market for maybe five computers — Thomas Watson, 1943

    Realizing that my ability to predict successful technology changes are as good as Thomas Watson, it’s interesting to try to see how innovators see the world.

    According to Paul Graham, innovators “live in the future.” They are natural early adopters and their use of technology is so that they simply build what is missing to them.

    An alternate formulation which I like is from Tim Urban: innovators have an accurate “reality box.” That is, unlike most people, whose understanding of the world and what technology enable reflects the common wisdom established 10 years ago, the innovator has an accurate and up-to-date understanding of the possibilities offered by technology. This make it obvious to create new products around these capabilities.

    Will virtual reality turn out to be the future of shopping, or self driving cars become mainstream, or bitcoin establish itself as a the first digital currency? Whatever the next breakthrough will be, there’s an exiting time ahead.

    So I’ve decided to be more open to new ideas and keep my reality box more accurate to assess them. But changing one’s way of reacting to new ideas is hard, just as well as predicting the future.

    Wearing a smart watch is still something that doesn’t appeal to me. And it apparently doesn’t appeal to many other people either.



    The New Digital Age

    The New Digital Age explores the impact of internet connectivity and digital media on society. The book witnesses changes that have already occurred, reviews current trends, and tries to predict some future moves.

    Written by Eric Schmidt, a tech executive, and Jared Cohen, a former foreign policy advisor, the book focuses on the impact of technology at the political and societal level, not so much at the individual level (only the first chapter “Our future selves” is about it). I applaud this ambitious agenda.

    People interested in technology and cyber criminality (e.g. TED talks, Wired) might be familiar with some of the observations and speculations in the book. The novelty that it carries will depend on the background of the reader. Some of the predictions are however unique to the authors, and they do not hesitate to give their personal opinions. This gives a special edge to the matter.

    The trends and predictions are usually backed up with short annectodical evidences that are interesting in themselves. The overall discussion remains however usually quite abstract, which at times gives the impression that it lacks substance. This is to be expected from such a book, though. Prediction and precision don’t match up very well.

    My main criticism of the book is that while the chapters tell a consistent story of how society evolves with periods of peacetime, revolution, conflict, and reconstruction, the chapter internals do not enjoy such a coherent treatment. The predictions that they discuss appear to exist more by accident than as the outcome of a thorough analysis. For instance, I do not recall reading anything about electronic voting. This seems to me like an unavoidable topic for such a book.

    The book gives also a slight feeling of redundancy. Certain topics are discussed from a different point of view from chapter to chapter. For instance, the tension between privacy and security is discussed under the perspective of state organization, militantism, counter terrorism, etc. An improvement for a second edition would be to provide a roadmap of recurring topics and their treatment in each chapter. That would give a high-level view of the content, and would avoid this unpleasant feeling of redundancy.

    While the positions in the book are relatively balanced, the overall tone is inevitably biased towards US policy, which is no suprise given Jared Cohen’s background. Also, the book emphasizes tracking and surveillance a lot and will make proponents of an anonymous internet uneasy.

    Overall, I liked the book. The themes addressed are very relevant and it sharpened my understanding of the role of technology in modern society. What the future will really bring, nobody knows.


    Using Multiple Google Calendars with iOS 5

    With Google calendars, you can create additional calendars linked to your account. This is convenient, say, to split your own events from events of others that don’t use any online calendar but that you want to track.

    With iOS 4, adding a google account would display only the primary calendar, not the auxiliarary ones. The solution then was to add them individually as WebCal calendars. The WebCal URL is somehow cryptic but could be obtained from the ID of the calendar found in the settings.

    After upgrading to iOS 5, all events are duplicated n times, where n is the number of auxialliary calendars. Sounds like an aweful bug, isn’t it? Actually not, things only got better. The auxiliary calendars are now correctly supported.

    Go to and select the auxiliary calendars you want to sync. The google calendars you selected will all appear under you google account on iOS 5 (maybe you need to recreate it, though). You can remove the spurious individual WebCal calendars safely.


    The Social Network

    I wasn’t much involved or interested in social media (twitter and the likes) until I joined SCG a few month ago. I had a rather defensive attitude and wanted to have the smallest fingerprint on the web. For several reasons, I nevertheless started using Google Shared Link, Twitter, CiteULike and Stackoverflow to see how they worked.

    I must admit that I kind of like them all, now that I overcame my initial resistance.  But what I liked most is the surrounding questions on the evolution of the society. Here is a bunch of points I’m questioning myself about these days.

    Ranking, reputation and suggestion system

    The heart of these systems is to identify the value that the community gives to certain person or item (value is vague, maybe relevance or credibility would be better). This value can be mined using information about the network, or number of visit, etc. or by requesting user to vote. Purpose of these systems is to be fair, objective and democratic. Such systems are however complex to create. You need to design a set of rules that fit the purpose as well as a set of counter-mechanisms to eliminate abnormal behavior that still slip in (e.g. robot visit, abnormal pattern in user vote, etc.).  Ultimately all such system have their own weakness. This wasn’t too a problem when we didn’t depend critically on such system, but this is now the case.

    The value of our second life

    How much value to give to the web presence of an individual? For instance, recruitement has already changed with the appearance of job sites first, but then of online CV. This tendency will continue and expand to all area of our life. We can expect in the future to have consolidated profile be used more and more prior to meeting people for real. You can’t just erase all that and start from sratch. This may seriously bias our opinion on people. Prejudges related to a our web presence may be hard to overcome. Our presence on the web will be a direct measure of our skills, as is the case for instance with stackoverflow QA and CV. Will this expand to other area? Will we soon see  sentences such as “10+ meme on twitter is a plus” for people working in PR?
    •    How much should we trust this information?
    •    What is the “critical mass” that these systems must reach to really work?
    •    Does it represent the real soft- and social-skills of a person?
    •    Can we really sum up people with numbers?
    •    When will the first “single consolidate metric” appear that grades an individual according to its complete web presence?

    Community vs. individual

    The web was first driven by communities. People which contributed to the web, adhered to the value of these communities. However, if the tendency to expose single individual continues, there will more and more tension between the community aspects and the individual, selfish aspects. This tension isn’t new and has probably been studied since decades in sociology and psychology, but the expansion of this tension to the web is new.  And the effect is unknown.  Everybody will be an active player the Internet and not just a passive user, as during the past decade.  We can then expect much more friction and instability in these social web site. Or maybe not.

    Nowhere to Hide: Assessing Your Work Reputation Online