Working with AI – A First Experiment

AI will be in an inevitable tool to use in the future. To get a first impression of how it is to work with AI, I decide to realized a very small project using ChatGPT as assistant.

The small project would be a webpage that charts the performance of a portfolio of stocks. I haven’t written webpages since a long time (15 years!), so I would have to catch up using ChatGPT. I also decided to explore AWS Lambda at the same time.

The architecture is very simple: The webpage is a static file and historic stock quotes are stored on AWS S3. There’s a lambda that fetches the stocks quotes every night and stores the output in S3. The computation of the portfolio is done on the client-side. The key to access the stocks API is therefore not public, and I also don’t need a real backend to serve data.

For charting, ChatGPT suggested Chart.js, which was fine. For the stock API, the suggestions of ChatGPT were less useful. I had to compare myself the various sites directly. Finally, I settled on marketstack. That’s the best free tier I could find. Unfortunately, it doesn’t provide an API for currency rate. For hosting, ChatGPT gave me handing hint: you can upload you static website on AWS S3 and make it publicly accessible.

With the help of ChatGPT, it took my a couple hours to build the first version of the webpage using Chart.js and pure javascript.

Key learnings:

  • AI productivity boost is real. ChatGPT is quite amazing. It can give good suggestions about technological options. The quality of the code is also surprisingly good. You need to double check the answers, but it provides a lot of good insights. Definitively a productivity boost.
  • Good onboarding experience helps win clients. There are many stocks API. The quality of the various stocks API differ a lot. Onboarding is a killer point for any technical product. I chose marketstack because it was the simplest option to get something working, even though I know it doesn’t have a currency API which I will need later on.
  • Domain knowledge is always an asset. As with most business domain, things are never as simple as they seem. Computing the performance of a portfolio seems a no brainer. But stocks can split and have dividends. Therefore, the nominal historic price is misleading for long-term historical analysis. Instead, APIs provide adjusted closing prices.
  • Designing framework APIs is an art. There are many charting libraries and the way they are designed differ a lot. This reminded my of Why a Calendar App is a Great Design Exercise. Designing a chart API would a great exercise, too.

As for the webpage, I see lots of way to improve it further. From the domain point of view, I could add support for comparison with various indexes. From the technical point of view, being able to edit the portfolio would be nice. Supporting several users with login would also be a nice experiment. Figuring out what a delivery pipeline for lambda look like would also be interesting. At the moment, it was all manual uploads to S3.

If I have enough time, I may continue the project with ChatGPT. For the technical points, ChatGPT helps a lot, and proved to be a valuable assistant.

GDP as Proxy for Progress

The gross domestic product (GDP) measures how much goods and services have been produced in a year. It measures the economic activity of a country and is used as a a proxy to track the standard of living. The higher the GDP per capita, the more goods and services are accessible to the population, the higher is the living standard.

The very good post “What is economic growth?” from ourworldindata.org makes GDP more tangible:

Have a look around yourself right now. Many of the things you see are products that were produced by someone so that you can use them: the trousers you are wearing, the device you are reading this on, the electricity that powers it, the furniture around you, the toilet that is nearby, the sewage system it is connected to, the bus or car or bicycle you took to get where you are, the food you had this morning, the medications you will receive when you get sick, every window in your home, every shirt in your wardrobe, and every book on your shelf.

Over time, the cost of good and service decrease, due to improvement in production. The utility of the good or service remains constant, though. Inversely, for a given price, the utility increases. An affordable car today is way more comfortable, secure, and efficient than a car from ten year ago for the same price. GDP and progress don’t correlate exactly then. A constant GDP could still represent a constant, modest progress. But in practices, progress is manifested by increased GDP per capita.

Using GDP as proxy for standard of living is subject to debates. Besides access to goods and services, standard of living also encompasses dimensions like access to education, access to nature, or access to health care. Different countries and social systems with similar GDP may fare differently on these points. As a crude measure for the whole country or per capita, GDP says little about the distribution. Inequality isn’t well captured by GDP.

The cousin of GDP to track standard of living is life expectancy. Life expectancy also aggregates and proxies several aspects like education, access to health care, or overall well-being. Interestingly, while both metrics usually correlate, there are discrepancies that remind us that the metrics or only proxies and have flaws.

Ourworldindata.org has an interactive chart to explore GDP and life expectancy:

The GDP is influenced by the household structures. Depending on how GDP is computed, it may or may not contain “services” provided by family members, such as cooking, child care, or taking care of the elderly.

On one hand, it makes sense to include personal activities in the GDP. If you grow your vegetable yourself, you’re working as a farmer with one consumer, yourself. You’re producing some good for yourself. One the other hand, it’s not economic activity – it’s not traded on a market. This activity is not on the market and will not “benefit” from market mechanisms (or rather be “driven” by market forces), such as market-driven specialisation, allocation of resources, or pricing.

In economic logic, growing yourself your vegetable isn’t rational, since you use time that you could have invested in some other more rewarding economic activity, depending on your profession. The same holds for child care: working part-time to raise children isn’t economically rational if you have a high-paying profession.

The reason people are willing to sacrifice profit for such activities at the moment is leisure and fulfillment. They find an increased happiness at some other level in these activities.

Leisure and fulfillment work at the personal level. Doing more outside of the formal economy also brings benefits to the economy itself: it increases resilience and sustainability. These values are not accounted in classic economy, but maybe should.

Given the flaws of GDP, it’s no surprise that other metrics have been developed to track development, for instance the human development index. But the simplicity of GDP (or life expectancy for that matter) is very attractive. GDP will remain the prevalent metric for the years to come.

TOGAF: The Good Parts

TOGAF is a framework for enterprise architecture management. Enterprise architecture aims at aligning the business and IT to achieve the strategic goals of the enterprise. Enterprise architecture supports digital transformation in large enterprises.

The core of the TOGAF framework is the architecture development method (ADM).

In a nutshell, the method works as follows:

  • in the preliminary phase, you build up the enterprise architecture capability itself (that is, you establish and tailor the TOGAF framework)
  • the architecture work is triggered by architecture changes that go through the whole cycle.
  • Phase A-D work out the candidate architecture, which is decomposed in four architecture domains: business, application, data, and technology. Application and data architecture are grouped together into “information systems architecture” in the cycle.
  • The candidate architecture is solution-neutral. Up to this point, you identify changes to business processes, applications, interfaces, data models, platforms without defining a concrete implementation.
  • A more concrete solution is worked out in phases E and F. The selection of precise technologies, implementation architecture styles (microservices, streaming, etc.), or vendors come especially in these phases.
  • Phases E and F also cover planning with key stakeholders using an architecture roadmap and migration plan that define the work packages.
  • In phase G, the project is handed over to the implementation organization (e.g. an agile release train). Expectations about the outcome to deliver and quality are agreed in an “architecture contract” between the architecture and the implementation organization.
  • Phase H is a retrospective where improvements are formulated and kickstart a new cycle.

The cardinal sin to avoid is to jump from A to G, namely, from the business need to the implementation plan. We’ve all been there: the business has an idea and mandates a team to realize it, without looking left or right and how it would fit in the bigger context. This usually leads to specific solutions for specific problems. Over time, the architecture landscape becomes fragmented and inconsistent. The goal of enterprise architecture is precisely to avoid this. Business needs should be supported by a consistent architecture strategy.

This goal of consistency is supported in TOGAF with the concept of building blocks. The architecture consists of architecture building blocks (business, application, data, or technology) that can be used or reused. In phase B/C/D architects identify which building blocks have to modified, added, or removed for a given change. This is the gap analysis between the baseline and target architecture. In phases E and F, when the concrete solution architecture is worked out, solution architects identify solution building blocks to fulfill the requirements.

Part of TOGAF is also a content metamodel that define key entites to model the four architecture domains (business, application, data, or technology). It’s pretty generic but can be good starting point. You will probably have to refine it though, so that it becomes really useful (e.g. refine the technology metamodel to distinguish between plattform and frameworks).

These core concepts of TOGAF define a useful methodology to tackled complex architecture change. It’s the good parts.

From the perspective of agility, the framework is neither good nor bad. It will all depend on your implementation.

The framework is iterative in nature. Each architecture change goes through the cycle and is an iteration. How big the changes are, how long the cycles last, and how many iterations can run in parallel will depend on your implementation. Part of the preliminary phase is the idea to tailor the framework to your needs. You can implement the core idea in a bureaucratic manner with many deliverables and approvals. But you can also implement the core ideas in a lightweight manner with a few well chosen checkpoints. Similarly, you can partition the enterprise architecture work in “segments”. What they are and how big they are will depend on your implementation.

The bad parts of the framework are the useless ornaments around the core concepts. I can make a list:

  • The many deliverables expected to be produced along the cycle. I like the core concepts as long as they remain concepts. But TOGAF also defines a set of deliverables that probably are never implemented as such.
  • The template library documents predefined viewpoints that can be use to document the architecture. This level of details is mostly useless.
  • The content metamodel comes with two additional references architectures to model the technology and application domains. This complexifies the discussion about modelling without bring much benefits.
  • The framework comes with a set of techniques that can be employed to carry out the phases. What has been defined as technique or not seem rather arbitrary. The techniques can serve as inspiration to conduct real work, but I doubt they will be followed as such.
  • The framework defines a classification scheme for building blocks, ranging from generic to enterprise-specific, called the enterprise continuum. The value of the continuum as concept and its applicability are rather unclear.

These ornaments make the framework bigger with details, without being practical enough to be really useful. They mostly distract rather than help.

Talk: Centralized/Decentralized (@SBB DevDay’22)

Each year, I try to give a talk at the internal SBB Developer Conference, called “SBB DevDay”). The previous years, I gave talks about concrete technology challenges. This year I tried something new and gave a talk about development methods and practices.

I consolidated my experience and insights of the past 6 year as lead architect into a useful framework to think about software architecture governance. This perspective mixes traditional agile concepts with concepts of choice architecture.

Using Technology as Intended

Technology people come in two flavours: generalist and experts. I’m definitively a generalist. I’ve a good grasp of the core concepts being the technologies we use. I’m however lacking the expertise about the details of using them. For this, I rely on experts.

This lack of expertise of the details may even turn out to be an advantage in my position. Technologies have core concepts that support some primary technology use case. With enough knowledge about technical details, it’s possible to use the technology to support other use cases. But it’s almost never a good idea in the long term. The implementation will be fragile to subtle changes in details of the technology, and few people in the organization have enough expertise to maintain such an implementation. If your use case is not supported easily, rethink your problem at some other level. Not knowing too much about the technology means I’m limiting its use to what’s maintainable.

The last example of this kind that I encountered was about container dependencies on openshift. In our current system (not openshift-based), we start the containers using “run levels”. Such a concept doesn’t exist in openshift. You could recreate something similar using init containers and some deep technical knowledge, but it wouldn’t be trivial. Rather than misusing the technology, we will have to solve our problem at another level. The containers should be resilient to random startup order at the application-level.

Other examples from the past include details of object-relational mapping or distributed caching. Here too I believe that beeing “too smart” about using the technology isn’t good in the the long term. It’s better to stick to the common use cases and change the design at the application-level if necessary.

Sometimes some part of the design may leverage deep technological details. I’m not completely forbidding it. Used in strategic places and encapsulated in proper technology component, this is a viable strategy. We have for instance engineered a library for leader election based on the solace messaging middleware, that relies on a deep expertise of the semantics of queues in solace. This was a strategic decision and the library is maintained by a team of technology experts with proper knowhow and the support from solace itself. But such situations should be exceptions rather than the norm.

It’s hard to resist the appeal of a clever technology-based solution, but as engineers we absolutely should. When we misuse technologies, we paint ourselves in a corner.

What Is It Like to Be a Robot?

In “Metazoa“, Peter Godfrey-Smith explores the rise of consciousness in animals – from simple multicellular organisms to invertebrates like us.

Consciousness is concept that’s not so easy to capture. It’s about a sense of self, about a perception of the environment and oneself, about a subjective experience of the world. When does an animal qualify as conscious? Godfrey-Smith postulates that consciousness is a spectrum, not something one has or doesn’t. The analogy he uses for this is sleeping, or the state right after waking up. We are conscious, but with a different level of consciousness as when fully awake.

The nature of consciousness can be explored by taking extreme positions:

  • can you be conscious without any perception of the environment (a “pure mind”)?
  • does reacting to what happens around you without any emotion qualify a conscious?
  • do you need to have a nervous system and feel pain to be conscious, or is having a mood enough?
  • could you be conscious, but act indistinguishably from as an unconscious animal?

I would have described consciousness as being aware of one’s own existence, something related to mortality, and rather binary. Godfrey-Smith equates consciousness more to having a sense of self and feelings, which makes it something less demarcated. He’s using consciousness more like “awareness“, whereas I would use it more like “self-awareness“. (That said, even self-awareness isn’t maybe so binary. Between being aware of deadly dangers and being aware of your own existence, it’s hard to say when we transition from instinct to consciousness.)

The book focuses on the relationship between senses and consciousness. Godfrey explains in the book how various animals sense the world and which kind of consciousness they might have. Some animals have antennas (Shrimps), some have tentacles (Octopus), some feel water pressure (fish). Many animals have vision, but the eye structure can differ. Some animals feel pain (mammals, fishes, molluscs) , but some don’t (insects) – it’s however not so clear to define when pain is felt or not. Not feeling pain doesn’t mean the animal is unaware of body damage, just like you don’t feel pain for you car but notice very well when something is broken when driving.

The book reminded me of “What it’s like to be robot?” from Rodney Brooks. This article, unsurprisingly, references the previous book from Godfrey-Smith “Other Minds”. The article from Rodney makes parallels between the perception of octopus and artificial intelligence systems. Many of the questions raised by Godfrey-Smith about the animal world can indeed be translated directly to the digital world. Computer systems have sensors, too. The have rules to react to inputs and produce outputs. They can learn and remember things, and develop an individual “subjective” perception of the world. They don’t “feel” pain, but can be aware of malfunctions in their own system. Does this qualify as a very limited form of consciousness?

The book touches at the end on the question of artificial intelligence, but very superficially. Rather than wondering whether an artificial intelligence could be conscious, he focuses on refuting the possibility of human-like artificial intelligence. His argument is basically that neural networks do model only a subset of the brain’s physical and chemical processes and can’t thus match human intelligence (there are other physical and chemical processes at play in the brain besides synapse firing). He also argues that an emulation of these processes still wouldn’t cut it, since it wouldn’t be the real stuff.

Artificial intelligence will not have a human-like intelligence, though. Each system (biological or digital) has its own form of intelligence. Because of his anthropomorphism of artificial intelligence, Godfrey-Smith doesn’t explore the alley of consciousness in AI systems much deeper. This is unfortunate, because with his consciousness-as-spectrum approach, it would have been an interesting discussion.

More

Wording Matters: Principles vs Practices

It struck me when reading Scaling the Practice of Architecture that people often use the term “principle” in a sloppy way:

There is a great deal I could write here about bad architectural principles but I’ll stick to the key aspects. Firstly, they are not practices. Practices are how you go about something, such as following TDD, or Trunk Based Delivery, or Pair Programming. This is not to say that practices are bad […] they’re just not architectural principles.

I’ve probable been using the term in a wrong way more than once. Principles don’t tell you exactly how to do something. They are just criterions to evaluate decisions. All things being equal, take the decision that fulfills the principle the most. Examples of well-known design principles are for instance

  • Single-responsibility principle
  • Keep it simple, stupid
  • Composition over inheritance

A practice, on the other hand, is a way of doing something. Examples of practices are:

  • Pair Programming
  • ​​​​​​​Shift left with CI/CD
  • Limit Work in Progress (WIP)

A lot of documents confuse the two. For instance, the SAFe Lean-Agile principle are actually mostly practices.

It could look like principles are for software design and practices are for software delivery. But you can have principles for software delivery, too. For instance, “maximize autonomy” could be a delivery principle. It doesn’t tell you how. It just tell you that if you have two options to design the organization, you should go the the one that maximizes autonomy. On the other hand, a software design practice could be to “model visually”.

Another confusion in this area come with another term similar to principles and practices: values. A value is a judgment of what we consider important. Usually they define behaviors and are then adjective (but “profit” could be a value yet isn’t an adjective). “Autonomy”, could be for instance a value. A value embodies implicitly the principle of favoring this value over others. For instance, if you value “autonomy”, you will automatically follow the principle “maximize autonomy”. If you adhere to a value, the corresponding principle comes for free.

Finally, there are “conventions” and “guideline”. Conventions tell you how to do things exactly and are mandatory. You can check if you adhere to a convention or not. This is unlike principles or practices, which have room for interpretation. A guideline is like a convention, but optional. Examples of convention or guidelines are:

  • Interfaces are versioned
  • Sanitize all inputs
  • Limit WIP to 3

Using a full example of value/principle/practice/guideline with in one area, we could have

  • value: resilience
  • principle: tolerate failures
  • practice: chaos testing
  • guideline: use tolerant reader

Granted, no matter how we try to distinguish the terms from one other, there will be some overlap in some cases. Natural language is messy. But I think it’s worth using the terms in the most appropriate ways if possible. It helps create a mental model that works. If you mix practices, principles, value and guidelines together, people might not notice immediately, but it creates a cognitive friction that makes it harder to actually apply underlying ideas.

SAFe: The Lean Mindset

An interesting aspect of the SAFe framework is that it tries to combine two agile mindsets. The first mindset is the iterative mindset of methods like Scrum. It’s a cornerstone of agile development and SAFe “scales” it from the team-level to the program-level, for instance with the PI Planning.

Another mindset in SAFe is the lean mindset. The lean mindset is not about iteration, but about optimising the flow of value.

Lean came initially from manufacturing where the goal is to (1) reduce the time to produce physical good, and (2) reduce the “inventory” needed in the process, and (3) reduce the “waste” produced during manufacturing. In manufacturing, managing inventory requires warehousing and logistics, this costs money. Materials that end up as waste cost money too but do not produce value. To reduce delivery time, each step in the delivery process must be optimised and wait time be reduced to the minimum.

These ideas can be translated to the software world if we consider that features under development are “inventory” and the development process is a pipeline that can be optimised. Features under development are “inventory” since they don’t produce value but must be managed. Waste is a bit harder to map but it represents all the unnecessary work that end up not being used (think of unused design document, analysis, etc.). The development pipeline can take many forms but is always a variation of define, build, verify, and release. The quicker a feature can transition in the pipeline the faster you produce value.

Lean in itself doesn’t require iteration. Iterations are needed to manage uncertainty and course-correct the product development in the face of new information. Lean is about optimising a delivery process. But the delivery process could be about the delivery of a similar item every time, like cars in the manufacturing world.

But Lean is also a great complement to iterative approaches like Scrum. In this case, the goal of the lean mindset is in a way to optimise the iteration speed. Rather than having several features with long delivery time, focus on few features and short delivery time.

SAFe emphasises the lean mindset with concepts like the continuous delivery pipeline and value stream mapping. Besides presiding over the process, the RTE are also charged to improve the flow of value in the organisation.

The lean mindset isn’t as established as the iterative mindset. I find it interesting that SAFe integrates it and promotes it. We conducted a value stream mapping session at work, and it was very enlightening. Thinking in waiting time, inventory, waste does indeed work in the software world, too.

It’s a simple way to highlight process and organisational issues. It gives clarity to what should be optimised and not get lost in organisation design. Chances are, if you want to reduce waiting time, you will have to solve a bunch of other problems first. The lean mindset positions these problems not as end in themselves, but as bottlenecks to short delivery time. It helps you prioritize these problems. It’s a bit like Test-driven Development (TDD). Making things testable requires that you figure out a good design first. But assessing testability is easier than assessing “good design”. In the case of Lean, minimising “waiting time” requires that you figure out a good organisation first, but measuring “waiting time” is easier than measuring “good organisation”.

Superficially Silly Ideas Can be Game-Changers

When Twitter appeared more than a decade ago, I though it was silly. I saw little value in a service that only allowed sharing 140-character long text messages. I registered on a bunch of social media platforms and created nevertheless a twitter account. Some years later, the only social media platform I’m actively using is… twitter.

There’s a lesson here for me and it’s that it’s hard to predict what will succeed. A lot of products can appear silly or superficial at first. They may appear so in the current time frame, but this can change in the future. Initially, twitter was full of people microblogging their life. It was boring. But it morphed in a platform that is useful to follow the news.

A startup like mighty can look silly now – why would you stream your browser from a powerful computer in the cloud? But as applications are ported to the web, maybe the boundary between thin client and server will move again.

We prefer to endorse project that appear profound and ethical, like supporting green energy, or reducing poverty. Product ideas that are silly or superficial don’t match these criterion and it’s easy to dismiss them. But innovation happens often because of such products. No matter how silly or superficial you think they are, if they gain traction, they need to solve their problem well at scale. These products are incubators for other technologies that can be used in other contexts. Twitter, for instance, open sourced several components. If Mighty gains traction, it might lead to new protocols for low-latency interactive streaming interfaces. An obvious candidate for such a technology could be set-top TV boxes.

These products might appears superficial at first and might lack the “credibility” of other domains, but here too, the first impression might be misguiding. A platform like twitter can support free speech and democracy (sure, there are problems with the platform, but it at least showed there are other ways to have public discourse). A product like Mighty might in turn make it more affordable to own computers for poor people, since it minimizes hardware requirements. Because these product don’t have an “noble” goal initially attached to them, doesn’t mean they don’t serve noble cause in the long term.

There are of course silly ideas that are simply silly and will fail. But the difference between products that are superficially silly and truly silly is not obvious. I took in this text the example of twitter and mighty. In retrospect, the case for twitter is clear. For mighty, I still don’t know. The idea puzzles me because it’s at the boundary.

More

Update 13.11.2022