11 Reasons Why I Hate XML

… at least in Java.

1 – Namespace and import

XML is only apparently simple. As soon as namespace are used, it immediately gets complicated.  What is the difference between targetNamespace=”…”, xmlns=”…” and xmlns:tns=”…” ? Can I declare several prefixes for the same namespace? Can I change the default namespace from within a document? What happens if I import a schema and rebind it to another namespace? How do I reference an element unambiguously? Ever wondered how to really create a QName correctly? Ever wondered what happens if you have a cycle in your dependencies?

2 – Encoding and CDATA

XML encoding and file encoding are not the same.  This is a huge source of troubles. Both encoding must match, and the XML file should be read and parsed according to the encoding specified in the XML header. Depending on the encoding, characters will be serialized in a different way, again a huge source of confusion. If the reader or writer of an XML document behave incorrectly, the document can be dangerously corrupted and information can be lost. Editors don’t necessary display the characters correctly, while the document may be right. Ever got a ? or ¿ in your text? Ever made a distinction between &amp; and & ? Ever wondered whether a CDATA section was necessary or if using UTF-8 would be ok? Ever realized that < and > can be used as-is in attributes but need an encoding within a tag?

3 – Entities and DOCTYPE

Somehow relates to #2, but not only. XML entities are a generic way to define variables and are declared in the DOCTYPE. You can define custom entities; this is rather unusual but still need to be supported. Entites can be internal or external to your XML document, in which case the entity resolving might differ. Because entities are also used to escape special character, you can not consider this as an advanced feature that you won’t use. XML entities needs to be handled with care and is always a source of trouble. For instance, the tag <my-tag>hello&amp;world</my-tag> will trigger 3 characters(...) events with SAX.

4 – Naming convention

Ever wondered whether it was actually better to name your tag <my-tag/>, <myTag/> or <MyTag/>? The same goes for attributes….

5 Null, empty string and white spaces

Making the difference between null and empty string with XML is always painful. Null would be represented by the absence of the tag or attribute, whereas empty string would be represented with an empty tag or empty attribute. The same problem appears if you want to distinguish empty list and no list at all. If not considered clearly upfront (which is frequently the case), it can be very hard to retrofit clearly this distinction in an application.
Whitespace is another issue on its own. The way tabs, spaces, carriage return, line feeds are processed is always confusing. There are some options to control that, but it’s way too complicated for most of the usage. As a consequence, sometimes these special characters will be encoding in entities, sometimes embedded in CDATA and sometimes stores as-is in the XML.

6 – Normalization

XML encryption and signature look fine on paper. But as soon as you dig in the spec, you realize that it’s not so easy because of the syntactic and semantic equivalence of XML document. Is <my-tag></my-tag> the same as <my-tag/>? To solve this issue, XML normalization was introduced which define the canonical representation of a document. Good luck to understand all the subtleties when considering remarks #1, #2,  #3 and #5.

7 – Too many API and implementations

Even if stuffs improved in this area, there are too many API and implementation available. I wish there was one unified API and one single implementation sometimes…Ever wondered how to select a specific implementation?  Ever got a classloader issue due to an XML library? Ever got confused whether StAX was actually really better than SAX to read XML documents?

8 – Implementation options

Most XML implementations have options or features to deal with the subtleties I just describe. This is especially true for namespace handling. As a consequence, you code may work on one implementation but not on another.  For instance, startDocument should be used to start an XML document and deal with namespace correctly. The strictness of the implementations differs, so don’t take for granted that portability is 100%.

9 – Pretty printing

There are so many API and frameworks that it’s always a mess to deal with pretty printing, if supported by the framework.

10 – Security

XML was not designed for security. Notorious problems are: dangerous framework extension, XML bomb, outbound connection to access remote schema, extensive memory consumption, and many more problems documented in this excellent article from MISC. As a consequence, XML document can be easily abused to disrupt the system.

11 – XPath and XSLT

XPath and XSLT belong to the XML ecosystem and suffer the same problems as XML itself: apparent simplicity but internal complexity. I won’t speak here about everything else that surrounds XML and that forms the big picture of the XML family specifications. I will just say that I recently got a NPE in NetBeans because “/wsa:MessageID” was not ok but using “/wsa:MessageID/.” was just fine.  Got the point?

OpenESB: Invoke an Asynchronous Web Service

I was contacted last week to know if I had actually integrated an asynchronous web service in OpenESB, as promised in a previous post. The NetBeans SOA package is sometimes a bit obscure, though there are some explanation about the examples. I took a bit of time to dig this out, and here is then the promised follow-up (except that I won’t use WS-Addressing). I will use

  • OpenESB bundled with Glassfish
  • NetBeans to author the BPEL process
  • SoapUI to test the process

What we want to get

The BPEL process that will be created is a synchronous BPEL process, which calls an asynchronous web service using a correlation set to “resume” the process when the asynchronous response is received. The scenario is not very realistic – a BPEL process that calls an asynchronous WS will itself be asynchronous most of the time. The asynchronous WS may indeed take arbitrary long to respond; the client of the BPEL process would probably time out in this case.  This example suffices however to show the underlying principles.

  • The BPEL process is synchronous
  • But it calls an asynchronous WS service
  • We use correlation-set for request/response matching

The BPEL process that we want to obtain at the end is shown below:

Create the PartnerLinks

One or two PartnerLinks?

Communication to and from the asynchronous web service can be realized using a single partner link with two ports or using two partner links with one port each.
From point of view of BPEL an asynchronous request/response is indeed no more than a pair of one-way messages. The request/response matching will anyway be done using correlation set.

As a consequence, the messages can come from “anywhere” and there is therefore not need to have one single partner link. I found it easier to have 2 partner links so that all ports on the left side are the one exposed by the process, and all ports on the right side are the one consumed by the process.

WSDL with one-way PartnerLink

PartnerLinks can be defined in the BPEL process or in the WSDL of the web service. NetBeans has a nice feature to create the PartnerLink in a WSDL therefore I chose to define them there.

A one-way web service is a web service which defines only <input> or <output>. I therefore took the WSDL of my previous post and simply removed the <output> tags so that they become one-way web service. (I also removed anything related to WS-Addressing as it’s not used here).

The PartnerLink can then easily be created with NetBeans using the “Partner” view in the WSDL. The two WSDLs then looked like this:

 

Create the BPEL process

Add the PartnerLink

Now that the WSDL files of the asynchronous web services are ready, I create a new BPEL process. I then added the following PartnerLinks:

  • AsynchronousSampleClient from the SOA sample bundled with NetBeans
  • AsyncTestImplService created previously
  • AsyncTestResponseImplService create previously

Wire the request/response

Then I wired the request/response as follows. I relied on NetBeans variable creation for each <invoke> or <receive> activity. I therefore got the following variables:

  • ResponseIn
  • SayHelloIn
  • OperationAIn
  • OperationAOut

Assign the variables

For the purpose of this example I assign the variable between the message like follows. Note that this example make no sense from a business point of view.

 

Define the correlation set

A receive activity within the process should be assigned a correlation set. The BPEL engine is otherwise unable to match the request/response and resume the right process instance.

I defined a correlation set “correlation” which would use the property “correlationProp”.  A correlation property is a value that existing in different message and that can be used to match messages together. The property itself is a “symbolic” name for the value, and the corresponding element in each message is defined using so-called property aliases.
I then added two aliases, one in each WSDL file, and defined how “correlationProp” would map in the “sayHello” and “response” message respectively.

The process can then be built without warnings.

Deployment

The endpoint ports

The process defines 3 ports that can be changed according to your need. In this example the expected endpoints are:

The corresponding WSDL can be obtain by appending  “?wsdl” in the URL.

Note that the address for the asynchronous callback is not passed as parameter from the BPEL process, but should be hard-coded in the web service implementation. It would however be very easy to pass the callback address as an extra parameter so that the asynchronous web service is entirely decoupled of the BPEL process.

Build

Rebuild the process to take the latest change in the port URL.

Create the composite application (CA)

The BPEL process cannot be deployed as-is. You will need to embed the BPEL process into a composite application, which is a deployable unit. This is very easy to do:

  1. Create a new project of type composite application.
  2. Drag and drop the BPEL project onto the Service Assembly
  3. Rebuild the composite application

All that is necessary will be created automatically during the build. After the build is complete, NetBeans will refresh the Service Assembly and it looks then like this:

Deploy

Go in the Glassfish console and deploy the service assembly produced in the previous step.

Import WSDL in SoapUI

Start SoapUI and import the 3 WSDL.

Mock the asynchronous web service

Now that the 3 WSDL have been imported, we will create a mock for the asynchronous web service.  This way we can verify if the BPEL process call the asynchronous web service correctly and we can send the callback response manually.

Select the WSDL “AsyncTestImplPortBinding”,  and right-click “Generate Mock Service”. Make sure to use

  • path = /AsyncTestImplService/AsyncTestImpl?*
  • port = 8888

So that it matches the port that the BPEL process will use.

Make sure to start the Mock, in which case SaopUI displays “running on port 8888” at the top-right of the Mock window. The project looks like this:

Test

1 – Invoke BPEL process

Send the following SOAP message to the BPEL process (located at http://localhost:18182/AsynchronousSampleClient):

<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"

xmlns:asy="http://enterprise.netbeans.org/bpel/AsynchronousSampleSchemaNamespace">
<soapenv:Header/>
<soapenv:Body>
<asy:typeA>
<paramA>dummy</paramA>
<id>123</id>
</asy:typeA>
</soapenv:Body>
</soapenv:Envelope>

2 – Receive the asynchronous invocation

When the Mock service the asynchronous message it displays something like “[sayHello] 4ms”. The message can be opened and should look like:

<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Body>
<sayHello xmlns:msgns="http://ewe.org/" xmlns="http://ewe.org/">
<arg0 xmlns="">123</arg0>
</sayHello>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

3 – Send the callback manually

We simulate manually the behavior of the mock service and send the following message to the callback endpoint (http://localhost:18182/AsynchronousSampleClient/response):

<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:ewe="http://ewe.org/">
<soapenv:Header/>
<soapenv:Body>
<ewe:response>
<!--Optional:-->
<arg0>123</arg0>
</ewe:response>
</soapenv:Body>
</soapenv:Envelope>

4 – Get the synchronous reply

So far, the SOAP request of step #1 was still waiting to receive the synchronous response.  After the callback has been sent, the BPEL engine should resume the right instance of the BPEL process (using the correlation value “123”), which should then terminate.

SoapUI will display the time taken for the request/response which will then be something like “response time: 5734 ms”. The time will of course depend on how long you took to perform step 2 and 3. (Note that after some time, the request will timeout if you really take too long to do these steps.)
The SOAP response message should look like:

<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Body>
<typeA
xmlns:msgns="http://enterprise.netbeans.org/bpel/AsynchronousSampleClient"
xmlns="http://enterprise.netbeans.org/bpel/AsynchronousSampleSchemaNamespace">
<id xmlns="">123</id>
</typeA>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

Conclusion

This example as-is make little sense from a technical and business point of view; I wish I had also used more meaningul names for the various elements. It however shows the principle of asynchronous web service invocation using OpenESB. The adaption of this example for meaningful use cases should be relatively simple. It’s a matter of changing the message types and assignment rules.

Taming Size and Complexity

The only real problem with modern software is size and complexity. If we had a bigger brain able to apprehend and reason about software as a whole without omitting details, we wouldn’t have that many issues. Unfortunately, our mental abilities are limited, and as a consequence, we need to have ways to build software whose complexity is beyond our own analytical power. The same is true for any large scale engineering initiative. Apart from discipline, which is a prerequisite to manage size and complexity, traditional ways to address size & complexity are: abstraction, automation and intuition.

Abstraction

The traditional way to address complexity is to raise the abstraction level. Get rid of details and stay focused on the essential – complexity goes away. You can then reason on different parts at various abstraction levels independently of each other. This is the ground-breaking argument about any modeling effort or methodology. The problem is that the whole is not equal to the sum of its parts. Unforeseen interactions will emerge resulting in a myriad of potential problems.  An other major problem is the traceability of the different parts.

Automation

The traditional way to address size is through automation. A lot of task that we perform are not tedious due to their very nature, but due to the effort they demand. Our concentration is also limited which implies we will make mistakes. Automation leads then not only to higher productivity but higher quality.  There are too many examples of automated task, but code formatting and refactoring fall for instance into this category.  Even though automation is extremely effective for specific task, automation is also impacted by the complexity of the software to produce. State explosion is for instance one of the main problems of a technique such as symbolic execution.

Intuition

Actually, problem solving implies not only strong analytical skills but also some form of intuition. The same goes with software and program understanding. Software exploration and visualization are powerful techniques to reason about abstract information in an intuitive way. Software is intangible and has by consequence no natural representation – this leaves the door open for new visualization metaphors. Examples of interactive visual development technologies are BPEL workflow, DSM, or polymetric views.

A Simple Categorization of Unit Tests

Unit testing has become incredibly popular during the past years. As a consequence I sometimes feel like we’ve lost the focus on why we write unit test and what we expect as a return on investment.

The success of unit testing probably comes from the simplicity of the appraoch. However, we’ve learned since then that unit testing is not so easy neither. A high percentage of code coverage does not necessary mean quality software, doeasn’t mean that border cases have been covered, and can even impede software maintenance if the test suite is poorly organized.

Don’t get me wrong, I do value the benefit of unit testing.  But unit testing for the sake of unit testing has no value to me. If you think an other form of automated testing will be a most rewarding strategy, then do it. If you go for unit tests, the important questions to ask are:

  • Did the unit tests revealed bug in the production code?
  • Did you organise the unit test in a meaningful way?
  • Did you spend time to identify and test border cases?

A clear “yes” to these three questions indicates an intelligent and probably rewarding testing effort. A pertinent test suite should reveal the bugs that you introduce in the system — and don’t pretend you don’t introduce any. A well-organized test suite should give you a high-level view of the features in the module. A well-crafted test suite should cover the nasty use cases of the system before they hurt you when the system is in production.

There is an abundant literature about unit testing, but nothing that I read seemed to cover the reality of the unit tests that I write. I therefore analysed the nature of my own unit tests and came with a personal categorization which differs from existing one. Here is the 4 categories I’ve identified as well as a short description of the pro/cons of each kind.

Basic unit tests

A basic unit test is a test suite which covers one unique class and test each method in a more-or-less individual fashion. This is the core idea of unit test where each tiny functionality is tested in full isolation, possible with the help of mock objects to break the dependencies. As a result, the test should be repeatable and also independent (of the environment and of other modules).

Problem with real unit tests are:

Unit test with indirect assertions

For basic unit tests, the subject under test (SUT) is the very one for which we assert the behavior. We perform an action on the SUT and ensures it behaves correctly. This is however sometimes not possible as soon as the system become a bit more complicated. As a consequence, the actions are performed on the SUT, but we rely on a level of indirection for the assertions; we then assert the behavior of another object than the SUT. This is for instance the case when we mock the database and want to ensure that  the rows are correctly altered.

  • Coupling with the implementation is still high
  • Fake objects migth be used instead of Mocks — they contain state and aren’t purely hard-code anymore
  • Behaviour of the SUT is harder to understand due to the level of indirection

Inflection point unit test

Inflection points — a term coined by Michael Feather if I’m right — are somehow the entry points to a given software module. For utility libraries or services, the inflection points correspond to the public API. Testing these specific points is the most rewarding strategy to me, and other people think the same.

  • Inflection points are less subject to changes, and are closer to a black-box form of testing
  • Tests become the first client of the interface and give you a change to check if the API is practical
  • After having covered all the common use cases of the inflection point, the test coverage of the underlying classes should be close to 100%. If not, this indicates a potential design weakness or useless code.

Dynamic unit tests

I qualify tests as “dynamic” when their execution change from one run to the other. Primary goal of such test is to simulate the dynamicity and variability of the productive system. Such test are however quite far away from the concept of basic unit tests and could be considered to some extend as integration tests; they are however still repeatable and independent of other modules. Execution in the real system may indeed be aftected by either

  • Threading issues
  • Contextual data, e.g. cache
  • Nature of the input

Most of these tests rely on randomization, for instance to generate input data or disrupt the scheduling of threads. Though it’s more complicated,  randomization and fuzzing have been proved as effective techniques to detect issues which would never arise with fixed execution condition. Think for instance about phase of the moon bugs and date problems.

Abstraction-Level vs. Meta-Level

As per wikipedia, a model is a pattern, plan, representation (especially in miniature), or description designed to show the main object or workings of an object, system, or concept. In the case of software engineering, models are used to represent specific aspects of the software system, such as static structures, communication paths, algorithms, etc.

Models can be created along two axis:

Abstraction level

An abstraction level is a way of hiding the implementation of a particular set of functionality. Both representations are however two descriptions of the same reality. The representation can vary in their level of detail or in their expressiveness. Example

C code – assembler code.
Both representation describe a sequence of operation in an imperative way, but C is much more expressive than assembler.

Java code – sequence diagram.
Both representations describe the run-time behavior of the system. The sequence diagram does however frequently omit details of the call flow.

Meta-level

A meta-level (and meta-model) is way to highlight the properties – the nature – of a particular set of elements. A meta-model describes then the constraints or the semantics to any set of such elements.

Object – class.
In OO technologies, object and classes belong to two different meta-level. The class is a representation of the structure (constraints) that any of the instance will fulfill.

XML – DTD.
An XML file can be validated against a DTD, which is not a representation of the particular file, but of the structure (constraints) that the file must fulfill.

Tower of models

Models themselves can be refined towards either higher abstraction level or meta-levels.

Component diagram – class diagram – Java code.
All three artifacts are representations of the same reality at various level of refinement and reasoning.

XML – Schema – Meta schema.
In this case, each model is meta description of the underlying one. XML Schema themselves must conform to the schema http://www.w3.org/2001/XMLSchema.

As with the XML example, the tower of meta-model generally ends up in model specified in itself. That is, a recursion is introduced to stop the tower of meta-model from growing infinitely.
The term “modeling” refers commonly to “raising the abstraction level”; the term “meta-modelling” is used otherwise. Different tools/technologies don’t necessary represent different meta-levels. Inversly, one technology may be used to address different meta-level.
With respect to OO technologies, several UML diagrams and notations work at different meta-level. Object diagram and sequence diagrams describe the interaction between specific instances of objects. Class diagram specifies constraints (such as cardinality) or semantical relationship between objects (such as aggregation). Stereotypes can then be used to specifies constraints (such as “singleton”) or semantics (such as “entity”) to the class itself. For instance, an “entity” stereotype could be defined to flag all classes with an attribute “primary key” of type integer.
Level \ Technology

Language

UML
L0 Object  Object diagram,  Sequence diagram
L1 Class Class diagram
L2 Stereotype

Examples where meta-models are useful

Having three meta-level provides one extra level of indirection that can be very practical in some cases:

Level

XML transformation

OO refactoring

Annotation processing

L0 – information Data Object Class annotation (e.g. @Entity)
L1 – model Schema Class Annotation processor (e.g. JPA)
L2 – meta-model Meta schema Metaclass Classloader

Here are three common examples which actually leverage the strength of meta-model.

XML transformation

The success of XML is related to the existence of a meta-model which makes it a semi-structured format. Structured data are interpretable only by their host application which is the only one to know the model. Unstructured data (plain text, image, etc.) have not model and are not interpretable at all. Semi-structured data have a model which is itself understandable by any other application which knows the meta-model. Due to this extra-level, reusable parser, validator, encryptor or even arbitrary transformer (think XSLT) could be created. The schema could also be extended without necessary breaking backward compatibility (think XPath).

OO refactoring

Refactoring rules is a typical example of the usage of the OO meta-model: refactoring rules are proved to preserve the validity of the model against the meta-model, and are reusable for any model.

Annotation processing

Annotation processing is yet another example where a meta-model is used under the hood. Just like the class-loader knows how to load a class because of the meta-model of any class, it is also able to load the corresponding annotations. This reliefs the application from storing application-specific semantics in separate files (think deployment descriptor). The application (annotation processor) can then query the annotation at run-time to refine the behavior of the system. The annotations enrich classes with additional constraints and behaviour in a resuable way: the same annotation processor can be used for arbitrary class sets.

Pharo By Example

This is an introduction book to Smalltalk and the Pharo development environment. The book is split in two parts: the first one covers the Smalltalk language, the Pharo IDE, the Morphic package for GUI application and the Seaside framework for web application. The second part is relatively thin and covers more advanced topics such as reflection.

The book is easy to read, with many examples to follow, and the reader will quickly acquire a good comprehension of Smalltalk. The object-oriented concepts specific to Smalltalk are however rigorously discussed. A good example would be the paragraph about class variables in case of a class inheritance: it’s easy to follow in the book but still an OO concept that could be otherwise misunderstood. These paragraphs are the most important one for people (like me) coming from other OO languages.

The GUI and web frameworks are briefly covered but enough to get a idea of how to they work. Both are component-based, and a concrete component is built from A to Z in both cases to get insight about the frameworks.

The second part about reflection is where the beauty of Smalltalk shines. First the concept behing meta-object and reflection are presented and then a few useful patterns which use reflection are described, for instance how to build a dynamic proxy in Smalltalk. This is clearly the most interesting part, where the dynamic nature of Smalltalk can be exploited in a constructive way.

Software Evolution

Software evolution studies software engineering under the perspective of software maintenance and change management. The term was coined in the 70 after the seminal work of Lehman and his laws of software evolution.

As such, software evolution is a topic as broad as software engineering itself – any aspect of software engineering (being a process, methodology, tool, or technique) can indeed be studied under the perspective of software evolution.

The book is a comprehensive overview of the current body of knowledge in software evolution. It contains 10 chapters addressing each one a specific field of software engineering under the perspective of software evolution. The articles present recent works and open challenges; the book can then be seen as comprehensive survey as well as a research agenda. The book bibliography is impressive and more than 500 articles are referenced throughout the 10 chapters.

Apart from this selection of articles, the book has an excellent introduction the field of software evolution, and a set of useful appendices pointing to additional resources in this field. The 10 chapters are organized into 3 parts:

Part I: Program understanding and analysis

Identifying and Removing Software Clones
Analysing Software Repositories to Understand Software Evolution
Predicting Bugs from History

Part II: Re-engineering

Object-Oriented Reengineering
Migration of Legacy Information Systems
Architectural Transformations: From Legacy to Three-Tier and Services

Part III: Novel trends in software evolution

On the Interplay Between Software Testing and Evolution and its Effect on Program Comprehension
Evolution Issues in Aspect-Oriented Programming
Software Architecture Evolution
Empirical Studies of Open Source Evolution

Chapters I enjoyed the most were “Object-oriented reengineering”, “On the interplay between…” and “Evolution Issues in AOP”.

I wish there was a chapter about evolutionary issues in model-driven engineering, which is an important area of research. I therefore would recommend “Model-Driven Software Evolution: A Research Agenda” as a complement to this book.

The book is accessible to non-expert and I learned a lot while reading it. The book is definitively worth looking at for anyone interested to understand what “software evolution” is all about.

Quotes

… mostly on software engineering, but not only.

“How does a project get to be a year late?…One day at a time.” — The Mytical Man Month

“Complexity kills. It sucks the life out of developers, it makes products difficult to plan, build and test. […] Each of us should […] explore and embrace techniques to reduce complexity.”  — Ray Ozzie, CTO, Microsoft

“Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away.” — Saint-Exupéry

“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.” — D. Knuth

“Beware of bugs in the above code; I have only proved it correct, not tried it” — D. Knuth

“Research is less about discovering the fantastic, as it is about revealing the obvious. The obvious is always there. It needs no discovery. It only needs us to take a look from a different perspective to see it.” — On T. Girba’s blog

“It is impossible for any number which is a power greater than the second to be written as a sum of two like powers. I have a truly marvelous demonstration of this proposition which this margin is too narrow to contain.” — P. Fermat

“These are my principles. If you don’t like them, I have others.” — Groucho Marx

“if all you have is a hammer, everything looks like a nail…”  — B. Baruch

“Discipline is the bridge between goals and accomplishment.” — Jim Rohn.

“If I am to speak ten minutes, I need a week for preparation; if fifteen minutes, three days; if half an hour, two days; if an hour, I am ready now.”  —  Woodrow Wilson

“The devil is in the details” — German proverb quote

“The first idea is always wrong” — Popular wisdom

“What can go wrong will go wrong” — Murphy’s law

“The nice thing about standards is that there are so many of them to choose from.” — Andrew S. Tanenbaum

“The manager asks how and when; the leader asks what and why.” — Warren Bennis

“Experience is what you get when you don’t get what you want.” — Dan Stanford

“A mission is something that you do in a bounded amount of time with bounded resources. A purpose is something that you never achieve.” — William Cook

“Any fool can write code that a computer can understand. Good programmers write code that humans can understand.” — Martin Fowler

“The competent programmer is fully aware of the strictly limited size of his own skull; therefore he approaches the programming task in full humility, and among other things he avoids clever tricks like the plague.” — E.W. Dijkstra, The Humble Programmer

There are wavelengths that people cannot see, there are sounds that people cannot hear, and maybe computers have thoughts that people cannot think” — Richard Hamming

“Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.” — Brian Kernighan

“If others would think as hard as I did, then they would get similar results.” — Newton

“Rules of thumb are sometimes what their acronym infers…ROT” — unknown

“Any intelligent fool can make things bigger and more complex. It takes a touch of genius – and a lot of courage – to move in the opposite direction.” — Albert Einstein

“When I use a word,” Humpty Dumpty said, in a rather scornful tone, “it means just what I choose it to mean — neither more nor less.” — Lewis Caroll, Through the Looking Glass

“There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult.” — Tony Hoare

“He who hasn’t hacked assembly language as a youth has no heart. He who does as an adult has no brain.” — John Moore

“If you can’t explain it simply, you don’t understand it well enough” — A. Einstein

“There are two groups of languages: those which everyone hates and those which nobody uses.” — Unknown

“Any problem in computer science can be solved with another layer of indirection” — David Wheeler

“Idiomatic usage often matters more than what is strictly possible.” — A comment from Stuart Halloway

“You learn lisp for the same reason you learn latin. It won’t get you a job but it will improve your mind” — Paul Graham

The Internet is the world’s largest library. It’s just that all the books are on the floor.” —John Allen Paulos

“There are only two hard things in Computer Science: cache invalidation and naming things” — Phil Karlton

Some abstraction is great, but eventually you need to write code that does something useful.” — Bill Karwin

“Tell me, I forget. Show me, I remember. Involve me, I understand.” — Chinese proverb.

Don’t reinvent the wheel unless you want to learn about wheels.” — Popular wisdom

“I’m sorry this letter is so long, I didn’t have time to write a shorter one”  — Mark Twain

“To be trusted is a greater compliement than to be loved” — George Mcdonald

“Most people work just hard enough not to get fired and get paid just enough money not to quit” — George Carlin

“Find your obsession, make it your profession, and you’ll never work a day in your life” — Unknown

Plans are nothing; planning is everything.” – Dwight D. Eisenhower

“If you set your goals ridiculously high and it’s a failure, you will fail above everyone else’s success.” — James Cameron

“A human being should be able to change a diaper, plan an invasion, butcher a hog, conn a ship, design a building, write a sonnet, balance accounts, build a wall, set a bone, comfort the dying, take orders, give orders, cooperate, act alone, solve equations, analyze a new problem, pitch manure, program a computer, cook a tasty meal, fight efficiently, die gallantly. Specialization is for insects.” — Robert A. Heinlein

“Tools can enable change in behavior and eventually change culture” — Patrick Debois

“First you learn the value of abstraction, then you learn the cost of abstraction, then you’re ready to engineer.”@KentBeck

“Simplicity does not precede complexity, but follows it.” — Alan Perlis

“The most important thing in communication is hearing what isn’t said.” – Peter F. Drucker

“Computer science is now about systems. It hasn’t been about algorithms since the 1960’s.” – Alan Kay

[Some people] just can’t wrap their head around that someone might build something because they like building things. -– Mark Zuckerberg in an inteview

“So much complexity in software comes from trying to make one thing do two things.” – Ryan Singer

S*** My Domain Name Has Expired

I had the bad surprise last week to notice that my domain name had expired. Like many other before me, I realized then that the domain name business is aggressively money-driven and that many companies try to make profit out of domain name registration.

In my case, the domain name had first expired, but because I was in vacation and couldn’t do much from there, it had then moved after 30 days into the status “redemption period”. I knew about a few status but not about the complete list of status. You still have a chance to renew the domain name while in status “redemption period”, but it costs you more! Normally, the domain name should go back to public after the redemption period is over. Unfortunately, there are many companies watching the soon-to-be-expired domain names, and they systematically buy them. There are also some affiliation between registrar and such companies, which means there is apparently little chances that you can buy it again after it has expired completely. You can place a “back-order” on these companies website, but again, it will cost you more.

I was a bit disgusted by the whole process, and had no other way than to renew it at a much higher price than normal. Lesson learned: make sure you enable “automatic renewal” on your registrar website.

Here is a portion of the chat I had with the guy at register.com:

Chat log
me: Hi, I have a question about DNS renewal.
support: Ok I can help you with that, what is the domain name please ?
me: My DNS recently expired while I was on vacation. When I came back, I tried to renew it, but unfortunately my credit card had expired as well (bad luck). Now that I have updated the information for my credit card, the DNS move into “redemption-period” and I can’t renew it. The DNS was “XXXXXXX”
support: Ok thank you. Just a heads up.. DNS is Domain Name Server. What you have is a Domain Name. Thanks. I will just be a second to bring up that account.
me: Yes, sorry, I mean DN.
support: Not a problem.
support: Your domain name’s status is currently: Redemption Period
me: Yes. That’s what I obtained with WhoIs. Is there a way to renew it?
support: This means that the domain name has gone back to the registry. I can still however purchase the name back for you, however the rates are registry rates and higher then the normal renewal cost and the rates are non negotiable due to the domain not being with our company.

1 Year is $120.00
2 Years is $145.00
3 Years is $170.00
5 Years is $179.00
6 Years is $205.00
7 Years is $240.00
10 Years is $250.00
me: Will the domain goes back to “public” if I wait longer. I could then register it again with normal price?
support: Eventually, but it may go to an auction or someone may have back-ordered the domain name. This is a very risky and touchy time with a domain name. Redeeming the domain at the registry price is your last and only real chance of getting this domain name.
support: There are hundreds of “just dropped” sites that email all there clients all the domains that expired and was released that day, so your domain name will be view by hundreds of people as soon as it drops publicly.
me: What you mean is that there are some companies that systematically buy expired domain in the hope to sell it back for a higher price?
support: Correct.
support: This is a very common practice.
me: What I don’t get is who own the “registry” and the domain name right now.
support: That would be ICANN
support: They are the owner of all domain names.
me: But they are not doing any business on their own…
support: I’m not sure what you mean.. they hold the accreditation for all registrar’s. Without an ICANN accreditation you cannot legally sell domains as a company.
me: I mean, who fixed the prices you sent me? ICANN?
support: The expiration date for the domain name “XXXXXX” has past, as well as our 30 day administrative grace period during which a renewal of a domain name may be permitted. Accordingly, we have submitted the domain name to the Registry for deletion. The Registry has placed this domain name in a ‘Redemption Grace Period’, which provides the Registrant one last opportunity to ‘redeem’ or reclaim the domain name before it is made available for public registration on a first come, first served basis.

Once the domain name is placed on redemption status by the Registry we incur additional expenses in reinstating the domain name, which are in reflected in the redemption fee.

me: Ok, so the “recovery” price is registrar-specific but higher than normal renewal because of the “extra work”. And yours starts at 120 USD.
support: We charge more because we have to pay more to the registry. We do not own domain names but purchase from the registry when you purchase from us. This is not $120.00 profitable dollars for our company.
me: If I reinstall the DN for $ 120, am I able to renew it then on a yearly-based at the regular price of $ 35?
support: Correct.
support: Did you want to reinstate the domain ?
me: Yes, I will reinstall the domain. Just leave one minute to decide whether I take 1 year or more. I don’t want to be in the same situation next year. And it will cost me more money to renew it on a yearly basis.
support: You are far better off going longer then sooner, the farther out you go the less of a chance it will happen and the better savings you will get.
me: Ok, so I would like reinstall the domain for a period of 5 years at the price of $ 179.
support: Sounds great, that’s the best choice.
support: Perfect.
support: This is going to take up to 72 hours to redeem the domain back into your account. This requires a special department to redeem. You will get an email notification once the domain has been redeemed.
support: Was there anything else I can help you with today ?
me: No that was all. Thanks a lot.
support: Did you want me to put the privacy protection back on the domain once in the account ?
me: Is it still $ 11 per year?
support: It would be $45.00 for a 5 year term.
me: In this case, please include “privacy protection” as well.
support: Ok I can do that for you. Did you want me to re-create the email account as well ?
me: No. Not for the time being. So far I remember it was relatively expensive and and I can’t afford a 5 year period for the mail.
support: Ok I will get this redeemed for you as soon as possible. Was there anything else I can do for you today ?
me: I see no other points.
support: Have a great day.

Glassfish mysteries #5: transaction recovery

Here are all posts of this serie on Glassfish.

There is little information available on the web about Glassfish transaction recovery. Transaction recovery is indeed something that should be very rare.

Some background

Such a recovery is necessary only if a problem (typically a crash) occurs while the transaction manager is performing the 2-phase commit protocol. If a problem happens before the 2PC protocol starts, the transaction will timeout and be rolled back automatically. If the problem appears during the 2PC protocol, the situation is a bit more complicated: one branch may be prepared the other not, or even worse, one branch may be comitted and the other not. A distributed transaction in such a state is frequently called “in-doubt” in the litterature.

The 2PC is supposed to be a fast operation, so the probability of in-doubt transaction is supposed to be also very low. It nevertheless can happen, and in this case, the distributed transaction must be recovered. This means that the transaction manager will attempt to complete the 2PC protocol based on his own transaction log. In some case, the transaction manager doesn’t know exactly what was done or not, and it must then “heuristically” rollback or commit the pending branches. This is generally really bad as it may leave the system in an inconsistent state, with some operations having been committed in one branch (e.g: the database) and rolled back
in another one (e.g: the JMS broker).

Glassfish admin console

First of all, we’ve never been able to recover any in-doubt transaction from the Admin>Transaction page. The “recover transaction” button didn’t produce any visible effect. We were however able to force the recovery at startup by enabling the appropriate option in the transaction service configuration page.

Oracle transaction recovery

If you are using Oracle, youre database connection will need some advanced privileges to have the recovery working. Glassfish will indeed execute either a “commit force” or “rollback force” on the database, which
is usually performed by a DBA with system rights. The privileges we found were necessary are the following:

GRANT FORCE ANY TRANSACTION TO <db_conn>;
GRANT SELECT ON dba_2pc_pending TO <db_conn>;
GRANT SELECT ON DBA_PENDING_TRANSACTIONS TO <db_conn>;
GRANT SELECT ON SYS.PENDING_TRANS$ TO <db_conn>;
GRANT SELECT ON SYS.DBA_2PC_NEIGHBORS TO <db_conn>;
GRANT EXECUTE ON sys.dbms_system TO <db_conn>;
CREATE PUBLIC SYNONYM dbms_system FOR dbms_system;

Before the recovery,  the view dba_2pc_pending shows one pending transaction, whereas after the recovery the view is empty.

The is also little information about the property oracle-xa-recovery-workaround of the transaction service. It seems like there is a bug with Oracle and the view dba_2pc_pending. This view is sometimes not correctly refreshed by Oracle. The workaround’s purpose is apparently to force the view to be updated so that Glassfish can use it to identify the in-doubt transactions. This is unfortunately only a suppositon as we never found a clear explanation of the exact impact of this property.