Threat modeling: overview

Threat Modelling is a process of assessing and documenting a system’s security risks. The threat model identifies and describes the set of possible attacks to your system, as well as mitigation strategies and countermeasures. Your security threat modelling efforts also enable your team to justify security features within a system, or security practices for using the system, to protect your corporate assets.

Any threat modelling process will usually encompass the following steps:

1) A model of the system that is relevant for the threat analysis
2) A model of the potential threats
3) A categorization and rating of the threats
4) A set of countermeasures and mitigation strategies

There are however several approaches to perform each of the steps. We will now briefly give an overview of each step.

Step 1: Model your system

The system model is an abstraction of your system that fits the threat analysis. It differs from other traditional models in the sense that it is a mix between a deployment view, a data view and a use case view.

The system entry & exit points The entry & exit points are the ways through which data enter and leave the system, from and to the external environment.
The actors & external dependencies The actors and the external dependencies are the entities that legitimately interact with a system. The actors tend to represent real user roles whereas an external dependency refers usually to a third-party system. The distinction between both can sometimes be blurry: an external system could be considered as an actor in case it is the active participant in the interaction.
The trust levels & boundaries Trust levels define the minimal access granted to an actor within a system. For example, a system administrator actor may have a trust level that allows them to modify files in a certain directory on a file server, while another user entity may be restricted from modifying files.
The assets An asset is an item of value, an item security efforts are designed to protect. It is usually the destruction or acquisition of assets that drives malicious intent. A collection of credit card numbers is a high-value asset, while a database that contains candy store inventory is probably a lower-value asset. An asset is sometimes called a protected resource.
Use cases Identify the use case for operating on that data that the application will facilitate.
The assumptions All assumptions that were driving the modelling effort. Considering the cryptographic algorithm either public or private is for instance an assumption worth being mentioned.

Step 2: Model your threats

Let’s first define some concepts:

Threat – The possibility of something bad happening
Attack -A mean though which a threat is realized
Vulnerability – A flaw in the product
Countermeasure – A mean to mitigate the vulnerability

« Threats are realized through attacks which can materialize through certain vulnerabilities if they have not been mitigated with appropriate countermeasures »

A concrete example would be:

Threat  – Perform arbitrary query on the system
Attack – Access internal service exposed to the end-user
Vulnerabilities – (1) Firewall misconfigured (2) Lack of access control
Countermeasure – (1) Correct firewall rules (2) Secure the EJB correctly

« Arbitrary query can be executed through access to the back-end EJB  which can be possible because of wrong firewall configuration and lack of access control if the infrastructure and the application server were not configured correctly »

An advanced attack is frequently composed of a series of preliminary attacks which will exploit several vulnerabilities. The attacks can be represented as a tree, called an attack tree.

The level of details in the identification of threats, attacks and vulnerabilities is up to the analyst.

Step 3: Categorize and rate your threats

Once the threats, attacks and vulnerabilities have been identified, the threats can be categorized. Popular categorization schemes are STRIDE or CIA.


Spoofing – To illegally acquire confidential information of someone and use it
Tampering – To modify maliciously information that is stored, in transit or otherwise.
Repudiation – A malicious used denying the fact of committing an action that he/she is unauthorised to do or that hampers security of an organisation, and the system has no trace of such action. This action cannot be proved.
Information Disclosure – To view information that is not meant to be disclosed.
Denial of Service – Sending or directing network traffic to a host or network that it cannot handle thus they become unusable to others.
Elevation of privileges – To increase the adversary’s system trust level, permitting additional attacks.


Confidentiality – To ensure that information is accessible only to those authorized to have access
Availability – The ratio of the total time a functional unit is capable of being used during a given interval
Integrity – To ensure that the data remain an accurate reflection of the universe of discourse it is modelling or representing, that no inconsistencies exists.

Once categorized, the threats can be rated according to the risk the represent. The total risk can evaluated according to DREAD:


Damage Potential – Defines the amount of potential damage that an attack may cause if successfully executed.
Reproducibility– Defines the ease in which the attack can be executed and repeated.
Exploitability– Defines the skill level and resources required to successfully execute an attack.
Affected Users – Defines the number of valid user entities affected if the attack is successfully executed.
Discoverability– Defines how quickly and easily an occurrence of an attack can be identified.

If attack trees have been modelled, the risk can be estimated based on the likelihood each step in the attack tree.

Step 4: set up countermeasures and mitigation strategies

Once the threats, attacks and vulnerabilities have been identified and documented, a set of countermeasure can be set up. Such strategies aim at reducing the risk surface and mitigating the potential effects of an attack. If the existence of the threat can not be removed altogether, the probability of such threat should be reduced to an acceptable threshold.

Wrap up

The following quote summarize well the rationale behind thread modeling. « Threat modeling is not a magic process/tool where you just throw stuff in and out comes goodness. Threat modeling is a structured way of thinking about and addressing the risks to what you are about to build rather than going about it randomly. »


StAX pretty printer

Using StAX to write XML is a lot easier than either using DOM or SAX. There is however no option to indent the generated XML, unlike with SAX or DOM. When faced with this problem, I came out with a simple yet generic solution: I would intercept all write calls and preprend the necessary whitespace according to the current depth in the XML. To achieve this easily an InvocationHandler can be used that will decorate the XMLStreamWriter.

Here is a sample usage

XMLStreamWriter wstxWriter = null;
XMLStreamWriter prettyPrintWriter = null;
ByteArrayOutputStream baos = new ByteArrayOutputStream();

wstxWriter = factory.createXMLStreamWriter(baos, "UTF-8"); // specify encoding

// Wrap with pretty print proxy
PrettyPrintHandler handler = new PrettyPrintHandler( wstxWriter );
prettyPrintWriter = (XMLStreamWriter) Proxy.newProxyInstance(
new Class[]{XMLStreamWriter.class},
handler );


And the InvocationHandler looks like this (see this gist):

public class PrettyPrintHandler implements InvocationHandler {

private static Logger LOGGER = Logger.getLogger(PrettyPrintHandler.class.getName());
private final XMLStreamWriter target;
private int depth = 0;
private final Map<Integer, Boolean> hasChildElement = new HashMap<Integer, Boolean>();
private static final String INDENT_CHAR = " ";
private static final String LINEFEED_CHAR = "\n";

public PrettyPrintHandler(XMLStreamWriter target) { = target;

public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
String m = method.getName();
if (LOGGER.isDebugEnabled()) {
LOGGER.debug("XML event: " + m);
// Needs to be BEFORE the actual event, so that for instance the
// sequence writeStartElem, writeAttr, writeStartElem, writeEndElem, writeEndElem
// is correctly handled
if ("writeStartElement".equals(m)) {
// update state of parent node
if (depth > 0) {
hasChildElement.put(depth - 1, true);
// reset state of current node
hasChildElement.put(depth, false);
// indent for current depth
target.writeCharacters(repeat(depth, INDENT_CHAR));
else if ("writeEndElement".equals(m)) {
if (hasChildElement.get(depth) == true) {
target.writeCharacters(repeat(depth, INDENT_CHAR));
else if ("writeEmptyElement".equals(m)) {
// update state of parent node
if (depth > 0) {
hasChildElement.put(depth - 1, true);
// indent for current depth
target.writeCharacters(repeat(depth, INDENT_CHAR));
method.invoke(target, args);
return null;

private String repeat(int d, String s) {
String _s = "";
while (d-- > 0) {
_s += s;
return _s;

The repeat method is quite ugly. You can use StringUtil form commons-lang instead of check one of the other repeat implementation on stackoverflow.

JCA connector: file system adapter

I’ve been working on a sample JCA connector that can be used to access to the file system. The connector is pseudo-transactional: it is transacted but not very robust. It correctly creates and deletes files if the transaction is committed or rolled back. However, the connectors doesn’t keep the information in a transaction log, so if the system crash, the file system may be in an inconsistent state. The file system may also be in an inconsistent state, in the rare case of the transaction being rolled back after the first phase of the 2PC protocol.

The code is available on github: txfs

JCA connector: overview

The J2EE stack is composed of several containers, among which the JCA container. Not as popular as the EJB or Web containers, the JCA container is nevertheless a very important piece of the J2EE stack.  This container contains the “glue” that provides transparent transactional connectivity to third-party system. Remember that from point of view of the Web and EJB containers, the database or the JMS brokers are third-party systems. So you probably already used JCA connectors without even noticing it.
A JCA connector is an adapter between the J2EE world and the outside world. The J2EE world is indeed a managed environment that provides many services such as transaction management, connection pooling, thread management, lifecycle events, configuration injection or declarative security.  If you want to leverage these features while connecting to your third party system, JCA is the way to go.

Connectors are bi-directional. They can be used to connector to the outside, or from the outside. In the later case, we speak about connection inflow. We will discuss here only the first case: outside connectivity.

Anatomy of a JCA connector

Managed connection A managed connection represents a physical connection to the third party system. Managed connections are pooled and are enlisted/delisted in the global transactions automatically by the application server.
Connection handle The client (e.g.: EJB, or Servlet) does not manipulate the managed connection directly. It uses instead a connection handle which exposes only the subset of available operations that are relevant to a client; it is the client-side view of the managed connection.

Several handles can have the same underlying managed connection. This could happen for instance if the client obtains two connection handles for a given third party system while in the same transaction; the application server can optimize this case and re-use the same managed connection.

Connection request info The information that identifies the target third party system (e.g.: database connection string). The connector’s connection pool can contain connections to different target systems. The connection request info must then be checked to ensure the connection targets the desired external system.
Managed connection factory The managed connection factory has two purposes:

(1) Create brand new managed connection according to the desired connection request info

(2) Check if some managed connection can be re-used according to a given connection request info. This is refereed as connection matching.

The connection matching mechanism exists because the application server, which manages the connection pool, is not able to know to which target system a given connection points to. This is part of the connector’s logic.

Connection factory The client (e.g.: EJB, or Servlet) does not manipulate the managed connection factory directly. It uses instead a connection factory which is a façade that shields the client from the connector’s internal complexity. The connection factory typically contains a single method getConnection(…) which returns a connection handle to the client. The connection factory will interact with the application server and the managed connection factory to implement connection pooling correctly.
XAResource From the conceptual point of view, the managed connection is the participant that is enlisted/delisted in the global transaction. However, the managed connection is not forced to implement the XAResource interface itself, but only to expose a method getXAResource(…). An auxiliary object can then be returned that will act as an adapter between the JTA transaction manager and the managed connection.

Below is the component view:

JCAA complete sequence diagram

The various elements of a JCA connectors have now been presented. Below is the outline of a call to the Connection Factory’s getConnection() method.


1 Client creates the connection request info of the external system
3 Client calls getConnection( requestInfo) on the ConnectionFactory
3.1 ConnectionFactory calls allocateConnection( managedConnectionFactory, requestInfo) on the application server. The applicaton server exposes its functionalities through in the ConnectionManager interface. Connectors can therefore be implemented in a standard way without knowing the application server implementation.
3.1.1 The application server gathers a list of available (free) managed connection in the pool.Because the application server doesn’t know which managed connection points to the desired system, it calls matchConnection( listOfFreeConnection, requestInfo) on the managed connection factory.

The managed connection factory check if any of the provided connection indeed corresponds to the given request info. If there are none, it returns null.

3.1.3 No managed connection is available, or none of them match the request info. The application server need to allocate a new connection and calls createManagedConnection( requestInfo ) on the managed connection factory. The managed connection factory creates a brand new managed connection and return it.
3.1.5 The application server gets the XAResource which belongs to the managed connection and enlist the connection in the distributed transaction. The application server put the managed connection in the pool.
3.2 The application server returns the managed connection to the connection factory.
3.3 The connection factory calls getConnection() on the managed connection and obtains a connection handle.
4 The connection factory return the connection handle to the client.

Web service and polymorphism

This page present several options to define a web service interface that uses polymorphism on list of objects. It compares various possible web service interface definition.

My preferred pattern is #5. It was used in one of our web service and works nice.

See also

Wrapper with two collections

Name Wrapper with two collections
Description Declaration of two list to hold each kind of filters
Java definition
Class FilterClause
 List<TermFilter> termFilers;
 List<FullTextFilter>   ftFilters;
Distinction between null or empty list Yes – if lists are empty, filterClause tag is still there
Expected SOAP request <filterClause>
  • Simple
  • Not extensibility: a new subtype require a new list and to re-generate the stub on the client side
  • To add a new subtype, a new list must be added in the class
  • Not very Object-oriented

Wrapper with polymorphism on tag name

Name Wrapper with polymorphism on tag name (xsd :choice)
Description Declaration of one list with xsd:choice to distinguish the type of the elements and change the tag accordingly
Java definition
public class FilterClause
{ @XmlElements({
@XmlElement(name="termFilter", type=TermFilter.class),
})private Listfilter; }
Distinction between null or empty list Yes – if lists are empty, filterClause tag is still there
Expected SOAP request <filterClause>
  • Extensibility to new subtype without the need to re-generate the stub on the client side
  • XML is easily readable
  • To add a new subtype  a new @XmlElement entry must be added manually

Wrapper with polymorphism with xsi:type

Name Wrapper with polymorphism with xsi:type
Description Usage of a wrapper with the declaration of one list using base type – polymorphism is detected automatically by JAXB & .NET
Java definition
public class FilterClause
{@XmlElement(name="filter")private Listfilters;}
Distinction between null or empty list Yes – if lists are empty, filterClause tag is still there
Expected SOAP request <filterClause>
<filter xmlns:q1=”; xsi:type=”q1:termFilter”>
  • Extensibility to new subtype without the need to re-generate the stub on the client side
  • XML is easily readable because of the wrapper tag
  • To add a new subtype  a new @XmlElement entry must be added manually
  • Wrapper tag is useless except for readability

Polymorphic list on tag name

Name Polymorphic list on tag name (xsd :choice)
Description Same as 2 without wrapper
Java definition Can not be defined in Java, because @XmlElement can not be applied on the parameter of a method signature
Distinction between null or empty list No – if list is empty or null not tags are written
Expected SOAP request <termFilter>
Pro See comment in Java definition
Cons See comment in Java definition

Polymorphic list with xsi:type

Name Polymorphic list with xsi:type
Description Same as 3 without wrapper
Java definition
public class XXXX
List<Filter> filters;
Distinction between null or empty list No – if list is empty or null not tags are written
Expected SOAP request <filter xmlns:q1=”; xsi:type=”q1:termFilter”>
  • Extensibility to new subtype without the need to re-generate the stub on the client side
  • All subtypes can be defined in a secondary XSD that is imported in the WSDL
  • To add a new subtype  a new @XmlSeeAlso entry must be added manually

Fun with iTune Shuffle and Probabilities

I recently tagged and imported all my mp3 into iTune. I noticed then that there were lots of albums that I had only partially listened to and I decided to use the feature “Party Shuffle” to listen to my library randomly and eventually hear all the songs.

After a couple of weeks, I observed that some songs would reappear in the playlist and were picked twice. Over the weeks the frequency of “re-entry” songs increased with the direct consequence that new music was played less and less. Even though I had already realized that it would not be possible to hear all the songs with approach, I was still surprised by the “re-entry” rate, which I would have intuitively expected to be much lower.

I turned to probability to better understand the situation.

Let’s n be the size of my library. After t songs played randomly, the probability that a given song was played at least once is:

P( song played at least once ) = t / n.

Absolutely not! This probability can be computed with 1 – probability that the song was never played. This gives:

      P( song played at least once ) = 1 – (( n-1 )/ n)  ^ t

More generally, the probability of a song having been played x times is given by the function

P( x ) = (1/n)^x * ( (n-1) / n )^(t-x) * C ( n, x  )

Where C(n,x) is the number of possible permutation. The expanded

P( x ) = (1/n)^x * ( (n-1) / n )^(t-x) *  n! / (n-x) ! x!

Note that the probability that the song was never played (x=0) is still (( n-1 )/ n)  ^ t.

After t songs, the sum P(0) + P(1) + … + P(t) = 1, which proves that the formula is correct.

The average number of songs played in the library after t songs, can be computed with

Avg. played

 = n * P( song played at least once )

= n * ( 1 – ((n-1)/n)^t ) = n – (n-1)^t  / n^(t-1)

The “re-entry” rate, or the probability of hearing a new song can be computed with (n- avg. played) / n which is equivalent to the probability that a given song was never played P(x=0).

The graph bellows shows the probability that a song was never played for a library of 500 songs, after 0, 50, 100, etc. songs. It’s interesting to notice that the probability of new songs fall below 50% after about 300 songs.


Database independent SQL: a checklist

Pure SQL data layers tend to disappear in favour of O/R mapper which embed they own query language. The languages of today’s O/R mapper (e.g: Hibernate, JPA, etc.) is still close to the underlying target SQL syntax. Actually, some keywords in Hibernate will be taken as-is form HQL to SQL which means that using an O/R mapper does not provide 100% database independence necessary. There are also frequently a few native queries here and there to perform data-intensive operation that are easily realized with pure SQL, and do not match well with the object approach. Therefore it is still relevant to think about SQL-92 convention and to keep the SQL and HQL the more independent as possible. Below is a checklist I wrote a few years back when I was involved in the migration of a project from Informix to Oracle.

  • Outer join & inner join
    Review the syntax of your joins.
  • AS keyword
    There is no need of a special keyword to alias a column.
  • Function (SQL92 or not)
    Try to avoid using database function. If it’s absolutely necessary prefer the SQL92 subset and make your code easily portable.
  • Operator ==
    The SQL92 standard uses = for equality.
  • Case type: num vs char
    Complex SQL using case statement can be rewritten to be SQL92 compliant
  • lock columm name (reserved words)
    Each RDMS has a list of reserved keywords. Try to avoid such name to make your SQL portable.
  • No usage of *
    Do never use * to return rows with select. The order of your column is part of the physical database model and your code should work with the logical database model. Chances are, that the column order with change over time due to migration, or extension of the database model.

Keep your desktop clean: a checklist

I found a post-it on my desk this morning that was actually there since several months (if not year). It’s a short checklist with questions to answer before deciding whether I will trash or retain a document.

I tend to retain too many documents (electronic and paper)  that consumes space and are useless — I will anyway never read them. This short post-it helps me in my cleaning activity. It helps me take some distance with the material and its importance.

  • is it hard to find the document again?
  • is the document important regarding legal aspects?
  • is the document up to date?
  • is the document currently/frequently used?

Then if all answers are “no”, trash it!

It’s amazing how many documents can actually be trashed according to this checklist. Free desktop again!

Introduction to Reliable Distributed Programming

This book discusses distributed algorithms in the context of reliable application development. The algorithms are described intuitively and presented in pseudo-code as well. Even though this is an academic book, it is not too theoretical and is easy to follow. Theoretical complexity of the algorithms was for instance omitted on purpose.

The book presents the programming abstraction incrementally. It starts first with a recap of time abstractions and then builds a stack of algorithms in the following way: network and link abstractions, broadcast abstractions, shared memory abstractions, and consensus abstractions. The last chapter is about programming models used in real systems, for instance group communication or state machine replication.

If you are eager to learn how stuff work under the hood or want to have solid foundations to address reliable distributed programming, I recommend this book. I enjoyed reading it, but would have appreciated a bit more coverage of programming models in the last chapter.

97 Things Every Software Architect Should Know

This book is a collection of 97 articles, written by various authors, about software engineering and architecture. The articles are short (no more than 2 pages) and easy to read. Each one is focused on one principle.

The book is not a definitive receipe on how to conduct a project and be sucessful. It’s rather a set of – more or less generic – advices to think about: did I had similar experience? Should I remember this advice next time I start a project? Is this advice relevant in my position? etc.

What is nice about the book is that it covers a wide spectrum of problems faced when building a piece of software.  There are not only reflexions about technology, but also communication issues, risk management, managing people motivation,  taste and opinions, the necessity of having a vision and a leader, etc. What matters utlimately is the delivery of a solution that addresses the real customer problems.

My top 10 favorites are:

“Perfect” is the Enemy of “Good Enough”

Great software is not built, it is grown

If there is only one solution, get a second opinion

Make sure the simple stuff is simple

Shortcuts now are paid back with interest later

Prefer principles, axioms and analogies to opinion and taste

Simplify essential complexity; diminish accidental complexity

Your system is legacy, design for it.

Simplicity before generality, use before reuse

Everything will ultimately fail