We all have been confronted with critical systems not performing as they should. This subject is fascinating as it is crossing nearly all the fields of I.T. systems design, build and run.
Defining performance: How should it perform?
Not performing as it should is our first clue. How should it perform? Has it been thoughtfully defined? As well defined as the entry fields colors, as the business rules, as this shiny button on the left?
Analysts often have difficulties and sometimes even avoid defining non-functional requirements. But these are critical requirements, especially performance, as it is often expensive to meet when excessive and expensive not to meet when the user’s experience suffers, in sales/audience/company image for any common website, employees motivation/engagement and process efficiency for any in-house system. Only business drivers can serve as foundations to define NFR, during technical design, it is often too late to acquire them back.
IMHO, agile approaches taught us very interesting techniques to focus on expressing user experience, and the growing field of business service level definition as well. Focusing on the interaction between each user and the system is key to define the experience and the main points of attention. Service level value on these main points completes the goal definition:
- Key interaction scenario
- Value of performance
- Cost of non-performance
If any system specification can provide such inputs, software and technical designers/architects will have a very good starting point to work and communicate on the choices they make.
Using a model
An I.T. system is known to be complex because it is made of many moving parts, each living a predictive but non-linear life.
As engineers, we try to tackle that by modeling the system, grouping parts and subsystems in subsystems (engineers love nesting dolls). The system is now simpler to describe, less moving parts, hierarchical decomposition of understandable parts. But each part is now even less linear and less predictable. But is it?
In my experience, at a certain level, subsystems can be bound to a certain behavior, (not purely linear but following simple defined models), and sometimes, when it is not completely bound, design decision can be made at these few remaining “unstable” points to ensure the correct behavior of the system. And at this level indicators can be expressed in conjunction with the interaction scenario.
The issue here is usually to build this scenario (really implement the user side of it), and have enough experience to target the right level. This calls for an experienced architect with some background in testing. Here again, agile approached taught and gave us a lot on defining scenarios, implementing the user-side of it, setting it up in the continuous integration test cases.
By modeling the various subsystems and their interactions we can estimate the flow of events and place the resources constraints shaping it from linear to non-linear to pure overflow. The model itself can already demonstrate limits by applying performance level goals under the load defined by the composition of the scenarios (occurring concurrently) and common behavior of technical subsystems or contributing subsystems SLA. It can be very useful to pinpoint structural bottlenecks and to concentrate on what to test.
Testing, modeling and new perspective
In this first part we explored how defining performance goals and their context early in the project is key and how modeling during design phases can help tackle the issues before they arise.
In the next part of this article, we’ll dig a bit deeper in the testing zone and what to do when performance is such a major requirement and feature of the system that it’s built into it.
In the meantime, don’t forget to read Patrick’s posts on defining complexity, especially the third chapter on reflexivity which will be very useful (different subjects, same approach, that’s the versatility of CS).