You Can't Argue With That
I’m pretty confident that most of the people reading this will be able to relate to a scenario in which you are tasked with resolving a vaguely defined, intermittent problem with a system. These problems are a sure fire way to lose user confidence and interest and the phrase “it works fine for me" is equally likely to alienate.
Things like...
- sometimes the screen doesn’t load and
- the lookup isn’t working
…are no good to anyone. What you need is...
- 0.6% of requests (for information from an external system) do not recieve a response
- 7.2% of requests take more than 2000ms
- more than 10 requests yesterday took more that 30000ms
Charts are even better...
Chart 1.
Chart 1. shows the number of requests to and responses from my fictional external system. There should be the same number of requests and responses. It is a little difficult to read the chart because sometimes the response comes in a few seconds later and so it is in the next bar. An example of lost requests is at 10:00. 12 requests are made but only 10 responses are received. Subsequent bars are equal so the responses did not come in later.
Chart 2.
Chart 2. shows the response times for my fictional requests. I’d challenge anyone to look at this chart and tell me that they don’t need to take a closer look at what’s going on.
I’ve talked about this topic many times before but it really is very easy to make your life easier. In IBM BPM (and probably any other technology stack) it is best practise (and almost trivial) to instrument your integrations. Don’t wait until there is a problem, make it part of your implementation work, *every* time.
I guess the other requirement is having a capability to interrogate the resultant data in a meaningful way. Thankfully, tools for this are increasingly available in the technology stacks of most enterprises.