Alan's (new and improved) blog

the customer issue tracking problem

Having spent a good deal of the past 7 years of my life bridging the gap between customers and software products, in many cases at the cost of a large amount of tedious, repetitive effort on my part, I am particularly sensitive to the problems involved in tracking and managing customer issues.

This came up again in the context of Jobster's imminent pilot launch: the company is undertaking what may well be its first internal systems integration project, between an externally hosted CRM system and the engineering bug tracking system.

First, let me describe just exactly how bad this can get. The project to ship the browser and messaging software for the Vodafone 3G phones last year at Openwave was an amazing international effort. The software was designed to ship simultaneously for the Japan market and numerous European markets, with two different handset suppliers incorporating our technology.

So, let say that someone doing handset acceptance testing in Spain had an issue with our software. That issue was filed in the Vodafone Europe acceptance issue database (thank God that at least all European Vodafone operators shared a database), which was then manually copied by the handset manufacturer into their database, which was then manually copied into Openwave's internal bug tracking database. Let's say we fixed that bug one happy day: then the fix notification would have to be propagated (again manually) back through those same three databases.

Thus it was no surprise to me when one day I happened to be on a phone conference directly with the Vodafone acceptance person in Spain, and she was extremely pissed off about the fact that her issues basically disappeared into a black hole and she didn't hear about them for weeks at a time.

The Critical Issue List problem

It gets worse. See, at certain levels of management, it's not fashionable to muck about with actually updating and normalizing data in a database, so issue lists get extracted into Excel and then passed around for further triage and prioritization. Ostensibly the purpose of this is to identify a "critical issue list" that defines when the project is ready to ship. In fact, as anyone who has been through this knows, the "critical issue list" is no more static than the database itself, and a live query on properly-updated issues in the database would serve this purpose much better.

So, back to the example of the Vodafone project, in addition to a VF Europe database, VF Japan database, handset manufacturer database and Openwave database, there was a "master" Excel sheet that was updated daily and sent around with the latest prioritization information. In order to assist in matching up the issues in the "master" Excel sheet with the issues in the three database, we hit upon the idea of - you guessed it - another Excel sheet that sucked data out of the "master" sheet, then matched that up against a screen-scraped web page report out of the customer database, and further linked it to internal issues so that we would know what to work on. The tape and bailing wire that held this mess together was an elaborate set of conventions for notation in the issue subjects such as (issue id) for internal issue id, and /H/ for really friggin high priority (the priority inflation was hilarious) and probably a half-dozen other such hacks that I can no longer recall. Fortunately Openwave has a world-class VBA guru in the form of Pierre Raynaud-Richard, so this other sheet automated much of this task with an unbelievable hairball of VBA code.

In short, 5 different databases (including Excel sheets) containing versions of the same data. The synchronization problem was so bad that, had I waited for the "official" process to run its course on the resolution of all these issues, nothing would ever have been accomplished. So, the problem was solved, as it often is, by breaking down these barriers of abstraction, gaining direct access to all these systems, and promptly updating and resolving issues at the source.

(Aside: one of the initiatives at Openwave was to drive our customers into yet another database called RT that was supposed to link a nicer way to internal issues etc., itself replacing a system called STS that I had used in previous projects. In the short term, all this created for me was yet another place to go look. I literally kept a small "cheat sheet" file on my hard drive with the web addresses, login information, etc. for the 8+ issue databases that I had to sift through.)

The Ideal Experience

It need not be this way. As any software engineer will tell you, the way you increase performance on a system is to make frequently-repeated operations as efficient and fast as possible. What could be more frequent than a customer reporting an issue with software? Right. So why do we create such complexity in this data path?

The ideal experience would be like that of my first paid programming project, back when I was writing language education software for the TRS-80. My customer was a high school teacher sitting in the chair next to me. He told me there was a problem. I fixed it. We moved on. Total issue resolution time: about 5 minutes.

The guys at CNet's Newsburst also came pretty close to the ideal the other day: I reported a problem with importing my OPML file into Newsburst via their feedback link. That same day, I got an email response back - they asked me for a copy of my OPML file - and the following day I got another email informing me that a slightly modified file could workaround the problem immediately (which they provided to me), and that they would incorporate a code fix for the problem shortly. Wow. That's very cool.

Dispatching and dealing with customer issues quickly doesn't mean fixing them all, though. It's ok to say "we're not going to fix this". Just do it quickly and efficiently. If the customer doesn't like it, they can escalate. But at least they won't wait for 3 weeks or 3 months or forever wondering what happened to their issue. Over the course of filing lots of issues, they will get a sense for which are likely to get fixed, and a sense of the criteria.

Towards a Solution

It's pretty clear to me that there needs to be one database. One schema. And, because it's going to deal with direct customer issues as well as internally-generated issues, it needs to be accessible outside the firewall. (Aaaag! Security freak-out! What if our competitors get access to all our secret bugs?) Second, because it's going to contain customer information, the business side of the house probably needs a substantial say in how it works. (Aaaag! Engineering freak-out! You mean we don't have total adminstrative control over our bug database?) And ideally it's got a great way to create "critical issue views" that mirror what people do in Excel sheets today. Ok, so maybe I'm dreaming. But the way this stuff is done today is clearly broken...

(Note: I promise I'm gonna work on getting comments working again. Soon. Too busy blogging to fix right now.)
« Home | Next »
| Next »
| Next »
| Next »
| Next »
| Next »
| Next »
| Next »
| Next »
| Next »