Too much of aggregation is not good for analysis …

Have you heard of Simpson’s Paradox? Simpson’s Paradox affects many of our analysis that we perform on a day to day basis. Unless we are aware of what it means and its implication to our analysis, may be all our analysis would be in vain

Let me explain it in our context. Let us consider two delivery teams working on multiple projects over a period of time. You accumulated data in your internal project management system (PMS) to quantify the number of functional points they handled in that period and number of bugs found in the product from system testing till customer acceptance of the product.

Let us assume the below is a sample data extracted from PMS and you would like to use that as base for your performance evaluation 

Team Function Points Developed No. Of Bugs No. of Bugs / Function Points
Team A 130 56 0.43
Team B 70 32 0.46

From the above information, it is obviously clear that Team A has done a better job than team B and you would intend to give a better rating to Team A than Team B

Now let us add one more dimension to the above analysis say “Complexity” of function points. For simplicity reasons let us say there are only 2 categories “Complex” and “Easy”. The revised data considering the complexity of function points is below

Team No. of bugs / Function Points (Complex) No. of bugs / Function Points (Easy) Weighted Average
Team A 40/50 = 0.8 16/80 = 0.20 0.5
Team B 30/50 = 0.6 2/20 = 0.1 0.35

Now the interpretation of above analysis says, Team B has done a better job that Team A which is totally contradicting to the initial analysis and interpretation of data you obtained. Make a note if you sum up the number of function points and number of bugs it is same as first one.

The underlying difference is due to the fact that in your second calculation you are measuring the quality at the same scale for complex and easy function points whereas not in the first one. The interpretation may change again if you introduce one more dimension like “Severity” of the bugs. But knowing the fact that Team B has done a good job in terms of handling Complex function points, it is safe to assume they may not have floundered by creating too many “High” Severity bugs

So don’t get carried away by high level aggregated numbers / ratios, dig deep into it to get the real picture. The validity of you assumptions decides how deep you should dig into. Interesting?


One thought on “Too much of aggregation is not good for analysis …

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s