Imagine living in a world where every worker contributed the same amount, every problem was equally important, and every product was equally popular with users. Planning would be so simple. In reality, however, the distribution is quite skewed. In most people‘s everyday routines, low-impact chores hijack a lot of their day. These activities can consume up to 80% of our time without having any meaningful impact on productivity. If you face similar issues, you might want to take a look at the Pareto principle. Also known as the 80/20 rule, it helps prioritize activities that really matter.
What is the Pareto principle?
The Pareto principle states that roughly 80% of the consequences come from 20% of the causes. In other words, a small minority of causes have a disproportionate effect on the outcome. This was named after its developer, the Italian economist Vilfredo Pareto, in 1896. Pareto noticed that 20% of the pea pods in his garden yielded 80% of the peas. Extending his observations beyond his garden, he observed that about 20% of the population owned 80% of the country‘s wealth. In terms of land ownership, a handful of wealthy citizens who made up 20% of the population owned 80% of the land. Decades later, management consultant Joseph Juran observed that the phenomenon identified by Pareto was a universal principle that consistently applied to a wide variety of fields including economics, commerce, management, and quality control. Juran discovered that 20% of the manufacturing processes were responsible for 80% of the product defects. Thus, by fine-tuning the production methods, overall product quality can be improved. His observations led him to coin the phrases “vital few” and “trivial many,” which meant prioritizing the “vital few” over the “trivial many” to achieve greater success. This was widely adopted as a strategy to identify the most significant factors across different departments, organizations, and business sectors.
What is Pareto distribution?
Let‘s now understand what a Pareto distribution looks like. Consider an example of income distribution in a city. Below are three graphs that attempt to describe the distribution.
The uniform distribution depicted in Figure 1(a) illustrates a situation where the population is evenly distributed across all income levels. The normal distribution shown in Figure 1(b) depicts a situation where the majority of the population’s income falls between income levels 4 to 6. However, studies indicate that income distribution follows a Pareto distribution, as shown in Figure 1(c), with the richest 20% of the population receiving the top 80% of income and the rest earning between income levels of 0 to 2.
The Pareto distribution is very accurately described by the following equation:
The equation represents a skewed distribution with a gradually decreasing tail, where the shape parameter α describes how quickly the slope decays and the scale parameter xm is the minimum possible value of x.
Let’s take an example of wealth distribution that will help us understand the concept of shape parameter α better.
The graph depicts distribution of wealth in the population. A higher value of α indicates that a smaller percentage of the population has a higher amount of wealth (or higher wealth disparity). For example, in Figure 2, the orange line illustrates the 90-10% distribution for α value of 1.05, showing that the wealthiest 10% of the population own 90% of the wealth. Similarly, the green line represents an 80/20 perfect Pareto distribution for α value of 1.16. Thus, the shape parameter α is also known as the Pareto index, which is a measure of the breadth of wealth in this case.
The distribution has a wide range of applications that describe social, scientific, economic, and geophysical events in society, as well as quality control and actuarial in enterprises, among other things.
What is Pareto chart?
While the Pareto distribution is an excellent statistical tool for understanding the Pareto principle, we will now look at a presentation style that aids in data visualization and interpretation by displaying the relative impact of all contributing components on the main problem.
A Pareto chart contains both bar and line graphs, where the lengths of the bars generally represent the importance of different situations, and the line represents the cumulative total. The bars are arranged in a decreasing order, with the longest bars on the left and the shortest on the right. In this way, the chart graphically depicts which scenarios are the most significant sources of the issue.
Consider the following simple Pareto chart, which depicts the distribution of some prevalent software defects. The left vertical axis represents the frequency of occurrence while the right vertical axis depicts the cumulative percentage of the total number of occurrences.
Functional defects have the most frequent occurrence with 700 defects covering approximately 50% of issues. Performance problems is the next most frequent occurrence with 500 reported problems; both functional and performance issues leading to a cumulative contribution of 80% (see the red-colored line, looking at the right vertical axis).
As Figure 3 shows, the first two bars are the tallest and account for a considerable proportion of all the defects. The cumulative chart (red line) also increases rapidly at first and then levels out, demonstrating the Pareto principle in action. The top two categories account for roughly 80% of all defects, and hence fixing those should be prioritized over others.
Some common misconceptions
Due to its simplicity, the concept of the Pareto principle is frequently misunderstood and results in several misconceptions. Below are some common myths associated with this principle:
- Myth 1: It is a mathematical law
The 80/20 rule is a general observation rather than a formal mathematical equation. Furthermore, the law can be misinterpreted such that if 20% of inputs are more important, then the remaining 80% can be ignored. This is a logical fallacy. In the chart shown in Figure 3, while we may prioritize efforts to fix 30% of the most frequently occurring bugs, the remaining 70% should not be neglected.
- Myth 2: The numbers 20 and 80 must add up to 100:
It is a common misconception that the numbers 20 and 80 that represent the distribution must always add up to 100. However, that is not always the case. Inputs and outputs simply represent different units, and they could be 80/30, 90/20, etc. The percentages of these units don’t have to add up to 100.
- Myth 3: It can be used to accurately predict the future:
The Pareto principle only focuses on historical data, making it a useful tool for planning, but it makes no predictions for the future. While past performance can be a good indicator of future performance, it’s not a guarantee that it will be relevant in future scenarios. When formulating new strategies, Pareto principle may not always be applicable as circumstances can change and evolve over time.
The Pareto principle gets widely used in various domains. Below are some examples:
- In marketing and sales, knowing that 20% of advertisements generate 80% of sales traffic can be used to improve future marketing efforts.
- In software development, the Pareto principle can be used to identify a small subset of bugs that contribute to the maximum number of crashes. This can help prioritize fixes for the most prevalent cases while safeguarding the system from the vast majority of outages.
- Surprisingly, almost all sports adhere to the Pareto principle. Regardless of the sport, the leading players are responsible for the majority of the wins. The Pareto principle stood up well in baseball, where 15 percent of the top players accounted for 85 percent of total victories. Similarly, in cricket, the leading 20% of players receive 80% of the success and compensation.
ignio analytics framework uses Pareto principle in many ways. It helps in data quality assessment, characterizing normal behavior, performing feature engineering, generating data summaries, assessing top contributors, among others. The Pareto principle often offers a very powerful lever to identify areas that need attention and thus helps narrow-down the focus for a deep-dive analysis.
The Pareto principle, developed in 1896, is still relevant as it does not regard the world as even and uniform. On the contrary, it acknowledges that most things are not distributed evenly. Some things contribute more than others, and hence priorities should be established to maximize the impact. As a result, it has become a powerful tool that is used across industries. However, keep in mind that the Pareto principle is not a one-size-fits-all solution. It is merely an observation and not a mathematical law.