top of page
Search

Measures of Center Tendency: A Practical Survival Guide for Qlik Developers

  • Writer: Igor Alcantara
    Igor Alcantara
  • Nov 26
  • 16 min read
ree


Picture yourself surveying a city from above. Each bustling building, large avenue, and hidden courtyard tells a story, but what if you had to sum up the entire city in a single snapshot? This is the power, and the profound puzzle, of Measures of Center Tendency. These are the tools that shrink data chaos into a single, meaningful beacon. They are critical, sometimes cunning methods for finding patterns in the numbers that shape your business, your strategy, and your dashboards.


A Measure of Center Tendency is a single number that summarizes an entire variable by capturing its “typical” or central value, showing where most observations tend to cluster in the distribution. It compresses complex, messy data into a concise reference point so analysts can quickly grasp the overall level of a metric like sales, costs, wait times, or customer ages without scanning every row. Common Measures of Center Tendency include the arithmetic average, the median, and more specialized options such as geometric and harmonic means, each reflecting a different way of defining what “center” means for a particular business question and data shape.


You might be thinking that you already know this. You might assume this is just about calculating an average. But hold your horses. If you rely solely on the simple average to make decisions, you might be steering your ship into dangerous waters without even knowing it. The average is a charismatic and easy solution, but it is also a liar. It smiles at you while hiding the truth of your data.


To truly master your analytics and provide your users with insights that actually reflect reality, you need to expand your toolkit. You need to understand the philosophy of the center. We are going to explore why the arithmetic mean is often misleading, what the other measures are, and exactly how you can implement them within your Qlik applications to become a true data hero. When you write Avg(Sales) what exactly are you measuring? What business value are you extracting? This is our journey today.


Let’s begin.


The Philosophy of the Center


What exactly are we looking for when we ask for a measure of central tendency? We are trying to find a single number that describes the typical experience within a group. We want to know where the gravity of the data lies.


Imagine you are trying to describe the typical weather in a city to a friend. If you just give them one number, you are attempting to compress the complexity of hundreds of days into a single digit. That is a heavy responsibility. It might work in a tropical city like Rio, where there are basically two seasons: very hot and hell, but try that in Boston or Lund and you are out of Luck. The goal is to find the value that best represents the entire set, and there are many ways to accomplish that.


ree

The problem arises when we assume that the "center" is always the same thing. It is not. The center changes depending on the shape of your data and the nature of the numbers you are crunching. Sometimes the center is the middle value. Sometimes it is the most frequent value. Sometimes it is a value determined by multiplication rather than addition.


When you drag and drop a field in Qlik and let it default to Avg(), you are making a massive assumption. You are assuming your data follows a beautiful and symmetrical bell curve. But in the world of business, data is rarely symmetrical. It is messy. It is skewed. It is full of outliers. Relying on the wrong measure of center is like trying to measure the temperature of a room by putting the thermometer directly inside the oven. You get a reading, but it does not describe the room.


Arithmetic Mean


The Popular Deceiver

Let us start with the one everyone knows. The Arithmetic Mean. This is what people usually mean when they say "average." You calculate it by summing up all the values and dividing by the count of values.


In Qlik, this is the default behavior. You write:


Avg(Sales)

It is easy. It is fast. It is everywhere.


The Arithmetic Mean works wonderfully when your data is distributed evenly like a perfect bell curve. If you measure the height of everyone in a stadium, the arithmetic mean is a perfect summary. Most people are near the average height, and very few are giants or extremely short.


However, the Arithmetic Mean has a fatal flaw. It is extremely sensitive to outliers.


Imagine you are analyzing the annual salary of ten people in a bar. They all earn about fifty thousand dollars a year. The average salary in that bar is fifty thousand dollars. Suddenly, a billionaire walks in. Now you have eleven people. If you calculate the arithmetic mean now, the "average" salary in the bar might be ninety million dollars.


Does that number represent the experience of the people in the bar? Absolutely not. If you report that the "typical" person in that bar makes millions, you are lying with data. The Arithmetic Mean has been pulled away from the true center by one massive value.


In another example, consider the average salary of a NBA player, often quoted at around ten million dollars per year, which looks spectacular on paper and seems to suggest that a “typical” player is a multimillionaire superstar. In reality, only a relatively small group of elite players earn thirty, forty, or even fifty million per year, while a large share of the league sits much closer to the minimum or mid-level contracts, so most players make roughly half of that headline figure or less. In that case, ten million dollars a year is not a typical salary and using this to understand a typical player in the league is wrong.


It is like having a room with six 5th graders and one eighty years old person and saying a typical age in the group is 63 years old.


In a business context, this happens constantly. One massive sales deal can skew your average order value. One incredibly long support ticket can skew your average resolution time. If you only show your users the Avg(), you might lead them to believe that performance is better or worse than it actually is.


Median


The Honest Broker

Enter the Median. The Median is the value that sits exactly in the middle of your dataset when it is sorted from smallest to largest. It separates the higher half from the lower half. The median tells me that Half of data is lower than this, and half if higher than this.


If you have five sales amounts, the median is the third one. If you have the same bar scenario with the ten workers and the billionaire, the median remains fifty thousand dollars. The billionaire counts as just one data point. He does not pull the center towards him.


Notice how the Median splits equality the dataset in half
Notice how the Median splits equality the dataset in half
In this example, the high max outlier shifts the Average
In this example, the high max outlier shifts the Average

The Median is robust. It resists the influence of outliers. It tells you what a "typical" experience looks like much better than the mean does when your data is skewed.


When to use it: Use the Median whenever you are dealing with financial data like income, house prices, or asset values. Use it for time durations like "Time to Resolve Ticket" or "Days to Ship" because one lost package that takes a year to arrive will destroy your arithmetic mean, but the median will remain accurate.


How to do it in Qlik: Qlik makes this incredibly simple for you. The syntax is just as easy as the average.


Median(Sales)

If you are building a dashboard for sales performance, consider putting the Mean and the Median next to each other. If they are very different, that difference itself is an insight! It tells the user that the data is skewed. If the Mean is much higher than the Median, you know you have some massive outliers pulling the numbers up. That is a story your users need to hear.


It is great, but it is not Gandolf, nor Yoda

It is vital to remember that the Median is not some magical potion that cures all your analytical ailments. Furthermore the Arithmetic Mean is not a villain actively trying to deceive you. They are simply different tools designed for different jobs.


While the Median is excellent at ignoring extreme values to show you the typical experience, sometimes you absolutely need to know about those extremes. Imagine you are an insurance company analyzing claim amounts for catastrophic events. Most claims might be small, but a few could be in the billions of dollars.


If you use the median, you will completely ignore those massive outliers. That would be disastrous for calculating the total risk exposure and setting premiums. In such a case you want the arithmetic mean precisely because it is sensitive to those massive numbers. The total financial impact matters more than the typical claim. But the world of data is stranger than just symmetrical or skewed bell curves. Sometimes, neither the Mean nor the Median is the right answer.


Geometric Mean


The Growth Tracker

Now we enter the territory of multiplication. The Arithmetic Mean is based on addition. But what if your data is about growth, percentages, or rates?


Imagine you are analyzing the performance of an investment portfolio. In year one, your investment grows by ten percent. In year two, it grows by fifty percent. In year three, it drops by thirty percent. If you try to use a simple arithmetic average on these percentages, you will get a misleading number. Median will make it even worse.


The problem is that the Arithmetic Mean treats each year as an isolated event on a static starting amount. It assumes the fifty percent gain in the second year is based on the same money you had at the start of year one. But that is not how reality works. The starting value of the second year is entirely dependent on the result of the first year. Your money compounds. The fifty percent growth happens on top of the ten percent growth you already achieved. Because the value of the second term is influenced by the first, and the third is influenced by the second, simple addition fails to capture the true path of your investment. The math of compounding does not work with addition. You need the Geometric Mean.


The Geometric Mean is calculated by multiplying all the numbers together and then taking the nth root (where n is the count of numbers). It is the only correct way to average ratios and percentages that build upon each other.


Formula for Geometric Mean
Formula for Geometric Mean

Here is a fun example to prove why you need it. You have one hundred dollars. You lose fifty percent. You now have fifty dollars. You gain fifty percent. You now have seventy five dollars.


The arithmetic average of negative fifty percent and positive fifty percent is zero. So the arithmetic mean tells you that your average return is zero percent. But you started with one hundred and ended with seventy five. You lost money! The arithmetic mean lied to you. The Geometric Mean would correctly show a negative value, reflecting the reality of your loss.


When to use it: Use this for financial portfolio returns, scientific growth rates, or any metric that is a percentage of a previous value.


How to do it in Qlik: In future versions of Qlik, you might find native support (Hey Qlik, if you're reading this, please, do it and I will buy you lunch). While this does not happen, it is good to know how to construct it manually using the mathematical relationship between logarithms and exponentials. This method is bulletproof and works on every version of Qlik.


The formula logic is: e (the Euller's number, which is one of my favorite tattoos) raised to the power of the average of the natural logs of the values.


Exp(Avg(Log(ReturnRate)))

The Geometric Mean has one very strict rule. It is allergic to negative numbers. This is not just a preference but a mathematical necessity. The formula relies on roots and logarithms. If you try to take the square root of a negative number, you leave the real world and enter the world of imaginary numbers. Qlik does not want to display imaginary numbers on your executive dashboard. If you try to feed a negative growth rate into a Geometric Mean function, the calculation will break and return a null value.


This creates a conflict because business data is full of negative numbers. Sales go down. Margins shrink. We usually express these as negative percentages, like minus ten percent. However, reason to worry you have not, my young padawan. There is a way out.


To solve this, we have to translate our data from a "rate of change" into a "multiplier."

Think of it in terms of what remains rather than what was lost. If you have a dollar and you lose ten percent of it, you are not holding "minus ten cents." You are holding ninety cents. The multiplier for that period is zero point nine. If you gain ten percent the next year, you are holding one dollar and ten cents. The multiplier is one point one.


Notice that zero point nine and one point one are both positive numbers. We have successfully removed the negative sign while keeping the reality of the performance intact. You calculate the Geometric Mean using these positive multipliers. This gives you the average compound growth factor. Once you have that final result, you simply subtract the number one from it. That converts the multiplier back into a percentage that your users will understand.


Harmonic Mean


The Speed Balancer

This is the most neglected member of the family, like the middle kid, but it is a superhero in logistics and manufacturing. The Harmonic Mean is the reciprocal of the arithmetic mean of the reciprocals.


Wait, what?
Wait, what?

That sounds complicated. Let me make it simple. The Harmonic Mean is required when you are averaging rates or ratios.


The classic example is the "Drive to Work" problem. You drive to work at sixty miles per hour. You drive home at forty miles per hour. What is your average speed?


Your brain wants to shout "Fifty!" because fifty is the arithmetic average of sixty and forty. But that is wrong. You spent more time driving at forty miles per hour than you did at sixty. Because you were slower, you were on the road longer. Therefore, the slower speed should have more weight in the average.


The true average speed is forty eight miles per hour. That is the Harmonic Mean.


Formula for Harmonic Mean
Formula for Harmonic Mean

If you are a Qlik developer working in supply chain, manufacturing, or logistics, you deal with rates all the time. Speed of production lines. Items processed per hour. Km per liter. If you use Avg() on these rates, you are reporting incorrect data. You are overestimating efficiency because you are not accounting for the fact that slower rates consume more time. The same applies for Project Management? How much time do we usually spend in a Qlik development project? Use it Mean or Median and you're wrong.


When to use it: Use the Harmonic Mean whenever you are averaging units that are defined as "something per something" (speed, density, rate) and you want to find the average rate over the total output.


How to do it in Qlik: While some newer builds have a function, the manual construction is very powerful and educational. You calculate the count of observations divided by the sum of the reciprocals.


Count(Value) / Sum(1/Value)

For a more robust expression that handles distinct values correctly within an aggregation, you might use the Aggr() function to isolate the rows.


Count(Total Distinct ID) / Sum(Total Aggr(1/Rate, ID))

This ensures you are calculating the harmonic mean of the rates correctly across your dimensions.


The Mount Rushmore of Measures Checkpoint


You have now assembled the Mount Rushmore of central tendency. These four giants stand tall as the bedrock of statistical analysis. The Arithmetic Mean, Median, Geometric Mean, and Harmonic Mean cover the vast majority of landscapes you will encounter in your daily data voyages.


ree

If you master just these four you are already lightyears ahead of the average (pun intended) dashboard designer who thinks the simple average is the answer to everything. It is perfectly acceptable to drop anchor right here. You have a solid toolkit that can handle skewed distributions and compound growth and complex rates.


If your brain feels full or if you just need to go fix a synthetic key in your data model then feel free to stop reading now. No judgment here because you have already graduated from the basic academy. Go for a walk play a game, have a coffee or just tune in your favorite 1990's rock bands.


However, there is always another horizon for the true explorers. For those of you who want to prepare for every possible anomaly and specific business logic quirk there are hidden levels to unlock. For those not happy to reach the moon, how about we go to Mars? We are about to venture into the territory of specific use cases where even the Big 4 might stumble. If you want to elevate your analytics from great to legendary then keep scrolling. We have three more instruments to show you.


Mode


The Crowd Favorite

Sometimes you do not care about the mathematical center or the balance of gravity. Sometimes you simply want to know what is popular. Who is the Beyonce of the data? The Mode is the value that appears most frequently in your dataset.


Think of a shoe store manager. The arithmetic mean of shoe sizes might be size nine point three. But you cannot manufacture a shoe in size nine point three. The median might be size nine. But if the specific shoe is trending among teenagers who mostly wear size seven, then stocking size nines will leave you with empty shelves and angry customers. You need the Mode.


The Mode is the only measure of central tendency that works with data that is not a number. You cannot calculate the average color of cars sold. You cannot calculate the median flavor of ice cream. But you can certainly calculate the Mode. It tells you that "Silver" is the most common car color or that "Qlik Tok" is the crowd favorite social media.


When to use it: Use this for inventory management to see which SKUs move the fastest. Use it for categorical analysis like identifying the most common support issue reason. It connects you to the most common experience rather than the mathematical average.


How to do it in Qlik: Qlik has a native function for this that is just as easy as the others.


Mode(ShoeSize)

Be careful though. A dataset can have multiple modes (if two values tie for the win) or no mode at all (if every value is unique).


Weighted Mean


The Fair Judge

This is perhaps the most critical addition for any business analyst. The simple Arithmetic Mean treats every data point as equals. But in business, data points are rarely equal.


Imagine you are analyzing the profit margin of two stores. Store A sells ten items and has a margin of ten percent. Store B sells one thousand items and has a margin of two percent.


If you simply average ten percent and two percent, you get six percent. You might report to your CEO that the company margin is six percent. That is catastrophically wrong. Store B is massive and Store A is tiny. The massive volume of Store B drags the reality down. The true margin is much closer to two percent.


The Weighted Mean forces the calculation to respect the weight or volume of each contributor. It stops small samples from bullying the big picture.


When to use it: Use this whenever you are averaging ratios, percentages, or prices that are associated with varying volumes. If you calculate "Average Price" without weighting it by "Quantity Sold", you are calculating a theoretical number that does not match your revenue.


How to do it in Qlik: There is no simple WeightedAvg() function in the standard aggregation list, but you build it using the logic of "Sum of Product divided by Sum of Weights".


Sum(Margin * Sales) / Sum(Sales)

This formula ensures that every dollar of sales gets a vote in the final calculation, rather than just every store getting a vote.


Trimmed Mean


The Skeptic

Sometimes you want the sensitivity of the mean, but you do not trust the extremes. You like the stability of the median, but you feel it ignores too much information. You want a compromise.


The Trimmed Mean involves cutting off a specific percentage of the smallest and largest values before calculating the average. For example, you might remove the top five percent and the bottom five percent of your data. Then you average what is left.


This is famously used in Olympic judging for sports like diving or gymnastics. They throw out the highest score and the lowest score to eliminate bias, then average the rest. In business, this removes the influence of data entry errors (like a zero price) or one-off anomalies without completely ignoring the distribution of the central data.


When to use it: Use this for sensor data that might have glitches. Use it for "Time to Deliver" metrics where you want to exclude the packages that got lost and the ones that were delivered instantly due to a system error, so you can focus on the performance of the actual logistics network. Use this in cases like Hospital Length of Stay where a few rare very long stays can mess up the center of gravity, but you do not want to totally ignore outliers.


How to do it in Qlik: This requires a bit more advanced scripting, often using Set Analysis or the Fractile() function to define your cut off points. A dynamic way to do this in a chart is to only sum values that exist between the 5th and 95th percentile.


Avg({$<Value = {">=$(=Fractile(Value, 0.05)) <=$(=Fractile(Value, 0.95))"}>} Value)

This expression acts as a bouncer, refusing to let the outliers into your calculation party.


Conclusion: Captaining Your Analysis


We began this journey with a simple warning about the charisma of the average. We talked about how a single number, when misapplied, can hide the very reality it claims to represent. Hopefully you now see that relying solely on the arithmetic mean is like trying to navigate the vast ocean with a map that only shows straight lines in a world full of curves, currents, and unexpected storms. The center is not a fixed point on a map but a dynamic concept that shifts depending on the landscape of your data.


Most importantly: The math is useless if the logic is flawed. Before you calculate anything you must define what success looks like for that metric. You need to bridge the gap between the raw data and the actual business decision. Do not pick a measure at random or out of habit. Make sure your choice of center tendency is a deliberate decision that reveals the specific insight you need. Don't blindly use one or another, reflect before you act.


The most critical step happens before you write a single line of script. You must stop and interrogate the intent behind the analysis. Ask yourself exactly what real world phenomenon you are trying to capture and what specific business value this number must deliver. Do not simply reach for a tool because it is familiar or convenient. True insight comes from aligning your statistical method with your business reality. You must reflect deeply on the destination before you choose the vehicle.


The choice to use the correct measure of central tendency is the choice between building a dashboard that merely reports numbers and building one that reveals the truth. It is the difference between being a passive passenger on the sea of data and being an active, knowledgeable captain of your analysis. Your users rely on you to provide them with a compass that points to the true north, not just the easiest north to calculate. So go back to your applications with fresh eyes. Challenge your old expressions. Look at your distributions with skepticism and curiosity. Apply the right tool for the job. The true center is out there waiting for you to find it. Safe travels on your next data voyage.


 
 
 

Comments


© 2024 Data Voyagers

  • Youtube
  • LinkedIn
bottom of page