Understanding Mean

24/06/2020 Views : 359

I Wayan Sumarjaya

The word mean is so familiar and we often use this word in our daily life. For example, we calculate mean daily mileage from, say, Tabanan to Denpasar, or calculate mean monthly family expenditure. Do you know that there are many things that we can learn from mean?

Before we discuss what we can learn from mean, let us see the following illustration. Suppose that we wish to calculate family expenditure for the last six months (in million rupiah). The data is as follows: 2, 2.5, 1.5, 3, 2.6, and 20. We know that we the calculation of this mean is straightforward. Just sum all these numbers and then divide by 6, for instance 31.7/6 = 5.25 million rupiah. Does the mean make sense? If not, what is happening here? The mean seems do not make sense since this number should be around two million rupiahs. Now, let see if we remove the 20 million rupiahs from our data. In other words, we only calculate the mean for the first five months and ignoring the sixth month.  The mean is now 11.7/5 = 2.34 which is more reasonable.

Let us now see the following illustration. Suppose that the monthly family expenditure for the last sixth months are as follows: 2, 15, 16, 18, 15, and 17. The mean of this expenditure is 83/6=13.8 million rupiahs which is slightly lower than 16 million rupiahs. Now, as the preceding illustration let us calculate the mean expenditure ignoring the first data. Straightforward calculation yields 81.5/5 = 16.2 million rupiahs which is now become more reasonable.

So, what can we learn from these examples? Here we see that the mean is sensitive to extreme values. This raises question. How to solve this sensitivity of the mean? Several ways are available to solve this problem.

One way of solving this sensitivity is by truncating or trimming ordered data that contain extreme value. Hence, the term truncated or trimmed mean.  Basic ideas of truncated mean are as follows. First, sort the data from the lowest to the highest values. Then determine how much data will be removed, for instance 25%. Let us see again the above expenditure data, for instance, 2, 15, 15, 16, 18, 15, and 17.  Removing the first and the last observation and then calculating the mean yields 63/4 = 15.75. In this case we have 33% trimmed mean.

Another way of solving the mean problem is by using median. Median is the mid value of ordered data. Let us calculate the median for the ordered expenditure, for instance, 2, 15, 15, 16, 17, and 18. The median lies between the third and the fourth observation. Hence the median is (15 + 16)/2 = 15.5 million rupiahs. Median is called robust since its value is not affected by outliers.

The other thing that we need to know is that the mean can be calculated for numbers with meaningful expression. For example suppose that a poll is carried out to ask people opinion about the use of face mask during the pandemic. We might code the poll as follows: 0 for disagree, 1 for mild, and 2 for agree. Now, suppose that we have the following data: 0, 0, 1, 1, 2, 2, 2. The mean of this data is 8/7 = 1.14.  Is this correct? Of course not. The numbers are merely code the poll which do not the real number.

We conclude that the mean posses the following interesting properties. First, the mean measures the centre of the data. Next, the mean is sensitive to outliers. Finally, the mean will be meaningful if applies to meaningful numbers.