Skip to content Skip to sidebar Skip to footer

Trimmed Mean With Percentage Limit In Python?

I am trying to calculate the trimmed mean, which excludes the outliers, of an array. I found there is a module called scipy.stats.tmean, but it requires the user specifies the ran

Solution 1:

At least for scipy v0.14.0, there is a dedicated function for this (scipy.stats.trim_mean):

from scipy import stats
m = stats.trim_mean(X, 0.1) # Trim 10% at both ends

which used stats.trimboth inside.

From the source code it is possible to see that with proportiontocut=0.1 the mean will be calculated using 80% of the data. Note that the scipy.stats.trim_mean can not handle np.nan.

Solution 2:

(Edit: the context for this answer was that scipy.stats.trim_mean wasn't documented yet. Now that it's publicly available, use that function instead of rolling your own. My answer below is kept for historical purpose.)


You can also implement the whole thing yourself, following the instruction in the MatLab documentation.

Here's the code in Python 2:

from numpy import mean
deftrimmean(arr, percent):
    n = len(arr)
    k = int(round(n*(float(percent)/100)/2))
    return mean(arr[k+1:n-k])

Solution 3:

Here's a manual implementation using floor from the math library...

deftrimMean(tlist,tperc):
    removeN = int(math.floor(len(tlist) * tperc / 2))
    tlist.sort()
    if removeN > 0: tlist = tlist[removeN:-removeN]
    return reduce(lambda a,b : a+b, tlist) / float(len(tlist))

Post a Comment for "Trimmed Mean With Percentage Limit In Python?"