fmean is a generic function that computes the (column-wise) mean of x, (optionally) grouped by g and/or weighted by w. The TRA argument can further be used to transform x using its (grouped, weighted) mean.

fmean(x, ...)

# S3 method for default
fmean(x, g = NULL, w = NULL, TRA = NULL, na.rm = TRUE,
      use.g.names = TRUE, ...)

# S3 method for matrix
fmean(x, g = NULL, w = NULL, TRA = NULL, na.rm = TRUE,
      use.g.names = TRUE, drop = TRUE, ...)

# S3 method for data.frame
fmean(x, g = NULL, w = NULL, TRA = NULL, na.rm = TRUE,
      use.g.names = TRUE, drop = TRUE, ...)

# S3 method for grouped_df
fmean(x, w = NULL, TRA = NULL, na.rm = TRUE,
      use.g.names = FALSE, keep.group_vars = TRUE, keep.w = TRUE, ...)

Arguments

x

a numeric vector, matrix, data frame or grouped data frame (class 'grouped_df').

g

a factor, GRP object, atomic vector (internally converted to factor) or a list of vectors / factors (internally converted to a GRP object) used to group x.

w

a numeric vector of (non-negative) weights, may contain missing values.

TRA

an integer or quoted operator indicating the transformation to perform: 1 - "replace_fill" | 2 - "replace" | 3 - "-" | 4 - "-+" | 5 - "/" | 6 - "%" | 7 - "+" | 8 - "*" | 9 - "%%" | 10 - "-%%". See TRA.

na.rm

logical. Skip missing values in x. Defaults to TRUE and implemented at very little computational cost. If na.rm = FALSE a NA is returned when encountered.

use.g.names

logical. Make group-names and add to the result as names (default method) or row-names (matrix and data frame methods). No row-names are generated for data.table's.

drop

matrix and data.frame method: Logical. TRUE drops dimensions and returns an atomic vector if g = NULL and TRA = NULL.

keep.group_vars

grouped_df method: Logical. FALSE removes grouping variables after computation.

keep.w

grouped_df method: Logical. Retain summed weighting variable after computation (if contained in grouped_df).

...

arguments to be passed to or from other methods.

Details

Missing-value removal as controlled by the na.rm argument is done very efficiently by simply skipping them in the computation (thus setting na.rm = FALSE on data with no missing values doesn't give extra speed). Large performance gains can nevertheless be achieved in the presence of missing values if na.rm = FALSE, since then the corresponding computation is terminated once a NA is encountered and NA is returned (unlike mean which just runs through without any checks).

The weighted mean is computed as sum(x * w) / sum(w). If na.rm = TRUE, missing values will be removed from both x and w i.e. utilizing only x[complete.cases(x,w)] and w[complete.cases(x,w)].

This all seamlessly generalizes to grouped computations, which are performed in a single pass (without splitting the data) and therefore extremely fast.

When applied to data frames with groups or drop = FALSE, fmean preserves all column attributes (such as variable labels) but does not distinguish between classed and unclassed object (thus applying fmean to a factor column will give a 'malformed factor' error). The attributes of the data frame itself are also preserved.

Value

The (w weighted) mean of x, grouped by g, or (if TRA is used) x transformed by its mean, grouped by g.

See also

Examples

## default vector method mpg <- mtcars$mpg fmean(mpg) # Simple mean
#> [1] 20.09062
fmean(mpg, w = mtcars$hp) # Weighted mean: Weighted by hp
#> [1] 17.97245
fmean(mpg, TRA = "-") # Simple transformation: demeaning (See also ?W)
#> [1] 0.909375 0.909375 2.709375 1.309375 -1.390625 -1.990625 -5.790625 #> [8] 4.309375 2.709375 -0.890625 -2.290625 -3.690625 -2.790625 -4.890625 #> [15] -9.690625 -9.690625 -5.390625 12.309375 10.309375 13.809375 1.409375 #> [22] -4.590625 -4.890625 -6.790625 -0.890625 7.209375 5.909375 10.309375 #> [29] -4.290625 -0.390625 -5.090625 1.309375
fmean(mpg, mtcars$cyl) # Grouped mean
#> 4 6 8 #> 26.66364 19.74286 15.10000
fmean(mpg, mtcars[8:9]) # another grouped mean.
#> 0.0 0.1 1.0 1.1 #> 15.05000 19.75000 20.74286 28.37143
g <- GRP(mtcars[c(2,8:9)]) fmean(mpg, g) # Pre-computing groups speeds up the computation
#> 4.0.1 4.1.0 4.1.1 6.0.1 6.1.0 8.0.0 8.0.1 #> 26.00000 22.90000 28.37143 20.56667 19.12500 15.05000 15.40000
fmean(mpg, g, mtcars$hp) # Grouped weighted mean
#> 4.0.1 4.1.0 4.1.1 6.0.1 6.1.0 8.0.0 8.0.1 #> 26.00000 22.69409 27.68209 20.42405 19.10087 14.82854 15.35259
fmean(mpg, g, TRA = "-") # Demeaning by group
#> [1] 0.4333333 0.4333333 -5.5714286 2.2750000 3.6500000 -1.0250000 #> [7] -0.7500000 1.5000000 -0.1000000 0.0750000 -1.3250000 1.3500000 #> [13] 2.2500000 0.1500000 -4.6500000 -4.6500000 -0.3500000 4.0285714 #> [19] 2.0285714 5.5285714 -1.4000000 0.4500000 0.1500000 -1.7500000 #> [25] 4.1500000 -1.0714286 0.0000000 2.0285714 0.4000000 -0.8666667 #> [31] -0.4000000 -6.9714286
fmean(mpg, g, mtcars$hp, "-") # Group-demeaning using weighted group means
#> [1] 0.57594937 0.57594937 -4.88209220 2.29913232 3.87145923 -1.00086768 #> [7] -0.52854077 1.70590551 0.10590551 0.09913232 -1.30086768 1.57145923 #> [13] 2.47145923 0.37145923 -4.42854077 -4.42854077 -0.12854077 4.71790780 #> [19] 2.71790780 6.21790780 -1.19409449 0.67145923 0.37145923 -1.52854077 #> [25] 4.37145923 -0.38209220 0.00000000 2.71790780 0.44741235 -0.72405063 #> [31] -0.35258765 -6.28209220
## data.frame method fmean(mtcars)
#> mpg cyl disp hp drat wt qsec #> 20.090625 6.187500 230.721875 146.687500 3.596562 3.217250 17.848750 #> vs am gear carb #> 0.437500 0.406250 3.687500 2.812500
fmean(mtcars, g)
#> mpg cyl disp hp drat wt qsec vs am gear #> 4.0.1 26.00000 4 120.3000 91.00000 4.430000 2.140000 16.70000 0 1 5.000000 #> 4.1.0 22.90000 4 135.8667 84.66667 3.770000 2.935000 20.97000 1 0 3.666667 #> 4.1.1 28.37143 4 89.8000 80.57143 4.148571 2.028286 18.70000 1 1 4.142857 #> 6.0.1 20.56667 6 155.0000 131.66667 3.806667 2.755000 16.32667 0 1 4.333333 #> 6.1.0 19.12500 6 204.5500 115.25000 3.420000 3.388750 19.21500 1 0 3.500000 #> 8.0.0 15.05000 8 357.6167 194.16667 3.120833 4.104083 17.14250 0 0 3.000000 #> 8.0.1 15.40000 8 326.0000 299.50000 3.880000 3.370000 14.55000 0 1 5.000000 #> carb #> 4.0.1 2.000000 #> 4.1.0 1.666667 #> 4.1.1 1.428571 #> 6.0.1 4.666667 #> 6.1.0 2.500000 #> 8.0.0 3.083333 #> 8.0.1 6.000000
fmean(fgroup_by(mtcars, cyl, vs, am)) # Another way of doing it..
#> cyl vs am mpg disp hp drat wt qsec gear #> 1 4 0 1 26.00000 120.3000 91.00000 4.430000 2.140000 16.70000 5.000000 #> 2 4 1 0 22.90000 135.8667 84.66667 3.770000 2.935000 20.97000 3.666667 #> 3 4 1 1 28.37143 89.8000 80.57143 4.148571 2.028286 18.70000 4.142857 #> 4 6 0 1 20.56667 155.0000 131.66667 3.806667 2.755000 16.32667 4.333333 #> 5 6 1 0 19.12500 204.5500 115.25000 3.420000 3.388750 19.21500 3.500000 #> 6 8 0 0 15.05000 357.6167 194.16667 3.120833 4.104083 17.14250 3.000000 #> 7 8 0 1 15.40000 326.0000 299.50000 3.880000 3.370000 14.55000 5.000000 #> carb #> 1 2.000000 #> 2 1.666667 #> 3 1.428571 #> 4 4.666667 #> 5 2.500000 #> 6 3.083333 #> 7 6.000000
head(fmean(mtcars, g, TRA = "-")) # etc..
#> mpg cyl disp hp drat wt #> Mazda RX4 0.4333333 0 5.000000 -21.66667 0.09333333 -0.1350000 #> Mazda RX4 Wag 0.4333333 0 5.000000 -21.66667 0.09333333 0.1200000 #> Datsun 710 -5.5714286 0 18.200000 12.42857 -0.29857143 0.2917143 #> Hornet 4 Drive 2.2750000 0 53.450000 -5.25000 -0.34000000 -0.1737500 #> Hornet Sportabout 3.6500000 0 2.383333 -19.16667 0.02916667 -0.6640833 #> Valiant -1.0250000 0 20.450000 -10.25000 -0.66000000 0.0712500 #> qsec vs am gear carb #> Mazda RX4 0.1333333 0 0 -0.3333333 -0.6666667 #> Mazda RX4 Wag 0.6933333 0 0 -0.3333333 -0.6666667 #> Datsun 710 -0.0900000 0 0 -0.1428571 -0.4285714 #> Hornet 4 Drive 0.2250000 0 0 -0.5000000 -1.5000000 #> Hornet Sportabout -0.1225000 0 0 0.0000000 -1.0833333 #> Valiant 1.0050000 0 0 -0.5000000 -1.5000000
## matrix method m <- qM(mtcars) fmean(m)
#> mpg cyl disp hp drat wt qsec #> 20.090625 6.187500 230.721875 146.687500 3.596562 3.217250 17.848750 #> vs am gear carb #> 0.437500 0.406250 3.687500 2.812500
fmean(m, g)
#> mpg cyl disp hp drat wt qsec vs am gear #> 4.0.1 26.00000 4 120.3000 91.00000 4.430000 2.140000 16.70000 0 1 5.000000 #> 4.1.0 22.90000 4 135.8667 84.66667 3.770000 2.935000 20.97000 1 0 3.666667 #> 4.1.1 28.37143 4 89.8000 80.57143 4.148571 2.028286 18.70000 1 1 4.142857 #> 6.0.1 20.56667 6 155.0000 131.66667 3.806667 2.755000 16.32667 0 1 4.333333 #> 6.1.0 19.12500 6 204.5500 115.25000 3.420000 3.388750 19.21500 1 0 3.500000 #> 8.0.0 15.05000 8 357.6167 194.16667 3.120833 4.104083 17.14250 0 0 3.000000 #> 8.0.1 15.40000 8 326.0000 299.50000 3.880000 3.370000 14.55000 0 1 5.000000 #> carb #> 4.0.1 2.000000 #> 4.1.0 1.666667 #> 4.1.1 1.428571 #> 6.0.1 4.666667 #> 6.1.0 2.500000 #> 8.0.0 3.083333 #> 8.0.1 6.000000
head(fmean(m, g, TRA = "-")) # etc..
#> mpg cyl disp hp drat wt #> Mazda RX4 0.4333333 0 5.000000 -21.66667 0.09333333 -0.1350000 #> Mazda RX4 Wag 0.4333333 0 5.000000 -21.66667 0.09333333 0.1200000 #> Datsun 710 -5.5714286 0 18.200000 12.42857 -0.29857143 0.2917143 #> Hornet 4 Drive 2.2750000 0 53.450000 -5.25000 -0.34000000 -0.1737500 #> Hornet Sportabout 3.6500000 0 2.383333 -19.16667 0.02916667 -0.6640833 #> Valiant -1.0250000 0 20.450000 -10.25000 -0.66000000 0.0712500 #> qsec vs am gear carb #> Mazda RX4 0.1333333 0 0 -0.3333333 -0.6666667 #> Mazda RX4 Wag 0.6933333 0 0 -0.3333333 -0.6666667 #> Datsun 710 -0.0900000 0 0 -0.1428571 -0.4285714 #> Hornet 4 Drive 0.2250000 0 0 -0.5000000 -1.5000000 #> Hornet Sportabout -0.1225000 0 0 0.0000000 -1.0833333 #> Valiant 1.0050000 0 0 -0.5000000 -1.5000000
## method for grouped data frames - created with dplyr::group_by or fgroup_by library(dplyr) mtcars %>% group_by(cyl,vs,am) %>% fmean # Ordinary
#> # A tibble: 7 x 11 #> cyl vs am mpg disp hp drat wt qsec gear carb #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 4 0 1 26 120. 91 4.43 2.14 16.7 5 2 #> 2 4 1 0 22.9 136. 84.7 3.77 2.94 21.0 3.67 1.67 #> 3 4 1 1 28.4 89.8 80.6 4.15 2.03 18.7 4.14 1.43 #> 4 6 0 1 20.6 155 132. 3.81 2.76 16.3 4.33 4.67 #> 5 6 1 0 19.1 205. 115. 3.42 3.39 19.2 3.5 2.5 #> 6 8 0 0 15.1 358. 194. 3.12 4.10 17.1 3 3.08 #> 7 8 0 1 15.4 326 300. 3.88 3.37 14.6 5 6
mtcars %>% group_by(cyl,vs,am) %>% fmean(hp) # Weighted
#> # A tibble: 7 x 11 #> cyl vs am sum.hp mpg disp drat wt qsec gear carb #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 4 0 1 91 26 120. 4.43 2.14 16.7 5 2 #> 2 4 1 0 254 22.7 134. 3.78 2.90 21.1 3.62 1.62 #> 3 4 1 1 564 27.7 93.9 4.08 2.07 18.5 4.20 1.49 #> 4 6 0 1 395 20.4 153. 3.78 2.76 16.2 4.44 4.89 #> 5 6 1 0 461 19.1 202. 3.46 3.39 19.2 3.53 2.60 #> 6 8 0 0 2330 14.8 363. 3.14 4.16 17.1 3 3.21 #> 7 8 0 1 599 15.4 323. 3.84 3.39 14.6 5 6.24
mtcars %>% group_by(cyl,vs,am) %>% fmean(hp, "-") # Weighted Transform
#> # A tibble: 32 x 11 #> # Groups: cyl, vs, am [7] #> cyl vs am hp mpg disp drat wt qsec gear carb #> * <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 6 0 1 110 0.576 6.65 0.124 -0.137 0.269 -0.443 -0.886 #> 2 6 0 1 110 0.576 6.65 0.124 0.118 0.829 -0.443 -0.886 #> 3 4 1 1 93 -4.88 14.1 -0.230 0.253 0.0696 -0.200 -0.486 #> 4 6 1 0 110 2.30 55.8 -0.375 -0.176 0.271 -0.534 -1.60 #> 5 8 0 0 175 3.87 -3.11 0.00691 -0.719 -0.0649 0 -1.21 #> 6 6 1 0 105 -1.00 22.8 -0.695 0.0691 1.05 -0.534 -1.60 #> 7 8 0 0 245 -0.529 -3.11 0.0669 -0.589 -1.24 0 0.790 #> 8 4 1 0 62 1.71 12.4 -0.0898 0.292 -1.09 0.382 0.382 #> 9 4 1 0 95 0.106 6.46 0.140 0.252 1.81 0.382 0.382 #> 10 6 1 0 123 0.0991 -34.6 0.465 0.0491 -0.869 0.466 1.40 #> # ... with 22 more rows
mtcars %>% group_by(cyl,vs,am) %>% select(mpg,hp) %>% fmean(hp, "-") # Only mpg
#> Adding missing grouping variables: `cyl`, `vs`, `am`
#> # A tibble: 32 x 5 #> # Groups: cyl, vs, am [7] #> cyl vs am hp mpg #> * <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 6 0 1 110 0.576 #> 2 6 0 1 110 0.576 #> 3 4 1 1 93 -4.88 #> 4 6 1 0 110 2.30 #> 5 8 0 0 175 3.87 #> 6 6 1 0 105 -1.00 #> 7 8 0 0 245 -0.529 #> 8 4 1 0 62 1.71 #> 9 4 1 0 95 0.106 #> 10 6 1 0 123 0.0991 #> # ... with 22 more rows
mtcars %>% fgroup_by(cyl,vs,am) %>% # Equivalent and faster ! fselect(mpg,hp) %>% fmean(hp, "-")
#> hp mpg #> Mazda RX4 110 0.57594937 #> Mazda RX4 Wag 110 0.57594937 #> Datsun 710 93 -4.88209220 #> Hornet 4 Drive 110 2.29913232 #> Hornet Sportabout 175 3.87145923 #> Valiant 105 -1.00086768 #> Duster 360 245 -0.52854077 #> Merc 240D 62 1.70590551 #> Merc 230 95 0.10590551 #> Merc 280 123 0.09913232 #> Merc 280C 123 -1.30086768 #> Merc 450SE 180 1.57145923 #> Merc 450SL 180 2.47145923 #> Merc 450SLC 180 0.37145923 #> Cadillac Fleetwood 205 -4.42854077 #> Lincoln Continental 215 -4.42854077 #> Chrysler Imperial 230 -0.12854077 #> Fiat 128 66 4.71790780 #> Honda Civic 52 2.71790780 #> Toyota Corolla 65 6.21790780 #> Toyota Corona 97 -1.19409449 #> Dodge Challenger 150 0.67145923 #> AMC Javelin 150 0.37145923 #> Camaro Z28 245 -1.52854077 #> Pontiac Firebird 175 4.37145923 #> Fiat X1-9 66 -0.38209220 #> Porsche 914-2 91 0.00000000 #> Lotus Europa 113 2.71790780 #> Ford Pantera L 264 0.44741235 #> Ferrari Dino 175 -0.72405063 #> Maserati Bora 335 -0.35258765 #> Volvo 142E 109 -6.28209220 #> #> Grouped by: cyl, vs, am [7 | 5 (3.8)]