Skip to contents

The function Means() creates a table of group means, optionally with standard errors, confidence intervals, and numbers of valid observations.

Usage

Means(data, ...)
# S3 method for data.frame
Means(data,
    by, weights=NULL, subset=NULL,
    default=NA,
    se=FALSE, ci=FALSE, ci.level=.95,
    counts=FALSE, ...)
# S3 method for formula
Means(data, subset, weights, ...)
# S3 method for numeric
Means(data, ...)
# S3 method for means.table
as.data.frame(x, row.names=NULL, optional=TRUE, drop=TRUE, ...)
# S3 method for xmeans.table
as.data.frame(x, row.names=NULL, optional=TRUE, drop=TRUE, ...)

Arguments

data

an object usually containing data, or a formula.

If data is a numeric vector or an object that can be coerced into a data frame, it is changed into a data frame and the data frame method of Means() is applied to it.

If data is a formula, then a data frame is constructed from the variables in the formula and Means is applied to this data frame, while the formula is passed on as a by= argument.

by

a formula, a vector of variable names or a data frame or list of factors.

If by is a vector of variable names, they are extracted from data to define the groups for which means are computed, while the variables for which the means are computed are those not named in by.

If by is a data frame or a list of factors, these are used to defined the groups for which means are computed, while the variables for which the means are computed are those not in by.

If by is a formula, its left-hand side determines the variables of which means are computed, while its right-hand side determines the factors that define the groups.

weights

an optional vector of weights, usually a variable in data.

subset

an optional logical vector to select observations, usually the result of an expression in variables from data.

default

a default value used for empty cells without observations.

se

a logical value, indicates whether standard errors should be computed.

ci

a logical value, indicates whether limits of confidence intervals should be computed.

ci.level

a number, the confidence level of the confidence interval

counts

a logical value, indicates whether numbers of valid observations should be reported.

x

for as.data.frame(), a result of Means().

row.names

an optional character vector. This argmument presently is inconsequential and only included for reasons of compatiblity with the standard methods of as.data.frame.

optional

an optional logical value. This argmument presently is inconsequential and only included for reasons of compatiblity with the standard methods of as.data.frame.

drop

a logical value, determines whether "empty cells" should be dropped from the resulting data frame.

...

other arguments, either ignored or passed on to other methods where applicable.

Value

An array that inherits classes "means.table" and "table". If

Means was called with se=TRUE or ci=TRUE

then the result additionally inherits class "xmeans.table".

Examples

# Preparing example data
USstates <- as.data.frame(state.x77)
USstates <- within(USstates,{
    region <- state.region
    name <- state.name
    abb <- state.abb
    division <- state.division
})
USstates$w <- sample(runif(n=6),size=nrow(USstates),replace=TRUE)

# Using the data frame method
Means(USstates[c("Murder","division","region")],by=c("division","region"))
#> , ,  = Murder
#> 
#>                     region
#> division             Northeast     South North Central      West
#>   New England         3.383333                                  
#>   Middle Atlantic     7.400000                                  
#>   South Atlantic                9.775000                        
#>   East South Central           12.300000                        
#>   West South Central           10.475000                        
#>   East North Central                          7.780000          
#>   West North Central                          3.485714          
#>   Mountain                                              7.187500
#>   Pacific                                               7.260000
#> 
Means(USstates[c("Murder","division","region")],by=USstates[c("division","region")])
#> , ,  = Murder
#> 
#>                     region
#> division             Northeast     South North Central      West
#>   New England         3.383333                                  
#>   Middle Atlantic     7.400000                                  
#>   South Atlantic                9.775000                        
#>   East South Central           12.300000                        
#>   West South Central           10.475000                        
#>   East North Central                          7.780000          
#>   West North Central                          3.485714          
#>   Mountain                                              7.187500
#>   Pacific                                               7.260000
#> 
Means(USstates[c("Murder")],1)
#>         Mean
#> Murder 7.378
Means(USstates[c("Murder","region")],by=c("region"))
#>                
#> region             Murder
#>   Northeast      4.722222
#>   South         10.581250
#>   North Central  5.275000
#>   West           7.215385

# Using the formula method
# One 'dependent' variable
Means(Murder~1, data=USstates)
#>         Mean
#> Murder 7.378
Means(Murder~division, data=USstates)
#>                     
#> division                Murder
#>   New England         3.383333
#>   Middle Atlantic     7.400000
#>   South Atlantic      9.775000
#>   East South Central 12.300000
#>   West South Central 10.475000
#>   East North Central  7.780000
#>   West North Central  3.485714
#>   Mountain            7.187500
#>   Pacific             7.260000
Means(Murder~division, data=USstates,weights=w)
#>                     
#> division                Murder
#>   New England         3.320144
#>   Middle Atlantic     7.290699
#>   South Atlantic      9.435370
#>   East South Central 12.209477
#>   West South Central 10.431353
#>   East North Central  8.150741
#>   West North Central  3.693592
#>   Mountain            7.049596
#>   Pacific             6.900254
Means(Murder~division+region, data=USstates)
#> , ,  = Murder
#> 
#>                     region
#> division             Northeast     South North Central      West
#>   New England         3.383333                                  
#>   Middle Atlantic     7.400000                                  
#>   South Atlantic                9.775000                        
#>   East South Central           12.300000                        
#>   West South Central           10.475000                        
#>   East North Central                          7.780000          
#>   West North Central                          3.485714          
#>   Mountain                                              7.187500
#>   Pacific                                               7.260000
#> 
as.data.frame(Means(Murder~division+region, data=USstates))
#>              division        region    Murder
#> 1         New England     Northeast  3.383333
#> 2     Middle Atlantic     Northeast  7.400000
#> 12     South Atlantic         South  9.775000
#> 13 East South Central         South 12.300000
#> 14 West South Central         South 10.475000
#> 24 East North Central North Central  7.780000
#> 25 West North Central North Central  3.485714
#> 35           Mountain          West  7.187500
#> 36            Pacific          West  7.260000

# Standard errors and counts
Means(Murder~division, data=USstates, se=TRUE, counts=TRUE)
#> , , Statistic = Mean
#> 
#>                     Variable
#> division                 Murder
#>   New England         3.3833333
#>   Middle Atlantic     7.4000000
#>   South Atlantic      9.7750000
#>   East South Central 12.3000000
#>   West South Central 10.4750000
#>   East North Central  7.7800000
#>   West North Central  3.4857143
#>   Mountain            7.1875000
#>   Pacific             7.2600000
#> 
#> , , Statistic = SE
#> 
#>                     Variable
#> division                 Murder
#>   New England         0.4475241
#>   Middle Atlantic     1.7691806
#>   South Atlantic      0.9151015
#>   East South Central  1.0189864
#>   West South Central  1.5040916
#>   East North Central  1.4287757
#>   West North Central  1.0411597
#>   Mountain            0.8565791
#>   Pacific             1.4968634
#> 
#> , , Statistic = N
#> 
#>                     Variable
#> division                 Murder
#>   New England         6.0000000
#>   Middle Atlantic     3.0000000
#>   South Atlantic      8.0000000
#>   East South Central  4.0000000
#>   West South Central  4.0000000
#>   East North Central  5.0000000
#>   West North Central  7.0000000
#>   Mountain            8.0000000
#>   Pacific             5.0000000
#> 
drop(Means(Murder~division, data=USstates, se=TRUE, counts=TRUE))
#>                     Statistic
#> division                   Mean         SE          N
#>   New England         3.3833333  0.4475241  6.0000000
#>   Middle Atlantic     7.4000000  1.7691806  3.0000000
#>   South Atlantic      9.7750000  0.9151015  8.0000000
#>   East South Central 12.3000000  1.0189864  4.0000000
#>   West South Central 10.4750000  1.5040916  4.0000000
#>   East North Central  7.7800000  1.4287757  5.0000000
#>   West North Central  3.4857143  1.0411597  7.0000000
#>   Mountain            7.1875000  0.8565791  8.0000000
#>   Pacific             7.2600000  1.4968634  5.0000000
as.data.frame(Means(Murder~division, data=USstates, se=TRUE, counts=TRUE))
#>             division Variable      Mean        SE N
#> 1        New England   Murder  3.383333 0.4475241 6
#> 2    Middle Atlantic   Murder  7.400000 1.7691806 3
#> 3     South Atlantic   Murder  9.775000 0.9151015 8
#> 4 East South Central   Murder 12.300000 1.0189864 4
#> 5 West South Central   Murder 10.475000 1.5040916 4
#> 6 East North Central   Murder  7.780000 1.4287757 5
#> 7 West North Central   Murder  3.485714 1.0411597 7
#> 8           Mountain   Murder  7.187500 0.8565791 8
#> 9            Pacific   Murder  7.260000 1.4968634 5

# Confidence intervals
Means(Murder~division, data=USstates, ci=TRUE)
#> , , Statistic = Mean
#> 
#>                     Variable
#> division                 Murder
#>   New England         3.3833333
#>   Middle Atlantic     7.4000000
#>   South Atlantic      9.7750000
#>   East South Central 12.3000000
#>   West South Central 10.4750000
#>   East North Central  7.7800000
#>   West North Central  3.4857143
#>   Mountain            7.1875000
#>   Pacific             7.2600000
#> 
#> , , Statistic = Lower
#> 
#>                     Variable
#> division                 Murder
#>   New England         2.2329361
#>   Middle Atlantic    -0.2121697
#>   South Atlantic      7.6111289
#>   East South Central  9.0571304
#>   West South Central  5.6883091
#>   East North Central  3.8130827
#>   West North Central  0.9380882
#>   Mountain            5.1620124
#>   Pacific             3.1040410
#> 
#> , , Statistic = Upper
#> 
#>                     Variable
#> division                 Murder
#>   New England         2.2329361
#>   Middle Atlantic    -0.2121697
#>   South Atlantic      7.6111289
#>   East South Central  9.0571304
#>   West South Central  5.6883091
#>   East North Central  3.8130827
#>   West North Central  0.9380882
#>   Mountain            5.1620124
#>   Pacific             3.1040410
#> 
drop(Means(Murder~division, data=USstates, ci=TRUE))
#>                     Statistic
#> division                   Mean      Lower      Upper
#>   New England         3.3833333  2.2329361  2.2329361
#>   Middle Atlantic     7.4000000 -0.2121697 -0.2121697
#>   South Atlantic      9.7750000  7.6111289  7.6111289
#>   East South Central 12.3000000  9.0571304  9.0571304
#>   West South Central 10.4750000  5.6883091  5.6883091
#>   East North Central  7.7800000  3.8130827  3.8130827
#>   West North Central  3.4857143  0.9380882  0.9380882
#>   Mountain            7.1875000  5.1620124  5.1620124
#>   Pacific             7.2600000  3.1040410  3.1040410
as.data.frame(Means(Murder~division, data=USstates, ci=TRUE))
#>             division Variable      Mean      Lower      Upper
#> 1        New England   Murder  3.383333  2.2329361  2.2329361
#> 2    Middle Atlantic   Murder  7.400000 -0.2121697 -0.2121697
#> 3     South Atlantic   Murder  9.775000  7.6111289  7.6111289
#> 4 East South Central   Murder 12.300000  9.0571304  9.0571304
#> 5 West South Central   Murder 10.475000  5.6883091  5.6883091
#> 6 East North Central   Murder  7.780000  3.8130827  3.8130827
#> 7 West North Central   Murder  3.485714  0.9380882  0.9380882
#> 8           Mountain   Murder  7.187500  5.1620124  5.1620124
#> 9            Pacific   Murder  7.260000  3.1040410  3.1040410

# More than one dependent variable
Means(Murder+Illiteracy~division, data=USstates)
#>                     
#> division                 Murder Illiteracy
#>   New England         3.3833333  0.9166667
#>   Middle Atlantic     7.4000000  1.1666667
#>   South Atlantic      9.7750000  1.5000000
#>   East South Central 12.3000000  1.9500000
#>   West South Central 10.4750000  2.0000000
#>   East North Central  7.7800000  0.8000000
#>   West North Central  3.4857143  0.6285714
#>   Mountain            7.1875000  0.9500000
#>   Pacific             7.2600000  1.1400000
as.data.frame(Means(Murder+Illiteracy~division, data=USstates))
#>             division    Murder Illiteracy
#> 1        New England  3.383333  0.9166667
#> 2    Middle Atlantic  7.400000  1.1666667
#> 3     South Atlantic  9.775000  1.5000000
#> 4 East South Central 12.300000  1.9500000
#> 5 West South Central 10.475000  2.0000000
#> 6 East North Central  7.780000  0.8000000
#> 7 West North Central  3.485714  0.6285714
#> 8           Mountain  7.187500  0.9500000
#> 9            Pacific  7.260000  1.1400000

# Confidence intervals
Means(Murder+Illiteracy~division, data=USstates, ci=TRUE)
#> , , Statistic = Mean
#> 
#>                     Variable
#> division                 Murder Illiteracy
#>   New England         3.3833333  0.9166667
#>   Middle Atlantic     7.4000000  1.1666667
#>   South Atlantic      9.7750000  1.5000000
#>   East South Central 12.3000000  1.9500000
#>   West South Central 10.4750000  2.0000000
#>   East North Central  7.7800000  0.8000000
#>   West North Central  3.4857143  0.6285714
#>   Mountain            7.1875000  0.9500000
#>   Pacific             7.2600000  1.1400000
#> 
#> , , Statistic = Lower
#> 
#>                     Variable
#> division                 Murder Illiteracy
#>   New England         2.2329361  0.6167655
#>   Middle Atlantic    -0.2121697  0.6495522
#>   South Atlantic      7.6111289  1.0807969
#>   East South Central  9.0571304  1.3617494
#>   West South Central  5.6883091  0.8748353
#>   East North Central  3.8130827  0.6758336
#>   West North Central  0.9380882  0.5126359
#>   Mountain            5.1620124  0.3990592
#>   Pacific             3.1040410  0.4343240
#> 
#> , , Statistic = Upper
#> 
#>                     Variable
#> division                 Murder Illiteracy
#>   New England         2.2329361  0.6167655
#>   Middle Atlantic    -0.2121697  0.6495522
#>   South Atlantic      7.6111289  1.0807969
#>   East South Central  9.0571304  1.3617494
#>   West South Central  5.6883091  0.8748353
#>   East North Central  3.8130827  0.6758336
#>   West North Central  0.9380882  0.5126359
#>   Mountain            5.1620124  0.3990592
#>   Pacific             3.1040410  0.4343240
#> 
as.data.frame(Means(Murder+Illiteracy~division, data=USstates, ci=TRUE))
#>              division   Variable       Mean      Lower      Upper
#> 1         New England     Murder  3.3833333  2.2329361  2.2329361
#> 2     Middle Atlantic     Murder  7.4000000 -0.2121697 -0.2121697
#> 3      South Atlantic     Murder  9.7750000  7.6111289  7.6111289
#> 4  East South Central     Murder 12.3000000  9.0571304  9.0571304
#> 5  West South Central     Murder 10.4750000  5.6883091  5.6883091
#> 6  East North Central     Murder  7.7800000  3.8130827  3.8130827
#> 7  West North Central     Murder  3.4857143  0.9380882  0.9380882
#> 8            Mountain     Murder  7.1875000  5.1620124  5.1620124
#> 9             Pacific     Murder  7.2600000  3.1040410  3.1040410
#> 10        New England Illiteracy  0.9166667  0.6167655  0.6167655
#> 11    Middle Atlantic Illiteracy  1.1666667  0.6495522  0.6495522
#> 12     South Atlantic Illiteracy  1.5000000  1.0807969  1.0807969
#> 13 East South Central Illiteracy  1.9500000  1.3617494  1.3617494
#> 14 West South Central Illiteracy  2.0000000  0.8748353  0.8748353
#> 15 East North Central Illiteracy  0.8000000  0.6758336  0.6758336
#> 16 West North Central Illiteracy  0.6285714  0.5126359  0.5126359
#> 17           Mountain Illiteracy  0.9500000  0.3990592  0.3990592
#> 18            Pacific Illiteracy  1.1400000  0.4343240  0.4343240

# Some 'non-standard' but still valid usages:
with(USstates,
     Means(Murder~division+region,subset=region!="Northeast"))
#> , ,  = Murder
#> 
#>                     region
#> division                 South North Central      West
#>   South Atlantic      9.775000                        
#>   East South Central 12.300000                        
#>   West South Central 10.475000                        
#>   East North Central                7.780000          
#>   West North Central                3.485714          
#>   Mountain                                    7.187500
#>   Pacific                                     7.260000
#> 

with(USstates,
     Means(Murder,by=list(division,region)))
#> , , Murder
#> 
#>                    Northeast     South North Central      West
#> New England         3.383333                                  
#> Middle Atlantic     7.400000                                  
#> South Atlantic                9.775000                        
#> East South Central           12.300000                        
#> West South Central           10.475000                        
#> East North Central                          7.780000          
#> West North Central                          3.485714          
#> Mountain                                              7.187500
#> Pacific                                               7.260000
#>