Skip to contents

The function dispersion() extracts the dispersion parameter from a multinomial logit model or computes a dispersion parameter estimate based on a given method. This dispersion parameter can be attached to a model using update(). It can also given as an argument to summary().

Usage

dispersion(object,method, ...)
# S3 method for class 'mclogit'
dispersion(object,method=NULL,groups=NULL, ...)

Arguments

object

an object that inherits class "mclogit". When passed to dispersion(), it should be the result of a call of mclogit() of mblogit(), without random effects.

method

a character string, either "Afroz", "Fletcher", "Pearson", or "Deviance", that specifies the estimator of the dispersion; or NULL, in which case the default estimator, "Afroz" is used. The estimators are discussed in Afroz et al. (2019).

groups

an optional formula that specifies groups of observations relevant for the estimation of overdispersion. Prediced probabilities should be constant within groups, otherwise a warning is generated since the overdispersion estimate may be imprecise.

...

other arguments, ignored or passed to other methods.

References

Afroz, Farzana, Matt Parry, and David Fletcher. (2020). "Estimating Overdispersion in Sparse Multinomial Data." Biometrics 76(3): 834-842. doi:10.1111/biom.13194 .

Examples

library(MASS) # For 'housing' data

# Note that with a factor response and frequency weighted data,
# Overdispersion will be overestimated:
house.mblogit <- mblogit(Sat ~ Infl + Type + Cont, weights = Freq,
                         data = housing)
#> 
#> Iteration 1 - deviance = 3493.764 - criterion = 0.9614469
#> Iteration 2 - deviance = 3470.111 - criterion = 0.00681597
#> Iteration 3 - deviance = 3470.084 - criterion = 7.82437e-06
#> Iteration 4 - deviance = 3470.084 - criterion = 7.469596e-11
#> converged

dispersion(house.mblogit,method="Afroz")
#> [1] 2.062653
dispersion(house.mblogit,method="Deviance")
#> [1] 2.175601

summary(house.mblogit)
#> 
#> Call:
#> mblogit(formula = Sat ~ Infl + Type + Cont, data = housing, weights = Freq)
#> 
#> Equation for Medium vs Low:
#>               Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)    -0.4192     0.1729  -2.424 0.015342 *  
#> InflMedium      0.4464     0.1416   3.153 0.001613 ** 
#> InflHigh        0.6649     0.1863   3.568 0.000359 ***
#> TypeApartment  -0.4357     0.1725  -2.525 0.011562 *  
#> TypeAtrium      0.1314     0.2231   0.589 0.555980    
#> TypeTerrace    -0.6666     0.2063  -3.232 0.001230 ** 
#> ContHigh        0.3609     0.1324   2.726 0.006420 ** 
#> 
#> Equation for High vs Low:
#>               Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)    -0.1387     0.1592  -0.871 0.383570    
#> InflMedium      0.7349     0.1369   5.366 8.03e-08 ***
#> InflHigh        1.6126     0.1671   9.649  < 2e-16 ***
#> TypeApartment  -0.7356     0.1553  -4.738 2.16e-06 ***
#> TypeAtrium     -0.4080     0.2115  -1.929 0.053730 .  
#> TypeTerrace    -1.4123     0.2001  -7.056 1.71e-12 ***
#> ContHigh        0.4818     0.1241   3.881 0.000104 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Approximate residual Deviance: 3470 
#> Number of Fisher scoring iterations:  4 
#> Number of observations:  1681 
#> 

phi.Afroz <- dispersion(house.mblogit,method="Afroz")
summary(house.mblogit, dispersion=phi.Afroz)
#> 
#> Call:
#> mblogit(formula = Sat ~ Infl + Type + Cont, data = housing, weights = Freq)
#> 
#> Equation for Medium vs Low:
#>               Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)    -0.4192     0.1729  -2.424 0.016717 *  
#> InflMedium      0.4464     0.1416   3.153 0.002004 ** 
#> InflHigh        0.6649     0.1863   3.568 0.000504 ***
#> TypeApartment  -0.4357     0.1725  -2.525 0.012763 *  
#> TypeAtrium      0.1314     0.2231   0.589 0.557002    
#> TypeTerrace    -0.6666     0.2063  -3.232 0.001559 ** 
#> ContHigh        0.3609     0.1324   2.726 0.007305 ** 
#> 
#> Equation for High vs Low:
#>               Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)    -0.1387     0.1592  -0.871 0.385176    
#> InflMedium      0.7349     0.1369   5.366 3.57e-07 ***
#> InflHigh        1.6126     0.1671   9.649  < 2e-16 ***
#> TypeApartment  -0.7356     0.1553  -4.738 5.58e-06 ***
#> TypeAtrium     -0.4080     0.2115  -1.929 0.055910 .  
#> TypeTerrace    -1.4123     0.2001  -7.056 9.14e-11 ***
#> ContHigh        0.4818     0.1241   3.881 0.000164 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Dispersion:  2.062653  on  130  degrees of freedom
#> Approximate residual Deviance: 3470 
#> Number of Fisher scoring iterations:  4 
#> Number of observations:  1681 
#> 

summary(update(house.mblogit, dispersion="Afroz"))
#> 
#> Call:
#> mblogit(formula = Sat ~ Infl + Type + Cont, data = housing, weights = Freq)
#> 
#> Equation for Medium vs Low:
#>               Estimate Std. Error t value Pr(>|t|)  
#> (Intercept)    -0.4192     0.2484  -1.688   0.0938 .
#> InflMedium      0.4464     0.2033   2.196   0.0299 *
#> InflHigh        0.6649     0.2676   2.485   0.0142 *
#> TypeApartment  -0.4357     0.2478  -1.758   0.0811 .
#> TypeAtrium      0.1314     0.3204   0.410   0.6825  
#> TypeTerrace    -0.6666     0.2962  -2.250   0.0261 *
#> ContHigh        0.3609     0.1901   1.898   0.0599 .
#> 
#> Equation for High vs Low:
#>               Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)    -0.1387     0.2287  -0.607 0.545109    
#> InflMedium      0.7349     0.1967   3.737 0.000278 ***
#> InflHigh        1.6126     0.2400   6.718 5.20e-10 ***
#> TypeApartment  -0.7356     0.2230  -3.299 0.001253 ** 
#> TypeAtrium     -0.4080     0.3038  -1.343 0.181568    
#> TypeTerrace    -1.4123     0.2875  -4.913 2.64e-06 ***
#> ContHigh        0.4818     0.1783   2.703 0.007800 ** 
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Dispersion:  2.062653  on  130  degrees of freedom
#> Approximate residual Deviance: 3470 
#> Number of Fisher scoring iterations:  4 
#> Number of observations:  1681 
#> 

# In order to be able to estimate overdispersion accurately,
# data like the above (which usually comes from applying
# 'as.data.frame' to a contingency table) the model has to be
# fitted with the optional argument 'from.table=TRUE':
house.mblogit.corrected <- mblogit(Sat ~ Infl + Type + Cont, weights = Freq,
                                   data = housing, from.table=TRUE,
                                   dispersion="Afroz")
#> 
#> Iteration 1 - deviance = 38.84842 - criterion = 0.992521
#> Iteration 2 - deviance = 38.66222 - criterion = 0.004803721
#> Iteration 3 - deviance = 38.6622 - criterion = 3.782555e-07
#> Iteration 4 - deviance = 38.6622 - criterion = 3.666163e-15
#> converged
# Now the estimated dispersion parameter is no longer larger than 20,
# but just bit over 1.0.
summary(house.mblogit.corrected)
#> 
#> Call:
#> mblogit(formula = Sat ~ Infl + Type + Cont, data = housing, weights = Freq, 
#>     dispersion = "Afroz", from.table = TRUE)
#> 
#> Equation for Medium vs Low:
#>               Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)   -0.41923    0.02661 -15.757  < 2e-16 ***
#> InflMedium     0.44640    0.02178  20.497  < 2e-16 ***
#> InflHigh       0.66494    0.02867  23.195  < 2e-16 ***
#> TypeApartment -0.43569    0.02654 -16.414  < 2e-16 ***
#> TypeAtrium     0.13137    0.03432   3.827  0.00053 ***
#> TypeTerrace   -0.66657    0.03173 -21.007  < 2e-16 ***
#> ContHigh       0.36085    0.02037  17.716  < 2e-16 ***
#> 
#> Equation for High vs Low:
#>               Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)   -0.13874    0.02450  -5.664 2.36e-06 ***
#> InflMedium     0.73486    0.02107  34.882  < 2e-16 ***
#> InflHigh       1.61263    0.02571  62.718  < 2e-16 ***
#> TypeApartment -0.73563    0.02389 -30.795  < 2e-16 ***
#> TypeAtrium    -0.40798    0.03254 -12.539 2.64e-14 ***
#> TypeTerrace   -1.41233    0.03079 -45.866  < 2e-16 ***
#> ContHigh       0.48183    0.01910  25.229  < 2e-16 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Dispersion:  0.0236687  on  34  degrees of freedom
#> Approximate residual Deviance: 38.66 
#> Number of Fisher scoring iterations:  4 
#> Number of observations:  1681 
#>