coulminer - 23 days ago 10
R Question

# Printing all glm coefficients in R

How do I print glm coefficients for all factor levels, including reference level?
summary(glm_obj) prints only the values that deviate from reference values.

I know that those are 0's, but I need this for integration, i.e. telling other systems what factor levels can happen at all.

Sorry if it's too simple, could not find anywhere.

Thanks

To illustrate the problem I am facing:

``````> glm(Petal.Width~Species,data=iris)

Call:  glm(formula = Petal.Width ~ Species, data = iris)

Coefficients:
(Intercept)  Speciesversicolor   Speciesvirginica
0.246              1.080              1.780

Degrees of Freedom: 149 Total (i.e. Null);  147 Residual
Null Deviance:      86.57
Residual Deviance: 6.157    AIC: -45.29`
``````

The model description above yields only coefficients for versicolor and virginica, which is, as Dason has noted, absolutely fine from the point of view of the model itself.

However, I needed to share the model with another application, which must know what levels of Species to expect (and e.g. issue a warning in once a new, unstudied level appears).

Summary() gives the same results:

``````> summary(glm(Petal.Width~Species,data=iris))

Call:
glm(formula = Petal.Width ~ Species, data = iris)

Deviance Residuals:
Min      1Q  Median      3Q     Max
-0.626  -0.126  -0.026   0.154   0.474

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)        0.24600    0.02894    8.50 1.96e-14 ***
Speciesversicolor  1.08000    0.04093   26.39  < 2e-16 ***
Speciesvirginica   1.78000    0.04093   43.49  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for gaussian family taken to be 0.04188163)

Null deviance: 86.5699  on 149  degrees of freedom
Residual deviance:  6.1566  on 147  degrees of freedom
AIC: -45.285

Number of Fisher Scoring iterations: 2
``````

Answered to myself as I think this method fits the purpose better than what has been proposed.

Sharing predictive models is not what the summary.glm method supposed to do, therefore summary(model) does not say much about the data the model is applied to.

There is a solution though -- use PMML, it allows describing both, the model, and the data it should be applied to.

Example:

``````> library(pmml)
> pmml(glm(Petal.Width~Species,data=iris))
<PMML version="4.2" xmlns="http://www.dmg.org/PMML-4_2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.dmg.org/PMML-4_2 http://www.dmg.org/v4-2/pmml-4-2.xsd">
<Extension name="user" value="dmitrijsl" extender="Rattle/PMML"/>
<Application name="Rattle/PMML" version="1.4"/>
<Timestamp>2014-07-15 15:07:51</Timestamp>
<DataField name="Petal.Width" optype="continuous" dataType="double"/>
<DataField name="Species" optype="categorical" dataType="string">
<Value value="setosa"/>
<Value value="versicolor"/>
<Value value="virginica"/>
</DataField>
...
``````

Now also Setosa is in the list for the receiver system to know what to expect, and the model description is there:

``````...
<ParameterList>
<Parameter name="p0" label="(Intercept)"/>
<Parameter name="p1" label="Speciesversicolor"/>
<Parameter name="p2" label="Speciesvirginica"/>
</ParameterList>
<FactorList>
<Predictor name="Species"/>
</FactorList>
<CovariateList/>
<PPMatrix>
<PPCell value="versicolor" predictorName="Species" parameterName="p1"/>
<PPCell value="virginica" predictorName="Species" parameterName="p2"/>
</PPMatrix>
<ParamMatrix>
<PCell parameterName="p0" df="1" beta="0.245999999999997"/>
<PCell parameterName="p1" df="1" beta="1.08"/>
<PCell parameterName="p2" df="1" beta="1.78"/>
</ParamMatrix>
``````

Source (Stackoverflow)