Noobie Noobie - 1 month ago 8
LaTeX Question

R :How to get a proper latex regression table from a dataframe?

Consider the following example

inds <- c('var1','','var2','')
model1 <- c(10.2,0.00,0.02,0.3)
model2 <- c(11.2,0.01,0.02,0.023)

df = df=data.frame(inds,model1,model2)
df
inds model1 model2
var1 10.20 11.200
0.00 0.010
var2 0.02 0.020
0.30 0.023


Here you have the output of a custom regression model with coefficients and P-values (I actually can show any other statistics if I need to, say, the standard errors of the coefficients).

There are two variables,
var1
and
var2
.

For instance, in model1,
var1
comes with a coefficient of
10.2
and a P-value of
0.00
while
var2
has a coefficient of
0.02
and a P-value of
0.30
.

Is there a package that handle these (custom) tables automatically and can create a neat Latex table with stars for significance?

Thanks!

Answer

Here is a solution using texreg.

Note that texreg >= 1.36.18 is required (as available from its R-Forge project homepage).

The information you are providing in the data frame (coefs and p-values) could be arranged in arbitrary ways in a data frame. Therefore we need to write code that selects these data from the appropriate places in the data frame and uses them to create a texreg object. As you are requesting a generic (and presumably re-usable) solution, we should wrap the code in a re-usable function. I'll call this function extractFromDataFrame. So here is the function, which extracts the information from the data frame and creates a list of texreg objects for the different models:

require("texreg")

extractFromDataFrame <- function (dataFrame) {
  coef.row.indices <- seq(1, nrow(dataFrame) - 1, 2)
  pval.row.indices <- seq(2, nrow(dataFrame), 2)
  texregObjects <- list()
  for (i in 2:ncol(dataFrame)) {
    coefs <- dataFrame[coef.row.indices, i]
    coefnames <- as.character(dataFrame[coef.row.indices, 1])
    pvalues <- dataFrame[pval.row.indices, i]
    tr <- createTexreg(coef = coefs, coef.names = coefnames, pvalues = pvalues)
    texregObjects[i - 1] <- list(tr)
  }
  return(texregObjects)
}

In this function, we first define in which rows of the data frame the coefficients are stored and in which rows the p-values are stored. Then we created an empty list in which we stored the texreg objects. We iterate through all columns but the first as the first one contains only the labels. In each of these model columns, we save the coefficients, their names, and the p-values, and then we hand them over to the createTexreg constructor, which is a function that creates a texreg object for us based on the data. We add the texreg object to the list. In the end, we return the list of texreg objects.

We can now apply the function to any data frame that looks like the one provided in the question, with arbitrary numbers of columns (> 1). In this case, after applying the function to the df object, we may want to print the contents of the list if we want to make sure that we did everything right:

tr <- extractFromDataFrame(df)
tr

And indeed, the results contain the relevant data:

[[1]]

No standard errors were defined for this texreg object.
No decimal places were defined for the GOF statistics.

     coef.   p
var1 10.20 0.0
var2  0.02 0.3

No GOF block defined.

[[2]]

No standard errors were defined for this texreg object.
No decimal places were defined for the GOF statistics.

     coef.     p
var1 11.20 0.010
var2  0.02 0.023

No GOF block defined.

Now we can simply hand the list of texreg objects over to screenreg, e.g., screenreg(tr), with the following result:

========================
      Model 1    Model 2
------------------------
var1  10.20 ***  11.20 *
var2   0.02       0.02 *
========================
*** p < 0.001, ** p < 0.01, * p < 0.05

Or to htmlreg for creating an HTML table. Or, as requested in the original question, to texreg for creating a LaTeX table. The output of texreg(tr, single.row = TRUE) looks like this:

\begin{table}
\begin{center}
\begin{tabular}{l c c }
\hline
 & Model 1 & Model 2 \\
\hline
var1 & $10.20^{***}$ & $11.20^{*}$ \\
var2 & $0.02$        & $0.02^{*}$  \\
\hline
\multicolumn{3}{l}{\scriptsize{$^{***}p<0.001$, $^{**}p<0.01$, $^*p<0.05$}}
\end{tabular}
\caption{Statistical models}
\label{table:coefficients}
\end{center}
\end{table}

This solution can be modified to accommodate standard errors, confidence intervals, or goodness-of-fit statistics.

Various texreg arguments can be used to customize the output, including the use of the booktabs package or decimal alignment via dcolumn, for example.

Please note that you should not call your data frame df because that object name is already defined in the stats package.

Comments