Martin Alexander Martin Alexander - 3 months ago 9
R Question

Standardized regression coefficients using lm() and scale() differ from those using lm.beta() or cor()

I have two variables and I want to find the correlation between them. The issues is that I seem to be getting different results depending on which method I use.

One method I know of is to run a lm() function with the independent and dependent variables in the scale() function.

So with the variables below that would look like:

lm(scale(mainDataframe$relativeFemHappy) ~ scale(mainDataframe$allRights))


Other methods I know of are to simply use the cor() function or the lm.beta() function.

So that would like like:

cor(mainDataframe$relativeFemHappy, mainDataframe$allRights, use="pairwise.complete.obs")


and

library(lm.beta)
lm.beta(lm(mainDataframe$relativeFemHappy ~ mainDataframe$allRights))


The issue is that the results I'm getting are different:

> lm(scale(mainDataframe$relativeFemHappy) ~ scale(mainDataframe$allRights))

Call:
lm(formula = scale(mainDataframe$relativeFemHappy) ~ scale(mainDataframe$allRights))

Coefficients:
(Intercept) scale(mainDataframe$allRights)
-0.002478 -0.272812

> lm.beta(lm(mainDataframe$relativeFemHappy ~ mainDataframe$allRights))
mainDataframe$allRights
-0.2550056
> cor(mainDataframe$relativeFemHappy, mainDataframe$allRights, use="pairwise.complete.obs")
[1] -0.2550056


So with first method using lm() and scale() I'm getting a coefficient of 0.2728, while the lm.beta() and cor() method gives me a coefficient of -0.225

I'm would love to know what's causing this. Thanks.

mainDataframe.allRights mainDataframe.relativeFemHappy
1 1.3333333 0.0653854461
2 NA -0.0943358596
3 1.0000000 -0.3559994842
4 2.0000000 0.0542260426
5 1.3333333 -0.1125838731
6 NA 0.0647522523
7 1.6666667 -0.1119041715
8 1.0000000 0.0564865005
9 1.3333333 0.2199685735
10 1.3333333 0.3016471599
11 0.6666667 0.6291666667
12 NA -0.1322754782
13 NA -0.7031950673
14 1.6666667 0.5382193869
15 0.6666667 0.0515831008
16 1.3333333 -0.2406053407
17 NA -0.3188695664
18 1.3333333 -0.2132530855
19 1.3333333 -0.1051805386
20 1.3333333 0.5137880544
21 1.3333333 -0.1591651057
22 NA 0.3518542315
23 1.6666667 -0.3134255036
24 2.3333333 -0.0353351079
25 1.3333333 -0.3069227981
26 1.3333333 0.4518921825
27 1.3333333 -0.0106520766
28 2.0000000 -0.1744353706
29 1.3333333 -0.5486947791
30 2.0000000 -0.1683776581
31 2.0000000 -0.1141202547
32 2.6666667 0.1352620331
33 2.3333333 NaN
34 1.3333333 -0.4105513765
35 1.3333333 -0.3623256900
36 1.3333333 -0.1843162243
37 2.0000000 -0.2813061511
38 1.3333333 -0.2735289841
39 1.0000000 -0.3703465553
40 1.3333333 -0.0399500250
41 1.3333333 -0.0798679868
42 NA -0.1494736842
43 0.6666667 0.2510419233
44 2.3333333 -0.1636337231
45 3.0000000 -0.2588880820
46 0.3333333 0.5142450779
47 1.6666667 -0.0927171343
48 1.3333333 0.2302559822
49 1.3333333 -0.1605876144
50 1.3333333 0.0224237663
51 1.3333333 -0.3474095401
52 1.3333333 0.0879899428
53 NA -0.2959860780
54 2.0000000 -0.0678765880
55 2.3333333 -0.2593966749
56 2.6666667 -0.3066565041
57 1.6666667 0.0659408848
58 1.6666667 0.3153641680
59 1.3333333 -0.4080779390
60 1.3333333 0.1695402299
61 2.0000000 -0.1246312234
62 1.6666667 -0.4569675001
63 2.0000000 0.1021491160
64 1.3333333 -0.1375955915
65 NA 0.0007769658
66 1.3333333 -0.0427901329
67 2.3333333 0.0918414523
68 1.3333333 0.1675599213
69 1.3333333 0.0667226151
70 1.0000000 0.6140938930
71 1.3333333 0.0139284251
72 2.0000000 -0.0253022876
73 1.3333333 0.0767676768
74 1.3333333 -0.3298592768
75 0.3333333 0.4164929718
76 NA 0.2050189429
77 1.6666667 0.1017706560
78 0.6666667 0.6626247039
79 1.3333333 0.1182371519
80 0.0000000 -0.1336948622
81 0.6666667 0.2007353845
82 2.0000000 -0.0111828561
83 1.3333333 0.0728503690
84 1.3333333 0.3259760711
85 NA 0.1190302497
86 1.0000000 0.1194620625
87 0.6666667 0.0453267607
88 2.0000000 0.0911983186
89 1.3333333 0.1566666667
90 0.0000000 0.0907911338
91 1.6666667 0.0898769242
92 NA -0.1525686518
93 3.0000000 -0.0293211263
94 1.6666667 0.6627064577
95 1.3333333 0.5176272062
96 NA 0.4856334661
97 2.0000000 -0.0205725729
98 1.6666667 -0.2117421455
99 1.3333333 -0.0930969019
100 2.0000000 -0.0367682733
101 1.3333333 0.3817815271
102 NA -0.2265089463
103 NA 0.1038953135
104 NA -0.0329032045
105 1.0000000 -0.0223175342
106 NA 0.0393768703
107 NA -0.1385969952
108 NA 0.1356859273
109 2.0000000 0.0107975036
110 NA 0.0979167949
111 0.6666667 -0.0342344955
112 NA -0.0050468143
113 NA -0.0895239553
114 NA -0.0465631929
115 NA 0.3002016217
116 2.6666667 -0.1137102105
117 0.6666667 0.0882938923
118 NA 0.4241776220
119 NA 0.1236421047
120 NA 0.2142170169
121 NA 0.0387629732
122 1.0000000 -0.0567106487
123 NA 0.0336110922
124 NA 0.1359546531
125 NA -0.0764485186
126 NA 0.3689020044
127 NA 0.4295649361
128 NA -0.1044761961
129 1.0000000 -0.2089427217
130 NA 0.2015707900
131 1.6666667 -0.0740150225
132 NA 0.0851963992
133 NA 0.1023532212
134 1.3333333 -0.0808608360
135 NA 0.2427526973
136 NA -0.0551786818
137 3.0000000 0.0660331924
138 NA -0.3727922200
139 NA 0.1102447610
140 NA -0.2057888977
141 NA -0.1719448695
142 2.3333333 -0.2175613073
143 NA -0.2613899294
144 NA 0.0756224178
145 1.3333333 -0.1586860559
146 NA -0.1028082059
147 1.6666667 -0.0093129029
148 NA 0.2982334465
149 NA -0.2291732892
150 NA -0.3709208321
151 NA 0.0254403690
152 NA -0.2755686789
153 NA 0.1773620638
154 0.6666667 0.1088370006
155 NA 0.0951056627
156 NA -0.3433133733
157 NA -0.0837993745
158 NA -0.3437314283
159 NA -0.2230338635
160 NA 0.0075808250
161 NA 0.0706623401
162 NA 0.0185266374
163 NA 0.0063326421
164 NA 0.0671828617
165 NA -0.1791227448
166 NA -0.0233741378
167 NA -0.0233616222
168 NA 0.5177982205
169 NA -0.0210875370
170 NA -0.0955256618
171 NA 0.2049268262
172 NA -0.0165755643
173 NA 0.3305190592
174 NA 0.1140276893
175 NA -0.1494444444
176 NA 0.0485406351
177 NA 0.1383207807
178 NA -0.0726862507
179 NA 0.0389694042

Answer

Have a check on this:

## normalization before joint removal of `NA`
attributes(scale(mainDataframe))[3:4]
#$`scaled:center`
#       allRights relativeFemHappy 
#     1.483660123      0.005227296 

#$`scaled:scale`
#       allRights relativeFemHappy 
#       0.5926344        0.2411674 

## normalization after joint removal of `NA`
x <- na.omit(mainDataframe)
attributes(scale(x))[3:4]
#$`scaled:center`
#       allRights relativeFemHappy 
#      1.47524752       0.00462978 

#$`scaled:scale`
#       allRights relativeFemHappy 
#       0.5894377        0.2580075 

As you can see, the mean and standard deviation are different.

Now, if you use lm for the complete cases x, you get what you expected:

lm(scale(relativeFemHappy) ~ scale(allRights) - 1, data = x)
#Coefficients:
#scale(allRights)  
#          -0.255  

Note I have used -1 in the formula to drop intercept.