Back-of-the-envelope low-quality musings: #takes

Longform stuff I've put time and thought into is at #effortposts.

To hear from academics and policymakers, #interviews.

Compiled scribbles on what I read can be found in #notes.

Topic-related tags include #economics, #coronavirus and #miscellaneous.

# Reg Y X

04 Feb 2021

In a similar vein to my previous set of posts on growth, I’m planning to write something about money and monetary policy. The reason monetary policy matters is because we think it might affect real variables, but how can we verify this to be the case? This post is mostly just a quick-and-dirty attempt at messing around in Stata to demonstrate this, with a bonus bit of content at the end involving more Stata regressions.

Vector Autoregressions

Suppose we are trying to see how some variables affect each other - we’d like to produce some sort of statistical model that describes their interaction, explaining how an exogenous shock to one might affect the others. We would want to fit this to historical data. But we may not want to specify a particular structure for our model, due to not knowing the exact causal relationships - luckily, a vector autoregression offers an atheoretical way of doing so, meaning that the epistemic requirements are reduced.

A VAR involves regressing the $k$ endogenous variables against the variables themselves and their lags. The order of a VAR describes its lag. The general form of a $k$ variable $p$-th order VAR of VAR($p$) is given as the following.

$\mathbf{y}_t = \mathbf{c} + \mathbf{A}_1 \mathbf{y}_{t-1} + ... + \mathbf{A}_p \mathbf{y}_{t-p} + \mathbf{u}_t$

This is has several components.

• There is a $k \times 1$ vector $\mathbf{y}$ of endogenous variables.
• There is a $k \times k$ matrix $\mathbf{A}$ of parameters.
• There is a $k \times 1$ vector $\mathbf{c}$ of constants.
• There is a $k \times 1$ vector $\mathbf{u}$ of the white noise error term.

By plugging in historical data of the Great Moderation, we can estimate these equations. Instead of going through the statistical theory of VARs, let’s just look at how we can use this in Stata. Suppose we want to look at the impact of a change in the policy rate on real output, the unemployment rate and inflation. We can take the federal funds rate, real GDP, U3 unemployment and chain-weighted PCE inflation from FRED. We run a battery of tests to determine the optimal lag, and then run the VAR.

clear all

* Import data from FRED and set time as quarterly.
import fred FEDFUNDS GDPC1 UNRATE PCEPI, daterange(01Jan1985 01Jan2008) aggregate(quarterly, eop)
generate dateq = qofd(daten)
tsset dateq, quarterly

* Make the variables for VAR.
rename FEDFUNDS ffr
gen ln_rgdp = log(GDPC1)
gen ln_u3 = log(UNRATE)
gen ln_pi = log(PCEPI)

* Find the best lag.
varsoc ln_pi ln_u3 ln_rgdp ffr, maxlag(10)

Selection-order criteria
Sample:  1987q3 - 2008q1                     Number of obs      =        83
+---------------------------------------------------------------------------+
|lag |    LL      LR      df    p      FPE       AIC      HQIC      SBIC    |
|----+----------------------------------------------------------------------|
|  0 |  142.962                      4.1e-07  -3.34849  -3.30166  -3.23192  |
|  1 |  807.707  1329.5   16  0.000  6.7e-14  -18.9809  -18.7467   -18.398* |
|  2 |  842.698  69.981   16  0.000  4.3e-14* -19.4385*  -19.017* -18.3894  |
|  3 |  856.645  27.895   16  0.033  4.5e-14   -19.389  -18.7802  -17.8736  |
|  4 |  871.333  29.376   16  0.022  4.7e-14  -19.3574  -18.5613  -17.3757  |
|  5 |  879.566  16.465   16  0.421  5.8e-14  -19.1703  -18.1868  -16.7223  |
|  6 |  886.949  14.767   16  0.542  7.4e-14  -18.9626  -17.7918  -16.0484  |
|  7 |  903.747  33.596   16  0.006  7.6e-14  -18.9819  -17.6237  -15.6013  |
|  8 |  927.558  47.621   16  0.000  6.7e-14  -19.1701  -17.6246  -15.3232  |
|  9 |  934.709  14.302   16  0.576  9.0e-14  -18.9568  -17.2241  -14.6437  |
| 10 |  958.197  46.976*  16  0.000  8.4e-14  -19.1373  -17.2172  -14.3579  |
+---------------------------------------------------------------------------+
Endogenous:  ln_pi ln_u3 ln_rgdp ffr
Exogenous:  _cons

* Run the VAR with lags of 1 and 2 periods.
quietly var ln_pi ln_u3 ln_rgdp ffr, lags (1/2)


This has produced a whole host of coefficients for estimating all of our equations, which makes it difficult to know what to pay attention to. What we can do is focus on two things in particular. Firstly, we can look for Granger causality - if $x$ Granger causes $y$, that means knowing the lagged values of $x$ in conjunction with the lagged values of $y$ will provide statistically important information about the future values of $y$, compared to just knowing the lagged values of $y$. Notice this is actually a very loose use of the word “causality” - indeed, it is probably better described as precedence or forecasting ability. Secondly, we can look at impulse response functions, which tell us how an exogenous change in one variable might affect the rest.

* Run a Granger test.
vargranger

Granger causality Wald tests
+------------------------------------------------------------------+
|          Equation           Excluded |   chi2     df Prob > chi2 |
|--------------------------------------+---------------------------|
|             ln_pi              ln_u3 |   2.129     2    0.345    |
|             ln_pi            ln_rgdp |  4.2549     2    0.119    |
|             ln_pi                ffr |   5.602     2    0.061    |
|             ln_pi                ALL |  9.0466     6    0.171    |
|--------------------------------------+---------------------------|
|             ln_u3              ln_pi |  .38453     2    0.825    |
|             ln_u3            ln_rgdp |   10.49     2    0.005    |
|             ln_u3                ffr |  1.7722     2    0.412    |
|             ln_u3                ALL |  14.438     6    0.025    |
|--------------------------------------+---------------------------|
|           ln_rgdp              ln_pi |  2.6259     2    0.269    |
|           ln_rgdp              ln_u3 |  2.1724     2    0.338    |
|           ln_rgdp                ffr |  7.5135     2    0.023    |
|           ln_rgdp                ALL |  12.889     6    0.045    |
|--------------------------------------+---------------------------|
|               ffr              ln_pi |  3.4996     2    0.174    |
|               ffr              ln_u3 |   6.836     2    0.033    |
|               ffr            ln_rgdp |  1.9432     2    0.378    |
|               ffr                ALL |  21.757     6    0.001    |
+------------------------------------------------------------------+

* Look at impulse response functions.
irf set monetary.irf, replace
irf create model1, step(40)
irf graph oirf, impulse(ffr) response(ln_pi ln_u3 ln_rgdp ffr) yline(0,lcolor(black)) xlabel(0(4)40) byopts(yrescale)

exit


When looking at the Granger causation tables, we can see that each row is testing a null hypothesis that the coefficients of the lags are 0. The Granger test where all the variables are excluded is simply testing a null hypothesis involving a simple autoregressive model. It is clear that the federal funds rate is responsive to the other variables in the VAR, even if it is only unemployment that has a $p$-value less than the usual significance level of 0.05. Real GDP is Granger caused by the federal funds rate and itself Granger causes changes in unemployment. However, the results for inflation suggest that we are unable to reject the null hypothesis of a simple autoregressive model.

As for the impulse response functions, we see reasonable results - a rise in the federal funds rate seems to lead to a fall in real GDP and a rise in unemployment. However, inflation seems to be going the wrong way initially, raising what is called the price puzzle. I will leave the explanations of all of this to another day, but for now it is clear that we should suspect monetary policy has real effects.

Bonus Content

Since I was already messing around in Stata and since people have asked for it, here’s a repeat of my old MRW replication - except this time, I’m using Stata in lieu of R and I’m using 1985-2010 data instead of MRW’s original 1960-1985 data.

For this replication, we need to know the real GDP per working-age population (15-64 years old), the average annual rate of the working-age population, the savings rate and the level of human capital. We will use the Penn World Tables as our source of data. There are a few differences between this and the original MRW. These changes mean that we can see whether the findings are robust outside of their original sample.

• We are considering 1985-2010 instead of the original’s 1960-1985.
• We are looking at all 183 countries in the PWT instead of the original 98 which MRW facet into several groups.
• We are using PWT’s measure of human capital instead of the original measure that only looked at secondary school enrollment.
• Because PWT no longer offers the working-age population, we are using the number of persons engaged as its proxy, which PWT recommends.
clear all

* Clean up PWT data.
use https://www.rug.nl/ggdc/docs/pwt100.dta
keep if year == 1985 | year == 2010
drop countrycode currency_unit rgdpe rgdpo pop avh ccon cda cgdpe cgdpo cn ck ctfp cwtfp rconna rdana rnna rkna rtfpna rwtfpna labsh irr delta xr pl_con pl_da pl_gdpo i_cig i_xm i_xr i_outlier i_irr cor_exp statcap csh_c csh_g csh_x csh_m csh_r pl_c pl_i pl_g pl_x pl_m pl_n pl_k

* Take real GDP at constant national prices (as per https://www.rug.nl/ggdc/docs/the_next_generation_of_the_penn_world_table.pdf).
gen y_l = rgdpna/emp

* Make data wide.
reshape wide emp hc rgdpna csh_i y_l, i(country) j(year)

* Find population growth rate given ln(1+g) approximates g for a small g.
gen n = log(emp2010/emp1985)/25

* Take the savings rate as the average share of GDP in investment.
gen s = (csh_i1985+csh_i2010)/2

* As with MRW, assume that g + delta = 0.05.
gen n_g_d = n + 0.05

* Set up the log variables for regression.
foreach var in y_l1985 y_l2010 s n_g_d {
gen ln_var' = log(var')
}

* Unrestricted regression of textbook Solow model.
quietly reg ln_y_l2010 ln_s ln_n_g_d
eststo reg1

* Restricted regression of textbook Solow model with estimates of alpha.
constraint 1 _b[ln_s] + _b[ln_n_g_d] = 0
quietly cnsreg ln_y_l2010  ln_s ln_n_g_d, constraint(1)
quietly nlcom _b[ln_s] / (1 + _b[ln_s])
matrix b = r(b)
matrix V = r(V)
quietly estadd scalar alpha = b[1,1]
quietly estadd scalar alpha_se = sqrt(V[1,1])
eststo reg2

esttab reg1 reg2, se stats(N r2 alpha alpha_se) title("Unrestricted and Restricted Regressions: Basic Solow Model")

Unrestricted and Restricted Regressions: Basic Solow Model
--------------------------------------------
(1)             (2)
ln_y_l2010      ln_y_l2010
--------------------------------------------
ln_s                1.573***        1.351***
(0.228)         (0.192)

ln_n_g_d           -0.732          -1.351***
(0.399)         (0.192)

_cons               10.68***        8.688***
(1.146)         (0.221)
--------------------------------------------
N                     145             145
r2                  0.273
alpha                               0.575
alpha_se                           0.0348
--------------------------------------------
Standard errors in parentheses
* p<0.05, ** p<0.01, *** p<0.001


Notice that the $R^2$ value of 0.273 is a lot worse than MRW’s, though the predicted signs for $s$ and $n+g+\delta$ hold. For the restricted regression, it tells us that $\alpha=0.575$. This is similar to MRW’s 0.60, and like MRW, would imply that capital’s share of income is double what it is in reality. So we need to consider the augmented model.

* Unrestricted regression with human capital.
gen ln_sh = log((hc1985+hc2010)/2)
quietly reg ln_y_l2010 ln_s ln_n_g_d ln_sh
eststo reg3

* Restricted regression with human capital.
constraint 1 _b[ln_s] + _b[ln_n_g_d] = 0
quietly cnsreg ln_y_l2010 ln_s ln_n_g_d ln_sh, constraint(1)
quietly nlcom _b[ln_s] / (1 + _b[ln_s])
matrix b = r(b)
matrix V = r(V)
quietly estadd scalar alpha = b[1,1]
quietly estadd scalar alpha_se = sqrt(V[1,1])
eststo reg4

esttab reg3 reg4, se stats(N r2 alpha alpha_se) title("Unrestricted and Restricted Regressions: Augmented Solow Model")

Unrestricted and Restricted Regressions: Augmented Solow Model
--------------------------------------------
(1)             (2)
ln_y_l2010      ln_y_l2010
--------------------------------------------
ln_s                0.705***        0.424*
(0.204)         (0.196)

ln_n_g_d            0.661          -0.424*
(0.358)         (0.196)

ln_sh               2.907***        2.766***
(0.262)         (0.271)

_cons               10.87***        7.659***
(0.922)         (0.194)
--------------------------------------------
N                     129             129
r2                  0.661
alpha                               0.298
alpha_se                           0.0968
--------------------------------------------
Standard errors in parentheses
* p<0.05, ** p<0.01, *** p<0.001


Notice this produces an $R^2$ value of 0.661, which is slightly less than the 0.77 found by MRW. In any case, this is a significant improvement from the basic Solow model. This also suggests that $\alpha=0.298$, which is much closer to capital’s share of around one-third in reality.

Another aspect of the Solow model that differs from endogenous growth models is the idea of convergence. We can test for that too.

* Define growth rate as difference of natural logarithms.
gen ln_y_ldiff = ln_y_l2010 - ln_y_l1985

* Test for unconditional convergence and derive implied lambda, since we the coefficient of ln_y_l1985 is e^(lambda t) - 1.
quietly reg ln_y_ldiff ln_y_l1985
quietly nlcom ln(1+_b[ln_y_l1985])/25
matrix b = r(b)
matrix V = r(V)
quietly estadd scalar lambda = b[1,1]
quietly estadd scalar lambda_se = sqrt(V[1,1])
eststo reg5

* Test for conditional convergence, controlling for the savings rate and n+g+delta.
quietly reg ln_y_ldiff ln_y_l1985 ln_s ln_n_g_d
quietly nlcom ln(1+_b[ln_y_l1985])/25
matrix b = r(b)
matrix V = r(V)
quietly estadd scalar lambda = b[1,1]
quietly estadd scalar lambda_se = sqrt(V[1,1])
eststo reg6

* Test for conditional convergence, controlling for the savings rate, n+g+delta and human capital.
quietly reg ln_y_ldiff ln_y_l1985 ln_s ln_n_g_d ln_sh
quietly nlcom ln(1+_b[ln_y_l1985])/25
matrix b = r(b)
matrix V = r(V)
quietly estadd scalar lambda = b[1,1]
quietly estadd scalar lambda_se = sqrt(V[1,1])
eststo reg7

esttab reg5 reg6 reg7, se stats(N r2 lambda lambda_se) title("Unconditional and Conditional Convergences")

Unconditional and Conditional Convergences
------------------------------------------------------------
(1)             (2)             (3)
ln_y_ldiff      ln_y_ldiff      ln_y_ldiff
------------------------------------------------------------
ln_y_l1985         -0.106***       -0.147***       -0.214***
(0.0315)        (0.0340)        (0.0440)

ln_s                                0.232*          0.184
(0.112)         (0.112)

ln_n_g_d                           -0.629***       -0.571**
(0.171)         (0.202)

ln_sh                                               0.530**
(0.193)

_cons               1.390***        0.490           0.835
(0.311)         (0.639)         (0.746)
------------------------------------------------------------
N                     145             145             129
r2                 0.0739           0.183           0.281
lambda           -0.00450        -0.00637        -0.00962
lambda_se         0.00141         0.00160         0.00224
------------------------------------------------------------
Standard errors in parentheses
* p<0.05, ** p<0.01, *** p<0.001


It is clear from that with $R^2 = 0.0739$ and an implied $\lambda$ of near 0, there is no tendency for poorer countries to grow faster. As for both tests of conditional convergence, we see some effect compared to the unconditional case, but nothing terribly significant and disappointing compared to the original MRW numbers.

Finally, we can make pretty pictures of the three tests of convergence!

scatter ln_y_ldiff ln_y_l1985, title("Unconditional Convergence") ytitle("Growth Rate 1985-2010") xtitle("Log Output per Employed Person 1985") legend(off) msymbol(Oh) || lfit ln_y_ldiff ln_y_l1985

quietly regress ln_y_ldiff ln_s ln_n_g_d
predict growth2, resid
quietly regress ln_y_l1985 ln_s ln_n_g_d
predict ln_y_l2, resid
scatter growth2 ln_y_l2, title("Conditional Convergence I") ytitle("Residuals of Growth Rate 1985-2010") xtitle("Residuals of Log Output per Employed Person 1985") legend(off) msymbol(Oh) || lfit growth2 ln_y_l2

quietly regress ln_y_ldiff ln_s ln_n_g_d ln_sh
predict growth3, resid
quietly regress ln_y_l1985 ln_s ln_n_g_d ln_sh
predict ln_y_l3, resid
scatter growth3 ln_y_l3, title("Conditional Convergence II") ytitle("Residuals of Growth Rate 1985-2010") xtitle("Residuals of Log Output per Employed Person 1985") legend(off) msymbol(Oh) || lfit growth3 ln_y_l3

exit


Overall, it seems reasonable to conclude that adding human capital does get us a lot closer to a Solow model describing reality. However, it performs noticeably worse than with the original data both in accounting for cross-country disparities and in predicting convergence, suggesting it isn’t the be all and end all. Indeed, subsequent papers like “Neoclassical Revival in Growth Economics: Has it Gone Too Far?” (Klenow and Rodríguez-Clare 1997), “It’s Not Factor Accumulation” (Easterly and Levine 2001), “Is Growth Exogenous?” (Bernanke and Gürkaynak 2001) as well as “Why are Some Countries Richer than Others?” (Felipe and McCombie 2002) have all pushed back against the MRW explanation.