### main packages
library(tidyverse)
library(tidyquant)
library(timetk)
library(readxl)
library(plotly)
library(scales)
library(gt)
library(janitor)
library(config)
### data packages
library(fredr)
library(Quandl)
library(riingo)
### used in a few places
library(httr)
library(jsonlite)
library(formattable)
library(ggrepel)
library(ggtext)
library(geomtextpath)
library(XML)
library(corrr)
Introduction
Welcome to A Practical Guide to Exploring Macroeconomic Data with R
! As the name implies, the pages that follow aim to equip the reader with practical tools, code and data to analyze some aspects of the Macroeconomy. Macroeconomics deals with the economy as a whole. As such, macroeconomic data measure a country’s income, consumption, employment, imports and exports, monetary policy, interest rates and, inflation among other related data. The macroeconomic data is important to understand the health of the aggregate economy. News or updates about the macroeconomic indicators can also have significant impact on stock returns, bond returns and other asset markets.
By book’s end, we hope you will have the tools that you need to hit the ground running as a macro data scientist. This book is focused on modern data science tooling and practical tools for everyday use. We include many data visualizations in this book, along with the details of how to build them. These we hope will serve as a launch point for your own creativity. We believe strongly in the power of visualization and after reading this book, you should have the tools you need to imagine, build and deliver very effective messages using data.
Why R?
We love the R programming language and find it very well suited to this material. You will become familiar with the important fundamentals for data wrangling and data visualization, in particular around time series. Part of that will entail creating new time based features and columns We will also be publishing a Python version of this book that covers the same material. If you’re new to R, this book is a good introduction. If you also want to get started with Python, having them side-by-side is a great way to build Python skills (or at least we can say that it helped us in that regard as we reviewed how to translate one to the other)
Outline
This book is structured in the following way:
The first section of the book covers GDP and the stock market generally but it also serves as an introduction to R. It is not a comprehensive introduction because it focuses just on the functions and code we need to get our job done, but we do explain each step. If you are an experienced R coder, you might want to use that first chapter on GDP as a code reference. We do walk through how to import an ugly excel spreadsheet and make it tidy, which will almost surely confront anyone who wants to work in the financial industry, but it’s not exactly macroeconomics. The next chapter is on market data, and importing that data from a source that requires an API key, a common occurrence in data science. We also cover an intermediate difficulty use case around programmatically identifying bear markets (and then visualizing them). Even experienced programmers should find this of value as it can be applied across other financial use cases (our main goal in this book is to supply flexible code, that can be used elsewhere).
The second section focuses on interest rates, inflation, the Federal Reserve, the job and housing markets. The first two may be the most important macroeconomic forces of recent times (and of all times, some would argue), while the Federal Reserve has emerged as not just a regulator but also a crucial player in financial markets.
The third section we end with a dive into the Senior Loan Officer Survey, which gauges credit conditions, and its relationship to market returns. This serves as a capstone case where we tie together different concepts showed in the book to create a market signal.
To summarize, we cover:
- GDP
- Market returns
- Interest Rates
- Inflation
- The Federal Reserve
- Employment
- Housing
- The U.S. Dollar
- Credit conditions from the Senior Loan Officer Survey
In every topic we will take little side trips that show how to create shades in a time series (very useful for recessions), how to identify a bear market, how to build dual Y-axis charts, how to index to any date, how to reconstruct the dollar index. Those are our specific tasks, but all of them provide you with the paradigm to build those same tools but for different economic series or construct you own tools—it’s up to your creativity and that’s our goal. We want to unleash the creativity in all of us, that’s how we view data science.
A common thread in this book is data wrangling. We will slowly create a dataframe that holds variables we wish to track and analyze regularly. You, reader, will almost certainly wish to include different variables than what’s covered in this book. Our goal is to provide the tools to do so!
Who should read this book
This book is intended for people who use macroeconomic data in their work, and want to use R and/or Python to do so. We have two versions of the book, one with R code and one with Python code. They cover substantially the same material.
What this book will not cover
This books covers a lot of data, code and charts. We build things from scratch and show you all of our code. We do not spend a lot of time on theory or delving deep into concepts. There are an abundance of macroeconomic theory books out there, and any of them can be used to complement this book.
We do not cover Machine Learning or Artificial Intelligence in this book, though we love those tools and use them frequently. This book is aimed at very practical code and tools that you can use on day 1 of a job in industry to find insights around the macroeconomy. It’s possible that on day 1 you’ll be asked to start running machine learning models but, in our experience, very unlikely. And even in the event that this occurs, it would be nice to be able to communicate what’s happening in the broader economy based on solid data, even if a machine learning model is telling us the sky is green.
Learn More about R
If you are brand new to R, the best place to start is the book “R for Data Science” by Wickham and Grolemund. It’s available for free online here or on Amazon.
If you are more of a online course person, we recommend the Business Science jumpstart by Matt Dancho available here: https://www.business-science.io/. Some of those are free and some are paid. Matt is also the author of the timetk package that we use frequently in this book.
Learn More about R
If you are brand new to R, the best place to start is the book “R for Data Science” by Wickham and Grolemund. It’s available for free online here: https://r4ds.hadley.nz/ or on Amazon.
If you are more of a online course person, we recommend the Business Science jumpstart by Matt Dancho available here: https://www.business-science.io/. Some of those are free and some are paid. Matt is also the author of the timetk package that we use frequently in this book.
R Preliminaries
If you are reading this book, it is likely that you already have R installed in your computer and are familiar with it. However, for those readers just starting with R, we give brief instructions on installation here.
What is R and RStudio?
R is an open-source statistical programming language that is growing very fast in the world of data science.
To download R, go to:
and then click on the link for either Mac, Windows or Linux depending on your computer.
RStudio is an integrated development environment (or IDE) for R programming. It makes writing and running R code more fun.
To install RStudio, go to:
http://www.rstudio.com/download
A major strength of R is that it has a fast growing universe of packages to help us accomplish our tasks. That’s the power of open source software: as coders create new functions and tools, they can share them with the rest of the universe. How do we know a package “works”? Well, if a package doesn’t work, the entire universe of R coders will start shouting about it.
Below is a list of the packages that we use in this book.
If you wish to follow along with our code, you will need to install these packages by running install.packages("name of package")
in RStudio.
For example, to install the tidyverse
collection of packages, run:
install.packages("tidyverse")
That will install the tidyverse
package on your machine.
When you wish to use that package during a coding session, run:
library(tidyverse)
And so on for all of the packages listed above.