Data Science for Economists

UC Berkeley, Spring 2023

Instructor: Eric Van Dusen (ericvd@berkeley.edu)

Lecture: MWF 1PM-2PM, Office Hours: See Calendar and Ed

Eric Van Dusen

Eric Van Dusen

he/him

ericvd@berkeley.edu

Hi - I am a lecturer in Data Science as well as a staff member helping to build out Data Science Undergraduate Studies. I am passionate about how Data Science approaches can bring innovation to teaching and learning in all disciplines. I love trail running, kayaking, camping and exploring.

Lecture recordings Lecture PollEV

  • The schedule and dates listed below are tentative and may be subject to change.
  • All announcements are on Ed. Make sure you are enrolled and active there.
  • The Syllabus contains a detailed explanation of how each course component will work this semester
  • If you plan to add late, make sure you contact the staff first to see if you can make up the missed assignments before officially adding the class.

Schedule

🚀 Jump to current week

1. Introduction and Basic Tools

Jan 15

Lab 0 Intro to Notebook (Optional)

Survey 1 Pre-Semester Survey (due Jan. 20)

Disc 0 Welcome (slides) (just read the slides; no sections this week)

Jan 18

1 Introduction and Course Overview

1 slides • video

Jan 20

2 Overview of Technology

2 slides • video

Lab 1 Pandas (due Jan. 30 before class)

2. Introduction to Pandas

Jan 23

3 Datahub to Pandas

3 slides • video

Disc 1 Pandas (slides) (video) (supp. reading) (demo)

Jan 25

4 Pandas and Phillips Curve

4 slides • video • code: Datascience to Pandas, Phillips Curve

Jan 27

5 Pandas and Data Sources

5 slides • video • code: EIA API (full version)

Lab 2 API and Data Cleaning (due Feb. 7)

3. Sources of Data - Online Datasets and API

Jan 30

6 Application Programming Interface (API)

6 slides • video • code: Using APIs

Disc 2 Conditional Filtering, Functions & APIs (slides) (video) (demo)

Feb 1

7 FOMC and Macro Indicators

7 slides • video • code: Pandas II, Stock Ticker

Feb 3

8 Guest Lecture: Data Sources in Economics (Jim Church)

8 slides • video • code: WRDS

Lab 3 Wrangling Survey Data (due Feb. 15)

4. Survey Data and Explorative Data Analysis

Feb 6

9 Survey and RCT Data

9 slides • video

Disc 3 Groupby, Plotly, & Survey Design (slides) (video) (supp. reading)

Feb 8

10 Hawthorne Effect

10 slides • video • code: Survey Data

Feb 10

11 Guest Lecture: Identifying Customer Needs in Grocery (Alan Liang)

11 intro slides • slides • video

Project 1 Regional GDP (due Feb. 28)

5. Wrangling Data - Data Cleaning

Feb 13

12 EDA and Data Cleaning

12 slides • video • code: Navigating Files

Disc 4 Regex & Data Cleaning Process (slides) (video) (regex101) (exercise)

Feb 15

13 Strings and Regex

13 slides • video • code: Cal College Network

Feb 17

14 Guest Lecture: UC Investments Office (Martin Scott and Brad Lyons)

14 slides • video • code: Demo

Lab 4 SQL (due Feb. 28)

6. Sources of Data - SQL

Feb 20

Holiday: President Day

Disc 5 SQL (slides) (video)

Feb 22

15 SQL

15 slides • video • code: Basics, Soda

Feb 24

16 SQL & Scanner Data

16 slides • video • code: Soda II

Lab 5 Intro to Visualization (due Mar. 7)

7. Data Visualization I

Feb 27

17 Visualization I

17 slides • video • code: Seaborn

Disc 6 Plotting (slides) (video)

Mar 1

18 Visualization II

18 slides • video • code: Avocado, WaterGuard

Mar 3

19 Guest Lecture: Why Nothing Makes Sense (Kyla Scanlon)

19 intro slides • slides • video

Lab 6 Geospatial Visualization (due Mar. 14)

8. Data Visualization II

Mar 6

20 Visualization III

20 slides • video

Disc 7 Midterm Review (slides) (video)

Mar 8

21 Mapping

21 slides • video • code: County-level Unemployment Rate

Mar 10

Exam Midterm (logistics)

Project 2 Mariel Boatlift (due Apr. 6)

9. Linkage to Econometrics

Mar 13

22 Towards Econometrics

22 slides • video • code: Wages

Disc 8 Applications to Econometrics (slides) (video) (supp. reading: DiD, RDD, Card (1990))

Mar 15

23 Justice 40

23 slides • video • code: Justice 40

Mar 17

24 Guest Lecture: Michael Schwarz

24 slides • video

Lab 7 Justice 40 (due Apr. 6)

10. Reproducibility and Good Coding Practices

Mar 20

25 Reproducibility I

25 slides • video

Disc 9 Reproducibility in Code (slides) (video)

Mar 22

26 Reproducibility II

26 slides • video

Mar 24

27 Guest Lecture: Modeling Data in Quantitative Trading (Rodrigo Palmaka)

27 slides • video • code: Illiquid Stock Price

Lab 8 Survival Analysis (due Apr. 13)

Survey 2 Mid-Semester Survey (due Apr. 7)

11. Spring Break

Mar 27

No lectures this week

12. Modeling Economic Data

Apr 3

28 Modeling

28 slides • video • code: Classification

Disc 10 Bias-Variance Tradeoff & Tuning (slides) (video) (demo)

Apr 5

29 Survival Analysis

29 slides • video • code: Regimes, Employee Turnover

Apr 7

30 Guest Lecture: Emi Nakamura

30 intro slides • slides • video

Lab 9 Time Series (Due Apr. 18th)

13. Time Series

Apr 10

31 Time Series I

31 slides • video • code: Time Series

Disc 11 Ridge & Lasso Regressions (& Time Series) (slides) (video) (demo)

Apr 12

32 Time Series II

32 slides • video • code: ARIMA, VAR

Apr 14

33 Guest Lecture: Shashank Dalmia

33 intro slides • slides • video • code: Forecasting with sklearn

Project 3 Group Project

14. Experimental Designs and Machine Learning

Apr 17

34 Machine Learning I

34 slides • video • code: Classifiers, Penguins

Apr 19

35 Machine Learning II

35 slides • video • code: Cal College Network, Penguins II, Compare Classifiers

Apr 21

36 Guest Lecture: Caleb Kruse

36 slides • video

15. Economics and Data Science and Beyond

Apr 24

37 The Future of ML and Economics

37 slides • video

Disc 13 Project Workshop (demo: geospatial embeddings) (demo: loan prediction)

Apr 26

38 Interactivity

38 slides • video • code: Macro Policy, Prisoner’s Dilemma

Apr 28

39 Guest Lecture: Melissa Dell

39 slides • video

16. RRR Week

May 1

No lectures this week

17. Final Week

May 9

Exam Final

logistics • concept quiz • data task (Links will be active at 8AM on May 9th)