Group Project

Table of contents

  1. Introduction
  2. Timeline
  3. Deliverables and Rubrics
    1. Deliverables
    2. Grading Rubrics
  4. Datasets and Papers
  5. Submission Checklists
    1. Project
    2. Peer Review and Assessment
  6. Additional Resources

Access the template ipynb here:

Download View in Datahub View in Colab

Group assignments are released here.

See clarifications and submission instructions on Ed.


Introduction

In this project, you will have the opportunity to apply the data science techniques that you have learned throughout the semester to explore and replicate papers from the Equality of Opportunity Project, led by Harvard Professor Raj Chetty.

The Equality of Opportunity Project aims to provide insights into the sources of economic mobility and to identify effective policy interventions to promote equality of opportunity in the United States. The project has collected an extensive dataset that includes anonymized tax records for millions of individuals, which has been used in numerous research papers to study topics such as intergenerational mobility, the effects of education on earnings, and the impacts of neighborhoods on children’s outcomes.

Through this project, you will have the chance to work with this rich dataset and replicate some of the papers published by Professor Chetty and his colleagues. You will use Python and Pandas to explore the data, clean it, and analyze it, applying the techniques you have learned in class to answer research questions and test hypotheses. You will also have the chance explore ideas of your interests using the datasets.


Timeline

The project is due on May. 1st.


Deliverables and Rubrics

The template notebook is the template for the group project. Read the project template carefully as it contains all the instructions for this project–each markdown cell provides instructions on what to do in order to complete a successful project.

Here is the list of deliverables and the grading rubrics. See the project template for more details.

Deliverables

Deliverable Points
Abstract 5%
Project Background 5%
Project Objective 5%
Data Description 5%
Data Cleaning 5%
EDA 20%
Modeling 20%
Interpretation and Conclusions 20%
Post-Analysis Reproducibility 10%
Clarity, Style and Presentation 5%

Grading Rubrics

For each deliverable, we will award points according to the following percentage scale:

Grade Description
Excellent (above 90%) Work that is free of all but the most minor errors and demonstrates creativity and/or a very deep understanding of what you are doing.
Good (80-90%) Work that is free of fundamental errors and demonstrates a basic understanding of what you’re doing.
Fair (60-80%) Work with fundamental errors in analysis and/or conveys a lack of understanding of the basics of the work you are attempting to do.
Lacking (below 60%) Work that is severely lacking or incomplete.

Datasets and Papers

In order to make the project consistent across teams, we are limiting the project to a specific set of datasets that are linked to economics journal articles.

Browse and choose one of the datasets from Harvard Professor Raj Chetty’s Equality of Opportunity Project. You will find them neatly presented (and mostly cleaned) together with the title of the corresponding research paper.

Paper Link
Race and Economic Opportunity in the United States: An Intergenerational Perspective summary, PDF
Who Becomes an Inventor in America? The Importance of Exposure to Innovation summary, PDF
Mobility Report Cards: The Role of Colleges in Intergenerational Mobility summary, PDF
The Fading American Dream: Trends in Absolute Income Mobility Since 1940 summary, PDF
The Effects of Neighborhoods on Intergenerational Mobility summary, PDF
Childhood Environment and Gender Gaps in Adulthood summary, PDF
The Association Between Income and Life Expectancy in the United States, 2001-2014 summary, PDF
Measuring the Impacts of Teachers I and II: Evaluating Bias in Teacher Value-Added Estimates and Teacher Value-Added and Student Outcomes in Adulthood I: summary, PDF; II: summary, PDF
Is the United States Still a Land of Opportunity? Recent Trends in Intergenerational Mobility summary, PDF
Where is the Land of Opportunity? The Geography of Intergenerational Mobility in the United States summary, PDF

Submission Checklists

Project

  • A Jupyter Notebook
  • A PDF file of the Jupyter Notebook
  • Include a link to all datasets that you downloaded and used in the notebook
  • Figures and Plots (if not already included in the notebook)

Peer Review and Assessment

  • Peer review (for reproducibility) on another group’s project
  • Peer review for groupmates’ contributions

Additional Resources

How to Import Files in Colab: External data: Local Files, Drive, Sheets, and Cloud Storage