Delivering Data Science Products via Packages

Misk Skills

Overview

This module will teach students how to create R and Python packages to scale coding procedures and make themselves and others more productive. Students will get introduced to how packages are structured, common workflows, the importance of testing, and how to set up basic continuous integration and deployment procedures.

Learning Objectives

This module will step through the process of building and contributing to R and Python packages. The goal is to provide you with a comprehensive picture of how to develop a high quality package. By the end of this module you should:

  1. Understand why and when we create packages.
  2. Be comfortable with the general structure and workflow of package development.
  3. Know how to fully document components of your package.
  4. Be able to implement a sound test structure for your source code.
  5. Incorporate continuous integration.
  6. Be exposed to many high quality R and Python packages.

Prework

This module makes a few assumptions of your established knowledge regarding your programming skills. Below are my assumptions and some resources to read through to make sure you are properly prepared.

Assumptions

You should be familiar with the basics of:

  • R and Python programming for data science
  • Writing functions
  • Using git and Github

Schedule

Session Description Notebook Slides
1 Introduction to packages Notebook Slides
2 Package structure Notebook Slides
3 Development workflow Notebook Slides
4 Portfolio builder: Create your first package Notebook
5 Package metadata Notebook Slides
6 Source code Notebook Slides
7 Portfolio builder: Make your first open source contribution Notebook
8 Tests Notebook Slides
9 Object documentation Notebook Slides
10 Changelog Notebook Slides
11 README Notebook Slides
12 Long-form documentation Notebook Slides
13 Package website Notebook Slides
10 Other components Notebook Slides
11 Portfolio builder: Improve an open source package’s documentation Notebook
12 Continuous integration Notebook Slides
13 Portfolio builder: TBD TBD