9 Exercises I

  1. Write a small shell script which takes the 10th row of a file and stores it into a new one.
  2. Create a pipeline of four commands.
  3. Set two environment variables in your .zshrc or .bashrc and use them in a Python program. For example set your names there, and print them lowercased through Python.
  4. Automate data download and processing of a Kaggle dataset with a shell script. This script should contain several commands. Kaggle provides a command-line interface tool that you can download and install. The first command should create the folder structure for the project, after this the data should be downloaded and extracted, and finally moved to the right folders.
  5. Many data science projects share the same structure, and you as a data scientist will often need to recreate it every time you start a project. Automate this work by creating a shell script that creates the folder structure and files for a typical data science project. You can look at the Cookiecutter Data Science project for inspiration.
  6. How can we use the > operator to store the dependencies of a Python project, within a virtual environment?
  7. Search in Google for data science command-line tools, that might be useful. Install and try such a tool, and make a short presentation on how it works, and why would we want to use it?
  8. Use a command-line text editor of your choice to modify your .zshrc file, to change the theme. An overview of the themes available is on the official website.