Chapter 1 Up and running with Python

Our learning objectives in this session are to get you up and running with Python 3.0+

1.1 Which build to use?

Python is a general purpose and intuitive programming language, which explains much of its wide appeal.

1.1.1 Python.org

For this course we assume that you have installed python 3 from python.org. This book was compiled using Python 3.6. As of writing the latest version of Python was v3.9, released on October 5, 2020. For our purposes, the differences are negligible. Indeed you may want to install v3.8 until it’s clear that there are no incompatibilities with the current version.

You can download and install Python using the link above or follow these instructions:

Table 1: System specific installations

Platform Notes
Linux Python 3 is already installed. Install pip with get-pip.py to install other packages.
macOS X use brew install python3 in the terminal.
Windows Install Python from the Windows Store.

Remember, for all OSs, you can also install Python directly from python.org.

1.1.2 Anaconda

Anaconda is a very popular Python build for data science. It comes pre-loaded with lots of useful apps and packages. However, for this course we’ll stick with the basic Python installation and install packages as needed. We will not explicitly cover Anaconda.

1.2 IDEs

There are many Integrated Development Environments (IDEs) and text editors with Python-specific plug-ins available.

If you’re using Anaconda you’ll have Notebooks and Jupyter Lab available. These are great tools for reporting results, but for writing and experimenting with raw code, it’s more useful and convenient to have an IDE that is based on scripts, as opposed to notebooks.

In this workshop we’ll use Visual Studio (VS) Code.

To launch VS Code from the command line it needs to be in your PATH. Open VS Code like any other application and access the Command Palette (⇧⌘P or shift+ctrl+P) and type ‘shell command’ to find the `shell Command: Install ‘code’ command in PATH command and execute it.

1.2.1 Setting up python in VS code

You may have to install the Python extension for VS Code.

Open the Command Palette (⇧⌘P or shift+ctrl+P) and start typing Py. Select the Python: Select Interpreter command once it appears to select the Python environment you want to use. If the status bar is present at the lower left, you can also select a Python environment option there, but this will only appear once you’ve created a .py script.

If you receive an error about linter, then you can install it directly from within VS Code by either i) following the on-screen instructions or ii) the steps listed below. Following the on-screen instructions will open a terminal window inside VS Code. You can execute the installation for the specific interpreter you have chosen. Alternatively, you can just use the Command Palette to run Terminal: Create New Integrated Terminal to open up a new terminal window within VS Code and execute:

pip install pylint

or more explicitly

/usr/local/bin/python3 -m pip install -U pylint --user

Check that you have a pip installer. Use the Command Palette to run Terminal: Create New Integrated Terminal to open up a new terminal window and execute:

pip install --upgrade pip

VS Code uses auto-completion to complete commands. It also uses IntelliSense which can provide methods depending on the type of an object. This is really convenient and we’ll see it in action throughout the workshop.

If you right click anywhere on the file you can execute the entire file in the terminal.

You may be prompted to install additional packages when you execute specific commands from the command palette. Usually this can be done by clicking on the provided “install” button. For example, try to execute the “Python: Create New Blank Juptyer Notebook.” This will prompt you to install the ipykernel package.

1.3 Virtual environment in VS code

If you have installed Anaconda, you’ll have a suite of packages typical for data science already prepared. It’s nice, but not necessary, and at some point you’ll want to install your own packages anyway.

A best practice in Python is to not install packages into a global interpreter environment. You’ll notice that this is a bit different from R, where less concern is made about packages and environments, although it’s an increasing theme.

This is where virtual environments come into play. It contains a copy of a global interpreter. Once you activate that environment, any packages you then install are isolated from other environments.

This reduces many complications that can arise from conflicting package versions. To create a virtual environment and install the required packages, first create a directory/folder where you’ll store your various python projects (i.e. not some random folder on the desktop!), then enter the following commands in the terminal as appropriate for your operating system:

1.3.1 For macOS/Linux

python3 -m venv .venv

The -m option searches for a python module (i.e. a file with a .py extension) to execute. Here, it executes the venv module and creates a new hidden directory called .venv. It contains all the information for your virtual environment. Once you start working in VS Code, you’ll notice another hidden file, .vscode. .vscode contains the preferences for your VS Code project.

One CS Code sees that you have created a virtual environment it will ask if you want to automatically activate it. Just click yes. You can manually activate it using:

source .venv/bin/activate

1.3.2 For windows

py -3 -m venv .venv

If VS Code asks to activate the virtual environment, just click Yes. You can manually activate it using:

.venv\scripts\activate

If the activate command generates the message Activate.ps1 is not digitally signed. You cannot run this script on the current system., then you need to temporarily change the PowerShell execution policy to allow scripts to run (see About Execution Policies in the PowerShell documentation):

Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope Process

Remember the command to execute your virtual environment. You should execute this at the beginning of every session in the terminal, or select to activate your virtual environment in VS Code if it is not automatically activated.

You may also choose to use the virtualenv package. There are some slight differences that we are not concerned about. For our purposes venv will be fine.

1.4 Package installation

Now that we have a virtual environment established, let’s install some packages. Remember, this is not necessary with Anaconda distributions since they already come with matplotlib installed.

1.4.1 macOS

python3 -m pip install matplotlib

1.4.2 Windows (may require elevation)

python -m pip install matplotlib

1.4.3 Linux (Debian)

apt-get install python3-tk
python3 -m pip install matplotlib

1.5 Choose your VS Code interpreter

You should be able to execute functions in your saved script now. If not, you may have to select the interpreter form the Command Palette function Python: Select Interpreter

Once you are finished you can type

deactivate

to end the environment. If you are

1.6 The PEP 8 style guide

Before we begin with our first exercises, a brief word on style. Python follows the PEP 8 style guide (co-authored by Guido van Rossum, himself, the creator of Python). It’s a long document, but the most important style guideline are:

  • Use 4 spaces for indentation (no tabs)
  • A max line length of 79 characters
  • Use \ to continue a command on the next line.

We’ll use the following naming conventions. Remember that snake_case is all lowercase, with each word separated using _. CamelCase is capitalized with no word separation.

Type Naming Convention Examples
Function snake_case function, my_function
Variable snake_case x, var, my_variable
Constant SNAKE_CASE CONSTANT, MY_CONSTANT
Package Short, lowercase, no word separation package, mypackage
Module (i.e. a file) snake_case, short | ` module.py, my_module.py
Class CamelCase Model, MyClass
Method snake_case class_method, method

Real Python has an easy-to-read summary guide with lots of examples.

1.7 Set up your environment

By now you should have a new project environment and blank script ready to go. Save your script using the py extension, e.g. DAwPy.py. We’ll start working with packages for data science right away, so make sure to set up your environment appropriately.

Begin a new script and let the first lines be as follows:

print("hello, world!")
## hello, world!

This will import the matplotlib package that you just installed using pip. We’ll discuss modules in further detail later on.

You may see some Python scripts beginning with the first line as such:

#!/usr/bin/env python

This defines that it is a python script when it is executed from the command line. For our purposes, we’ll be running Python in interactive mode, so we won’t worry about it.

To execute a command use shift + enter. You’ll get asked if you want to use the python interpreter. Choose yes and a new console will open. This is the same as executing

/usr/local/bin/python3

in the console, but within VS code, you can send a single or multiple lines from any open script directly to the interpreter.

1.8 Hello world

Now that you have your first command executed, let’s try a hello world for plots. You should have already installed the matplotlib pacakge as described earlier. Place the following code in your script and execute it with shift + enter:

plt.plot([1,23,2,4])
plt.ylabel('some numbers')

# plt.show()

You’ll need to execute the plt.show() command if you wanted to run the sript from the terminal. VS Code allows us to see the output directly in the built-in viewer in interactive mode.

1.9 Wrap-up

In this section we have installed python and set up a new project as a virtual environment.