Python for data analysis: setup

Python has become a leading technology in the data analysis space. It’s one of the most sought after skills & something that can help you advance your career in big data. The following series of articles will cover the language and libraries in detail, enabling you to get hands on with Python to tackle your own use cases.

In this article, we’ll get your laptop setup & ready to take on the Python challenges.

Step1: To start, you’ll need to verify that python3 is installed on your machine. On Mac (which is the focus of this tutorial), you can achieve this by typing ‘python3’ into the terminal. If nothing is found, you don’t have it.

If you don’t have Python3 already, you’ll need to install it. If you have homebrew on your Mac, you can do that easily by typing ‘brew install python’ into the terminal – if you don’t have homebrew, you’ll need to install that first (details can be found here).

Now that is installed, type ‘python3’ into the terminal again. You should see something like the below:

Step2: You will now need to download and install Anaconda v3.7, which can be achieved by downloading the .pkg file, double clicking & walking through the install wizard.

Step3: Next, let’s check if you have ipython installed on your machine. To do that, simply type ‘ipython’ into the terminal. If you see the error ‘command not found’, then it’s not already installed & you’ll need to do it. You can install ipython by typing ‘brew install ipython’ into the terminal.

Once again, to check the install, type ‘ipython’ into the terminal. If installed successfully, you should see something like this:

You can have a test of ipython, as demonstrated below:

Step4: Now, let’s get a few of the packages that we will need installed:

  • ‘brew install numpy’
  • ‘brew install pandas’

And that’s it! Your machine is ready to go. Catch us in Article 2, where we will start to discuss Jupyter notebooks & more!

A little overview of iPython & Jupyter

Traditionally, if you were writing a Python script, you’d do something like this:

And you’d populate that file with the script you wanted to run

And then you’d execute that script through the terminal

That’s a great way to execute your scripts, but it limits the exploratory element of data analysis. iPython & Jupyter enable us to see responses to our scripts, when we hit run. Here is an example of iPython doing just that (where in elements are my scripts & out are the responses to those scripts).

Jupyter achieves much of the same, but through a user interface. In the below, we create a new Python3 notebook, run a line of Python script (seeing the response right away) and then inspect a variable to see its metadata – which can help us to understand the best way to interact with that variable.

Create A Notebook

Run a simple script & see the output

Element introspection