# Python for data analysis: Getting to grips with NumPy multidimensional arrays

NumPy stands for Numerical Python. It’s a library that contains a collection of routines for processing arrays.

An array, is a list. A multi dimensional array, is essentially a grid or table (it’s an array that contains 2 or more arrays).

All elements within a numpy array must be of the same datatype.

## Import NumPy and generate some random data

In the below, we generate our N dimensional arrays and populate them with random data. You can see that by defining ‘np.random.randn(2,3), we’re creating an array with 2 dimensions, each of which have three elements. I’ve also added an example, where I’ve created a 5 dimensional array, where each dimension has 3 elements.

```In [1]: import numpy as np
In [2]: random_data = np.random.randn(2,3)
In [3]: print(random_data)
[[ 0.88302552 -1.42080937 -1.6986092 ]
[-1.0288492  -0.54502937 -0.33842288]]
In [4]: random_data2 = np.random.randn(5,3)
In [5]: print(random_data2)
[[ 0.78625291 -0.48370229 -0.12836087]
[ 0.5822994  -0.80951002 -0.06102739]
[ 0.31494314  0.77771639 -1.01134465]
[ 0.52950739 -1.48921329 -1.24892668]
[-1.73892042 -0.39769143  0.21128377]]```

## Basic Array Interaction

We can interact with our arrays. In the below example, I’ve multiplied our random_data2 array by 4. Each element of each dimension has been multiplied.

```In [7]: multiply = random_data2 * 4
In [8]: print(multiply)
[[ 3.14501163 -1.93480914 -0.51344348]
[ 2.32919758 -3.2380401  -0.24410956]
[ 1.25977256  3.11086556 -4.04537862]
[ 2.11802955 -5.95685316 -4.99570671]
[-6.95568168 -1.59076574  0.84513507]]```

We can also add the array to itself. You can see below that I’ve added random_data to itself, doubling each of the element values.

```In [10]: another_example = random_data + random_data
In [11]: print(another_example)
[[ 1.76605103 -2.84161873 -3.3972184 ]
[-2.05769841 -1.09005875 -0.67684576]]```

## Inspect our arrays

The ndarray has two main inspection functions – shape and dtype. Shape tells you how many dimensions are in the array & how many elements are in each dimension. The dtype function shows you the datatype of the array elements (as above, all elements must be the same data type).

```In [12]: another_example.shape
Out[12]: (2, 3)
another_example.dtype
Out[13]: dtype('float64')```

## Converting lists to ndarrays

In the below, I convert the list called ‘ages’ to an ndarray and then convert two lists (age_groups) to a 2 dimensional array.

```In [14]: ages = [1, 5, 7, 8, 10, 21]
In [15]: ndarray = np.array(ages)
In [16]: print(ndarray)
[ 1  5  7  8 10 21]
In [17]: age_groups = [[1, 5, 8, 8, 10, 21], [7, 2, 1, 11, 16, 22]]
In [18]: ndarray2 = np.array(age_groups)
In [19]: ndarray2
Out[20]: array([list([1, 5, 8, 8, 10, 21]), list([7, 2, 1, 11, 16, 22])], dtype=object)```

## Explicitly define array data type

In the below, I define our first array as being of type float64. I then proceed to change the type to int64.

```In [31]: array_defined_type = np.array(newlist, dtype=np.float64)
In [32]: array_defined_type.dtype
Out[32]: dtype('float64')
In [33]: changetype = array_defined_type.astype(np.int64)
In [34]: changetype.dtype
Out[35]: dtype('int64')```

## Extracting data from ndarrays

In the below, I extract each of the two lists stored within the array:

```In [36]: age_groups = [[1, 5, 8, 8, 10, 21], [7, 2, 1, 11, 16, 22]]
In [37]: ndarray2 = np.array(age_groups)
In [38]: ndarray2[0]
Out[39]: array([ 1,  5,  8,  8, 10, 21])
In [40]: ndarray2[1]
Out[41]: array([ 7,  2,  1, 11, 16, 22])```

Now, I select a specific element from within the array, within the array.

```In [46]: ndarray2[1,1]
Out[46]: 2```

## Multidimensional Arrays

An array within an array is called a multidimensional array. Let’s look at an example of a 2 dimensional array.

```In [1]: import numpy as np
In [2]: bmw = [60000, 300, 2000]
In [3]: renault = [50000, 445, 1500]
In [4]: nissan = [25000, 321, 3320]
In [5]: two_dimensions = [[nissan], [renault], [bmw]]
In [6]: np.array(two_dimensions)
Out[7]:
array([[[25000,   321,  3320]],
[[50000,   445,  1500]],
[[60000,   300,  2000]]])```

This is a two dimensional array as arrays only go to two levels: the main array and the manufacturer specific arrays. 2D arrays are important, as tables are presented in two dimensions:

Selling PriceSalesProfit
600003002000
500004451500
250003213320

## Math with arrays

Below, I show how we can multiply, divide, add and subtract two arrays. In every case, the index in array 1 is matched to the same index in array 2.For example, bmw[1] = 300 and nissan[1] = 321. So the result of the addition for position 1 of the output array is 621.

```In [20]: bmw = np.array(bmw)
In [21]: nissan = np.array(nissan)
In [23]: bmw_plus_nissan = np.add(bmw, nissan)
In [24]: print(bmw_plus_nissan)
[85000   621  5320]
In [25]: bmw_minus_nissan = np.subtract(bmw,nissan)
In [26]: print(bmw_minus_nissan)
[35000   -21 -1320]
In [27]: bmw_multiply_nissan = np.multiply(bmw,nissan)
In [28]: print(bmw_multiply_nissan)
[1500000000      96300    6640000]
In [29]: bmw_divide_nissan = np.divide(bmw, nissan)
In [30]: print(bmw_divide_nissan)
[2.4        0.93457944 0.60240964]```

We can sum all elements in an array using the below:

```In [34]: print(np.sum(bmw))
62300```