This is an archived version of the course. Please see the latest version of the course.

np.ndarray

The main ‘pillar’ of NumPy is the np.ndarray object (or np.array as an alias), which stands for N-dimensional arrays.

Unlike a Python list, elements in an np.ndarray must be of the same type.

np.ndarrays are also more efficient, and these have smaller memory consumption and better runtime compared to Python lists. So definitely use np.ndarray over list any time for large scale array/matrix operations.

Creating a NumPy array

For one dimensions, you can pass a Python list (or in fact any sequence type) to the constructor of np.array().

x = np.array([1, 2, 3]) 
print(x)         ## [1, 2, 3]
print(type(x))   ## <class 'numpy.ndarray'>

For higher dimensions, use a nested list. The following code gives you a 2D matrix.

y = np.array([[1, 2, 3], [-1, 4, 7]]) 
print(y)         
## [[ 1  2  3]
##  [-1  4  7]]

In NumPy, a dimension is called an axis (plural: axes).

np.array([1, 2, 3]) has one axis (axis 0) with 3 elements.

np.array([[1, 2, 3], [-1, 4, 7]]) has two axes. Axis 0 has 2 elements (rows), axis 1 has 3 elements (columns).

2D np.ndarray

In higher dimensions, it is easier to think of the arrays as arrays inside arrays.

3D np.ndarray

np.ndarray attributes

Here are some of the most important attributes of the np.ndarray object:

  • arr_name.ndim – the number of axes (dimensions) of the array
  • arr_name.shape – a tuple of integers representing the dimensions of the array (the number of elements in each axis)
  • arr_name.size – the total number of elements of the array (i.e. the product of the elements of arr_name.shape)
  • arr_name.dtype – a data type object describing the type of the elements in the array
x = np.array([[[0, 1, 2, 3],
               [4, 5, 6, 7]],
              [[0, 1, 2, 3],
               [4, 5, 6, 7]],
              [[0 ,1 ,2, 3],
               [4, 5, 6, 7]]])
print(x.ndim)    ## 3
print(x.shape)   ## (3, 2, 4)
print(x.size)    ## 24
print(x.dtype)   ## int32

The dtype of an ndarray is usually inferred from the type of elements in the sequence that you passed to the constructor.

Users can also explicitly specify the dtype in the constructor. This can either be standard Python types (e.g. int or float) or np.dtype (e.g. np.int32, np.float64)

x = np.array([0, 1, 2, 3])  
print(x.dtype)   ## int32
x = np.array([0.0, 1.1, 2, 3])
print(x.dtype)   ## float64
x = np.array(["a", "b", "c", "d"])
print(x.dtype)   ## <U1 (unicode string)
x = np.array([0, 1, 2, 3], dtype=float)
print(x.dtype)   ## float64
x = np.array([0, 1, 2, 3], dtype=np.int32)
print(x.dtype)   ## int32
x = np.array([0, 1, 2, 3], dtype=np.uint32)
print(x.dtype)   ## uint32
x = np.array([0, 1, 2, 3], dtype=np.float64)
print(x.dtype)   ## float64

int32 refers to a 32-bit integer, while int64 refers to a 64-bit integer. Simply put, you can represent larger numbers with a larger number of bits.

uint32 is an unsigned integer (non-negative integer). Useful for when you are not expecting a number to be negative (e.g. counts).

The official documentation provides a complete list of attributes for np.ndarray.