np.ndarray
The main ‘pillar’ of NumPy is the np.ndarray
object (or np.array
as an alias), which stands for N-dimensional arrays.
Unlike a Python list
, elements in an np.ndarray
must be of the same type.
np.ndarray
s are also more efficient, and these have smaller memory consumption and better runtime compared to Python list
s. So definitely use np.ndarray
over list
any time for large scale array/matrix operations.
Creating a NumPy array
For one dimensions, you can pass a Python list
(or in fact any sequence type) to the constructor of np.array()
.
x = np.array([1, 2, 3])
print(x) ## [1, 2, 3]
print(type(x)) ## <class 'numpy.ndarray'>
For higher dimensions, use a nested list
. The following code gives you a 2D matrix.
y = np.array([[1, 2, 3], [-1, 4, 7]])
print(y)
## [[ 1 2 3]
## [-1 4 7]]
In NumPy, a dimension is called an axis (plural: axes).
np.array([1, 2, 3])
has one axis (axis 0) with 3 elements.
np.array([[1, 2, 3], [-1, 4, 7]])
has two axes. Axis 0 has 2 elements (rows), axis 1 has 3 elements (columns).
In higher dimensions, it is easier to think of the arrays as arrays inside arrays.
np.ndarray
attributes
Here are some of the most important attributes of the np.ndarray
object:
arr_name.ndim
– the number of axes (dimensions) of the arrayarr_name.shape
– atuple
of integers representing the dimensions of the array (the number of elements in each axis)arr_name.size
– the total number of elements of the array (i.e. the product of the elements ofarr_name.shape
)arr_name.dtype
– a data type object describing the type of the elements in the array
x = np.array([[[0, 1, 2, 3],
[4, 5, 6, 7]],
[[0, 1, 2, 3],
[4, 5, 6, 7]],
[[0 ,1 ,2, 3],
[4, 5, 6, 7]]])
print(x.ndim) ## 3
print(x.shape) ## (3, 2, 4)
print(x.size) ## 24
print(x.dtype) ## int32
The dtype
of an ndarray
is usually inferred from the type of elements in the sequence that you passed to the constructor.
Users can also explicitly specify the dtype
in the constructor. This can either be standard Python types (e.g. int
or float
) or np.dtype
(e.g. np.int32
, np.float64
)
x = np.array([0, 1, 2, 3])
print(x.dtype) ## int32
x = np.array([0.0, 1.1, 2, 3])
print(x.dtype) ## float64
x = np.array(["a", "b", "c", "d"])
print(x.dtype) ## <U1 (unicode string)
x = np.array([0, 1, 2, 3], dtype=float)
print(x.dtype) ## float64
x = np.array([0, 1, 2, 3], dtype=np.int32)
print(x.dtype) ## int32
x = np.array([0, 1, 2, 3], dtype=np.uint32)
print(x.dtype) ## uint32
x = np.array([0, 1, 2, 3], dtype=np.float64)
print(x.dtype) ## float64
int32
refers to a 32-bit integer, while int64
refers to a 64-bit integer. Simply put, you can represent larger numbers with a larger number of bits.
uint32
is an unsigned integer (non-negative integer). Useful for when you are not expecting a number to be negative (e.g. counts).
The official documentation provides a complete list of attributes for np.ndarray
.