NumPy (acronym for 'Numerical Python' or 'Numeric Python') is one of the most essential package for speedy mathematical computation on arrays and matrices in Python. It is also quite useful while dealing with multi-dimensional data. It is a blessing for integrating C, C++ and FORTRAN tools. It also provides numerous functions for Fourier transform (FT) and linear algebra.
Importing numpy
1D array
Using numpy an array is created by using np.array:
Changing the datatype
np.array( ) has an additional parameter of dtype through which one can define whether the elements are integers or floating points or complex numbers.
Creating the sequence of numbers
If you want to create a sequence of numbers then using np.arange, we can get our sequence. To get the sequence of numbers from 20 to 29 we run the following command.
np.arange provides an option of step which defines the difference between 2 consecutive numbers. If step is not provided then it takes the value 1 by default.
Suppose we want to create an arithmetic progression with initial term 20 and common difference 2, upto 30; 30 being excluded.
Indexing in arrays
It is important to note that Python indexing starts from 0. The syntax of indexing is as follows -
If we want to extract 3rd element we write the index as 2 as it starts from 0.
If we want to change the value of all the elements from starting upto index 7,excluding 7, with a step of 3 as 123 we write:
Reshaping the arrays
Note that reshape() does not alter the shape of the original array. Thus to modify the original array we can use resize( )
If a dimension is given as -1 in a reshaping, the other dimensions are automatically calculated provided that the given dimension is a multiple of total number of elements in the array.
In the above code we only directed that we will have 3 rows. Python automatically calculates the number of elements in other dimension i.e. 4 columns.
Missing Data
To check whether array contains missing value, you can use the functionisnan( )
2D arrays
A 2D array in numpy can be created in the following manner:
Creating some usual matrices
To create a matrix of unity we write np.ones( ). We can create a 3 * 3 matrix of all ones by:
Reshaping 2D arrays
To get a flattened 1D array we can use ravel( )
Python : Numpy Tutorial |
Why NumPy instead of lists?
One might think of why one should prefer arrays in NumPy instead we can create lists having the same data type. If this statement also rings a bell then the following reasons may convince you:
- Numpy arrays have contiguous memory allocation. Thus if a same array stored as list will require more space as compared to arrays.
- They are more speedy to work with and hence are more efficient than the lists.
- They are more convenient to deal with.
NumPy vs. Pandas
Pandas is built on top of NumPy. In other words,Numpy is required by pandas to make it work. So Pandas is not an alternative to Numpy. Instead pandas offers additionalmethod or provides more streamlined way of working with numerical and tabular data in Python.
Firstly you need to import the numpy library. Importing numpy can be done by running the following command:
import numpy as npIt is a general approach to import numpy with alias as 'np'. If alias is not provided then to access the functions from numpy we shall write numpy.function. To make it easier an alias 'np' is introduced so we can write np.function. Some of the common functions of numpy are listed below -
Functions | Tasks |
---|---|
array | Create numpy array |
ndim | Dimension of the array |
shape | Size of the array (Number of rows and Columns) |
size | Total number of elements in the array |
dtype | Type of elements in the array, i.e., int64, character |
reshape | Reshapes the array without changing the original shape |
resize | Reshapes the array. Also change the original shape |
arange | Create sequence of numbers in array |
Itemsize | Size in bytes of each item |
diag | Create a diagonal matrix |
vstack | Stacking vertically |
hstack | Stacking horizontally |
Using numpy an array is created by using np.array:
a = np.array([15,25,14,78,96])
a
print(a)
a Output: array([15, 25, 14, 78, 96]) print(a) Output: [15 25 14 78 96]Notice that in np.array square brackets are present. Absence of square bracket introduces an error. To print the array we can use print(a).
Changing the datatype
np.array( ) has an additional parameter of dtype through which one can define whether the elements are integers or floating points or complex numbers.
a.dtypeInitially datatype of 'a' was 'int32' which on modifying becomes 'float64'.
a = np.array([15,25,14,78,96],dtype = "float")
a
a.dtype
- int32 refers to number without a decimal point. '32' means number can be in between-2147483648 and 2147483647. Similarly, int16 implies number can be in range -32768 to 32767
- float64 refers to number with decimal place.
Creating the sequence of numbers
If you want to create a sequence of numbers then using np.arange, we can get our sequence. To get the sequence of numbers from 20 to 29 we run the following command.
b = np.arange(start = 20,stop = 30, step = 1)
b
array([20, 21, 22, 23, 24, 25, 26, 27, 28, 29])In np.arange the end point is always excluded.
np.arange provides an option of step which defines the difference between 2 consecutive numbers. If step is not provided then it takes the value 1 by default.
Suppose we want to create an arithmetic progression with initial term 20 and common difference 2, upto 30; 30 being excluded.
c = np.arange(20,30,2) #30 is excluded.
c
array([20, 22, 24, 26, 28])It is to be taken care that in np.arange( ) the stop argument is always excluded.
Indexing in arrays
It is important to note that Python indexing starts from 0. The syntax of indexing is as follows -
- x[start:end:step]: Elements in array x start through the end (but the end is excluded), default step value is 1.
- x[start:end] : Elements in array x start through the end (but the end is excluded)
- x[start:] : Elements start through the end
- x[:end] : Elements from the beginning through the end (but the end is excluded)
If we want to extract 3rd element we write the index as 2 as it starts from 0.
x = np.arange(10)
x[2]
x[2:5]
x[::2]
x[1::2]
x Output: [0 1 2 3 4 5 6 7 8 9] x[2] Output: 2 x[2:5] Output: array([2, 3, 4]) x[::2] Output: array([0, 2, 4, 6, 8]) x[1::2] Output: array([1, 3, 5, 7, 9])Note that in x[2:5] elements starting from 2nd index up to 5th index(exclusive) are selected.
If we want to change the value of all the elements from starting upto index 7,excluding 7, with a step of 3 as 123 we write:
x[:7:3] = 123
x
array([123, 1, 2, 123, 4, 5, 123, 7, 8, 9])To reverse a given array we write:
x = np.arange(10)
x[ : :-1] # reversed x
array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])Note that the above command does not modify the original array.
Reshaping the arrays
To reshape the array we can use reshape( ).
f = np.arange(101,113)
f.reshape(3,4)
f
array([101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112])
Note that reshape() does not alter the shape of the original array. Thus to modify the original array we can use resize( )
f.resize(3,4)
f
array([[101, 102, 103, 104], [105, 106, 107, 108], [109, 110, 111, 112]])
If a dimension is given as -1 in a reshaping, the other dimensions are automatically calculated provided that the given dimension is a multiple of total number of elements in the array.
f.reshape(3,-1)
array([[101, 102, 103, 104], [105, 106, 107, 108], [109, 110, 111, 112]])
In the above code we only directed that we will have 3 rows. Python automatically calculates the number of elements in other dimension i.e. 4 columns.
Missing Data
The missing data is represented by NaN (acronym for Not a Number). You can use the command np.nan
val = np.array([15,10, np.nan, 3, 2, 5, 6, 4])
val.sum()
Out: nan
To ignore missing values, you can use np.nansum(val) which returns 45To check whether array contains missing value, you can use the functionisnan( )
np.isnan(val)
2D arrays
A 2D array in numpy can be created in the following manner:
g = np.array([(10,20,30),(40,50,60)])The dimension, total number of elements and shape can be ascertained by ndim, size and shape respectively:
#Alternatively
g = np.array([[10,20,30],[40,50,60]])
g
g.ndim
g.size
g.shape
g.ndim Output: 2 g.size Output: 6 g.shape Output: (2, 3)
Creating some usual matrices
numpy provides the utility to create some usual matrices which are commonly used for linear algebra.
To create a matrix of all zeros of 2 rows and 4 columns we can use np.zeros( ):np.zeros( (2,4) )
array([[ 0., 0., 0., 0.], [ 0., 0., 0., 0.]])Here the dtype can also be specified. For a zero matrix the default dtype is 'float'. To change it to integer we write 'dtype = np.int16'
np.zeros([2,4],dtype=np.int16)
array([[0, 0, 0, 0], [0, 0, 0, 0]], dtype=int16)To get a matrix of all random numbers from 0 to 1 we write np.empty.
np.empty( (2,3) )
array([[ 2.16443571e-312, 2.20687562e-312, 2.24931554e-312], [ 2.29175545e-312, 2.33419537e-312, 2.37663529e-312]])Note: The results may vary everytime you run np.empty.
To create a matrix of unity we write np.ones( ). We can create a 3 * 3 matrix of all ones by:
np.ones([3,3])
array([[ 1., 1., 1.], [ 1., 1., 1.], [ 1., 1., 1.]])To create a diagonal matrix we can write np.diag( ). To create a diagonal matrix where the diagonal elements are 14,15,16 and 17 we write:
np.diag([14,15,16,17])
array([[14, 0, 0, 0], [ 0, 15, 0, 0], [ 0, 0, 16, 0], [ 0, 0, 0, 17]])To create an identity matrix we can use np.eye( ) .
np.eye(5,dtype = "int")
array([[1, 0, 0, 0, 0], [0, 1, 0, 0, 0], [0, 0, 1, 0, 0], [0, 0, 0, 1, 0], [0, 0, 0, 0, 1]])By default the datatype in np.eye( ) is 'float' thus we write dtype = "int" to convert it to integers.
Reshaping 2D arrays
To get a flattened 1D array we can use ravel( )
g = np.array([(10,20,30),(40,50,60)])
g.ravel()
array([10, 20, 30, 40, 50, 60])To change the shape of 2D array we can use reshape. Writing -1 will calculate the other dimension automatically and does not modify the original array.
g.reshape(3,-1) # returns the array with a modified shape
#It does not modify the original array
g.shape
(2, 3)Similar to 1D arrays, using resize( ) will modify the shape in the original array.
g.resize((3,2))
g #resize modifies the original array
array([[10, 20], [30, 40], [50, 60]])
Time for some matrix algebra
Matrix addition and subtraction can be done in the usual way:
Some Mathematics functions
In order to get the exponents we use **
In order to obtain if a condition is satisfied by the elements of a matrix we need to write the criteria. For instance, to check if the elements of B are more than 25 we write:
In a similar manner np.absolute, np.sqrt and np.exp return the matrices of absolute numbers, square roots and exponentials respectively.
Creating 3D arrays
Numpy also provides the facility to create 3D arrays. A 3D array can be created as:
To calculate the sum along a particular axis we use the axis parameter as follows:
Consider a 3D array:
To extract the first element from all the rows we write:
Find out position of elements that satisfy a given condition
Indexing with Arrays of Indices
Consider a 1D array.
You can also use indexing with arrays to assign the values:
When the list of indices contains repetitions then it assigns the last value to that index:
Caution: If one is using += operator on repeated indices then it carries out the operator only once on repeated indices.
Indexing with Boolean Arrays
We create a 2D array and store our condition in b. If we the condition is true it results in True otherwise False.
To select the elements from 'a' which adhere to condition 'b' we write:
This property can be very useful in assignments:
As done in integer indexing we can use indexing via Booleans:
Let x be the original matrix and 'y' and 'z' be the arrays of Booleans to select the rows and columns.
Statistics on Pandas DataFrame
Let's create dummy data frame for illustration :
1. Calculate mean of each column of data frame
Stacking various arrays
Let us consider 2 arrays A and B:
Splitting the arrays
Consider an array 'z' of 15 elements:
On passing 2 elements we get:
For 2D arrays np.hsplit( ) works as follows:
To split after the third and the fifth column we write:
Copying
Consider an array x
Creating a view of the data
Let us store z as a view of x by:
Changing the shape of z
Creating a copy of the data:
Now let us create z as a copy of x:
Let us create some arrays A,b and B and they will be used for this section:
A = np.array([[2,0,1],[4,3,8],[7,6,9]])In order to get the transpose, trace and inverse we use A.transpose( ) , np.trace( ) and np.linalg.inv( ) respectively.
b = np.array([1,101,14])
B = np.array([[10,20,30],[40,50,60],[70,80,90]])
A.T #transpose
A.transpose() #transpose
np.trace(A) # trace
np.linalg.inv(A) #Inverse
A.transpose() #transpose Output: array([[2, 4, 7], [0, 3, 6], [1, 8, 9]]) np.trace(A) # trace Output: 14 np.linalg.inv(A) #Inverse Output: array([[ 0.53846154, -0.15384615, 0.07692308], [-0.51282051, -0.28205128, 0.30769231], [-0.07692308, 0.30769231, -0.15384615]])
Note that transpose does not modify the original array.
Matrix addition and subtraction can be done in the usual way:
A+B
A-B
A+B Output: array([[12, 20, 31], [44, 53, 68], [77, 86, 99]]) A-B Output: array([[ -8, -20, -29], [-36, -47, -52], [-63, -74, -81]])
Matrix multiplication of A and B can be accomplished by A.dot(B). Where A will be the 1st matrix on the left hand side and B will be the second matrix on the right side.
A.dot(B)
array([[ 90, 120, 150], [ 720, 870, 1020], [ 940, 1160, 1380]])To solve the system of linear equations: Ax = b we use np.linalg.solve( )
np.linalg.solve(A,b)
array([-13.92307692, -24.69230769, 28.84615385])
The eigen values and eigen vectors can be calculated using np.linalg.eig( )
np.linalg.eig(A)
(array([ 14.0874236 , 1.62072127, -1.70814487]), array([[-0.06599631, -0.78226966, -0.14996331], [-0.59939873, 0.54774477, -0.81748379], [-0.7977253 , 0.29669824, 0.55608566]]))The first row are the various eigen values and the second matrix denotes the matrix of eigen vectors where each column is the eigen vector to the corresponding eigen value.
Some Mathematics functions
We can have various trigonometric functions like sin, cosine etc. using numpy:
B = np.array([[0,-20,36],[40,50,1]])
np.sin(B)
array([[ 0. , -0.91294525, -0.99177885], [ 0.74511316, -0.26237485, 0.84147098]])The resultant is the matrix of all sin( ) elements.
In order to get the exponents we use **
B**2
array([[ 0, 400, 1296], [1600, 2500, 1]], dtype=int32)We get the matrix of the square of all elements of B.
In order to obtain if a condition is satisfied by the elements of a matrix we need to write the criteria. For instance, to check if the elements of B are more than 25 we write:
B>25
array([[False, False, True], [ True, True, False]], dtype=bool)We get a matrix of Booleans where True indicates that the corresponding element is greater than 25 and False indicates that the condition is not satisfied.
In a similar manner np.absolute, np.sqrt and np.exp return the matrices of absolute numbers, square roots and exponentials respectively.
np.absolute(B)Now we consider a matrix A of shape 3*3:
np.sqrt(B)
np.exp(B)
A = np.arange(1,10).reshape(3,3)
A
array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])To find the sum, minimum, maximum, mean, standard deviation and variance respectively we use the following commands:
A.sum()
A.min()
A.max()
A.mean()
A.std() #Standard deviation
A.var() #Variance
A.sum() Output: 45 A.min() Output: 1 A.max() Output: 9 A.mean() Output: 5.0 A.std() #Standard deviation Output: 2.5819888974716112 A.var() Output: 6.666666666666667In order to obtain the index of the minimum and maximum elements we use argmin( ) and argmax( ) respectively.
A.argmin()
A.argmax()
A.argmin() Output: 0 A.argmax() Output: 8If we wish to find the above statistics for each row or column then we need to specify the axis:
A.sum(axis=0)
A.mean(axis = 0)
A.std(axis = 0)
A.argmin(axis = 0)
A.sum(axis=0) # sum of each column, it will move in downward direction Output: array([12, 15, 18]) A.mean(axis = 0) Output: array([ 4., 5., 6.]) A.std(axis = 0) Output: array([ 2.44948974, 2.44948974, 2.44948974]) A.argmin(axis = 0) Output: array([0, 0, 0], dtype=int64)By defining axis = 0, calculations will move in downward direction i.e. it will give the statistics for each column. To find the min and index of maximum element for each row, we need to move in right-wise direction so we write axis = 1:
A.min(axis=1)
A.argmax(axis = 1)
A.min(axis=1) # min of each row, it will move in rightwise direction Output: array([1, 4, 7]) A.argmax(axis = 1) Output: array([2, 2, 2], dtype=int64)To find the cumulative sum along each row we use cumsum( )
A.cumsum(axis=1)
array([[ 1, 3, 6], [ 4, 9, 15], [ 7, 15, 24]], dtype=int32)
Creating 3D arrays
Numpy also provides the facility to create 3D arrays. A 3D array can be created as:
X = np.array( [[[ 1, 2,3],X contains two 2D arrays Thus the shape is 2,2,3. Totol number of elements is 12.
[ 4, 5, 6]],
[[7,8,9],
[10,11,12]]])
X.shape
X.ndim
X.size
To calculate the sum along a particular axis we use the axis parameter as follows:
X.sum(axis = 0)
X.sum(axis = 1)
X.sum(axis = 2)
X.sum(axis = 0) Output: array([[ 8, 10, 12], [14, 16, 18]]) X.sum(axis = 1) Output: array([[ 5, 7, 9], [17, 19, 21]]) X.sum(axis = 2) Output: array([[ 6, 15], [24, 33]])axis = 0 returns the sum of the corresponding elements of each 2D array. axis = 1 returns the sum of elements in each column in each matrix while axis = 2 returns the sum of each row in each matrix.
X.ravel()
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])ravel( ) writes all the elements in a single array.
Consider a 3D array:
X = np.array( [[[ 1, 2,3],To extract the 2nd matrix we write:
[ 4, 5, 6]],
[[7,8,9],
[10,11,12]]])
X[1,...] # same as X[1,:,:] or X[1]
array([[ 7, 8, 9], [10, 11, 12]])Remember python indexing starts from 0 that is why we wrote 1 to extract the 2nd 2D array.
To extract the first element from all the rows we write:
X[...,0] # same as X[:,:,0]
array([[ 1, 4], [ 7, 10]])
Find out position of elements that satisfy a given condition
a = np.array([8, 3, 7, 0, 4, 2, 5, 2])
np.where(a > 4)
array([0, 2, 6]np.where locates the positions in the array where element of array is greater than 4.
Indexing with Arrays of Indices
Consider a 1D array.
x = np.arange(11,35,2)
x
array([11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33])We form a 1D array i which subsets the elements of x as follows:
i = np.array( [0,1,5,3,7,9 ] )
x[i]
array([11, 13, 21, 17, 25, 29])In a similar manner we create a 2D array j of indices to subset x.
j = np.array( [ [ 0, 1], [ 6, 2 ] ] )
x[j]
array([[11, 13], [23, 15]])Similarly we can create both i and j as 2D arrays of indices for x
x = np.arange(15).reshape(3,5)To get the ith index in row and jth index for columns we write:
x
i = np.array( [ [0,1], # indices for the first dim
[2,0] ] )
j = np.array( [ [1,1], # indices for the second dim
[2,0] ] )
x[i,j] # i and j must have equal shape
array([[ 1, 6], [12, 0]])To extract ith index from 3rd column we write:
x[i,2]
array([[ 2, 7], [12, 2]])For each row if we want to find the jth index we write:
x[:,j]
array([[[ 1, 1], [ 2, 0]], [[ 6, 6], [ 7, 5]], [[11, 11], [12, 10]]])Fixing 1st row and jth index,fixing 2nd row jth index, fixing 3rd row and jth index.
You can also use indexing with arrays to assign the values:
x = np.arange(10)
x
x[[4,5,8,1,2]] = 0
x
array([0, 0, 0, 3, 0, 0, 6, 7, 0, 9])0 is assigned to 4th, 5th, 8th, 1st and 2nd indices of x.
When the list of indices contains repetitions then it assigns the last value to that index:
x = np.arange(10)
x
x[[4,4,2,3]] = [100,200,300,400]
x
array([ 0, 1, 300, 400, 200, 5, 6, 7, 8, 9])Notice that for the 5th element(i.e. 4th index) the value assigned is 200, not 100.
Caution: If one is using += operator on repeated indices then it carries out the operator only once on repeated indices.
x = np.arange(10)
x[[1,1,1,7,7]]+=1
x
array([0, 2, 2, 3, 4, 5, 6, 8, 8, 9])Although index 1 and 7 are repeated but they are incremented only once.
Indexing with Boolean Arrays
We create a 2D array and store our condition in b. If we the condition is true it results in True otherwise False.
a = np.arange(12).reshape(3,4)
b = a > 4
b
array([[False, False, False, False], [False, True, True, True], [ True, True, True, True]], dtype=bool)Note that 'b' is a Boolean with same shape as that of 'a'.
To select the elements from 'a' which adhere to condition 'b' we write:
a[b]
array([ 5, 6, 7, 8, 9, 10, 11])Now 'a' becomes a 1D array with the selected elements
This property can be very useful in assignments:
a[b] = 0
a
array([[0, 1, 2, 3], [4, 0, 0, 0], [0, 0, 0, 0]])All elements of 'a' higher than 4 become 0
As done in integer indexing we can use indexing via Booleans:
Let x be the original matrix and 'y' and 'z' be the arrays of Booleans to select the rows and columns.
x = np.arange(15).reshape(3,5)We write the x[y,:] which will select only those rows where y is True.
y = np.array([True,True,False]) # first dim selection
z = np.array([True,True,False,True,False]) # second dim selection
x[y,:] # selecting rowsWriting x[:,z] will select only those columns where z is True.
x[y] # same thing
x[:,z] # selecting columns
x[y,:] # selecting rows Output: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) x[y] # same thing Output: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) x[:,z] # selecting columns Output: array([[ 0, 1, 3], [ 5, 6, 8], [10, 11, 13]])
Statistics on Pandas DataFrame
Let's create dummy data frame for illustration :
np.random.seed(234) mydata = pd.DataFrame({"x1" : np.random.randint(low=1, high=100, size=10), "x2" : range(10) })
1. Calculate mean of each column of data frame
np.mean(mydata)2. Calculate median of each column of data frame
np.median(mydata, axis=0)axis = 0 means the median function would be run on each column. axis = 1 implies the function to be run on each row.
Stacking various arrays
Let us consider 2 arrays A and B:
A = np.array([[10,20,30],[40,50,60]])To join them vertically we use np.vstack( ).
B = np.array([[100,200,300],[400,500,600]])
np.vstack((A,B)) #Stacking vertically
array([[ 10, 20, 30], [ 40, 50, 60], [100, 200, 300], [400, 500, 600]])To join them horizontally we use np.hstack( ).
np.hstack((A,B)) #Stacking horizontally
array([[ 10, 20, 30, 100, 200, 300], [ 40, 50, 60, 400, 500, 600]])newaxis helps in transforming a 1D row vector to a 1D column vector.
from numpy import newaxis
a = np.array([4.,1.])
b = np.array([2.,8.])
a[:,newaxis]
array([[ 4.], [ 1.]])#The function np.column_stack( ) stacks 1D arrays as columns into a 2D array. It is equivalent to hstack only for 1D arrays:
np.column_stack((a[:,newaxis],b[:,newaxis]))
np.hstack((a[:,newaxis],b[:,newaxis])) # same as column_stack
np.column_stack((a[:,newaxis],b[:,newaxis])) Output: array([[ 4., 2.], [ 1., 8.]]) np.hstack((a[:,newaxis],b[:,newaxis])) Output: array([[ 4., 2.], [ 1., 8.]])
Splitting the arrays
Consider an array 'z' of 15 elements:
z = np.arange(1,16)Using np.hsplit( ) one can split the arrays
np.hsplit(z,5) # Split a into 5 arrays
[array([1, 2, 3]), array([4, 5, 6]), array([7, 8, 9]), array([10, 11, 12]), array([13, 14, 15])]It splits 'z' into 5 arrays of eqaual length.
On passing 2 elements we get:
np.hsplit(z,(3,5))
[array([1, 2, 3]), array([4, 5]), array([ 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])]It splits 'z' after the third and the fifth element.
For 2D arrays np.hsplit( ) works as follows:
A = np.arange(1,31).reshape(3,10)
A
np.hsplit(A,5) # Split a into 5 arrays
[array([[ 1, 2], [11, 12], [21, 22]]), array([[ 3, 4], [13, 14], [23, 24]]), array([[ 5, 6], [15, 16], [25, 26]]), array([[ 7, 8], [17, 18], [27, 28]]), array([[ 9, 10], [19, 20], [29, 30]])]In the above command A gets split into 5 arrays of same shape.
To split after the third and the fifth column we write:
np.hsplit(A,(3,5))
[array([[ 1, 2, 3], [11, 12, 13], [21, 22, 23]]), array([[ 4, 5], [14, 15], [24, 25]]), array([[ 6, 7, 8, 9, 10], [16, 17, 18, 19, 20], [26, 27, 28, 29, 30]])]
Copying
Consider an array x
x = np.arange(1,16)We assign y as x and then say 'y is x'
y = xLet us change the shape of y
y is x
y.shape = 3,5Note that it alters the shape of x
x.shape
(3, 5)
Creating a view of the data
Let us store z as a view of x by:
z = x.view()
z is x
FalseThus z is not x.
Changing the shape of z
z.shape = 5,3Creating a view does not alter the shape of x
x.shape
(3, 5)Changing an element in z
z[0,0] = 1234Note that the value in x also get alters:
x
array([[1234, 2, 3, 4, 5], [ 6, 7, 8, 9, 10], [ 11, 12, 13, 14, 15]])Thus changes in the display does not hamper the original data but changes in values of view will affect the original data.
Creating a copy of the data:
Now let us create z as a copy of x:
z = x.copy()Note that z is not x
z is xChanging the value in z
z[0,0] = 9999No alterations are made in x.
x
array([[1234, 2, 3, 4, 5], [ 6, 7, 8, 9, 10], [ 11, 12, 13, 14, 15]])Python sometimes may give 'setting with copy' warning because it is unable to recognize whether the new dataframe or array (created as a subset of another dataframe or array) is a view or a copy. Thus in such situations user needs to specify whether it is a copy or a view otherwise Python may hamper the results.
Exercises : Numpy
1. How to extract even numbers from array?
arr = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
Desired Output :array([0, 2, 4, 6, 8])
arr[arr % 2 == 0]
2. How to find out the position where elements of x and y are same
x = np.array([5,6,7,8,3,4])
y = np.array([5,3,4,5,2,4])
y = np.array([5,3,4,5,2,4])
Desired Output :array([0, 5]
np.where(x == y)
3. How to standardize values so that it lies between 0 and 1
k = np.array([5,3,4,5,2,4])
Hint :k-min(k)/(max(k)-min(k))
kmax, kmin = k.max(), k.min()
k_new = (k - kmin)/(kmax - kmin)
4. How to calculate the percentile scores of an array
p = np.array([15,10, 3,2,5,6,4])
np.percentile(p, q=[5, 95])
5. Print the number of missing values in an array
p = np.array([5,10, np.nan, 3, 2, 5, 6, np.nan])
print("Number of missing values =", np.isnan(p).sum())
Thanks Ekta for sharing this article on NumPy.
ReplyDeleteYou're welcome.
DeleteThanks.....Ekta exactly what i was looking for
ReplyDeleteIt's awesome to know that you like it.
DeleteVery nice guide! Helped me a lot.
ReplyDeletegreat tutorial. Helped a lot in taking numpy skills at next level.
ReplyDeleteExcellent am student of advanced maths. And this is very good.
ReplyDeleteGreat effort Ekta ! I am very thankful to you for compiling all the nuisances of numpy at one place . I am bookmarking this webpage for future reference .
ReplyDelete..
ReplyDelete