I am using Numpy to store data into matrices. Coming from R background, there has been an extremely simple way to apply a function over row/columns or both of a matrix.
Is there something similar for python/numpy combination? It's not a problem to write my own little implementation but it seems to me that most of the versions I come up with will be significantly less efficient/more memory intensive than any of the existing implementation.
I would like to avoid copying from the numpy matrix to a local variable etc., is that possible?
The functions I am trying to implement are mainly simple comparisons (e.g. how many elements of a certain column are smaller than number x or how many of them have absolute value larger than y).
Almost all numpy functions operate on whole arrays, and/or can be told to operate on a particular axis (row or column).
As long as you can define your function in terms of numpy functions acting on numpy arrays or array slices, your function will automatically operate on whole arrays, rows or columns.
It may be more helpful to ask about how to implement a particular function to get more concrete advice.
def myfunc(a,b): if (a>b): return a else: return b vecfunc = np.vectorize(myfunc) result=vecfunc([[1,2,3],[5,6,9]],[7,4,5]) print(result) # [[7 4 5] # [7 6 9]]
(The elements of the first array get replaced by the corresponding element of the second array when the second is bigger.)
But don't get too excited;
np.frompyfunc are just syntactic sugar. They don't actually make your code any faster. If your underlying Python function is operating on one value at a time, then
np.vectorize will feed it one item at a time, and the whole
operation is going to be pretty slow (compared to using a numpy function which calls some underlying C or Fortran implementation).
To count how many elements of column
x are smaller than a number
y, you could use an expression such as:
import numpy as np array=np.arange(6).view([('x',np.int),('y',np.int)]) print(array) # [(0, 1) (2, 3) (4, 5)] print(array['x']) # [0 2 4] print(array['x']<3) # [ True True False] print((array['x']<3).sum()) # 2