使用布尔数组过滤数据

当只为 numpy 的 where 函数提供单个参数时,它返回评估为 true 的输入数组(condition)的索引(与 numpy.nonzero 相同的行为)。这可用于提取满足给定条件的数组的索引。

import numpy as np

a = np.arange(20).reshape(2,10)
# a = array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
#           [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]])

# Generate boolean array indicating which values in a are both greater than 7 and less than 13
condition = np.bitwise_and(a>7, a<13)
# condition = array([[False, False, False, False, False, False, False, False,  True, True],
#                    [True,  True,  True, False, False, False, False, False, False, False]], dtype=bool)

# Get the indices of a where the condition is True
ind = np.where(condition)
# ind = (array([0, 0, 1, 1, 1]), array([8, 9, 0, 1, 2]))

keep = a[ind]
# keep = [ 8  9 10 11 12]

如果你不需要索引,可以使用 extract 一步完成,其中你指定 condition 作为第一个参数,但是让 array 返回条件为真的值作为第二个参数。

# np.extract(condition, array)
keep = np.extract(condition, a)
# keep = [ 8  9 10 11 12]

可以向 where 提供另外两个参数 xy,在这种情况下,输出将包含 x 的值,其中条件为 Truey 的值为 False,其中条件为 False

# Set elements of a which are NOT greater than 7 and less than 13 to zero, np.where(condition, x, y)
a = np.where(condition, a, a*0)
print(a)
# Out: array([[ 0,  0,  0,  0,  0,  0,  0,  0,  8,  9],
#            [10, 11, 12,  0,  0,  0,  0,  0,  0,  0]])