使用布林陣列過濾資料
當只為 numpy 的 where
函式提供單個引數時,它返回評估為 true 的輸入陣列(condition
)的索引(與 numpy.nonzero
相同的行為)。這可用於提取滿足給定條件的陣列的索引。
import numpy as np
a = np.arange(20).reshape(2,10)
# a = array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
# [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]])
# Generate boolean array indicating which values in a are both greater than 7 and less than 13
condition = np.bitwise_and(a>7, a<13)
# condition = array([[False, False, False, False, False, False, False, False, True, True],
# [True, True, True, False, False, False, False, False, False, False]], dtype=bool)
# Get the indices of a where the condition is True
ind = np.where(condition)
# ind = (array([0, 0, 1, 1, 1]), array([8, 9, 0, 1, 2]))
keep = a[ind]
# keep = [ 8 9 10 11 12]
如果你不需要索引,可以使用 extract
一步完成,其中你指定 condition
作為第一個引數,但是讓 array
返回條件為真的值作為第二個引數。
# np.extract(condition, array)
keep = np.extract(condition, a)
# keep = [ 8 9 10 11 12]
可以向 where
提供另外兩個引數 x
和 y
,在這種情況下,輸出將包含 x
的值,其中條件為 True
,y
的值為 False
,其中條件為 False
。
# Set elements of a which are NOT greater than 7 and less than 13 to zero, np.where(condition, x, y)
a = np.where(condition, a, a*0)
print(a)
# Out: array([[ 0, 0, 0, 0, 0, 0, 0, 0, 8, 9],
# [10, 11, 12, 0, 0, 0, 0, 0, 0, 0]])