NumPy Master Class¶

Chapter2 Arithmetic Operations¶

Notebook4 argmax, argmin values¶

이번 notebook에서는 3,4인 argmax와 argmin을 살펴보도록 하자

1. ndarray.max()
2. ndarray.min()
3. ndarray.argmax()
4. ndarray.argmin()

1,2에서 다룬 max()와 min()은 최댓값을 뽑아주는 method들이었다면 argmax, argmin은 수학에서 다루는 것과 동일하게 max, min을 만드는 independent variable의 값을 뽑아준다.

즉 ndarray의 입장에서 봤을 때, 최댓값, 최솟값을 가지는 index를 return해준다.

이는 max(), min() method들과 많은 경우에 같이 다니므로 같이 공부해놓으면 좋은 method들이다.

import numpy as np

ndarray.argmax(), ndarray.argmin()¶

test_np = np.random.randint(low = 0, high = 10, size = (10,))
print(test_np)

[4 9 6 9 9 9 2 2 6 0]

먼저 위와 같이 0부터 10까지 random한 10개의 값을 만들어보자. 그리고 최댓값, 최솟값을 만드는 index를 뽑아보면 다음과 같다.

print("test_np.argmax():", test_np.argmax())
print("test_np.argmin():", test_np.argmin())

test_np.argmax(): 1
test_np.argmin(): 9

위에서 볼 수 있듯이, 최댓값, 최솟값이 아닌 최댓값, 최솟값을 만드는 index를 return하는 것을 볼 수 있다.

저번 notebook과 함께 사용하면 다음과 같이 사용할 수 있다.

[max_loc, max_val] = test_np.argmax(), test_np.max()
[min_loc, min_val] = test_np.argmin(), test_np.min()

print(test_np)
print("Max location:", max_loc, "and Max value:", max_val)
print("min location:", min_loc, "and min value:", min_val)

[4 9 6 9 9 9 2 2 6 0]
Max location: 1 and Max value: 9
min location: 9 and min value: 0

ndarray.argmax(), ndarray.argmin() with 2-dim ndarray¶

저번 notebook에서 다뤘던 데이터를 다시 가져와보자.

import pandas as pd
maths = np.random.randint(low = 30, high = 100, size = (20,))
english = np.random.randint(low = 30, high = 100, size = (20,))
physics = np.random.randint(low = 30, high = 100, size = (20,))

score_table = np.vstack((maths, english, physics)).T
d = {"Math scores": maths, "English scores": english, "Physics scores": physics}
df = pd.DataFrame(data=d)
print(df)

    Math scores  English scores  Physics scores
0            35              61              74
1            50              82              42
2            52              83              99
3            30              48              55
4            68              59              68
5            87              73              90
6            34              77              93
7            79              96              86
8            91              74              33
9            53              73              43
10           64              91              83
11           91              67              31
12           91              54              34
13           56              70              91
14           41              93              74
15           95              68              97
16           76              50              73
17           42              38              74
18           48              78              41
19           43              43              84

저번 시간에 max(), min()을 통해

1. 각 과목의 최댓값, 최솟값
2. 각 학생들의 최댓값, 최솟값

을 구했다면, argmax()는 각각 다음과 같은 의미를 지닌다.

1. 각 과목들의 최댓값, 최솟값의 위치는 몇 번재 학생이 최댓값, 최솟값을 가지는지
2. 각 학생들의 최댓값, 최솟값의 위치는 각 학생들이 수학, 영어, 물리 중 어떤 과목을 가장 잘 보고, 잘 못봤는지

이를 다음 코드에서 확인해보자.

print(score_table.shape)

print("score_table.argmax(axis = 0):", score_table.argmax(axis = 0))
print("score_table.argmax(axis = 1):", score_table.argmax(axis = 1), '\n')

print("score_table.argmin(axis = 0):", score_table.argmin(axis = 0))
print("score_table.argmin(axis = 1):", score_table.argmin(axis = 1))

(20, 3)
score_table.argmax(axis = 0): [15  7  2]
score_table.argmax(axis = 1): [2 1 2 2 0 2 2 1 0 1 1 0 0 2 1 2 0 2 1 2] 

score_table.argmin(axis = 0): [ 3 17 11]
score_table.argmin(axis = 1): [0 2 0 0 1 1 0 0 2 2 0 2 2 0 0 1 1 1 2 0]

즉 axis가 0이면 학생들의 차원이 없어지면서 과목에 대한 정보만 남게 된다.

그리고 argmax는 각 과목에서 최댓값을 가지는 index를 return하므로 어떤 학생이 각 과목에서 최대점수를 맞았는지 보여준다.

반대로 axis가 1이면 과목의 차원이 없어지면서 학생들에 대한 정보만 남게 된다.

그리고 argmax는 각 학생들이 어떤 과목에서 최댓값을 가지는지 index를 return하므로

0:수학, 1:영어, 2:물리

가 되고, 학생들이 가장 높은 점수를 맞은 과목들을 알아낼 수 있는 것이다.

ndarray.argmax() and ndarray.max()¶

그러면 위의 두 method를 섞으면 어떻게 활용할 수 있을지 확인해보자.

먼저 각 과목에서의 최우등 학생들을 뽑으려고 한다면 학생의 차원을 없애야 하므로 axis=0로 설정해준다.

best_studnets = score_table.argmax(axis = 0)
print(best_studnets)

[15  7  2]

위의 결과로 수학, 영어, 물리 성적을 가장 잘 받은 학생은 index로 15, 7, 2번째의 학생들이다.

그러면 그 점수를 확인하고 싶으면

best_scores = score_table.max(axis = 0)
print(best_scores)

[95 96 99]

위와 같이 최고점수를 뽑을 수 있다.

다른 문제로는 각 학생들이 어떤 과목에서 최대점수를 받았는지, 그리고 그 점수가 얼마인지 알고 싶다면

best_subjects = score_table.argmax(axis = 1)
best_scores = score_table.max(axis = 1)
print(best_subjects)
print(best_scores)

[2 1 2 2 0 2 2 1 0 1 1 0 0 2 1 2 0 2 1 2]
[74 82 99 55 68 90 93 96 91 73 91 91 91 91 93 97 76 74 78 84]

위와 같이 각 학생들이 최고점을 맞은 과목의 index와 그 성적을 알 수 있다.

티스토리

Chapter2 Arithmetic Operations: Notebook4 Argmax, Argmin Values