๋”ฅ๋Ÿฌ๋‹ 6

[๋”ฅ๋Ÿฌ๋‹] ์‹ ๊ฒฝ๋ง์˜ ๊ตฌ์กฐ

์‹ ๊ฒฝ๋ง ์ž…๋ ฅ ๋ ˆ์ด์–ด - ํ•™์Šต ๋ฐ์ดํ„ฐ(์ž…๋ ฅ ๋ฐ์ดํ„ฐ) ๋ฐ›๋Š” ๋ ˆ์ด์–ด ์ถœ๋ ฅ ๋ ˆ์ด์–ด = ํ•™์Šต ๊ฒฐ๊ณผ ์ถœ๋ ฅ ์ค‘๊ฐ„ ๋ ˆ์ด์–ด(์€๋‹‰ ๋ ˆ์ด์–ด) = ๋ฐ์ดํ„ฐ์—์„œ ํŠน์ง•๋Ÿ‰์„ ์ถ”์ถœํ•˜๋Š” ๋ ˆ์ด์–ด ๊ฐ ๋ ˆ์ด์–ด์—๋Š” "โ—‹"๋กœ ํ‘œํ˜„๋˜๋Š” ๋…ธ๋“œ ๋ฐฐ์น˜, ๋…ธ๋“œ๋ผ๋ฆฌ๋Š” "-"๋กœ ํ‘œํ˜„๋˜๋Š” ์—ฃ์ง€(๋งํฌ)๋กœ ์—ฐ๊ฒฐ ์—ฃ์ง€๋Š” ๊ฐ€์ค‘์น˜๋ผ๊ณ  ๋ถ€๋ฅด๋Š” ๊ฐ’์„ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค. ์ˆœ์ „ํŒŒ = ์ž…๋ ฅ๋ ˆ์ด์–ด๋ถ€ํ„ฐ ์˜ค๋ฅธ์ชฝ์œผ๋กœ ๊ณ„์‚ฐ์ด ์ด๋ฃจ์–ด์ง ์—ญ์ „ํŒŒ = ์ถœ๋ ฅ๋ ˆ์ด์–ด๋ถ€ํ„ฐ ์™ผ์ชฝ์œผ๋กœ ๊ณ„์‚ฐ์ด ์ด๋ฃจ์–ด์ง ์ˆœ์ „ํŒŒ์˜ ๊ตฌ์กฐ ๋ฐ”๋กœ ์ „ ๋ ˆ์ด์–ด์— ์žˆ๋Š” ๋…ธ๋“œ ๊ฐ’๊ณผ ์—ฃ์ง€์˜ ๊ฐ€์ค‘์น˜๋ฅผ ๊ณฑํ•œ ๋’ค ๋ชจ๋“  ๊ฒฐ๊ณผ ๋”ํ•˜๊ธฐ ๋”ํ•œ ๊ฐ’์„ ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋กœ ๋ณ€ํ™˜ -- ํ•ด๋‹น ๋…ธ๋“œ์˜ ๊ฐ’! ๋‹ค์Œ ๋…ธ๋“œ๋กœ ์ „๋‹ฌํ•œ๋‹ค. ํ•™์Šตํƒ€์ž…์— ๋”ฐ๋ฅธ ํ™œ์„ฑํ™”ํ•จ์ˆ˜: ๋ถ„๋ฅ˜ - ์†Œํ”„ํŠธ๋งฅ์Šค ํ•จ์ˆ˜ ํšŒ๊ท€(์ˆ˜์š” ์˜ˆ์ธก..) - ํ•ญ๋“ฑ ํ•จ์ˆ˜ ์—ญ์ „ํŒŒ์˜ ๊ตฌ์กฐ ์ˆœ์ „ ํŒŒ์—์„œ ๊ณ„์‚ฐํ•œ ๊ฒฐ๊ณผ๋ฅผ ์ •๋‹ต ๋ฐ์ดํ„ฐ์™€..

๋”ฅ๋Ÿฌ๋‹์„ ์ด์šฉํ•œ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ NLP

[[ 0 0 1 2] [ 0 0 0 3] [ 4 5 6 7] [ 0 8 9 10] [ 0 11 12 13] [ 0 0 0 14] [ 0 0 0 15] [ 0 0 16 17] [ 0 0 18 19] [ 0 0 0 20]]์ž์—ฐ์–ด = ์šฐ๋ฆฌ๊ฐ€ ํ‰์†Œ์— ๋งํ•˜๋Š” ์Œ์„ฑ์ด๋‚˜ ํ…์ŠคํŠธ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(Natural Language Processing, NLP) : ์ž์—ฐ์–ด๋ฅผ ์ปดํ“จํ„ฐ๊ฐ€ ์ธ์‹ํ•˜๊ณ  ์ฒ˜๋ฆฌํ•˜๋Š” ๊ฒƒ ํ…์ŠคํŠธ ์ „์ฒ˜๋ฆฌ ๊ณผ์ • ํ† ํฐํ™”(tokenization) : ์ž…๋ ฅ๋œ ํ…์ŠคํŠธ๋ฅผ ์ž˜๊ฒŒ ๋‚˜๋ˆ„๋Š” ๊ณผ์ • keras, text ๋ชจ๋“ˆ์˜ text_to_word_sequence() ํ•จ์ˆ˜ : ๋ฌธ์žฅ์„ ๋‹จ์–ด ๋‹จ์œ„๋กœ ๋‚˜๋ˆ” from tensorflow.keras.preprocessing.text import text_to_word_sequence text ..

[๋”ฅ๋Ÿฌ๋‹] ์ด๋ฏธ์ง€ ์ธ์‹ , ์ปจ๋ณผ๋ฃจ์…˜ ์‹ ๊ฒฝ๋ง(CNN)

MNIST ๋ฐ์ดํ„ฐ์…‹ - ๋ฏธ๊ตญ ๊ตญ๋ฆฝํ‘œ์ค€๊ธฐ์ˆ ์›(NIST)์ด ๊ณ ๋“ฑํ•™์ƒ๊ณผ ์ธ๊ตฌ์กฐ์‚ฌ๊ตญ ์ง์› ๋“ฑ์ด ์“ด ์†๊ธ€์”จ๋ฅผ ์ด์šฉํ•ด ๋งŒ๋“  ๋ฐ์ดํ„ฐ๋กœ ๊ตฌ์„ฑ - 70,000๊ฐœ์˜ ๊ธ€์ž ์ด๋ฏธ์ง€์— ๊ฐ๊ฐ 0๋ถ€ํ„ฐ 9๊นŒ์ง€ ์ด๋ฆ„ํ‘œ๋ฅผ ๋ถ™์ธ ๋ฐ์ดํ„ฐ์…‹ ์†๊ธ€์”จ ์ด๋ฏธ์ง€๋ฅผ ๋ช‡ %๋‚˜ ์ •ํ™•ํžˆ ๋งž์ถœ ์ˆ˜ ์žˆ๋Š”๊ฐ€? MNIST ๋ฐ์ดํ„ฐ๋Š” ์ผ€๋ผ์Šค๋ฅผ ์ด์šฉํ•ด ๋ถˆ๋Ÿฌ์˜ฌ ์ˆ˜ ์žˆ๋‹ค. mnist.load_data() ํ•จ์ˆ˜ : ์‚ฌ์šฉํ•  ๋ฐ์ดํ„ฐ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ X : ๋ถˆ๋Ÿฌ์˜จ ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ Y_class : ์ด ์ด๋ฏธ์ง€์— 0~9๊นŒ์ง€ ๋ถ™์ธ ์ด๋ฆ„ํ‘œ • ํ•™์Šต์— ์‚ฌ์šฉ๋  ๋ถ€๋ถ„: X_train, Y_class_train • ํ…Œ์ŠคํŠธ์— ์‚ฌ์šฉ๋  ๋ถ€๋ถ„: X_test, Y_class_test from keras.datasets import mnist (X_train, Y_class_train), (X_test, Y_c..

[๋”ฅ๋Ÿฌ๋‹] ์„ ํ˜• ํšŒ๊ท€ ์ ์šฉํ•˜๊ธฐ

๋ฐ์ดํ„ฐ ํ™•์ธ import pandas as pd df = pd.read_csv(”../dataset/housing.csv”, delim_whitespace=True, header=None) print(df.info()) Range Index:506 entries,0 to 505 Data columns (total 14 columns): 0 506 non-null float64 1 506 non-null float64 … … … … 13 506 non-null float64 Dtypes: float64(12), int64(2) memory usage: 55.4 KB Index 506๊ฐœ= ์ด ์ƒ˜ํ”Œ์˜ ์ˆ˜๋Š” 506๊ฐœ ์ปฌ๋Ÿผ 14๊ฐœ= 13๊ฐœ์˜ ์†์„ฑ๊ณผ 1๊ฐœ์˜ ํด๋ž˜์Šค 0 1 2 3 … 12 13 0 0.00632 18..

[๋”ฅ๋Ÿฌ๋‹] ์™€์ธ์˜ ์ข…๋ฅ˜ ์˜ˆ์ธกํ•˜๊ธฐ

df_pre๋ผ๋Š” ๊ณต๊ฐ„์— ๋ฐ์ดํ„ฐ๋ฅผ ๋ถˆ๋Ÿฌ์˜ต๋‹ˆ๋‹ค. sample() ํ•จ์ˆ˜ : ์›๋ณธ ๋ฐ์ดํ„ฐ์˜ ๋ช‡ %๋ฅผ ์‚ฌ์šฉํ• ์ง€๋ฅผ ์ง€์ •, ์›๋ณธ ๋ฐ์ดํ„ฐ์—์„œ ์ •ํ•ด์ง„ ๋น„์œจ๋งŒํผ ๋žœ๋ค์œผ๋กœ ๋ฝ‘์•„์˜ค๋Š” ํ•จ์ˆ˜ frac = 1 : ์›๋ณธ ๋ฐ์ดํ„ฐ์˜ 100%๋ฅผ ๋ถˆ๋Ÿฌ์˜ค๋ผ๋Š” ์˜๋ฏธ (frac = 0.5๋กœ ์ง€์ •ํ•˜๋ฉด 50%๋งŒ ๋žœ๋ค) df_pre = pd.read_csv(’../dataset/wine.csv’, header=None) df = df_pre.sample(frac=1) print(df.info()) Data columns (total 13 columns): 0 6497 non-null float64 1 6497 non-null float64 2 6497 non-null float64 3 6497 non-null float64 4 6497 non-nul..

[๋”ฅ๋Ÿฌ๋‹] ์ดˆ์ŒํŒŒ ๊ด‘๋ฌผ ๋ฐ์ดํ„ฐ : ๊ณผ์ ํ•ฉ ํ”ผํ•˜๊ธฐ

import pandas as pd df = pd.read_csv(’../dataset/sonar.csv’, header=None) print(df.info()) Range Index: 208 entries,0 to 207 Data columns (total 61 columns): 0 208 non-null float64 1 208 non-null float64 … … … … 59 208 non-null float64 60 208 non-null object Dtypes: float64(60), object(1) memory usage: 99.2+ KB Index๊ฐ€ 208๊ฐœ์ด๋ฏ€๋กœ ์ด ์ƒ˜ํ”Œ์˜ ์ˆ˜๋Š” 208๊ฐœ์ด๊ณ , ์ปฌ๋Ÿผ ์ˆ˜๊ฐ€ 61๊ฐœ์ด๋ฏ€๋กœ 60๊ฐœ์˜ ์†์„ฑ๊ณผ 1๊ฐœ์˜ ํด๋ž˜์Šค๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ์Œ ๋ชจ๋“  ์ปฌ๋Ÿผ์ด ์‹ค์ˆ˜ํ˜•(flo..