๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
Study/AI & ML

ํšŒ๊ท€ ์•Œ๊ณ ๋ฆฌ์ฆ˜๊ณผ ๋ชจ๋ธ ๊ทœ์ œ(3) - ํŠน์„ฑ ๊ณตํ•™๊ณผ ๊ทœ์ œ

by sumping 2024. 3. 15.

๐Ÿ‘€2์ฃผ์ฐจ 220117 ~ 220123 ๊ณต๋ถ€๊ธฐ๋ก

 

๐Ÿ“ ๋ณธ ํฌ์ŠคํŒ…์€ <ํ˜ผ์ž ๊ณต๋ถ€ํ•˜๋Š” ๋จธ์‹ ๋Ÿฌ๋‹+๋”ฅ๋Ÿฌ๋‹> ์ฑ…์„ ๋ฐ”ํƒ•์œผ๋กœ ์ž‘์„ฑํ•จ์„ ์•Œ๋ฆฝ๋‹ˆ๋‹ค.


โœ…Ch.03-3 ํŠน์„ฑ ๊ณตํ•™๊ณผ ๊ทœ์ œ

๐Ÿ”ฅํŠน์„ฑ๊ณตํ•™(feature engineering)

 

ํŠน์„ฑ ๊ณตํ•™(Feature engineering)์€ ๋จธ์‹ ๋Ÿฌ๋‹์˜ pre-processing ๋‹จ๊ณ„๋กœ, ๊ธฐ์กด์˜ ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ์ƒˆ๋กœ์šด ํŠน์„ฑ์„ ์ถ”์ถœํ•˜๋Š” ์ž‘์—…์„ ์˜๋ฏธํ•œ๋‹ค. Feature engineering์€ ๋” ์ข‹์€ ๋ฐฉ๋ฒ•์œผ๋กœ ์˜ˆ์ธก ๋ชจ๋ธ์—์„œ ๊ทผ๋ณธ์ ์ธ ๋ฌธ์ œ๋ฅผ ๋‚˜ํƒ€๋‚ด๋„๋ก ๋•๋Š”๋‹ค. ๊ฒฐ๊ณผ์ ์œผ๋กœ ๋ณด์ด์ง€ ์•Š๋Š” ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ๋ชจ๋ธ์˜ ์ •ํ™•์„ฑ์„ ํ–ฅ์ƒ์‹œํ‚จ๋‹ค. ๋จธ์‹ ๋Ÿฌ๋‹์—์„œ Feature engineering์€ ์ฃผ๋กœ 4๊ฐ€์ง€์˜ ๊ณผ์ •(Feature Creation, Transformation, Feature Extraction, and Feature Selection)์„ ๊ฑฐ์นœ๋‹ค. ๋ณธ ํฌ์ŠคํŒ…์€ ์‚ฌ์ดํ‚ท๋Ÿฐ์„ ์ด์šฉํ•˜์—ฌ Transformation์„ ๋ณด์—ฌ์ฃผ๊ธฐ ๋•Œ๋ฌธ์—, ๋” ๋งŽ์€ ์„ค๋ช…์„ ์›ํ•œ๋‹ค๋ฉด, ๋ณธ ํฌ์ŠคํŒ… ํ•˜๋‹จ์˜ ์ฐธ๊ณ ์ž๋ฃŒ๋ฅผ ์ฐธ๊ณ ํ•˜๋ฉด ๋œ๋‹ค.

Feature engineering์˜ ๋ชฉํ‘œ

1. ๋จธ์‹ ๋Ÿฌ๋‹ ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ๊ฑธ๋งž๋Š” ์ ๋‹นํ•œ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์…‹์„ ์ค€๋น„

2. ๋จธ์‹  ๋Ÿฌ๋‹ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๋Š” ๊ฒƒ์ด๋‹ค.

Feature engineering์„ ์‚ฌ์šฉํ•˜๋Š” ์ด์œ 

๊ณผ์ ํ•ฉ๊ณผ ํŽธํ–ฅ์„ ๋ฐฉ์ง€ํ•˜๊ธฐ ์œ„ํ•ด์„œ์ด๋‹ค. ๊ณผ์ ํ•ฉ์„ ๋ฐฉ์ง€ํ•˜๋Š” ์ด์œ ๋Š” ๋ถˆํ•„์š”ํ•œ ์š”์†Œ๋“ค์ด ๋ถ„์„์— ์‚ฌ์šฉ๋  ๊ฒฝ์šฐ ๊ณผ์ ํ•ฉ์ด ๋ฐœ์ƒํ•˜์—ฌ ํ…Œ์ŠคํŠธ ์„ธํŠธ๊ฐ€ ์ œ๋Œ€๋กœ ๋™์ž‘ํ•˜์ง€ ์•Š๊ฑฐ๋‚˜, ๋ชจ๋ธ์ด ๋‹จ์ˆœํ•ด์งˆ ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. ํŽธํ–ฅ ๋ฐฉ์ง€๋ฅผ ํ•˜๋Š” ์ด์œ ๋Š” ๋ถ€์ •ํ™•ํ•œ ์ •๋ณด๋“ค์ด ๋ถ„์„์— ์ ์šฉ๋  ๊ฒฝ์šฐ ํŽธํ–ฅ์ด ๋ฐœ์ƒํ•˜๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. 

+) ๋ฐ์ดํ„ฐ๊ฐ€ ๋ฐฉ๋Œ€ํ•˜๋‹ค๊ณ  ํ•ด๋„ ๊ทธ ๋ฐ์ดํ„ฐ๋ฅผ ๋ชจ๋‘ ๊ฒฐ๊ณผ๋ฅผ ๋„์ถœํ•˜๋Š”๋ฐ ์“ฐ๋ฉด ์ •ํ™•ํžˆ ๋‚˜ํƒ€๋‚  ๋“ฏํ•˜์ง€๋งŒ ์˜คํžˆ๋ ค ๊ฒฐ๊ณผ๋ฅผ ์ž˜๋ชป๋˜๊ฒŒ ๋„์ถœํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ๋‹ค. ์ด๋Š” ํ†ต๊ณ„๋ถ„์„์—์„œ ์„ ํ˜• ํ•จ์ˆ˜์˜ ๋…๋ฆฝ๋ณ€์ˆ˜๊ฐ€ ๋งŽ๋‹ค๊ณ  ํ•ด์„œ ์ข…์†๋ณ€์ˆ˜์˜ ๊ธฐ๋Œ€๊ฐ’์˜ ์ •ํ™•๋„๊ฐ€ ๋ฌด์กฐ๊ฑด ์˜ฌ๋ผ๊ฐ€์ง€ ์•Š๋Š” ์ด์œ ๋ผ๊ณ ๋„ ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ฆ‰, ๋จธ์‹  ๋Ÿฌ๋‹์˜ ์„ฑ๋Šฅ์€ ์–ด๋–ค ๋ฐ์ดํ„ฐ๋ฅผ ์ž…๋ ฅํ•˜๋Š”์ง€๊ฐ€ ๊ต‰์žฅํžˆ ์˜์กด์ ์ด๋‹ค. ๊ฐ€์žฅ ์ด์ƒ์ ์ธ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋Š” ๋ถ€์กฑํ•˜์ง€๋„ ๊ณผํ•˜์ง€๋„ ์•Š์€ ์ •ํ™•ํ•œ ์ •๋ณด๋งŒ์„ ํฌํ•จ๋  ๋•Œ์ด๋‹ค. ๊ทธ๋ž˜์„œ Feature engineering๋‹จ๊ณ„๋ฅผ ๊ฑฐ์ณ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ์ถ”์ถœํ•ด์•ผํ•œ๋‹ค.

Feature engineering์˜ ์ค‘์š”์„ฑ 

๋ฐ์ดํ„ฐ์˜ ํŠน์„ฑ์€ ์‚ฌ์šฉํ•˜๋Š” ์˜ˆ์ธก ๋ชจ๋ธ๊ณผ ๋ชฉํ‘œํ•˜๋Š” ๊ฒฐ๊ณผ์— ์ง์ ‘์ ์œผ๋กœ ์˜ํ–ฅ์„ ๋ฏธ์นœ๋‹ค. ๊ทธ๋ ‡๊ธฐ ๋•Œ๋ฌธ์— ๋ฐ์ดํ„ฐ์—์„œ ๋” ๋‚˜์€ ํŠน์„ฑ์„ ์ถ”์ถœํ•  ํ•„์š”๊ฐ€ ์žˆ๋‹ค.

1. Better features means flexibility. The flexibility of good features will allow you to use less complex models that are faster to run, easier to understand and easier to maintain.

2. Better features means simpler models. With well engineered features, you can choose “the wrong parameters” (less than optimal) and still get good results, for much the same reasons. You do not need to work as hard to pick the right models and the most optimized parameters. With good features, you are closer to the underlying problem and a representation of all the data you have available and could use to best characterize that underlying problem.

3. Better features means better results. As already discussed, in machine learning, as data we will provide will get the same output. So, to obtain better results, we must need to use better features.


โ“์–ด๋–ป๊ฒŒ ๋ณ€ํ™˜์„ ํ•˜๋‚˜์š”?

์‚ฌ์ดํ‚ท๋Ÿฐ์„ ์ด์šฉํ•˜์—ฌ ์‚ฌ์ดํ‚ท๋Ÿฐ์˜ ํด๋ž˜์Šค์ธ ๋ณ€ํ™˜๊ธฐ๋ฅผ ๋ถ€๋ฅธ๋‹ค. ๋ณธ ํฌ์ŠคํŒ…์—์„œ ์‚ฌ์šฉํ•  ๋ณ€ํ™˜๊ธฐ๋Š” PolynominalFeatures ํด๋ž˜์Šค์ด๋‹ค. (๋ณ€ํ™˜๊ธฐ๋Š” ํƒ€๊นƒ ๋ฐ์ดํ„ฐ ์—†์ด ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ๋ณ€ํ™˜ํ•œ๋‹ค.)

์•„๋ž˜ ์ฝ”๋“œ์˜ ๊ฒฐ๊ณผ๋ฅผ ๋ณด์ž. 2๊ฐœ์˜ ํŠน์„ฑ([2, 3])์„ ํ›ˆ๋ จ์‹œํ‚ค๊ณ  ๋ณ€ํ™˜์‹œํ‚ค๋ฉด, [1. 2. 3. 4. 6. 9]๋ผ๋Š” ๊ฒฐ๊ณผ๋ฅผ ๋ณผ ์ˆ˜ ์žˆ๋‹ค. ์ด ๊ฒฐ๊ณผ๊ฐ€ ์ฆ‰ 2๊ฐœ์˜ ํŠน์„ฑ์„ ๋ณ€ํ™˜ํ•œ ๊ฐ’์ด๋‹ค. ๊ฒฐ๊ณผ๋Š” 2์™€ 3์„ ์ด์šฉํ•˜์—ฌ ๊ฐ ํŠน์„ฑ์„ ์ œ๊ณฑํ•œ ํ•ญ๊ณผ ํŠน์„ฑ๋ผ๋ฆฌ ์„œ๋กœ ๊ณฑํ•œ ํ•ญ์„ ์ถ”๊ฐ€ํ•˜์˜€๋‹ค. 1์€ ์„ ํ˜• ๋ฐฉ์ ์‹์˜ ์ ˆํŽธ์— ํ•ญ์ƒ ๊ณฑํ•ด์ง€๋Š” ๊ณ„์ˆ˜๋ผ๊ณ  ๋ณผ ์ˆ˜ ์žˆ๋‹ค. 

from sklearn.preprocessing import PolynomialFeatures
poly = PolynomialFeatures()
poly.fit([[2,3]])
print(poly.transform([[2,3]]))

#[[1. 2. 3. 4. 6. 9.]]

 

(์‚ฌ์ดํ‚ท๋Ÿฐ ๋ชจ๋ธ์€ ์ž๋™์œผ๋กœ ํŠน์„ฑ์— ์ถ”๊ฐ€๋œ ์ ˆํŽธ ํ•ญ์„ ๋ฌด์‹œํ•˜์ง€๋งŒ, ํ—ท๊ฐˆ๋ฆฐ๋‹ค๋ฉด ๋ช…์‹œ์ ์œผ๋กœ 1์„ ์—†์•จ ์ˆ˜ ์žˆ๋‹ค. 1์„ ์—†์• ๋Š” ๋ฐฉ๋ฒ•์€ include_bias=False๋ฅผ ์ง€์ •ํ•˜๋ฉด ๋œ๋‹ค. ๋งŒ์•ฝ include_bias=True๋ผ๋ฉด ๊ฑฐ๋“ญ์ œ๊ณฑ ํ•ญ์€ ์ œ์™ธ๋˜๊ณ  ํŠน์„ฑ ๊ฐ„์˜ ๊ณฑ์…ˆ ํ•ญ๋งŒ ์ถ”๊ฐ€๋œ๋‹ค. ๊ธฐ๋ณธ๊ฐ’์€ False์ด๋‹ค.)

#include_bias=False ์ง€์ •
poly = PolynomialFeatures(include_bias=False)

#[[1. 2. 3. 4. 6. 9.]]

์•ž์„œ ์„ค๋ช…ํ•œ ๋‚ด์šฉ์ด์ง€๋งŒ, ๋ณ€ํ™˜ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ •๋ฆฌํ•˜์ž๋ฉด 

1๏ธโƒฃ PolynominalFeatures ํด๋ž˜์Šค๋ฅผ ์ž„ํฌํŠธํ•œ๋‹ค.

2๏ธโƒฃ ๋ณ€ํ™˜๊ธฐ ํด๋ž˜์Šค๊ฐ€ ์ œ๊ณตํ•˜๋Š” fit(), transform() ๋ฉ”์„œ๋“œ๋ฅผ ์ฐจ๋ก€๋Œ€๋กœ ํ˜ธ์ถœํ•œ๋‹ค. (๋ณ€ํ™˜)

3๏ธโƒฃ ์ž˜ ๋ณ€ํ™˜๋˜์—ˆ๋Š”์ง€ ํ›ˆ๋ จ ์„ธํŠธ์™€ ํ…Œ์ŠคํŠธ ์„ธํŠธ๋กœ ํ™•์ธํ•œ๋‹ค. shape() ๋ฉ”์„œ๋“œ๋ฅผ ์ด์šฉํ•˜์—ฌ ๋ฐฐ์—ด์˜ ํฌ๊ธฐ๋กœ ํ™•์ธํ•˜๋ฉด ๋œ๋‹ค. 

(42, 9)๋ฅผ ๋ณด๋ฉด 9๊ฐœ์˜ ํŠน์„ฑ์„ ๋งŒ๋“ค์—ˆ๋‹ค.(3๊ฐœ์˜ ํŠน์„ฑ์„ ์‚ฌ์šฉํ•˜์˜€๋‹ค. train_input์•ˆ์— 3๊ฐœ์˜ ํŠน์„ฑ์„ ๋„ฃ์—ˆ์Œ) 9๊ฐœ์˜ ํŠน์„ฑ์ด ๊ฐ๊ฐ ์–ด๋–ค ์ž…๋ ฅ์˜ ์กฐํ•ฉ์œผ๋กœ ๋งŒ๋“ค์–ด์กŒ๋Š”์ง€ ์•Œ๊ณ  ์‹ถ๋‹ค๋ฉด get_feature_names()๋ฉ”์„œ๋“œ๋ฅผ ์ด์šฉํ•˜๋ฉด ๋œ๋‹ค.

from sklearn.preprocessing import PolynomialFeatures

poly = PolynomialFeatures(include_bias=False)
poly.fit(train_input)
train_poly = poly.transform(train_input)
print(train_poly.shape) #ํ›ˆ๋ จ๋ฐ์ดํ„ฐ
#(42, 9)

#ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ
test_poly = poly.transform(test_input)
#(14, 9)

#9๊ฐœ์˜ ํŠน์„ฑ์ด ์–ด๋–ค ์กฐํ•ฉ์œผ๋กœ ์ด๋ฃจ์–ด์กŒ๋Š”์ง€
poly.get_feature_names()
#['x0', 'x1', 'x2', 'x0^2', 'x0 x1', 'x0 x2', 'x1^2', 'x1 x2', 'x2^2']

๐Ÿ“Transformations: The transformation step of feature engineering involves adjusting the predictor variable to improve the accuracy and performance of the model. For example, it ensures that the model is flexible to take input of the variety of data; it ensures that all the variables are on the same scale, making the model easier to understand. It improves the model's accuracy and ensures that all the features are within the acceptable range to avoid any computational error.


๐Ÿ”ฅ๊ทœ์ œ(regularization)

๊ทœ์ œ(regularization)๋Š” ๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ์ด ํ›ˆ๋ จ ์„ธํŠธ๋ฅผ ๋„ˆ๋ฌด ๊ณผ๋„ํ•˜๊ฒŒ ํ•™์Šตํ•˜์ง€ ๋ชปํ•˜๋„๋ก ํ›ผ๋ฐฉํ•˜๋Š” ๊ฒƒ์„ ๋งํ•œ๋‹ค. ์ฆ‰ ๋ชจ๋ธ์ด ํ›ˆ๋ จ์„ธํŠธ์— ๊ณผ๋Œ€์ ํ•ฉ๋˜์ง€ ์•Š๋„๋ก ๋งŒ๋“œ๋Š” ๊ฒƒ์ด๋‹ค. ์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ์˜ ๊ฒฝ์šฐ ํŠน์„ฑ์— ๊ณฑํ•ด์ง€๋Š” ๊ณ„์ˆ˜(๋˜๋Š” ๊ธฐ์šธ๊ธฐ)์˜ ํฌ๊ธฐ๋ฅผ ์ž‘๊ฒŒ ๋งŒ๋“œ๋Š” ์ผ์ด๋‹ค. ํ•˜๋‹จ์˜ ๊ทธ๋ฆผ์—์„œ ์ขŒ์ธก์€ ํ›ˆ๋ จ์„ธํŠธ๋ฅผ ๊ณผ๋„ํ•˜๊ฒŒ ํ•™์Šตํ•œ ๊ทธ๋ž˜ํ”„์ด๊ณ  ์šฐ์ธก์€ ๊ทœ์ œ๋ฅผ ํ†ตํ•ด ๋ณดํŽธ์ ์ธ ํŒจํ„ด์œผ๋กœ ํ•™์Šตํ•œ ๊ฒƒ์ด๋‹ค.

โ—ํŠน์„ฑ์˜ ์Šค์ผ€์ผ ์ •๊ทœํ™”

ํŠน์„ฑ์˜ ์Šค์ผ€์ผ์ด ์ •๊ทœํ™”๋˜์ง€ ์•Š๋Š”๋‹ค๋ฉด ๊ณฑํ•ด์ง€๋Š” ๊ณ„์ˆ˜ ๊ฐ’๋„ ์ฐจ์ด๊ฐ€ ๋‚˜๊ฒŒ ๋˜๊ธฐ ๋•Œ๋ฌธ์— ๊ทœ์ œ๋ฅผ ํ•˜๊ธฐ ์ „์— ์ •๊ทœํ™”๋ฅผ ํ•ด์ฃผ์–ด์•ผ ํ•œ๋‹ค. ์ •๊ทœํ™”๋ฅผ ํ•ด์ฃผ๋Š” ๋ฐฉ๋ฒ• ์ค‘ ํ•˜๋‚˜๋Š” ์‚ฌ์ดํ‚ท๋Ÿฐ์˜ ๋ณ€ํ™˜๊ธฐ ํด๋ž˜์Šค๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด๋‹ค.

 

โ“์ •๊ทœํ™”๋ฅผ ํ•˜๋Š” ๋ฐฉ๋ฒ•

1๏ธโƒฃ์‚ฌ์ดํ‚ท๋Ÿฐ์—์„œ ์ œ๊ณตํ•˜๋Š” StandardScaler ํด๋ž˜์Šค๋ฅผ ์ž„ํฌํŠธ ํ›„ ํด๋ž˜์Šค์˜ ๊ฐ์ฒด ss๋ฅผ ๋งŒ๋“ ๋‹ค.

2๏ธโƒฃPolynominalFeatures ํด๋ž˜์Šค๋กœ ๋งŒ๋“ (๋ณ€ํ™˜์‹œํ‚จ) train_poly๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ์ฒด๋ฅผ ํ›ˆ๋ จ์‹œํ‚จ๋‹ค.

3๏ธโƒฃํ›ˆ๋ จ ์„ธํŠธ์— ์ ์šฉํ–ˆ๋˜ ๋ณ€ํ™˜๊ธฐ๋กœ ํ…Œ์ŠคํŠธ ์„ธํŠธ์—๋„ ์ ์šฉํ•˜์—ฌ ํ‘œ์ค€์ ์ˆ˜๋กœ ๋ณ€ํ™˜ํ•ด ์ค€๋‹ค.

*) ํ‘œ์ค€์ ์ˆ˜(z์ ์ˆ˜)๋Š” ๊ฐ€์žฅ ๋„๋ฆฌ ์‚ฌ์šฉํ•˜๋Š” ์ „์ฒ˜๋ฆฌ ๋ฐฉ๋ฒ• ์ค‘ ํ•˜๋‚˜์ด๋‹ค.

from sklearn.preprocessing import StandardScaler
ss = StandardScaler()
ss.fit(train_poly)
#ํ‘œ์ค€์ ์ˆ˜๋กœ ๋ณ€ํ™˜ํ•œ ํ›ˆ๋ จ์„ธํŠธ์™€ ํ…Œ์ŠคํŠธ ์„ธํŠธ
train_scaled = ss.transform(train_poly)
test_scaled = ss.transform(test_poly)

์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ์— ๊ทœ์ œ๋ฅผ ์ถ”๊ฐ€ํ•œ ๋ชจ๋ธ์„ ๋ฆฟ์ง€(ridge)์™€ ๋ž์˜(lasso)๋ผ๊ณ  ๋ถ€๋ฅธ๋‹ค. ๋‘ ๋ชจ๋ธ์€ ๊ทœ์ œ๋ฅผ ๊ฐ€ํ•˜๋Š” ๋ฐฉ๋ฒ•์ด ๋‹ค๋ฅด๋‹ค. ๋ฆฟ์ง€๋Š” ๊ณ„์ˆ˜๋ฅผ ์ œ๊ณฑํ•œ ๊ฐ’์„ ๊ธฐ์ค€์œผ๋กœ ๊ทœ์ œ๋ฅผ ์ ์šฉํ•˜๊ณ , ๋ผ์˜๋Š” ๊ณ„์ˆ˜์˜ ์ ˆ๋Œ“๊ฐ’์„ ๊ธฐ์ค€์œผ๋กœ ๊ทœ์ œ๋ฅผ ์ ์šฉํ•œ๋‹ค.  ๋‘ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ๋ชจ๋‘ ๊ณ„์ˆ˜์˜ ํฌ๊ธฐ๋ฅผ ์ค„์ด์ง€๋งŒ ๋ผ์˜๋Š” ์•„์˜ˆ ๊ณ„์ˆ˜๋ฅผ 0์œผ๋กœ ๋งŒ๋“ค ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๋ฆฟ์ง€๋ฅผ ์ผ๋ฐ˜์ ์œผ๋กœ ์กฐ๊ธˆ ๋” ์„ ํ˜ธํ•˜๋Š” ํŽธ์ด๋‹ค.

๐Ÿ”ฅ๋ฆฟ์ง€ ํšŒ๊ท€ (L2 Regression, Ridge regression)

๋ฆฟ์ง€ ํšŒ๊ท€๋Š” ๊ทœ์ œ๊ฐ€ ์žˆ๋Š” ์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ ์ค‘ ํ•˜๋‚˜์ด๋ฉฐ, ์„ ํ˜• ๋ชจ๋ธ์˜ ๊ณ„์ˆ˜๋ฅผ ์ž‘๊ฒŒ ๋งŒ๋“ค์–ด ๊ณผ๋Œ€์ ํ•ฉ์„ ์™„ํ™”์‹œํ‚จ๋‹ค. ๋ฆฟ์ง€๋Š” ๋น„๊ต์  ํšจ๊ณผ๊ฐ€ ์ข‹์•„ ๋„๋ฆฌ ์‚ฌ์šฉํ•˜๋Š” ๊ทœ์ œ ๋ฐฉ๋ฒ•์ด๋‹ค.

* ๊ทœ์ œ์˜ ๊ฐ•๋„๋ฅผ ์ž„์˜๋กœ ์กฐ์ ˆํ•˜๊ธฐ ์œ„ํ•ด alpha ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค. alpha ๊ฐ’์ด ํด์ˆ˜๋ก ๊ทœ์ œ๊ฐ€ ์„ธ์ง€๊ณ  ๊ธฐ๋ณธ๊ฐ’์€ 1์ด๋‹ค. ์ ์ ˆํ•œ alpha๊ฐ’์„ ์ฐพ๋Š” ํ•œ ๊ฐ€์ง€ ๋ฐฉ๋ฒ•์€ alpha๊ฐ’์— ๋Œ€ํ•œ R²๊ฐ’์˜ ๊ทธ๋ž˜ํ”„๋ฅผ ๊ทธ๋ฆฌ๋Š” ๊ฒƒ์ด๋‹ค. ํ›ˆ๋ จ์„ธํŠธ์™€ ํ…Œ์ŠคํŠธ ์„ธํŠธ์˜ ์ ์ˆ˜๊ฐ€ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ์ง€์ ์ด ์ตœ์ ์˜ alpha๊ฐ’์ด ๋œ๋‹ค.

(* ํ•˜์ดํผ ํŒŒ๋ผ๋ฏธํ„ฐ : ๋จธ์‹ ๋Ÿฌ๋‹ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ํ•™์Šตํ•˜์ง€ ์•Š๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ์ด๋‹ค. ์ด๋Ÿฐ ํŒŒ๋ผ๋ฏธํ„ฐ๋Š” ์‚ฌ๋žŒ์ด ์‚ฌ์ „์— ์ง€์ •ํ•ด์•ผ ํ•œ๋‹ค. ๋Œ€ํ‘œ์ ์œผ๋กœ ๋ฆฟ์ง€์™€ ๋ผ์˜์˜ ๊ทœ์ œ ๊ฐ•๋„ alpha ํŒŒ๋ผ๋ฏธํ„ฐ์ด๋‹ค. ๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ์ด ํŠน์„ฑํ•ด์„œ ํ•™์Šตํ•œ ๋ชจ๋ธ ํŒŒ๋ผ๋ฏธํ„ฐ์™€๋Š” ์ •๋ฐ˜๋Œ€์˜ ๊ฐœ๋…์ด๋‹ค.)

 

โ—์ ์ ˆํ•œ alpha๊ฐ’์„ ์ฐพ๊ธฐ ์œ„ํ•ด R²๊ฐ’์˜ ๊ทธ๋ž˜ํ”„๋ฅผ ๊ทธ๋ฆฌ๊ธฐ

 

alpha๊ฐ’์„ ๋ฐ”๊ฟ€ ๋•Œ๋งˆ๋‹ค score() ๋ฉ”์„œ๋“œ์˜ ๊ฒฐ๊ณผ๋ฅผ ์ €์žฅํ•  ๋ฆฌ์ŠคํŠธ๋ฅผ ๋งŒ๋“ค๊ณ  alpha๊ฐ’์„ 10๋ฐฐ์”ฉ ๋Š˜๋ ค๊ฐ€๋ฉฐ ๋ฆฟ์ง€ ํšŒ๊ท€ ๋ชจ๋ธ ํ›ˆ๋ จ์„ ํ•œ๋‹ค. ๊ทธ ํ›„ ํ›ˆ๋ จ ์„ธํŠธ์™€ ํ…Œ์ŠคํŠธ ์„ธํŠธ์˜ ์ ์ˆ˜๋ฅผ ํŒŒ์ด์ฌ ๋ฆฌ์ŠคํŠธ์— ์ €์žฅํ•œ๋‹ค.

#ridge๋ชจ๋ธ ํ›ˆ๋ จ
from sklearn.linear_model import Ridge
ridge = Ridge()
ridge.fit(train_scaled, train_target)

#alpha๊ฐ’์„ ๋ฐ”๊ฟ€ ๋•Œ๋งˆ๋‹ค score() ๋ฉ”์„œ๋“œ์˜ ๊ฒฐ๊ณผ๋ฅผ ์ €์žฅํ•  ๋ฆฌ์ŠคํŠธ
import matplotlib.pyplot as plt
train_score = []
test_score = []

#alpha๊ฐ’์„ 0.001์• ์„œ 100๊นŒ์ง€ 10๋ฐฐ์”ฉ ๋Š˜๋ ค๊ฐ€๋ฉฐ ๋ฆฟ์ง€ ํšŒ๊ท€ ๋ชจ๋ธ ํ›ˆ๋ จ
alpha_list = [0.001, 0.01, 0.1, 1, 10, 100]
for alpha in alpha_list:
  #๋ฆฟ์ง€ ๋ชจ๋ธ ๋งŒ๋“ค๊ธฐ
  ridge = Ridge(alpha=alpha)
  #๋ฆฟ์ง€ ๋ชจ๋ธ ํ›ˆ๋ จ
  ridge.fit(train_scaled, train_target)
  #ํ›ˆ๋ จ ์ ์ˆ˜์™€ ํ…Œ์ŠคํŠธ ์ ์ˆ˜๋ฅผ ์ €์žฅ
  train_score.append(ridge.score(train_scaled, train_target))
  test_score.append(ridge.score(test_scaled, test_target))

#๊ทธ๋ž˜ํ”„๊ทธ๋ฆฌ๊ธฐ ์ฃผ์˜_๋กœ๊ทธํ•จ์ˆ˜๋กœ ๋ฐ”๊ฟ” ์ง€์ˆ˜๋กœ ํ‘œํ˜„ํ•˜๊ธฐ
plt.plot(np.log10(alpha_list), train_score)
plt.plot(np.log10(alpha_list), test_score)
plt.xlabel('alpha')
plt.ylabel('R²')
plt.show()

์œ„๋Š” ํ›ˆ๋ จ ์„ธํŠธ ๊ทธ๋ž˜ํ”„, ์•„๋ž˜๋Š” ํ…Œ์ŠคํŠธ ์„ธํŠธ ๊ทธ๋ž˜ํ”„์ด๋‹ค. ์ ์ ˆํ•œ alpha๊ฐ’์€ ๋‘ ๊ทธ๋ž˜ํ”„๊ฐ€ ๊ฐ€์žฅ ๊ฐ€๊น๊ณ  ํ…Œ์ŠคํŠธ ์ ์ˆ˜๊ฐ€ ๊ฐ€์žฅ ๋†’์€ -1, ์ฆ‰ 10โป¹=0.1์ด๋‹ค. 

๐Ÿ”ฅ๋ผ์˜ ํšŒ๊ท€ (L1 Regression, Lasso regression)

Lasso๋Š” ๊ทœ์ œ๊ฐ€ ์žˆ๋Š” ํšŒ๊ท€ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ธ ๋ผ์˜ ํšŒ๊ท€ ๋ชจ๋ธ์„ ํ›ˆ๋ จํ•œ๋‹ค. ์ด ํด๋ž˜์Šค๋Š” ์ตœ์ ์˜ ๋ชจ๋ธ์„ ์ฐพ๊ธฐ ์œ„ํ•ด ์ขŒํ‘œ์ถ•์„ ๋”ฐ๋ผ ์ตœ์ ํ™”๋ฅผ ์ˆ˜ํ–‰ํ•ด๊ฐ€๋Š” ์ขŒํ‘œ ํ•˜๊ฐ•๋ฒ•(coordinate descent)์„ ์‚ฌ์šฉํ•œ๋‹ค. ๋ฆฟ์ง€์™€ ๋‹ฌ๋ฆฌ ๊ณ„์ˆ˜ ๊ฐ’์„ ์•„์˜ˆ 0์œผ๋กœ ๋งŒ๋“ค ์ˆ˜๋„ ์žˆ๋‹ค.

 

โ—์ ์ ˆํ•œ alpha๊ฐ’์„ ์ฐพ๊ธฐ ์œ„ํ•ด R²๊ฐ’์˜ ๊ทธ๋ž˜ํ”„๋ฅผ ๊ทธ๋ฆฌ๊ธฐ

Lasso ๋ชจ๋ธ์„ ํ›ˆ๋ จํ•˜๋Š” ๊ฒƒ์€ Ridge์™€ ๋งค์šฐ ๋น„์Šทํ•˜๋‹ค. Ridge๋ชจ๋ธ๊ณผ ๊ฐ™์ด Lasso ๋˜ํ•œ ์ ์ ˆํ•œ alpha๊ฐ’์„ ๋ฐ”๊พธ์–ด๊ฐ€๋ฉฐ ๊ทœ์ œ์˜ ๊ฐ•๋„๋ฅผ ์กฐ์ ˆํ•œ๋‹ค.

#lasso ๋ชจ๋ธ ํ›ˆ๋ จ
from sklearn.linear_model import Lasso
lasso = Lasso()
lasso.fit(train_scaled, train_target)

train_score = []
test_score = []
alpha_list = [0.001, 0.01, 0.1, 10, 100]
for alpha in alpha_list:
  #๋ผ์˜ ๋ชจ๋ธ์„ ๋งŒ๋“ญ๋‹ˆ๋‹ค.
  lasso = Lasso(alpha=alpha, max_iter=10000)
  #๋ผ์˜ ๋ชจ๋ธ์„ ํ›ˆ๋ จํ•ฉ๋‹ˆ๋‹ค.
  lasso.fit(train_scaled, train_target)
  #ํ›ˆ๋ จ ์ ์ˆ˜์™€ ํ…Œ์ŠคํŠธ ์ ์ˆ˜๋ฅผ ์ €์žฅ
  train_score.append(lasso.score(train_scaled, train_target))
  test_score.append(lasso.score(test_scaled, test_target))

#๊ทธ๋ž˜ํ”„ ๊ทธ๋ฆฌ๊ธฐ
plt.plot(np.log10(alpha_list), train_score)
plt.plot(np.log10(alpha_list), test_score)
plt.xlabel('alpha')
plt.ylabel('R²')
plt.show()

Lasso ๋ชจ๋ธ์—์„œ ์ตœ์ ์˜ alpha๊ฐ’์€ 1, ์ฆ‰ 10¹=10 ์ด๋‹ค.

โ“๋ผ์˜ ๋ชจ๋ธ์ด ๋ฆฟ์ง€์™€ ๋‹ฌ๋ฆฌ ๊ณ„์ˆ˜ ๊ฐ’์„ ์•„์˜ˆ 0์œผ๋กœ ๋งŒ๋“ค ์ˆ˜๋„ ์žˆ๋‹ค๋Š” ๊ฒƒ์€ ๋ฌด์Šจ ๋œป์ธ๊ฐ€์š”?

#๋ผ์˜ ๋ชจ๋ธ ๊ณ„์ˆ˜ ์ค‘ 0์ธ ๊ฒƒ ์ถ”๋ ค๋‚ด๊ธฐ
print(np.sum(lasso.coef_ == 0))

#40

๋ผ์˜ ๋ชจ๋ธ์˜ ๊ณ„์ˆ˜๋Š” coef_์†์„ฑ์— ์ €์žฅ๋˜์–ด ์žˆ๋‹ค. ์ด ์ค‘ 0์ธ ๊ฒƒ์„ ํ—ค์•„๋ฆฌ๋ฉด 40๊ฐœ๊ฐ€ ๋œ๋‹ค๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค. 55๊ฐœ์˜ ํŠน์„ฑ์„ ๋ชจ๋ธ์— ์ฃผ์ž…ํ–ˆ์ง€๋งŒ ๋ผ์˜ ๋ชจ๋ธ์ด ์‚ฌ์šฉํ•œ ํŠน์„ฑ์€ 15๊ฐœ๋ฐ–์— ๋˜์ง€ ์•Š๋Š”๋‹ค. ์ด์ฒ˜๋Ÿผ ๋ผ์˜ ๋ชจ๋ธ์€ ๊ณ„์ˆ˜ ๊ฐ’์„ ์•„์˜ˆ 0์œผ๋กœ ๋งŒ๋“ค์–ด ๋‚ผ ์ˆ˜ ์žˆ๋‹ค. ์ด๋Ÿฐ ํŠน์ง• ๋•Œ๋ฌธ์— ๋ผ์˜ ๋ชจ๋ธ์„ ์œ ์šฉํ•œ ํŠน์„ฑ์„ ๊ณจ๋ผ๋‚ด๋Š” ์šฉ๋„๋กœ๋„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค. 

* ) ๋ผ์˜ ํšŒ๊ท€๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ์˜ ํฌ๊ธฐ์— ๊ด€๊ณ„์—†์ด ๊ฐ™์€ ์ˆ˜์ค€์˜ Regularization์„ ์ ์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ž‘์€ ๊ฐ’์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ 0์œผ๋กœ ๋งŒ๋“ค์–ด ํ•ด๋‹น ๋ณ€์ˆ˜๋ฅผ ์‚ญ์ œํ•˜๊ณ  ๋”ฐ๋ผ์„œ ๋ชจ๋ธ์„ ๋‹จ์ˆœํ•˜๊ฒŒ ๋งŒ๋“ค์–ด์ฃผ๊ณ  ํ•ด์„์— ์šฉ์ดํ•˜๊ฒŒ ๋งŒ๋“ค์–ด์ค€๋‹ค.


๐Ÿ’ก๋‹ค์ค‘ ํšŒ๊ท€(multiple regression)

์—ฌ๋Ÿฌ ๊ฐœ์˜ ํŠน์„ฑ์„ ์‚ฌ์šฉํ•œ ์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ์ด๋‹ค. ํŠน์„ฑ์ด ๋งŽ์œผ๋ฉด ์„ ํ˜• ๋ชจ๋ธ์€ ๊ฐ•๋ ฅํ•œ ์„ฑ๋Šฅ์„ ๋ฐœํœ˜ํ•œ๋‹ค. ๊ณ ๋ คํ•ด์•ผํ•˜๋Š” ๋ณ€์ˆ˜๊ฐ€ ๋งŽ๊ธฐ ๋•Œ๋ฌธ์— ์ธ๊ฐ„์˜ ์ƒ๊ฐ๊ณผ ์ƒ์ƒ์œผ๋กœ๋Š” ๋ถˆ๊ฐ€๋Šฅํ•˜๋‹ค. ํ•˜๋‹จ์˜ ์ด๋ฏธ์ง€์™€ ๊ฐ™์ด ํŠน์„ฑ 2๊ฐœ๋งŒ์„ ์‚ฌ์šฉํ•˜์—ฌ 3์ฐจ์› ๊ณต๊ฐ„์„ ํ˜•์„ฑํ•  ์ˆ˜์žˆ์ง€๋งŒ, ๊ทธ ์ด์ƒ์€ ๋ถˆ๊ฐ€๋Šฅํ•˜๊ธฐ์— ๋‹ค์ค‘ ํšŒ๊ท€๋ฅผ ์ด์šฉํ•œ๋‹ค.

* )๋‹ค์ค‘ ์„ ํ˜• ํšŒ๊ท€์˜ ์˜ˆ์ธก ํ•จ์ˆ˜์ด๋‹ค. ํŠน์„ฑ์€ ์ด p+1๊ฐœ, ๊ทธ์— ๋”ฐ๋ผ ๊ฐ€์ค‘์น˜๋„ p+1๊ฐœ์ด๋‹ค. ์ฃผ์–ด์ง„ ์—ฌ๋Ÿฌ ์ƒ˜ํ”Œ๋“ค์˜ p+1๊ฐœ์˜ ํŠน์ง•(x[0]~x[p])๊ณผ ๋ผ๋ฒจ๊ฐ’(y) ์‚ฌ์ด์˜ ๊ด€๊ณ„๋ฅผ ์ž˜ ๋‚˜ํƒ€๋‚ด์ฃผ๋Š” w์™€ b๋ฅผ ์ฐพ์•„์•ผ ํ•œ๋‹ค. ํŠน์ง•์ด 1๊ฐœ์ธ ์„ ํ˜•ํšŒ๊ท€์—์„  ๋ชจ๋ธ์ด ์ง์„ ์ด์—ˆ์ง€๋งŒ, 2๊ฐœ๋ฉด ํ‰๋ฉด์ด ๋˜๊ณ , ๊ทธ ์ด์ƒ์ด๋ฉด ์ดˆํ‰๋ฉด(hyperplane)์ด ๋œ๋‹ค. 

 

+) ๋‹คํ–ฅํšŒ๊ท€(polynomial regression) : ๋‹คํ•ญ์‹์„ ์‚ฌ์šฉํ•˜์—ฌ ํŠน์„ฑ๊ณผ ํƒ€๊นƒ ์‚ฌ์ด์˜ ๊ด€๊ณ„๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ํšŒ๊ท€. ๋‹ค์ค‘ํšŒ๊ท€์™€ ๋‹คํ•ญํšŒ๊ท€๋Š” ์—„์—ฐํžˆ ๋‹ค๋ฅด๊ฒŒ ๋•Œ๋ฌธ์— ํ—ท๊ฐˆ๋ฆฌ์ง€ ๋ง๋„๋ก ํ•˜์ž.

๐Ÿ’กํŒ๋‹ค์Šค(pandas)

ํŒ๋‹ค์Šค๋Š” ์œ ๋ช…ํ•œ ๋ฐ์ดํ„ฐ ๋ถ„์„ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์ด๋‹ค. ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„(dataframe)์€ ํŒ๋‹ค์Šค์˜ ํ•ต์‹ฌ ๋ฐ์ดํ„ฐ ๊ตฌ์กฐ์ด๋‹ค. ๋‹ค์ค‘ ํšŒ๊ท€์—์„œ ํŠน์„ฑ์ด ๋งŽ์„ ๋•Œ, ๋Š˜์–ด๋‚œ ๋ฐ์ดํ„ฐ๋ฅผ ๋ณต๋ถ™ํ•˜๋Š” ๊ฒƒ์€ ๊ต‰์žฅํžˆ ๋ฒˆ๊ฑฐ๋กญ๋‹ค. ๊ทธ๋ ‡๊ธฐ ๋•Œ๋ฌธ์— ํŒ๋‹ค์Šค๋ฅผ ์ด์šฉํ•˜์—ฌ ์‰ฝ๊ฒŒ ๋„˜ํŒŒ์ด ๋ฐฐ์—ด ๋˜๋Š” ๋‹ค์ฐจ์› ๋ฐฐ์—ด๋กœ ์†์‰ฝ๊ฒŒ ๋ฐ”๊พธ๋ฉด ์‚ถ์ด ์œคํƒํ•ด์งˆ ๊ฒƒ์ด๋‹ค.

read_csv() ํ•จ์ˆ˜์— ๋ฐ์ดํ„ฐ ์ฃผ์†Œ๋ฅผ ๋„ฃ๊ณ  ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์„ ๋งŒ๋“ ๋‹ค. ๊ทธ ๋‹ค์Œ์— to_numpy() ๋ฉ”์†Œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋„˜ํŒŒ์ด ๋ฐฐ์—ด๋กœ ๋ฐ”๊พธ๋ฉด ๋œ๋‹ค.

import pandas as pd
df = pd.read_csv('https://bit.ly/perch_csv_data') #๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„ ๋งŒ๋“ค๊ธฐ
perch_full = df.to_numpy() #๋„˜ํŒŒ์ด ๋ฐฐ์—ด๋กœ ๋ณ€ํ™˜
print(perch_full)

๐Ÿ’ก์‚ฌ์ดํ‚ท๋Ÿฐ API์˜ ์„ธ ๊ฐ€์ง€ ์œ ํ˜•

1๏ธโƒฃ์ถ”์ •๊ธฐ(estimator)

- ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ์…‹๊ณผ ๊ด€๋ จ๋œ ํŠน์ • ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ’๋“ค์„ ์ถ”์ •ํ•˜๋Š” ๊ฐ์ฒด

- fit() ๋ฉ”์„œ๋“œ ํ™œ์šฉ: ํŠน์ • ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ’์„ ์ €์žฅํ•œ ์†์„ฑ์ด ์—…๋ฐ์ดํŠธ๋œ ๊ฐ์ฒด ์ž์‹  ๋ฐ˜ํ™˜

 

2๏ธโƒฃ๋ณ€ํ™˜๊ธฐ(transformer)

- fit() ๋ฉ”์„œ๋“œ์— ์˜ํ•ด ํ•™์Šต๋œ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์ด์šฉํ•˜์—ฌ ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ์…‹ ๋ณ€ํ™˜ transform() ๋ฉ”์„œ๋“œ ํ™œ์šฉ

- fit() ๋ฉ”์„œ๋“œ์™€ transform() ๋ฉ”์„œ๋“œ๋ฅผ ์—ฐ์†ํ•ด์„œ ํ˜ธ์ถœํ•˜๋Š” fit_transform() ๋„ ํ™œ์šฉ ๊ฐ€๋Šฅ

 

3๏ธโƒฃ์˜ˆ์ธก๊ธฐ(predictor)

- ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ์…‹๊ณผ ๊ด€๋ จ๋œ ๊ฐ’์„ ์˜ˆ์ธกํ•˜๋Š” ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•˜๋Š” ์ถ”์ •๊ธฐ

- predict() ๋ฉ”์„œ๋“œ ํ™œ์šฉ

- fit() ๊ณผ predict() ๋ฉ”์„œ๋“œ๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ์–ด์•ผ ํ•จ

- predict() ๋ฉ”์„œ๋“œ๊ฐ€ ์ถ”์ •ํ•œ ๊ฐ’์˜ ์„ฑ๋Šฅ์„ ์ธก์ •ํ•˜๋Š” score() ๋ฉ”์„œ๋“œ๋„ ํฌํ•จ

- ์ผ๋ถ€ ์˜ˆ์ธก๊ธฐ๋Š” ์ถ”์ •์น˜์˜ ์‹ ๋ขฐ๋„๋ฅผ ํ‰๊ฐ€ํ•˜๋Š” ๊ธฐ๋Šฅ๋„ ์ œ๊ณต


[์ฝ”๋“œ]

์‚ฌ์ดํ‚ท๋Ÿฐ์˜ ๋ณ€ํ™˜๊ธฐ

#์‚ฌ์ดํ‚ท๋Ÿฐ์˜ ๋ณ€ํ™˜๊ธฐ
from sklearn.preprocessing import PolynomialFeatures

poly = PolynomialFeatures(include_bias=False) #์ ˆํŽธ์„ ์œ„ํ•œ ํ•ญ ์ œ๊ฑฐ
poly.fit(train_input)
train_poly = poly.transform(train_input) #ํ›ˆ๋ จ ์„ธํŠธ ํ›ˆ๋ จ ๋ฐ ๋ณ€ํ™˜
print(train_poly.shape)
#(42, 9)

poly.get_feature_names() #ํ”ผ์ณ ์—”์ง€๋‹ˆ์–ด๋ง ์กฐํ•ฉ ํ™•์ธ
#['x0', 'x1', 'x2', 'x0^2', 'x0 x1', 'x0 x2', 'x1^2', 'x1 x2', 'x2^2']

test_poly = poly.transform(test_input) #ํ›ˆ๋ จ์„ธํŠธ์— ์ ์šฉํ–ˆ๋˜ ๋ณ€ํ™˜๊ธฐ๋กœ ํ…Œ์ŠคํŠธ ์„ธํŠธ ๋ณ€ํ™˜
print(test_poly.shape) 
#(14, 9)

๋‹ค์ค‘ ํšŒ๊ท€ ๋ชจ๋ธ ํ›ˆ๋ จํ•˜๊ธฐ (๊ณผ๋Œ€์ ํ•ฉ ๋ฌธ์ œ)

55๊ฐœ์˜ ํŠน์„ฑ์ด ์˜๋ฏธํ•˜๋Š” ๋ฐ” : ์ƒ˜ํ”Œ ๊ฐœ์ˆ˜๋ณด๋‹ค ํŠน์„ฑ์ด ๋งŽ๋‹ค → ๊ณผ๋Œ€์ ํ•ฉ ๋ฌธ์ œ.

#๋‹ค์ค‘ ํšŒ๊ท€ ๋ชจ๋ธ ํ›ˆ๋ จํ•˜๊ธฐ
from sklearn.linear_model import LinearRegression #์„ ํ˜•ํšŒ๊ท€
lr = LinearRegression()
lr.fit(train_poly, train_target)
print(lr.score(train_poly, train_target)) #ํ›ˆ๋ จ ์„ธํŠธ score ์ถœ๋ ฅ
#0.9903183436982124

print(lr.score(test_poly, test_target)) #ํ…Œ์ŠคํŠธ ์„ธํŠธ score ์ถœ๋ ฅ
#0.9714559911594132

#ํ…Œ์ŠคํŠธ ์„ธํŠธ์— ๋Œ€ํ•œ ์ ์ˆ˜๋ฅผ ๋†’์ด๊ธฐ ์œ„ํ•ด ํŠน์„ฑ ์ถ”๊ฐ€
poly = PolynomialFeatures(degree=5, include_bias=False) #degree ๋งค๊ฐœ๋ณ€์ˆ˜ ์‚ฌ์šฉ : ์ตœ๋Œ€ ์ฐจ์ˆ˜ ๊ฒฐ์ •
poly.fit(train_input)
train_poly = poly.transform(train_input)
test_poly = poly.transform(test_input)
print(train_poly.shape)
#(42, 55) 55๊ฐœ์˜ ํŠน์„ฑ ํ™•์ธ
#ํ›ˆ๋ จ์„ธํŠธ์— ๋น„ํ•ด ํ…Œ์ŠคํŠธ ์„ธํŠธ์˜ ์ ์ˆ˜๋Š” ์Œ์ˆ˜ -> ๊ณผ๋Œ€์ ํ•ฉ
lr.fit(train_poly, train_target)
print(lr.score(train_poly, train_target))
#0.9999999999991096

print(lr.score(test_poly, test_target))
#-144.40579242335605

[์ฐธ๊ณ ์ž๋ฃŒ]

ํŠน์„ฑ๊ณตํ•™์— ๋Œ€ํ•ด์„œ ํ•œ๊ตญ์–ด๋กœ ์„ค๋ช…๋˜์–ด์žˆ์Œ. ๊ฐ„๋‹จํ•˜๊ฒŒ ์„ค๋ช…๋˜์–ด ์žˆ์–ด์„œ ๊ฐ€๋ณ๊ฒŒ ์ฝ๊ธฐ ์ข‹๋‹ค.

https://itwiki.kr/w/%ED%8A%B9%EC%84%B1_%EA%B3%B5%ED%95%99

 

ํŠน์„ฑ ๊ณตํ•™ - IT์œ„ํ‚ค

 

itwiki.kr

ํŠน์„ฑ๊ณตํ•™ ์ด๋ฏธ์ง€ ์‚ฌ์šฉ, ์ž์„ธํ•œ ์„ค๋ช…. 

https://www.javatpoint.com/feature-engineering-for-machine-learning

 

Feature Engineering for Machine Learning - Javatpoint

Feature Engineering for Machine Learning with Tutorial, Machine Learning Introduction, What is Machine Learning, Data Machine Learning, Machine Learning vs Artificial Intelligence etc.

www.javatpoint.com

ํŠน์„ฑ๊ณตํ•™์— ๋Œ€ํ•œ ์ž์„ธํ•œ ์„ค๋ช…. ๋ฐ”๋กœ ์œ„์˜ ์ž๋ฃŒ์™€ ๋ณ‘ํ–‰ํ•ด์„œ ์ฝ์–ด๋„ ๊ดœ์ฐฎ๋‹ค.

https://machinelearningmastery.com/discover-feature-engineering-how-to-engineer-features-and-how-to-get-good-at-it/

 

Discover Feature Engineering, How to Engineer Features and How to Get Good at It

Feature engineering is an informal topic, but one that is absolutely known and agreed to be key to success in […]

machinelearningmastery.com

ํ•œ๊ตญ์–ด๋กœ ๋˜์–ด์žˆ์–ด์„œ ์ฝ๊ธฐ ํŽธํ•จ. 

https://velog.io/@guide333/%EC%95%84%EC%9D%B4%ED%9A%A8-Feature-Engineering

 

[์•„์ดํšจ] Feature Engineering

์ด ํฌ์ŠคํŒ…์€ ์Šคํ„ฐ๋”” ์ค€๋น„ํ•˜๋ฉด์„œ ๋งŒ๋“  ์ž๋ฃŒ๋ฅผ ์ •๋ฆฌํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค. Feature Engineering์„ ์ž˜ ํ‘œํ˜„ํ•œ ๋ฌธ์žฅ์ด๋‹ค. Feature Engineering์€ ๋ฐ์ดํ„ฐ ๋ถ„์„์—์„œ ๋งŽ์€ ์ง€๋ถ„์„ ์ฐจ์ง€ํ•˜๋Š” ๋ถ€๋ถ„์ด๋‹ค.

velog.io

ํŠน์„ฑ๊ณตํ•™์˜ ํ•„์š”์„ฑ์— ๋Œ€ํ•ด์„œ ์ž˜ ์„ค๋ช…ํ•ด์ฃผ์‹ ๋‹ค.

http://www.incodom.kr/%EA%B8%B0%EA%B3%84%ED%95%99%EC%8A%B5/feature_engineering

 

๊ธฐ๊ณ„ํ•™์Šต/feature engineering - ์ธ์ฝ”๋ค, ์ƒ๋ฌผ์ •๋ณด ์ „๋ฌธ์œ„ํ‚ค

# feature engineering

www.incodom.kr

๋‹ค์ค‘์„ ํ˜•ํšŒ๊ท€ ์ด๋ฏธ์ง€ ์ฐธ๊ณ 

https://hleecaster.com/ml-multiple-linear-regression-example/

 

๋‹ค์ค‘์„ ํ˜•ํšŒ๊ท€(Multiple Linear Regression) - ํŒŒ์ด์ฌ ์ฝ”๋“œ ์˜ˆ์ œ - ์•„๋ฌดํŠผ ์›Œ๋ผ๋ฐธ

ํŒŒ์ด์ฌ scikit-learn์œผ๋กœ ๋‹ค์ค‘์„ ํ˜•ํšŒ๊ท€(Multiple Linear Regression) ๋ถ„์„ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ฝ”๋“œ ์˜ˆ์ œ์™€ ํ•จ๊ป˜ ์‚ดํŽด๋ณด์ž.

hleecaster.com

์‚ฌ์ดํ‚ท๋Ÿฐ์˜ API ์„ธ ๊ฐ€์ง€ ์œ ํ˜• ์ฐธ๊ณ 

https://codingalzi.github.io/handson-ml2/slides/handson-ml2-02b-slides.pdf

 

๋‹ค์ค‘์„ ํ˜•ํšŒ๊ท€์— ๋Œ€ํ•ด์„œ ์‰ฝ๊ฒŒ ์„ค๋ช…๋˜์–ด์žˆ๋‹ค. + ๋ผ์˜

https://otugi.tistory.com/127

 

์„ ํ˜•ํšŒ๊ท€(linear regression), ๋ผ์˜(LASSO), ๋ฆฌ์ง€(Ridge)

์„ ํ˜• ํšŒ๊ท€๋Š” ์‚ฌ์šฉ๋˜๋Š” ํŠน์„ฑ(feature)์˜ ๊ฐฏ์ˆ˜์— ๋”ฐ๋ผ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๊ตฌ๋ถ„๋œ๋‹ค. - ๋‹จ์ˆœ ์„ ํ˜• ํšŒ๊ท€(simple linear regression) : ํŠน์ง•์ด 1๊ฐœ - ๋‹ค์ค‘ ์„ ํ˜• ํšŒ๊ท€(multiple linear regression) : ํŠน์ง•์ด ์—ฌ๋Ÿฌ๊ฐœ LASSO์™€ Ri..

otugi.tistory.com

 

๋ผ์˜์™€ ๋ฆฟ์ง€์— ๋Œ€ํ•ด ๋” ์ž์„ธํžˆ ์•Œ๊ณ ์‹ถ๋‹ค๋ฉด?

https://sanghyu.tistory.com/13

 

Regularization(์ •๊ทœํ™”): Ridge regression/LASSO

์ด ๊ฐ•์˜๋ฅผ ๋ณด๊ณ  ์ •๋ฆฌํ•œ ๋‚ด์šฉ์ด๊ณ  ์ž๋ฃŒ๋„ ๊ฐ•์˜์—์„œ ๊ฐ€์ ธ์˜จ ์ž๋ฃŒ์ž„์„ ๋ฐํžˆ๊ณ  ์‹œ์ž‘ํ•œ๋‹ค. ์ด์ „ ํฌ์ŠคํŒ…์—์„œ ์‚ดํŽด๋ณธ linear regression ๋ชจ๋ธ์„ ๋‹ค์‹œ ์‚ดํŽด๋ณด์ž. ์ด๋ ‡๊ฒŒ least square solution์„ ๊ตฌํ•˜๋ฉด ๋„ˆ๋ฌด ๋ชจ๋ธ

sanghyu.tistory.com

https://rk1993.tistory.com/entry/Ridge-regression%EC%99%80-Lasso-regression-%EC%89%BD%EA%B2%8C-%EC%9D%B4%ED%95%B4%ED%95%98%EA%B8%B0

 

Ridge regression(๋ฆฟ์ง€ ํšŒ๊ท€)์™€ Lasso regression(๋ผ์˜ ํšŒ๊ท€) ์‰ฝ๊ฒŒ ์ดํ•ดํ•˜๊ธฐ

Ridge regression์™€ Lasso regression๋ฅผ ์ดํ•ดํ•˜๋ ค๋ฉด ์ผ๋‹จ ์ •๊ทœํ™”(regularization)๋ฅผ ์•Œ์•„์•ผํ•ฉ๋‹ˆ๋‹ค. ์ฒซ๋ฒˆ์งธ ๊ทธ๋ฆผ์„ ๋ณด๋ฉด ์ง์„  ๋ฐฉ์ •์‹์„ ์ด์šฉํ•˜์—ฌ ์„ ์„ ๊ทธ์—ˆ์Šต๋‹ˆ๋‹ค. ๋ฐ์ดํ„ฐ์™€ ์ง์„ ์˜ ์ฐจ์ด๊ฐ€ ๊ฝค ๋‚˜๋„ค์š”. ์ •ํ™•ํ•œ

rk1993.tistory.com