๋ฐ˜์‘ํ˜•

๋ถ„๋ฅ˜ ์ „์ฒด๋ณด๊ธฐ 402

[๋งˆ์ผ€ํŒ…์„ ์œ„ํ•œ ๋ฐ์ดํ„ฐ๋ถ„์„3-1] ๊ธฐ์ˆ ์  ๋ฐฉ๋ฒ•(ํผ๋„, Descriptive/Predictive Analytics, ROI&ROAS)

Funnel (ํผ๋„) ๊น”๋Œ€๊ธฐ๋ผ๋Š” ๋œป์ด์ง€๋งŒ, ๋งˆ์ผ€ํŒ…์—์„œ๋Š” ์†Œ๋น„์ž๊ฐ€ ๊ณ ๊ฐ์ด ๋˜๋Š” ๊ณผ์ •์„ ์˜๋ฏธํ•œ๋‹ค. Sales funnel = purchase funnel - ์ƒํ’ˆ์ด๋‚˜ ์„œ๋น„์Šค์˜ ๊ตฌ๋งค๋ฅผ ํ–ฅํ•œ ์ด๋ก ์ ์ธ ๊ณ ๊ฐ ์—ฌ์ •์„ ๋ณด์—ฌ์ฃผ๋Š” ์†Œ๋น„์ž ์ค‘์‹ฌ์˜ ๋งˆ์ผ€ํŒ… ๋ชจ๋ธ 1. awareness(์ธ์‹): user๊ฐ€ ๋ธŒ๋žœ๋“œ, ์ œํ’ˆ์— ๋Œ€ํ•ด ์ธ์‹ํ•˜๊ณ  ์žˆ๋Š” ๋‹จ๊ณ„ 2. interest : understanding, discovery๋ผ๊ณ ๋„ ํ•œ๋‹ค. 3. decision : consideration์ด๋ผ๊ณ ๋„ ํ•œ๋‹ค. ๊ฒฐ์ •์„ ๋‚ด๋ฆฌ๊ธฐ ์‹œ์ž‘ํ•˜๋Š” ๋‹จ๊ณ„์ด๋‹ค. 4. Action, Sales, Conversion : ์‹ค์ œ๋กœ ๊ตฌ๋งคํ•˜๋Š” ๋‹จ๊ณ„ 5. loyalty : repurchase, ์žฌ๊ตฌ๋งคํ•˜๋Š” ๋‹จ๊ณ„์ด๋‹ค. ํผ๋„์€ ๋งˆ์ผ€ํ„ฐ์—๊ฒŒ ์žˆ์–ด์„œ ๊ณ ๊ฐ์ด ๊ตฌ๋งค/์žฌ๊ตฌ๋งคํ•˜๋Š” ์—ฌ์ •์„ ์‹œ๊ฐํ™”ํ• ..

[๋งˆ์ผ€ํŒ…์„ ์œ„ํ•œ ๋ฐ์ดํ„ฐ๋ถ„์„2] ํด๋Ÿฌ์Šคํ„ฐ๋ง์„ ํ™œ์šฉํ•œ ์„ธ๋ถ„ํ™”

ํด๋Ÿฌ์Šคํ„ฐ๋ง(Clustering)์€ ๋ฐ์ดํ„ฐ์—์„œ ํ‘œ๋ฉด์ƒ์œผ๋กœ๋Š” ์•ˆ ๋ณด์ด๋Š” ํŒจํ„ด์„ ์ฐพ์„ ์ˆ˜ ์žˆ๊ฒŒ ํ•ด์ค€๋‹ค. ์ค‘์š”ํ•œ ๊ฒƒ์€ ๋ช‡ ๊ฐœ์˜ ํด๋Ÿฌ์Šคํ„ฐ๊ฐ€ ์ƒˆ๋ถ„ํ™”๋ฅผ ์ž˜ ํ•ด๋‚ด๋Š”์ง€ ์ฐพ์•„๋‚ด๋Š” ๊ฒƒ์ด๋‹ค. ํด๋Ÿฌ์Šคํ„ฐ๋ง์˜ ๋Œ€ํ‘œ์ ์ธ ๋ฐฉ๋ฒ•์€ k-means clustering ์ด ์žˆ๋‹ค. k-means clustering - group similar data points - iterative approach (๋ฐ˜๋ณต์ ์ธ ์ ‘๊ทผ๋ฒ•) - Starting point : Randomly selected cluster centers , Variable = you're interested in (location, demographics,,,) ----> revaluate hoe good your random choice was and improve it! ๊ณผ์ • 1..

[๋งˆ์ผ€ํŒ…์„ ์œ„ํ•œ ๋ฐ์ดํ„ฐ๋ถ„์„1] ๋งˆ์ผ€ํŒ…์—์„œ์˜ ์„ธ๋ถ„ํ™”

*๋งˆ์ผ€ํŒ…์„ ์œ„ํ•œ ๋ฐ์ดํ„ฐ ๋ถ„์„ ๋ฐฉ๋ฒ•๋ก ์— ๋Œ€ํ•ด ๊ณต๋ถ€ํ•œ ๋‚ด์šฉ์„ ์ •๋ฆฌํ–ˆ์Šต๋‹ˆ๋‹ค. ์„ธ๋ถ„ํ™” segmentation - ๊ณตํ†ต๋œ ํŠน์„ฑ์„ ๊ฐ€์ง€๊ณ  ์žˆ๋Š” ์‚ฌ๋žŒ๋“ค์„ ๊ทธ๋ฃน๋ณ„๋กœ ๋ถ„๋ฅ˜ํ•˜๋Š” ๊ฒƒ - ํ”Œ๋žซํผ(ํŽ˜์ด์Šค๋ถ, ๊ตฌ๊ธ€,,,,)์—์„œ ํƒ€์ผ“ํŒ…ํ•˜๋Š”๋ฐ์— ์“ฐ์ผ ์ˆ˜ ์žˆ๋‹ค - ํƒ€๊ฒŸํŒ… ์ „์— ์„ธ๋ถ„ํ™” ๋จผ์ € ์ˆ˜ํ–‰ํ•ด์•ผ ํ•œ๋‹ค. ์„ธ๋ถ„ํ™”๋Š” 2๊ฐ€์ง€ ํ˜•ํƒœ๋กœ ๋‚˜ํƒ€๋‚œ๋‹ค. 1. developed from a persona - ๊ทธ ์‚ฌ๋žŒ์˜ ๊ตฌ์„ฑ์š”์†Œ์— ๋”ฐ๋ผ ๋ถ„๋ฅ˜ (๋‚˜์ด,์ง์—… ๋“ฑ) 2. developed from data analytics - k-means clustering, statistical analysis,,, ์„ธ๋ถ„ํ™”๋ฅผ ํ•˜๋Š” ์ด์œ ? - ์„ธ๋ถ„ํ™” helps us reach the right users! ์„ธ๋ถ„ํ™”์˜ ๋ณ€์ˆ˜ : demograpic(์ธ๊ตฌํ†ต๊ณ„ํ•™์ ), psychogr..

[๋จธ์‹ ๋Ÿฌ๋‹4] Logistic Regression ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€ pyhton

binary classification์€ ์ข…๋ฅ˜๊ฐ€ 2๊ฐœ๋กœ ๋‚˜๋‰˜์–ด์ง„ ๋ฐ์ดํ„ฐ๊ฐ€ ์žˆ๊ณ  ์ด๋ฅผ ๋ถ„๋ฅ˜ํ•˜๋Š” ๋ฌธ์ œ์ž…๋‹ˆ๋‹ค. ์ด ๋ฌธ์ œ๋Š” ์˜ˆ์ธก ๊ฐ’์ด ์—ฐ์†์ ์ธ ๊ฐ’์ด ์•„๋‹Œ 0 ๋˜๋Š” 1์ž…๋‹ˆ๋‹ค. ์˜ˆ์‹œ ์ด๋ฉ”์ผ : ์ŠคํŒธ์ธ๊ฐ€ / ์•„๋‹Œ๊ฐ€? ์˜จ๋ผ์ธ ๊ฑฐ๋ž˜: Fraudulent Financial Statement (FFS)์ธ๊ฐ€ / ์•„๋‹Œ๊ฐ€? ์ข…์–‘ : ์•…์„ฑ์ข…์–‘(์•”)์ธ๊ฐ€ / ์–‘์„ฑ์ธ๊ฐ€? ์ด๋•Œ๋Š” ์šฐ๋ฆฌ์˜ ์˜ˆ์ธก ๊ฐ’์„ ํ™•๋ฅ  ๊ฐ’์œผ๋กœ ๋งŒ๋“  ๋‹ค์Œ์— ํ™•๋ฅ  ๊ฐ’์ด ์šฐ๋ฆฌ์˜ ๊ธฐ์ค€๋ณด๋‹ค ๋†’์œผ๋ฉด 1, ์•„๋‹ˆ๋ฉด 0์œผ๋กœ ๋ถ„๋ฅ˜ํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋ฅผ ํ‘ธ๋Š” ๋ฐฉ๋ฒ•์„ logistic regression์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ๋‚˜๋ˆ„๋Š” ์ข…๋ฅ˜๊ฐ€ 3๊ฐœ์ด์ƒ์ด๋ฉด - multi classification Logistic regression์„ ์ง„ํ–‰ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์ถœ๋ ฅ ๊ฐ’์„ 0๊ณผ 1์˜ ๊ฐ’์œผ๋กœ ๋งž์ถฐ์ฃผ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ..

[๋จธ์‹ ๋Ÿฌ๋‹3] Multiple Linear Regression ๋‹ค์ค‘์„ ํ˜•ํšŒ๊ท€ python

์‹ค์ œ๋กœ ์˜ˆ์ธก์„ ํ•˜๊ณ ์ž ํ•  ๋–„ ๋ณดํ†ต ํ•˜๋‚˜ ์ด์ƒ์˜ ๋ณ€์ˆ˜๋“ค์„ ๊ณ ๋ คํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. multiple linear regression์€ ๋‹ค์–‘ํ•œ ์ž…๋ ฅ ๋ณ€์ˆ˜๋“ค์„ ์‚ฌ์šฉํ•˜๋Š” ์˜ˆ์ธก๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์œ„ ์ด๋ฏธ์ง€๋ฅผ ์˜ˆ๋กœ ์„ค๋ช…ํ•˜๋ฉด, ์ง‘๊ฐ€๊ฒฉ(y)๋ฅผ ์˜ˆ์ธกํ•œ๋‹ค๊ณ  ํ•  ๋•Œ, x1(์นจ์‹ค์ˆ˜), x2=์ธต ์ˆ˜, x3=์ง€์–ด์ง„์—ฐ์ˆ˜, x4=ํฌ๊ธฐ 4๊ฐ€์ง€ feature(n=4)๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค . feature= dimension=attribute x(2)๋Š” =[3 2 40 127](์—ด๋ฒกํ„ฐ๋กœ)๊ฐ€ ๋˜๊ณ , x3(2)๋Š” 30 ์ž…๋‹ˆ๋‹ค default๋Š” ํ•œ์ƒ ์—ด๋ฒกํ„ฐ์ด๊ณ , row vector ์ฆ‰ [3 2 40 127]๋กœ ํ‘œํ˜„ํ•˜๊ณ  ์‹ถ๋‹คํ•˜๋ฉด, transpose๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ‘œํ˜„ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ์ธก ๋ชจ๋ธ ์‹์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์„ธํƒ€0,1,2,3์€ ๊ฐ ๋ณ€์ˆ˜์˜ ๊ฐ€์ค‘์น˜์ด๊ณ , x1,2,3๋Š” ๊ฐ fea..

[๋จธ์‹ ๋Ÿฌ๋‹2] Polynomial Regression python

์ง‘ ์‚ฌ์ด์ฆˆ์— ๋”ฐ๋ฅธ ๊ฐ€๊ฒฉ์„ ์˜ˆ์ธกํ•˜๋Š” One Variable Regression์„ ์ƒ๊ฐํ•ด๋ณด์ž. ๋ฐ์ดํ„ฐ์˜ ๋ถ„ํฌ๋ฅผ ๋ณด๋ฉด, ์˜ˆ์ธกํ•˜๋ ค๋Š” ์ง์„ ๊ฐ’์ด ์•ˆ ๋‚˜ํƒ€๋‚  ์ˆ˜๋„ ์žˆ๋‹ค. (์‚ฌ์ด์ฆˆ์™€ ๊ฐ€๊ฒฉ์ด ๋น„๋ก€ํ•˜์ง€ ์•Š์Œ) ์ด๋•Œ, ๋ณ€์ˆ˜๊ฐ’์„ ๊ทธ๋Œ€๋กœ ๊ณฑํ•˜๋Š”๊ฒƒ์ด ์•„๋‹Œ ๋ฃจํŠธx๋‚˜ x์˜ ๊ฑฐ๋“ญ์ œ๊ณฑ, sinx ๋“ฑ x๋ฅผ ๋ณ€ํ™˜ํ•œ ๊ฐ’์„ ์ƒˆ๋กœ์šด ์นผ๋Ÿผ์œผ๋กœ ์ถ”๊ฐ€ํ•˜์—ฌ ์˜ˆ์ธก๋ชจ๋ธ์„ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด๋ฅผ ๋‹ค์ค‘ํšŒ๊ท€ํ•˜๋ฉด ๋œ๋‹ค.(lost function, gradient descent ์ˆ˜ํ–‰ํ•˜๋ฉด ๋จ) ํŒŒ์ด์ฌ์—์„œ ์‚ฌ์ดํ‚ท๋Ÿฐ์˜ PolynomialFeatures๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค. from sklearn.preprocessing import PolynomialFeatures https://scikit-learn.org/stable/modules/generated/sklearn.p..

[kaggle competition1]Store Sales - Time Series Forecasting Use machine learning to predict grocery sales 1-๋ณ€์ˆ˜์„ค๋ช…

https://www.kaggle.com/competitions/store-sales-time-series-forecasting Store Sales - Time Series Forecasting | Kaggle www.kaggle.com ๋Œ€ํšŒ ๊ฐœ์š” - ์—์ฝฐ๋„๋ฅด์— ์œ„์น˜ํ•œ Favorita ๋งค์žฅ์—์„œ ํŒ๋งค๋˜๋Š” ์ˆ˜์ฒœ ๊ฐœ์˜ ์ œํ’ˆ๊ตฐ์˜ ๋งค์ถœ์„ ์˜ˆ์ธก ๋ฐ์ดํ„ฐ ์…‹,๋ณ€์ˆ˜ ์„ค๋ช… 1. train.csv - store_nbr : ์ œํ’ˆ์ด ํŒ๋งค๋˜๋Š” ์Šคํ† ์–ด id family : ํŒ๋งค๋˜๋Š” ์ œํ’ˆ์˜ ์ข…๋ฅ˜ ๋”๋ณด๊ธฐ ( ['AUTOMOTIVE', 'BABY CARE', 'BEAUTY', 'BEVERAGES', 'BOOKS', 'BREAD/BAKERY', 'CELEBRATION', 'CLEANING', 'DAIRY', 'DELI', 'EGGS',..

[๋จธ์‹ ๋Ÿฌ๋‹1] ์„ ํ˜•ํšŒ๊ท€ Linear Regression , gradient descent pyhton

๊ณผ๊ฑฐ ๋ฐ์ดํ„ฐ ๊ธฐ๋ฐ˜์œผ๋กœ, ์ž…๋ ฅ ์ฃผ์–ด์กŒ์„ ๋•Œ ์ถœ๋ ฅ(์˜ˆ์ธก๊ฐ’)์ด ๋‚˜์™€์•ผ ํ•œ๋‹ค. ์˜ˆ์ธกํ•˜๊ณ ์ž ํ•˜๋Š” ๋ณ€์ˆ˜ = target variable(ํƒ€๊ฒŸ ๋ณ€์ˆ˜) ํƒ€๊ฒŸ ๋ณ€์ˆ˜๊ฐ€ ์‹ค์ˆ˜์ด๋ฉด = regression problem, ํƒ€๊ฒŸ ๋ณ€์ˆ˜๊ฐ€ ์นดํ…Œ๊ณ ๋ฆฌ ๋ณ€์ˆ˜์ด๋ฉด = classification (๋Œ€ํ‘œ์ ์ธ ๋ฐฉ๋ฒ•๋ก : ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€) ์ด ๋‘˜์€ supervised learning(์ง€๋„ ํ•™์Šต)์ด๋‹ค. unsupervised learning(๋น„์ง€๋„ ํ•™์Šต)์—๋Š” clustring(k-means) ๋“ฑ์ด ์žˆ๋‹ค. ์„ ํ˜•ํšŒ๊ท€ Linear Regression - ์ข…์† ๋ณ€์ˆ˜ ๐‘ฆ์™€ ํ•œ๊ฐœ ์ด์ƒ์˜ ๋…๋ฆฝ ๋ณ€์ˆ˜ ๐‘‹์™€์˜ ์„ ํ˜• ๊ด€๊ณ„๋ฅผ ๋ชจ๋ธ๋ง(=1์ฐจ๋กœ ์ด๋ฃจ์–ด์ง„ ์ง์„ ์„ ๊ตฌํ•œ๋‹ค)ํ•˜๋Š” ๋ฐฉ๋ฒ•๋ก  - ์ตœ์ ์˜ ์ง์„ ์„ ์ฐพ์•„ ๋…๋ฆฝ ๋ณ€์ˆ˜์™€ ์ข…์† ๋ณ€์ˆ˜ ์‚ฌ์ด์˜ ๊ด€๊ณ„๋ฅผ ๋„์ถœํ•˜๋Š” ๊ณผ์ • ๋…๋ฆฝ ๋ณ€์ˆ˜= ์ž…๋ ฅ ๊ฐ’..

[์„ ํ˜•๋Œ€์ˆ˜ํ•™] ๊ธฐ์ €๋ฒกํ„ฐ ๋œป

์„ ํ˜•๋Œ€์ˆ˜ํ•™์„ ๊ณต๋ถ€ํ•˜๋ฉด์„œ ์ฒซ๋ฒˆ์จฐ ๊ณ ๋น„๊ฐ€ ๊ธฐ์ € ๋ฒกํ„ฐ์—์„œ ์ฐพ์•„์™€ ๋ฒ„๋ ธ๋‹ค. ์„ค๋ช…์„ ์ฐพ์•„๋ด๋„ ๋„ˆ๋ฌด ์–ด๋ ค์›Œ์„œ ์ดํ•ด๋ฅผ ๋ชปํ–ˆ์—ˆ๋‹ค. ์™„๋ฒฝํ•˜๊ฒŒ 100% ์•ˆ๋‹ค๊ณ ๋Š” ํ•  ์ˆ˜ ์—†์ง€๋งŒ ๊ทธ๋ž˜๋„ ์—ฌํƒœ๊นŒ์ง€ ์ดํ•ดํ•œ ๋‚ด์šฉ์„ ๋ฐ”ํƒ•์œผ๋กœ ๊ธฐ์ € ๋ฒกํ„ฐ์— ๋Œ€ํ•œ ์„ค๋ช…์„ ํ•ด๋ณด๊ฒ ๋‹ค. ๊ธฐ์ € ๋ฒกํ„ฐ๋ž€ ์–ด๋–ค ๊ณต๊ฐ„์„ ์ด๋ฃจ๋Š” ์›์†Œ ์ค‘ ๊ฐ€์žฅ ์—‘๊ธฐ์Šค์ธ ์›์†Œ์ด๋‹ค. ์ด๋•Œ, ์–ด๋–ค ๊ณต๊ฐ„์€ ๋Œ€๋ถ€๋ถ„ ์ขŒํ‘œํ‰๋ฉด์œผ๋กœ ๋งŽ์ด ์„ค๋ช…ํ•˜๋Š” ๊ฒƒ ๊ฐ™๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ์–ด๋–ค ๊ณต๊ฐ„์ด x์ถ•, y์ถ•์œผ๋กœ ์ด๋ฃจ์–ด์ง„ ์ขŒํ‘œํ‰๋ฉด(R^2)์ด๋ฉด, {(1,0),(0,1)}์€ ๊ธฐ์ € ๋ฒกํ„ฐ์ด๋‹ค. ์ด ๋‘ ์ขŒํ‘œ์— ์–ด๋–ค ์‹ค์ˆ˜๋ฅผ ๊ณฑํ•˜๋ฉด, ์ฆ‰ ๋ฒกํ„ฐ ์กฐํ•ฉ์œผ๋กœ ์ขŒํ‘œํ‰๋ฉด์— ์žˆ๋Š” ๋ชจ๋“  ์ ์„ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ {(1,0),(2,0)}์€ ๊ธฐ์ œ๊ฐ€ ๋  ์ˆ˜ ์—†๋‹ค. ์•„๋ฌด๋ฆฌ ํฐ ์ˆ˜๋ฅผ ๊ณฑํ•ด๋„, x์ถ• ์œ„์—์„œ๋งŒ ์ ์ด ์ฐํžˆ๊ธฐ ๋•Œ๋ฌธ์— ๋ชจ๋“  ์ขŒํ‘œํ‰๋ฉด..

[python] packing & unpacking ๋ฆฌ์ŠคํŠธ, ํŠœํ”Œ

ํŠœํ”Œ ์–ธํŒจํ‚น ์˜ˆ์‹œ - ๋ฆฌ์ŠคํŠธ๋Š” ()๋ฅผ []๋กœ ๋ฐ”๊ฟ”์ฃผ๋ฉด ๋ฆฌ์ŠคํŠธ ํŒจํ‚น, ์–ธํŒจํ‚น์ด ๊ฐ€๋Šฅํ•˜๋‹ค. a, b = (1, 10) print(a) print(b) 1 10 a, *b, c = (1, 2, 3, 4, 5) print(a) print(b) print(c) 1 [2,3,4] 5 *์€ ๋‚˜๋จธ์ง€๋ฅผ ๋ฌถ์–ด์ค€๋‹ค๊ณ  ์ƒ๊ฐํ•˜๋ฉด ํŽธํ•˜๋‹ค.

๋ฐ˜์‘ํ˜•