๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
์ „๋ฌธ์„ฑ์€ ๋ฌด์—‡์œผ๋กœ ๋งŒ๋“ค์–ด์ง€๋Š”๊ฐ€ ๐ŸŽ“/์ด๋ก ๊ณผ ์‹ค์Šต์œผ๋กœ ๋ฐฐ์šฐ๋Š” AI ์ž…๋ฌธ ๐Ÿค–

2. ํŒŒ์ด์ฌ ํ”„๋กœ๊ทธ๋ž˜๋ฐ๊ณผ ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ

by ์—”์นด์ฝ” 2024. 12. 12.
๋ฐ˜์‘ํ˜•
๋ฐ์ดํ„ฐ๊ฐ€ ๊ณง ์ž์‚ฐ์ธ ์‹œ๋Œ€, ๋ฐ์ดํ„ฐ๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ณ  ๋ถ„์„ํ•˜๋Š” ๋Šฅ๋ ฅ์€ ํ•„์ˆ˜ ์Šคํ‚ฌ๋กœ ์ž๋ฆฌ ์žก์•˜์Šต๋‹ˆ๋‹ค.
๊ทธ์ค‘์—์„œ๋„ ํŒŒ์ด์ฌ(Python)์€ ๊ฐ„๊ฒฐํ•œ ๋ฌธ๋ฒ•๊ณผ ๋ฐฉ๋Œ€ํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋กœ ๋ฐ์ดํ„ฐ ๋ถ„์„์— ๊ฐ€์žฅ ์ ํ•ฉํ•œ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด๋กœ ํ‰๊ฐ€๋ฐ›๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
์ด๋ฒˆ ๊ธ€์—์„œ๋Š” ํŒŒ์ด์ฌ์„ ์‚ฌ์šฉํ•ด ๋ฐ์ดํ„ฐ๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ณ  ๋ถ„์„ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์†Œ๊ฐœํ•ฉ๋‹ˆ๋‹ค.
์ดˆ๋ณด์ž๋„ ์‰ฝ๊ฒŒ ๋”ฐ๋ผ ํ•  ์ˆ˜ ์žˆ๋Š” ๋‹จ๊ณ„๋ณ„ ๊ฐ€์ด๋“œ๋ฅผ ํ†ตํ•ด ๋ฐ์ดํ„ฐ ๋ถ„์„์˜ ๊ธฐ์ดˆ๋ฅผ ์ตํ˜€๋ณด์„ธ์š”!

[์—”์นด์ฝ”]

์ด๋ก  [ ํŒŒ์ด์ฌ ๊ธฐ๋ณธ ๋ฌธ๋ฒ•, ๋ฐ์ดํ„ฐ ๊ตฌ์กฐ, ๋„˜ํŒŒ์ด์™€ ํŒ๋‹ค์Šค ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ]

 

โ—ผ ํŒŒ์ด์ฌ ๊ธฐ๋ณธ ๋ฌธ๋ฒ•

  • ๋ณ€์ˆ˜์™€ ์ž๋ฃŒํ˜• : ํŒŒ์ด์ฌ์€ ์ˆซ์ž, ๋ฌธ์ž์—ด, ๋ฆฌ์ŠคํŠธ, ๋”•์…”๋„ˆ๋ฆฌ ๋“ฑ ๋‹ค์–‘ํ•œ ์ž๋ฃŒํ˜•์„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.
  • ์ œ์–ด๋ฌธ : ์กฐ๊ฑด๋ฌธ๊ณผ ๋ฐ˜๋ณต๋ฌธ์œผ๋กœ ๋กœ์ง์„ ์ œ์–ดํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
name = "Alice"
age = 25
scores = [85, 90, 88]

for score in scores:
    if score > 80:
        print(f"Great score: {score}")

โ—ผ numpy ์„ค์น˜ 

Windows์—์„œ numpy๋ฅผ ์„ค์น˜ํ•˜๋ ค๋ฉด pip๋ผ๋Š” ํŒจํ‚ค์ง€ ๊ด€๋ฆฌ ๋„๊ตฌ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

1. ๋ช…๋ น ํ”„๋กฌํ”„ํŠธ ์—ด๊ธฐ

 - Windows์—์„œ cmd ๋˜๋Š” ๋ช…๋ น ํ”„๋กฌํ”„ํŠธ๋ฅผ ๊ฒ€์ƒ‰ํ•˜์—ฌ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

2. numpy ์„ค์น˜

 - ๋ช…๋ น ํ”„๋กฌํ”„ํŠธ์—์„œ ์•„๋ž˜ ๋ช…๋ น์–ด๋ฅผ ์ž…๋ ฅํ•˜์—ฌ numpy๋ฅผ ์„ค์น˜ํ•ฉ๋‹ˆ๋‹ค.

pip install numpy

3. ์„ค์น˜ ํ™•์ธ

 - ์„ค์น˜๊ฐ€ ์™„๋ฃŒ๋˜๋ฉด numpy๊ฐ€ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ์„ค์น˜๋˜์—ˆ๋Š”์ง€ ํ™•์ธํ•˜๋ ค๋ฉด ์•„๋ž˜ ๋ช…๋ น์–ด๋ฅผ ์ž…๋ ฅํ•˜์—ฌ numpy ๋ฒ„์ „์„ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค.

 
python -c "import numpy; print(numpy.__version__)"

4. NumPy (๋ฐฐ์—ด ์—ฐ์‚ฐ์— ๊ฐ•๋ ฅํ•œ ๋„๊ตฌ) : ๋‹ค์ฐจ์› ๋ฐฐ์—ด ์ƒ์„ฑ ๋ฐ ์ˆ˜ํ•™์  ์—ฐ์‚ฐ

import numpy as np

data = np.array([1, 2, 3, 4])
print("๋ฐฐ์—ด:", data)
print("๋ฐฐ์—ด์˜ ํ‰๊ท :", data.mean())

 

โ—ผ pandas ์„ค์น˜

pandas๋Š” Python์˜ ๋ฐ์ดํ„ฐ ๋ถ„์„ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋กœ, ๋ฐ์ดํ„ฐ๋ฅผ ํšจ์œจ์ ์œผ๋กœ ๋‹ค๋ฃฐ ์ˆ˜ ์žˆ๋Š” ๋‹ค์–‘ํ•œ ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

pip ๋ช…๋ น์–ด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์„ค์น˜ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

1. ๋ช…๋ น ํ”„๋กฌํ”„ํŠธ ์—ด๊ธฐ

 - Windows์—์„œ ๋ช…๋ น ํ”„๋กฌํ”„ํŠธ๋ฅผ ์—ด๊ธฐ ์œ„ํ•ด cmd๋ฅผ ๊ฒ€์ƒ‰ํ•œ ํ›„ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

2. pandas ์„ค์น˜

 - ๋ช…๋ น ํ”„๋กฌํ”„ํŠธ์—์„œ ์•„๋ž˜ ๋ช…๋ น์–ด๋ฅผ ์ž…๋ ฅํ•˜์—ฌ pandas๋ฅผ ์„ค์น˜ํ•ฉ๋‹ˆ๋‹ค.

pip install pandas

 

3. ์„ค์น˜ ํ™•์ธ
- ์„ค์น˜๊ฐ€ ์™„๋ฃŒ๋œ ํ›„, pandas๊ฐ€ ์ œ๋Œ€๋กœ ์„ค์น˜๋˜์—ˆ๋Š”์ง€ ํ™•์ธํ•˜๋ ค๋ฉด ์•„๋ž˜ ๋ช…๋ น์–ด๋ฅผ ์ž…๋ ฅํ•ฉ๋‹ˆ๋‹ค.

python -c "import pandas; print(pandas.__version__)"

 

4. Pandas (๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์œผ๋กœ ๋ฐ์ดํ„ฐ ์กฐ์ž‘) : CSV ํŒŒ์ผ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ์™€ ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ

import pandas as pd

# CSV ํŒŒ์ผ ์ฝ๊ธฐ
df = pd.read_csv("data.csv")

# ์ƒ์œ„ 5๊ฐœ ๋ฐ์ดํ„ฐ ์ถœ๋ ฅ
print(df.head())


์‹ค์Šต [ ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ ๋ฐ ๋ถ„์„ ์‹ค์Šต ]

โ—ผ csv ํŒŒ์ผ ์ฝ๊ธฐ ๋ฐ ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ

import pandas as pd

# ๊ฐ€์ƒ์˜ ๋ฐ์ดํ„ฐ์…‹ ์ƒ์„ฑ
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eva'],
    'Age': [25, 30, 35, 40, 45],
    'Salary': [50000, 60000, 70000, 80000, 90000],
    'Date of Joining': ['2020-01-15', '2019-05-12', '2021-03-20', '2018-11-03', '2022-08-09']
}

# DataFrame ์ƒ์„ฑ
df = pd.DataFrame(data)

# CSV ํŒŒ์ผ๋กœ ์ €์žฅ
df.to_csv('employee_data.csv', index=False)

 

 - ๋ฐ์ดํ„ฐ ์ •๋ฆฌ ๋ฐ ๋ณ€ํ™˜ [ ๊ฒฐ์ธก์น˜ ์ฒ˜๋ฆฌ, ๋ฌธ์ž์—ด ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ, ๋‚ ์งœ ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ ]

 - ๋ฐ์ดํ„ฐ ์ง‘๊ณ„ ๋ฐ ํ†ต๊ณ„ ์š”์•ฝ [ ๊ธฐ๋ณธ ํ†ต๊ณ„ ์š”์•ฝ, ์—ฐ๋„๋ณ„ ํ‰๊ท  ์—ฐ๋ด‰ ]

import pandas as pd

# pandas๋ฅผ ์‚ฌ์šฉํ•ด CSV ํŒŒ์ผ ์ฝ๊ธฐ
df = pd.read_csv('employee_data.csv')

# ๋กœ์šฐ ๋ฐ์ดํ„ฐ ํ™•์ธ
print(df)

# ๊ตฌ๋ถ„์„ 
print("---------------------------------------------------------------------")

# ๊ฒฐ์ธก์น˜ ์ฒ˜๋ฆฌ (inplace=True ๋Œ€์‹  ์ง์ ‘ ํ• ๋‹น)
# ๋งŒ์•ฝ Salary์— ๊ฒฐ์ธก์น˜๊ฐ€ ์žˆ์„ ๊ฒฝ์šฐ, ์ด๋ฅผ ํ‰๊ท ๊ฐ’์œผ๋กœ ์ฑ„์šฐ๋Š” ์˜ˆ์‹œ์ž…๋‹ˆ๋‹ค.
df['Salary'] = df['Salary'].fillna(df['Salary'].mean())

# ๋ฌธ์ž์—ด ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ
# ์ด๋ฆ„(Name) ์ปฌ๋Ÿผ์—์„œ ์ฒซ ๊ธ€์ž๋งŒ ๋Œ€๋ฌธ์ž๋กœ ๋ฐ”๊พธ๋Š” ์ž‘์—…์„ ํ•ฉ๋‹ˆ๋‹ค.
df['Name'] = df['Name'].str.capitalize()

# ๋‚ ์งœ ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ
# Date of Joining ์ปฌ๋Ÿผ์„ ๋‚ ์งœ ํ˜•์‹์œผ๋กœ ๋ณ€ํ™˜ํ•˜๊ณ , ์ด๋ฅผ ์ด์šฉํ•ด ์—ฐ๋„ ๋ฐ ์›”์„ ์ถ”์ถœํ•ฉ๋‹ˆ๋‹ค.
df['Date of Joining'] = pd.to_datetime(df['Date of Joining'])
df['Year of Joining'] = df['Date of Joining'].dt.year
df['Month of Joining'] = df['Date of Joining'].dt.month

# ๊ธฐ๋ณธ ํ†ต๊ณ„ ์š”์•ฝ
# describe()๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ธฐ๋ณธ์ ์ธ ํ†ต๊ณ„ ์š”์•ฝ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
print(df.describe())

# ๊ตฌ๋ถ„์„ 
print("---------------------------------------------------------------------")

# ์—ฐ๋„๋ณ„ ํ‰๊ท  ์—ฐ๋ด‰
# ์—ฐ๋„๋ณ„๋กœ ํ‰๊ท  ์—ฐ๋ด‰์„ ๊ณ„์‚ฐํ•ด๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
salary_by_year = df.groupby('Year of Joining')['Salary'].mean()
print(salary_by_year)

 

โ—ผ ์ถœ๋ ฅ๊ฐ’


โ—ผ ๋ฐ์ดํ„ฐ ์‹œ๊ฐํ™”

 - matplotlib์„ ์‚ฌ์šฉํ•˜์—ฌ ์—ฐ๋ น๋Œ€๋ณ„ ์—ฐ๋ด‰์„ ์‹œ๊ฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

 - matplotlib ์„ค์น˜

pip install matplotlib

 

  • ์—ฐ๋ น๋Œ€๋ณ„ ํ‰๊ท  ์—ฐ๋ด‰ ๋ฐ์ดํ„ฐ๋ฅผ ์‹œ๊ฐํ™”ํ•˜๋Š” ์˜ˆ์ œ๋ฅผ ์ž‘์„ฑํ•ด ๋ณด์•˜์–ด์š”.
import matplotlib.pyplot as plt

# ์—ฐ๋ น๋Œ€๋ณ„ ํ‰๊ท  ์—ฐ๋ด‰ ์‹œ๊ฐํ™”
plt.figure(figsize=(8, 6))
plt.bar(df['Age'], df['Salary'], color='skyblue')
plt.xlabel('Age')
plt.ylabel('Salary')
plt.title('Age vs Salary')
plt.show()

 

 - seaborn์„ ์‚ฌ์šฉํ•˜์—ฌ Salary์™€ Age ๊ฐ„์˜ ๊ด€๊ณ„๋ฅผ ์‹œ๊ฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

 - seaborn ์„ค์น˜

pip install seaborn

 

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

# pandas๋ฅผ ์‚ฌ์šฉํ•ด CSV ํŒŒ์ผ ์ฝ๊ธฐ
df = pd.read_csv('employee_data.csv')

# Salary์™€ Age์˜ ๊ด€๊ณ„๋ฅผ ์‹œ๊ฐํ™”
sns.scatterplot(x='Age', y='Salary', data=df)
plt.title('Age vs Salary')
plt.show()


ํ”„๋กœ์ ํŠธ [ ์‹ค์ œ ๋ฐ์ดํ„ฐ์…‹์„ ํ™œ์šฉํ•œ ๋ฐ์ดํ„ฐ ๋ถ„์„ ๋ณด๊ณ ์„œ ์ž‘์„ฑ ]

 - ๋งˆ์ง€๋ง‰์œผ๋กœ Kaggle ๋˜๋Š” ๊ณต๊ณต ๋ฐ์ดํ„ฐ ํฌํ„ธ์—์„œ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์ ธ์™€ ๋ถ„์„ ํ”„๋กœ์ ํŠธ๋ฅผ ์ง„ํ–‰ํ•ด ๋ณด์„ธ์š”.

  • ๋ฐ์ดํ„ฐ์…‹: ์†Œ์ƒ๊ณต์ธ์‹œ์žฅ์ง„ํฅ๊ณต๋‹จ_์ƒ๊ฐ€(์ƒ๊ถŒ)์ •๋ณด
  • ๋ชฉํ‘œ1: ๋ถ€์‚ฐ์—์„œ ์ƒ๊ฐ€์˜ ์—…์ข…๋ณ„ ๋ถ„ํฌ์™€ ์ƒ์œ„ 10๊ฐœ์˜ ์—…์ข…์„ ์ˆ˜์น˜ํ™”ํ•˜์—ฌ ์‹œ๊ฐํ™”ํ•˜๊ธฐ.
  • ๋ชฉํ‘œ2: ๋ถ€์‚ฐ์—์„œ ์šด์˜๋˜๋Š” ์นดํŽ˜์˜ ๋ฐ€์ง‘๋„๋ฅผ ํ–‰์ •๋™๋ณ„๋กœ ์ˆ˜์น˜ํ™”ํ•˜์—ฌ ์‹œ๊ฐํ™”ํ•˜๊ธฐ.
  • ์‚ฌ์šฉ ๋„๊ตฌ: Python, NumPy, Pandas, Matplotlib
pip install chardet
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.font_manager as fm
import seaborn as sns

# ํฐํŠธ ๊ฒฝ๋กœ ์„ค์ • (์˜ˆ: ์œˆ๋„์šฐ์˜ "๋ง‘์€ ๊ณ ๋”•" ํฐํŠธ)
font_path = "C:/Windows/Fonts/malgun.ttf"
font_prop = fm.FontProperties(fname=font_path)

# ํฐํŠธ ์„ค์ • ์ ์šฉ
plt.rc('font', family=font_prop.get_name())

# ํŒŒ์ผ ๋กœ๋“œ (low_memory ์‚ฌ์šฉ)
file_path = r"์†Œ์ƒ๊ณต์ธ์‹œ์žฅ์ง„ํฅ๊ณต๋‹จ_์ƒ๊ฐ€(์ƒ๊ถŒ)์ •๋ณด_๋ถ€์‚ฐ_202409.csv"
df = pd.read_csv(file_path, encoding='UTF-8', low_memory=False)

# ๋ถ„์„์— ๋ถˆํ•„์š”ํ•œ ์—ด ์ œ๊ฑฐ
df = df.drop(['์ง€์ ๋ช…', '๋™์ •๋ณด', 'ํ˜ธ์ •๋ณด'], axis=1)

# ๊ฒฐ์ธก์น˜๊ฐ€ ๋งŽ์€ ์—ด ์ค‘ ์ผ๋ถ€ ์ฑ„์šฐ๊ธฐ (์˜ˆ: ์ธต์ •๋ณด๋ฅผ "์•Œ ์ˆ˜ ์—†์Œ"์œผ๋กœ ์ฑ„์›€)
df['์ธต์ •๋ณด'] = df['์ธต์ •๋ณด'].fillna('์•Œ ์ˆ˜ ์—†์Œ')

# ๊ธฐ๋ณธ์ ์ธ ๊ฒฐ์ธก์น˜ ์ฒ˜๋ฆฌ
df = df.dropna(subset=['์ƒํ˜ธ๋ช…', '์œ„๋„', '๊ฒฝ๋„'])  # ์ƒํ˜ธ๋ช…๊ณผ ์œ„์น˜ ์ •๋ณด๊ฐ€ ์—†๋Š” ๋ฐ์ดํ„ฐ ์ œ๊ฑฐ

# ์—…์ข…๋ณ„ ๋ฐ์ดํ„ฐ ๊ฐœ์ˆ˜ ํ™•์ธ
industry_count = df['์ƒ๊ถŒ์—…์ข…๋Œ€๋ถ„๋ฅ˜๋ช…'].value_counts()

# ์ƒ์œ„ 10๊ฐœ ์—…์ข… ์‹œ๊ฐํ™”
plt.figure(figsize=(10, 6))
sns.barplot(
    x=industry_count.values[:10],
    y=industry_count.index[:10],
    palette='viridis',
    hue=industry_count.index[:10],  # `hue`์— y ๊ฐ’์„ ํ• ๋‹น
    legend=False  # ํ•„์š”์— ๋”ฐ๋ผ ์ถ”๊ฐ€
)
plt.title('๋ถ€์‚ฐ ์ง€์—ญ ์ƒ๊ฐ€ ์—…์ข…๋ณ„ ๋ถ„ํฌ (์ƒ์œ„ 10๊ฐœ)')
plt.xlabel('์ƒ๊ฐ€ ์ˆ˜')
plt.ylabel('์—…์ข…')
plt.show()

pip install folium
pip install geopandas
import pandas as pd
import folium
import matplotlib.font_manager as fm
from branca.colormap import linear
from folium.features import DivIcon

# ํฐํŠธ ๊ฒฝ๋กœ ์„ค์ • (์˜ˆ: ์œˆ๋„์šฐ์˜ "๋ง‘์€ ๊ณ ๋”•" ํฐํŠธ)
font_path = "C:/Windows/Fonts/malgun.ttf"
font_prop = fm.FontProperties(fname=font_path)

# ํŒŒ์ผ ๋กœ๋“œ
file_path = r"์†Œ์ƒ๊ณต์ธ์‹œ์žฅ์ง„ํฅ๊ณต๋‹จ_์ƒ๊ฐ€(์ƒ๊ถŒ)์ •๋ณด_๋ถ€์‚ฐ_202409.csv"
df = pd.read_csv(file_path, encoding='UTF-8', low_memory=False)

# ๋ฐ์ดํ„ฐ ํ•„ํ„ฐ๋ง: ํ™”๊ณผ์ž ๊ณต๋ฐฉ ์—…์ข… (I56221)
df_filtered = df[df['ํ‘œ์ค€์‚ฐ์—…๋ถ„๋ฅ˜์ฝ”๋“œ'] == 'I56221']

# ๊ฒฐ์ธก์น˜ ์ฒ˜๋ฆฌ: ์œ„๋„, ๊ฒฝ๋„ ๋ฐ์ดํ„ฐ๊ฐ€ ์žˆ๋Š” ์—…์†Œ๋งŒ ์‚ฌ์šฉ
df_filtered = df_filtered.dropna(subset=['์œ„๋„', '๊ฒฝ๋„', 'ํ–‰์ •๋™๋ช…'])

# ํ–‰์ •๋™๋ณ„ ์—…์†Œ ๋ฐ€์ง‘๋„ ๊ณ„์‚ฐ
dong_count = df_filtered.groupby('ํ–‰์ •๋™๋ช…').size().reset_index(name='์—…์†Œ์ˆ˜')

# ํ–‰์ •๋™๋ณ„ ์œ„๋„, ๊ฒฝ๋„ ์ •๋ณด ์ถ”๊ฐ€
dong_coords = df_filtered.groupby('ํ–‰์ •๋™๋ช…')[['์œ„๋„', '๊ฒฝ๋„']].mean().reset_index()

# ํ–‰์ •๋™๋ณ„ ์—…์†Œ ์ˆ˜์™€ ์ขŒํ‘œ ๋ณ‘ํ•ฉ
dong_data = pd.merge(dong_count, dong_coords, on='ํ–‰์ •๋™๋ช…')

# ๋ถ€์‚ฐ์˜ ๊ธฐ๋ณธ ์ง€๋„ ์ƒ์„ฑ (์œ„๋„, ๊ฒฝ๋„)
map_busan = folium.Map(location=[35.1796, 129.0756], zoom_start=12)

# ์—…์†Œ ์ˆ˜์— ๋”ฐ๋ฅธ ์ƒ‰์ƒ ๋งตํ•‘ (ํ‘ธ๋ฅธ์ƒ‰~๋ถ‰์€์ƒ‰ ๊ณ„์—ด, ์ƒ‰์ƒ ๋ฐ˜์ „)
colormap = linear.RdYlBu_09.scale(dong_data['์—…์†Œ์ˆ˜'].min(), dong_data['์—…์†Œ์ˆ˜'].max())  # ์ƒ‰์ƒ ๋ฐ˜์ „

# ํ–‰์ •๋™๋ณ„๋กœ ์ƒ‰์ƒ ์ ์šฉ ๋ฐ ์—…์†Œ์ˆ˜ ํ‘œ์‹œ
for _, row in dong_data.iterrows():
    dong_name = row['ํ–‰์ •๋™๋ช…']
    location = [row['์œ„๋„'], row['๊ฒฝ๋„']]
    ์—…์†Œ์ˆ˜ = row['์—…์†Œ์ˆ˜']
    
    # ๋™๊ทธ๋ผ๋ฏธ๋กœ ์—…์†Œ ๋ฐ€์ง‘๋„ ์‹œ๊ฐํ™”
    folium.CircleMarker(
        location=location,
        radius=20,  # ๋ฐ˜์ง€๋ฆ„ ํฌ๊ธฐ
        color=colormap(row['์—…์†Œ์ˆ˜']),
        fill=True,
        fill_color=colormap(row['์—…์†Œ์ˆ˜']),
        fill_opacity=0.6,
        popup=f"๋™: {dong_name}\n์—…์†Œ์ˆ˜: {์—…์†Œ์ˆ˜}",
        tooltip=dong_name  # ๋งˆ์šฐ์Šค๋ฅผ ์˜ฌ๋ ค๋†“์œผ๋ฉด ํ–‰์ •๋™ ์ด๋ฆ„์ด ํ‘œ์‹œ๋จ
    ).add_to(map_busan)

    # ๋™๊ทธ๋ผ๋ฏธ ์ค‘์•™์— ์—…์†Œ ์ˆ˜ ์ˆซ์ž ํ‘œ์‹œ
    folium.Marker(
        location=location,
        icon=DivIcon(
            icon_size=(30, 30),  # ํฌ๊ธฐ ์„ค์ •
            icon_anchor=(15, 15),
            html=f'<div style="font-size: 12px; color: black; text-align: center;">{์—…์†Œ์ˆ˜}</div>'
        )
    ).add_to(map_busan)

# ์ƒ‰์ƒ ๋ฐ” ์ถ”๊ฐ€
colormap.add_to(map_busan)

# ์ง€๋„ ํŒŒ์ผ๋กœ ์ €์žฅ
map_busan.save("busan_cake_shop_density_map_dong_with_boundary_and_popup.html")
print("ํ–‰์ •๋™๋ช… ๋ณ„ ๋ฐ€์ง‘๋„ ์ง€๋„ (์—…์†Œ ์ˆ˜ ๋ฐ ์—…์ฒด ๋ชฉ๋ก ํ‘œ์‹œ) ์ €์žฅ ์™„๋ฃŒ!")

 

โ€ป ์ €๋Š” ๋ชฉํ‘œ๋ฅผ ์„ค์ •ํ•˜๊ณ  ๊ฒฐ๊ณผ์— ๋„๋‹ฌํ•˜๊ธฐ ์œ„ํ•ด์„œ ๋‹ค์–‘ํ•œ ๋ฐฉ์‹ ์ค‘ AI๋ฅผ ์ฃผ๋กœ ์‚ฌ์šฉํ•˜์˜€์Šต๋‹ˆ๋‹ค. ์žฌ๊ฐ€ ๊ณผ๊ฑฐ ์ˆ˜์ž‘์—…์œผ๋กœ ๊ฐœ๋ฐœํ•  ๋•Œ ๋ณด๋‹ค ์ง€๊ธˆ์ด ์•„์ฃผ ๊ฐ„ํŽธํ•˜๊ณ  ์‹œ๊ฐ„๋„ ์ ˆ์•ฝ๋˜๋Š” ๊ฒŒ ๋„ˆ๋ฌด ์‹ ๊ธฐํ•˜๊ณ  ๊ฒฝ๊ฐ์‹ฌ์ด ๋“ค์—ˆ์Šต๋‹ˆ๋‹ค. ChatGPT๋ฅผ ํ™œ์šฉํ•˜์—ฌ python ๊ฐœ๋ฐœ์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ—ˆ๋‚˜, ๊ฒฐ๊ณผ๋Š” AI ๊ฐ€ ์„ ํƒํ•˜๋Š” ๊ฒƒ ์•„๋‹ˆ๊ณ , ์ธ๊ฐ„์ด ์„ ํƒํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ํ˜„๋ช…ํ•œ ์„ ํƒ์„ ํ•  ์ˆ˜ ์žˆ๋„๋ก ๋‹ค์–‘ํ•œ ๋ฐฉ๋ฉด์—์„œ ์ง€์‹์„ ๋ฐฐ์›Œ์•ผ ํ•˜๋Š” ๊ฒƒ์„ ๋ช…์‹ฌํ•ด์•ผ๊ฒ ์Šต๋‹ˆ๋‹ค.


์ด๋ฒˆ ์‹ค์Šต์„ ํ†ตํ•ด CSV ํŒŒ์ผ์„ ์ฝ๊ณ  ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ, ๋ณ€ํ™˜, ์ง‘๊ณ„ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ตํ˜”์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ, Matplotlib ๊ณผ Seaborn์„ ํ™œ์šฉํ•˜์—ฌ ๋ฐ์ดํ„ฐ๋ฅผ ์‹œ๊ฐํ™”ํ•˜๊ณ  ์ค‘์š”ํ•œ ์ธ์‚ฌ์ดํŠธ๋ฅผ ๋„์ถœํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๋ฐฐ์› ์Šต๋‹ˆ๋‹ค. ์‹ค์Šต์„ ํ†ตํ•ด ๋ฐ์ดํ„ฐ ๋ถ„์„์˜ ๊ธฐ์ดˆ๋ถ€ํ„ฐ ์‹ฌํ™”๊นŒ์ง€ ๋‹ค์–‘ํ•œ ๊ธฐ๋ฒ•์„ ๊ฒฝํ—˜ํ•ด ๋ณด์„ธ์š”.


 

1. ์ธ๊ณต์ง€๋Šฅ ๊ธฐ์ดˆ์™€ ์•„์ด๋””์–ด

1. ์ด๋ก  [ ์ธ๊ณต์ง€๋Šฅ์˜ ์ •์˜, ์—ญ์‚ฌ, ๋จธ์‹ ๋Ÿฌ๋‹๊ณผ ๋”ฅ๋Ÿฌ๋‹ ๊ฐœ๋… ]์ธ๊ณต์ง€๋Šฅ์˜ ์ •์˜๐Ÿ“์ธ๊ณต์ง€๋Šฅ(AI)์€ ์ธ๊ฐ„์˜ ์ง€๋Šฅ์„ ๋ชจ๋ฐฉํ•˜์—ฌ ํ•™์Šต, ๋ฌธ์ œ ํ•ด๊ฒฐ, ํŒจํ„ด ์ธ์‹ ๋“ฑ์„ ์ˆ˜ํ–‰ํ•˜๋Š” ์ปดํ“จํ„ฐ ์‹œ์Šคํ…œ์„ ๋งํ•ฉ๋‹ˆ๋‹ค.์ธ๊ณต

ncaco97.tistory.com

 

3. ๋จธ์‹ ๋Ÿฌ๋‹ ๊ธฐ์ดˆ

๋จธ์‹ ๋Ÿฌ๋‹์€ ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„์„ํ•˜๊ณ  ํ•™์Šตํ•˜์—ฌ ์˜ˆ์ธก์ด๋‚˜ ๋ถ„๋ฅ˜๋ฅผ ์ˆ˜ํ–‰ํ•˜๋Š” ์ธ๊ณต์ง€๋Šฅ ๊ธฐ์ˆ ์˜ ํ•ต์‹ฌ ์š”์†Œ์ž…๋‹ˆ๋‹ค. ์ด ๊ธ€์—์„œ๋Š” ๋จธ์‹ ๋Ÿฌ๋‹์˜ ๊ธฐ์ดˆ ๊ฐœ๋…์„ ์ดํ•ดํ•˜๊ณ , ์‹ค์Šต์„ ํ†ตํ•ด ๊ธฐ๋ณธ์ ์ธ ๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ์„

ncaco97.tistory.com

๋ฐ˜์‘ํ˜•