pandas - 7

2025. 4. 6. 18:04ㆍ개발공부/생성형 AI 기반 개발자 과정

728x90

Unique

컬럼에 여러 값이 있을 때, 중복 없이 어떤 값들이 있는 지 확인하는 방법입니다.

job_list = [{'name': 'John', 'job': "teacher"},
                {'name': 'Nate', 'job': "teacher"},
                {'name': 'Fred', 'job': "teacher"},
                {'name': 'Abraham', 'job': "student"},
                {'name': 'Brian', 'job': "student"},
                {'name': 'Janny', 'job': "developer"},
                {'name': 'Nate', 'job': "teacher"},
                {'name': 'Obrian', 'job': "dentist"},
                {'name': 'Yuna', 'job': "teacher"},
                {'name': 'Rob', 'job': "lawyer"},
                {'name': 'Brian', 'job': "student"},
                {'name': 'Matt', 'job': "student"},
                {'name': 'Wendy', 'job': "banker"},
                {'name': 'Edward', 'job': "teacher"},
                {'name': 'Ian', 'job': "teacher"},
                {'name': 'Chris', 'job': "banker"},
                {'name': 'Philip', 'job': "lawyer"},
                {'name': 'Janny', 'job': "basketball player"},
                {'name': 'Gwen', 'job': "teacher"},
                {'name': 'Jessy', 'job': "student"}
         ]
df = pd.DataFrame(job_list, columns = ['name', 'job'])

컬럼(시리즈)의 unique() 함수를 사용하여, 중복 없이, 컬럼에 있는 모든 값들을 출력할 수 있습니다.

print( df.job.unique() )

['teacher' 'student' 'developer' 'dentist' 'lawyer' 'banker'
 'basketball player']

각 유니크한 값별로 몇개의 데이터가 속하는 지 value_counts() 함수로 확인할 수 있습니다.

df.job.value_counts()

teacher              8
student              5
banker               2
lawyer               2
basketball player    1
dentist              1
developer            1
Name: job, dtype: int64

두개의 데이터프레임 합치기

l1 = [{'name': 'John', 'job': "teacher"},
      {'name': 'Nate', 'job': "student"},
      {'name': 'Fred', 'job': "developer"}]

l2 = [{'name': 'Ed', 'job': "dentist"},
      {'name': 'Jack', 'job': "farmer"},
      {'name': 'Ted', 'job': "designer"}]

df1 = pd.DataFrame(l1, columns = ['name', 'job'])
df2 = pd.DataFrame(l2, columns = ['name', 'job'])

pd.concat

두번째 데이터프레임을 첫번째 데이터프레임의 새로운 로우(행)로 합칩니다.

frames = [df1, df2]
result = pd.concat(frames, ignore_index=True)

result

.dataframe tbody tr th:only-of-type { vertical-align: middle; }

.dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }

	name	job
0	John	teacher
1	Nate	student
2	Fred	developer
3	Ed	dentist
4	Jack	farmer
5	Ted	designer

df.append

두번째 데이터프레임을 첫번째 데이터프레임의 새로운 로우(행)로 합칩니다.

l1 = [{'name': 'John', 'job': "teacher"},
      {'name': 'Nate', 'job': "student"},
      {'name': 'Fred', 'job': "developer"}]

l2 = [{'name': 'Ed', 'job': "dentist"},
      {'name': 'Jack', 'job': "farmer"},
      {'name': 'Ted', 'job': "designer"}]

df1 = pd.DataFrame(l1, columns = ['name', 'job'])
df2 = pd.DataFrame(l2, columns = ['name', 'job'])
result = df1.append(df2, ignore_index=True)

result

.dataframe tbody tr th:only-of-type { vertical-align: middle; }

.dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }

	name	job
0	John	teacher
1	Nate	student
2	Fred	developer
3	Ed	dentist
4	Jack	farmer
5	Ted	designer

pd.concat

두번째 데이터프레임을 첫번째 데이터프레임의 새로운 컬럼(열)으로 합칩니다.

l1 = [{'name': 'John', 'job': "teacher"},
      {'name': 'Nate', 'job': "student"},
      {'name': 'Jack', 'job': "developer"}]

l2 = [{'age': 25, 'country': "U.S"},
      {'age': 30, 'country': "U.K"},
      {'age': 45, 'country': "Korea"}]

df1 = pd.DataFrame(l1, columns = ['name', 'job'])
df2 = pd.DataFrame(l2, columns = ['age', 'country'])
result = pd.concat([df1, df2], axis=1, ignore_index=True)

result

.dataframe tbody tr th:only-of-type { vertical-align: middle; }

.dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }

	0	1	2	3
0	John	teacher	25	U.S
1	Nate	student	30	U.K
2	Jack	developer	45	Korea

두개의 리스트를 묶어서 데이터프레임으로 생성하기

label = [1,2,3,4,5]
prediction = [1,2,2,5,5]

comparison = pd.DataFrame(
    {'label': label,
     'prediction': prediction
    })

comparison

.dataframe tbody tr th:only-of-type { vertical-align: middle; }

.dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }

	label	prediction
0	1	1
1	2	2
2	3	2
3	4	5
4	5	5

728x90

'개발공부 > 생성형 AI 기반 개발자 과정' 카테고리의 다른 글

크롤링 활용(selenium 활용) (0)	2025.04.17
크롤링 기초(api 활용편 - xml) (0)	2025.04.16
pandas - 6 (0)	2025.04.06
pandas - 5 (0)	2025.04.06
pandas - 4 (0)	2025.04.06

pandas - 7

태그클라우드 이동

최근 글 👑

pandas - 7

2025. 4. 6. 18:04ㆍ개발공부/생성형 AI 기반 개발자 과정

Unique

두개의 데이터프레임 합치기

pd.concat

df.append

pd.concat

두개의 리스트를 묶어서 데이터프레임으로 생성하기

'개발공부 > 생성형 AI 기반 개발자 과정' 카테고리의 다른 글

관련글

티스토리툴바