긍정적 사고, 음식의 절제, 규칙적인 운동

[python] 생성된 엑셀을 Frequency 순으로, 동일 Frequency 이면 단어순으로 정렬

홍반장水_ 2024. 5. 2. 18:59

2024. 5. 2. 18:59

[python] 생성된 엑셀을 Frequency 순으로, 동일 Frequency 이면 단어순으로 정렬

import pandas as pd
from collections import Counter
import re

def read_text_file(file_path):
    """텍스트 파일을 읽고 내용을 반환"""
    with open(file_path, 'r', encoding='utf-8') as file:
        return file.read()

def count_word_frequencies(text):
    """주어진 텍스트에서 단어 빈도수 계산"""
    words = re.findall(r'\b\w+\b', text.lower())
    return Counter(words)

def save_frequencies_to_excel(frequencies, output_file):
    """단어 빈도수를 엑셀 파일로 저장"""
    # 판다스 DataFrame으로 변환
    df = pd.DataFrame(list(frequencies.items()), columns=['Word', 'Frequency'])
    # 빈도수 내림차순, 단어 알파벳순 오름차순으로 정렬
    df = df.sort_values(by=['Frequency', 'Word'], ascending=[False, True])
    # 데이터를 엑셀 파일로 저장
    df.to_excel(output_file, index=False)

# 파일 경로
file_path = 'example.txt'
output_excel = 'word_frequencies.xlsx'

# 파일 읽기
text = read_text_file(file_path)

# 빈도수 분석
frequencies = count_word_frequencies(text)

# 엑셀로 저장
save_frequencies_to_excel(frequencies, output_excel)

print("단어 빈도수가 정렬되어 엑셀 파일로 저장되었습니다.")

DataFrame 변환 및 정렬: pandas.DataFrame을 사용하여 빈도수 데이터를 DataFrame으로 변환한 후, sort_values 메소드를 사용하여 먼저 Frequency 열에 대해 내림차순으로, 동일한 빈도를 가진 항목에 대해서는 Word 열을 기준으로 오름차순 정렬합니다. ascending=[False, True] 파라미터는 각각 Frequency와 Word 열에 적용됩니다.
엑셀 파일 저장: 정렬된 데이터를 .xlsx 형식의 파일로 저장합니다.

저작자표시 비영리 변경금지

'프로그래밍 > Python' 카테고리의 다른 글

[python] 설치된 라이브러리 리스트 호출 (0)	2024.05.09
[python] string methods in python (0)	2024.05.08
[python] 웹 기반 파이썬 데이터 앱 쉽게 다루는 스트림릿(Streamlit) 간단 예제 (1)	2024.04.23
[python] steamlit - A faster way to build and share data apps. (0)	2024.04.23
[PYTHON] Lambda function (0)	2024.04.05

긍정적 사고, 음식의 절제, 규칙적인 운동

[python] 생성된 엑셀을 Frequency 순으로, 동일 Frequency 이면 단어순으로 정렬

'프로그래밍 > Python' 카테고리의 다른 글

+ Recent posts

티스토리툴바