반응형
반응형

[python] 한글 자음 확인해서 치환하기 

 

""" 한글 자음인지 확인 
"""
# Define the list of Korean initial consonants (초성)
INITIAL_CONSONANTS = [
    'ㄱ', 'ㄲ', 'ㄴ', 'ㄷ', 'ㄸ', 'ㄹ', 
    'ㅁ', 'ㅂ', 'ㅃ', 'ㅅ', 'ㅆ', 'ㅇ', 
    'ㅈ', 'ㅉ', 'ㅊ', 'ㅋ', 'ㅌ', 'ㅍ', 'ㅎ'
]

def is_korean_char(ch):
    """Check if the character is a Korean syllable."""
    return 0xAC00 <= ord(ch) <= 0xD7A3

def get_initial_consonant(ch):
    """Extract the initial consonant from a Korean syllable."""
    if not is_korean_char(ch):
        return None  # or you could return an empty string or raise an exception
    
    # Calculate the index for the initial consonant
    initial_index = (ord(ch) - 0xAC00) // (21 * 28)
    return INITIAL_CONSONANTS[initial_index]

# Example usage
sentence = "안녕하세요"
initials = [get_initial_consonant(ch) for ch in sentence if is_korean_char(ch)]
print(''.join(initials))  # Output: ㅇㄴㅎㅅㅇ
반응형
반응형

https://pypi.org/project/chardet/

 

Project description

Chardet: The Universal Character Encoding Detector

 
 
 
Detects
  • ASCII, UTF-8, UTF-16 (2 variants), UTF-32 (4 variants)
  • Big5, GB2312, EUC-TW, HZ-GB-2312, ISO-2022-CN (Traditional and Simplified Chinese)
  • EUC-JP, SHIFT_JIS, CP932, ISO-2022-JP (Japanese)
  • EUC-KR, ISO-2022-KR, Johab (Korean)
  • KOI8-R, MacCyrillic, IBM855, IBM866, ISO-8859-5, windows-1251 (Cyrillic)
  • ISO-8859-5, windows-1251 (Bulgarian)
  • ISO-8859-1, windows-1252, MacRoman (Western European languages)
  • ISO-8859-7, windows-1253 (Greek)
  • ISO-8859-8, windows-1255 (Visual and Logical Hebrew)
  • TIS-620 (Thai)

Note

Our ISO-8859-2 and windows-1250 (Hungarian) probers have been temporarily disabled until we can retrain the models.

Requires Python 3.7+.

반응형
반응형

[python] 엑셀 읽고 쓰기 

 

https://pypi.org/project/openpyxl/

 

openpyxl

A Python library to read/write Excel 2010 xlsx/xlsm files

pypi.org

 

pip install openpyxl

 

반응형
반응형

[python] 한글 자음 모음 분리하기 

 

 

pip install jamo

https://pypi.org/project/jamo/

 

jamo

A Hangul syllable and jamo analyzer.

pypi.org

https://github.com/jdongian/python-jamo  

 

------------

 

pip install jamotools

https://pypi.org/project/jamotools/

A library for Korean Jamo split and vectorize.  한국어 Jamo를 분할하고 벡터화하는 라이브러리입니다.

 

 

>>> import jamotools
>>> print(jamotools.split_syllable_char(u"안"))
('ㅇ', 'ㅏ', 'ㄴ')

>>> print(jamotools.split_syllables(u"안녕하세요"))
ㅇㅏㄴㄴㅕㅇㅎㅏㅅㅔㅇㅛ

>>> sentence = u"앞 집 팥죽은 붉은 팥 풋팥죽이고, 뒷집 콩죽은 햇콩 단콩 콩죽.우리 집
    깨죽은 검은 깨 깨죽인데 사람들은 햇콩 단콩 콩죽 깨죽 죽먹기를 싫어하더라."
>>> s = jamotools.split_syllables(sentence)
>>> print(s)
ㅇㅏㅍ ㅈㅣㅂ ㅍㅏㅌㅈㅜㄱㅇㅡㄴ ㅂㅜㄺㅇㅡㄴ ㅍㅏㅌ ㅍㅜㅅㅍㅏㅌㅈㅜㄱㅇㅣㄱㅗ,
ㄷㅟㅅㅈㅣㅂ ㅋㅗㅇㅈㅜㄱㅇㅡㄴ ㅎㅐㅅㅋㅗㅇ ㄷㅏㄴㅋㅗㅇ ㅋㅗㅇㅈㅜㄱ.ㅇㅜㄹㅣ
ㅈㅣㅂ ㄲㅐㅈㅜㄱㅇㅡㄴ ㄱㅓㅁㅇㅡㄴ ㄲㅐ ㄲㅐㅈㅜㄱㅇㅣㄴㄷㅔ ㅅㅏㄹㅏㅁㄷㅡㄹㅇㅡㄴ
ㅎㅐㅅㅋㅗㅇ ㄷㅏㄴㅋㅗㅇ ㅋㅗㅇㅈㅜㄱ ㄲㅐㅈㅜㄱ ㅈㅜㄱㅁㅓㄱㄱㅣㄹㅡㄹ
ㅅㅣㅀㅇㅓㅎㅏㄷㅓㄹㅏ.

>>> sentence2 = jamotools.join_jamos(s)
>>> print(sentence2)
앞 집 팥죽은 붉은 팥 풋팥죽이고, 뒷집 콩죽은 햇콩 단콩 콩죽.우리 집 깨죽은 검은 깨
깨죽인데 사람들은 햇콩 단콩 콩죽 깨죽 죽먹기를 싫어하더라.

>>> print(sentence == sentence2)
True
반응형
반응형

[python] gTTS 한글 speak

 

# gTTS : Google Text to Speech API : Google에서 제공하는 TTS 서비스. gTTS라는 모듈을 인스톨해야 함
"""
    'en'으로 지정했을 때 한글이 text에 포함되어 있으면 이를 무시하지만, 'ko'일때 text내의 영문은 무시되지 않고 음성합성을 수행한다(아주 이상함)
    영어는 여자 성우, 한글은 남자 성우이다(변경 불가능)
"""

from gtts import gTTS
import pygame
import time

text ="안녕하세요, 여러분. 파이썬으로 노는 것은 재미있습니다!!!"

tts = gTTS(text=text, lang='ko')

tts.save("helloKO.mp3")



# Initialize Pygame mixer
pygame.mixer.init()

# Load the audio file
pygame.mixer.music.load("helloKO.mp3")

# Play the audio file
pygame.mixer.music.play()

# Allow time for the audio to play
time.sleep(10)

# Stop the playback
pygame.mixer.music.stop()

 

반응형
반응형

pyttsx3

*** https://medium.com/analytics-vidhya/easy-way-to-build-an-audiobook-using-python-20fc7d6fb1af

 

Easy Way To Build An Audiobook Using Python

An audiobook is nothing but a book that was recorded in an audio format. It can also be stated as a book that is being read aloud…

medium.com

 

pip install gTTS  https://pypi.org/project/gTTS/

 

https://pypi.org/project/pyttsx3/

 

pyttsx3

Text to Speech (TTS) library for Python 2 and 3. Works without internet connection or delay. Supports multiple TTS engines, including Sapi5, nsss, and espeak.

pypi.org

Text to Speech (TTS) library for Python 2 and 3. Works without internet connection or delay. Supports multiple TTS engines, including Sapi5, nsss, and espeak.

 

pip install pyttsx3

Project description

pyttsx3 is a text-to-speech conversion library in Python. Unlike alternative libraries, it works offline, and is compatible with both Python 2 and 3.

Installation

pip install pyttsx3

If you recieve errors such as No module named win32com.client, No module named win32, or No module named win32api, you will need to additionally install pypiwin32.

Usage :

import pyttsx3
engine = pyttsx3.init()
engine.say("I will speak this text")
engine.runAndWait()

Changing Voice , Rate and Volume :

import pyttsx3
engine = pyttsx3.init() # object creation

""" RATE"""
rate = engine.getProperty('rate')   # getting details of current speaking rate
print (rate)                        #printing current voice rate
engine.setProperty('rate', 125)     # setting up new voice rate


"""VOLUME"""
volume = engine.getProperty('volume')   #getting to know current volume level (min=0 and max=1)
print (volume)                          #printing current volume level
engine.setProperty('volume',1.0)    # setting up volume level  between 0 and 1

"""VOICE"""
voices = engine.getProperty('voices')       #getting details of current voice
#engine.setProperty('voice', voices[0].id)  #changing index, changes voices. o for male
engine.setProperty('voice', voices[1].id)   #changing index, changes voices. 1 for female

engine.say("Hello World!")
engine.say('My current speaking rate is ' + str(rate))
engine.runAndWait()
engine.stop()

"""Saving Voice to a file"""
# On linux make sure that 'espeak' and 'ffmpeg' are installed
engine.save_to_file('Hello World', 'test.mp3')
engine.runAndWait()

Full documentation of the Library

https://pyttsx3.readthedocs.io/en/latest/

Included TTS engines:

  • sapi5
  • nsss
  • espeak

Feel free to wrap another text-to-speech engine for use with pyttsx3.

반응형

+ Recent posts