반응형
반응형

Naver sentiment movie corpus

www.lucypark.kr/docs/2015-pyconkr/#1

 

한국어와 NLTK, Gensim의 만남 - PyCon Korea 2015

 

www.lucypark.kr

github.com/e9t/nsmc

 

e9t/nsmc

Naver sentiment movie corpus. Contribute to e9t/nsmc development by creating an account on GitHub.

github.com

 

반응형
반응형

Stanford Pos Tagger를 이용한 POS Tagging

from nltk.tag import StanfordPOSTagger
from nltk.tokenize import word_tokenize

STANFORD_POS_MODEL_PATH = "압축 푼 디렉토리/stanford-postagger-full-2018-02-27/models/english-bidirectional-distsim.tagger"
STANFORD_POS_JAR_PATH = "압축 푼 디렉토리/stanford-postagger-full-2018-02-27/stanford-postagger-3.9.1.jar"

pos_tagger = StanfordPOSTagger(STANFORD_POS_MODEL_PATH, STANFORD_POS_JAR_PATH)

text = """Facebook CEO Mark Zuckerberg acknowledged a range of mistakes on Wednesday, 
including allowing most of its two billion users to have their public profile data scraped by outsiders. 
However, even as he took responsibility, he maintained he was the best person to fix the problems he created."""

tokens = word_tokenize(text)
print(tokens)
print()
print(pos_tagger.tag(tokens))

['Facebook', 'CEO', 'Mark', 'Zuckerberg', 'acknowledged', 'a', 'range', 'of', 'mistakes', 'on', 'Wednesday', ',', 'including', 'allowing', 'most', 'of', 'its', 'two', 'billion', 'users', 'to', 'have', 'their', 'public', 'profile', 'data', 'scraped', 'by', 'outsiders', '.', 'However', ',', 'even', 'as', 'he', 'took', 'responsibility', ',', 'he', 'maintained', 'he', 'was', 'the', 'best', 'person', 'to', 'fix', 'the', 'problems', 'he', 'created', '.']

[('Facebook', 'NNP'), ('CEO', 'NNP'), ('Mark', 'NNP'), ('Zuckerberg', 'NNP'), ('acknowledged', 'VBD'), ('a', 'DT'), ('range', 'NN'), ('of', 'IN'), ('mistakes', 'NNS'), ('on', 'IN'), ('Wednesday', 'NNP'), (',', ','), ('including', 'VBG'), ('allowing', 'VBG'), ('most', 'JJS'), ('of', 'IN'), ('its', 'PRP$'), ('two', 'CD'), ('billion', 'CD'), ('users', 'NNS'), ('to', 'TO'), ('have', 'VB'), ('their', 'PRP$'), ('public', 'JJ'), ('profile', 'NN'), ('data', 'NNS'), ('scraped', 'VBN'), ('by', 'IN'), ('outsiders', 'NNS'), ('.', '.'), ('However', 'RB'), (',', ','), ('even', 'RB'), ('as', 'IN'), ('he', 'PRP'), ('took', 'VBD'), ('responsibility', 'NN'), (',', ','), ('he', 'PRP'), ('maintained', 'VBD'), ('he', 'PRP'), ('was', 'VBD'), ('the', 'DT'), ('best', 'JJS'), ('person', 'NN'), ('to', 'TO'), ('fix', 'VB'), ('the', 'DT'), ('problems', 'NNS'), ('he', 'PRP'), ('created', 'VBD'), ('.', '.')]

noun_and_verbs = []
for token in pos_tagger.tag(tokens):
    if token[1].startswith("V") or token[1].startswith("N"):
        noun_and_verbs.append(token[0])
print(', '.join(noun_and_verbs))

Facebook, CEO, Mark, Zuckerberg, acknowledged, range, mistakes, Wednesday, including, allowing, users, have, profile, data, scraped, outsiders, took, responsibility, maintained, was, person, fix, problems, created

novdov.github.io/nlp/2018/04/05/NLP-POS-Tagging-%ED%92%88%EC%82%AC-%ED%83%9C%EA%B9%85/

 

Stanford Pos Tagger를 이용한 POS Tagging

Stanford Pos Tagger를 이용해 POS tagging 방법을 간단하게 알아봅니다.

novdov.github.io

품사 태깅 약어 정보

www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html

 

Penn Treebank P.O.S. Tags

31. VBP Verb, non-3rd person singular present

www.ling.upenn.edu

Number

Tag

Description

1. CC Coordinating conjunction
2. CD Cardinal number
3. DT Determiner
4. EX Existential there
5. FW Foreign word
6. IN Preposition or subordinating conjunction
7. JJ Adjective
8. JJR Adjective, comparative
9. JJS Adjective, superlative
10. LS List item marker
11. MD Modal
12. NN Noun, singular or mass
13. NNS Noun, plural
14. NNP Proper noun, singular
15. NNPS Proper noun, plural
16. PDT Predeterminer
17. POS Possessive ending
18. PRP Personal pronoun
19. PRP$ Possessive pronoun
20. RB Adverb
21. RBR Adverb, comparative
22. RBS Adverb, superlative
23. RP Particle
24. SYM Symbol
25. TO to
26. UH Interjection
27. VB Verb, base form
28. VBD Verb, past tense
29. VBG Verb, gerund or present participle
30. VBN Verb, past participle
31. VBP Verb, non-3rd person singular present
32. VBZ Verb, 3rd person singular present
33. WDT Wh-determiner
34. WP Wh-pronoun
35. WP$ Possessive wh-pronoun
36. WRB Wh-adverb

반응형
반응형
python - setuptool, setup.py

python.flowdas.com/install/index.html

 

파이썬 모듈 설치 (레거시 버전) — 파이썬 설명서 주석판

소개 파이썬 2.0에서, distutils API가 처음으로 표준 라이브러리에 추가되었습니다. 이는 리눅스 배포 관리자에게 파이썬 프로젝트를 리눅스 배포 패키지로 변환하는 표준 방법을 제공하고, 시스템

python.flowdas.com

python setup.py install


모듈 의존성 관리 — install_requires
비공개 모듈 설치 — dependency_links
콘솔 스크립트 설치 — entry_points
개발 모드 디플로이 — setup.py develop

 

setup.py

from setuptools import setup, find_packages

setup_requires = [
    ]

install_requires = [
    'django==1.6b4',
    'markdown==2.3.1',
    'pyyaml==3.10',
    'pillow==2.1.0',
    'lxml==3.2.3',
    'beautifulsoup4==4.3.1',
    ]

dependency_links = [
    'git+https://github.com/django/django.git@stable/1.6.x#egg=Django-1.6b4',
    ]

setup(
    name='Flowdas-Books',
    version='0.1',
    description='Flowdas Books',
    author='Flowdas',
    author_email='spammustdie@flowdas.com',
    packages=find_packages(),
    install_requires=install_requires,
    setup_requires=setup_requires,
    dependency_links=dependency_links,
    scripts=['manage.py'],
    entry_points={
        'console_scripts': [
            'publish = flowdas.books.script:main',
            'scan = flowdas.books.script:main',
            'update = flowdas.books.script:main',
            ],
        },
    )

www.flowdas.com/blog/%ED%8C%8C%EC%9D%B4%EC%8D%AC-%ED%94%84%EB%A1%9C%EC%A0%9D%ED%8A%B8-%EC%8B%9C%EC%9E%91%ED%95%98%EA%B8%B0-setuptools/

 

파이썬 프로젝트 시작하기 - Setuptools — flowdas

 

www.flowdas.com

 

반응형
반응형

Remove Elements From A Counter

 

counter에서 요소 삭제하기

Question:

How do you remove an element from a counter?

Answer:

Setting a count to zero does not remove an element from a counter. Use del to remove it entirely.

Source: (example.py)

from collections import Counter
 
c = Counter(x=10, y=7, z=3)
print(c)
 
c['z'] = 0
print(c)
 
del c['z']
print(c)

Output:

$ python example.py
Counter({'x': 10, 'y': 7, 'z': 3})
Counter({'x': 10, 'y': 7, 'z': 0})
Counter({'x': 10, 'y': 7})
반응형
반응형

How to add or increment single item of the Python Counter class

 

How to add or increment single item of the Python Counter class

A set uses .update to add multiple items, and .add to add a single one. Why doesn't collections.Counter work the same way? To increment a single Counter item using Counter.update, you have to add i...

stackoverflow.com

python 에서  Counter 증가시키명서 리스트 적용하기. 

>>> c = collections.Counter(a=23, b=-9)

#You can add a new element and set its value like this:

>>> c['d'] = 8
>>> c
Counter({'a': 23, 'd': 8, 'b': -9})


#Increment:

>>> c['d'] += 1
>>> c
Counter({'a': 23, 'd': 9, 'b': -9} 


#Note though that c['b'] = 0 does not delete:

>>> c['b'] = 0
>>> c
Counter({'a': 23, 'd': 9, 'b': 0})


#To delete use del:

>>> del c['b']
>>> c
Counter({'a': 23, 'd': 9})
Counter is a dict subclass
반응형
반응형

Yes Cyber solution is best.

For beginners

  1. Read file in Read mode.
  2. Iterate lines by readlines() or readline()
  3. Use split(",") method to split line by '
  4. Use float to convert string value to float. OR We can use eval() also.
  5. Use list append() method to append tuple to list.
  6. Use try except to prevent code from break.
My text file is:

-34.968398,-6.487265
-34.969448,-6.488250
-34.967364,-6.492370
-34.965735,-6.582322


Code:

p = "/home/vivek/Desktop/test.txt"
result = []
with open(p, "rb") as fp:
    for i in fp.readlines():
        tmp = i.split(",")
        try:
            result.append((float(tmp[0]), float(tmp[1])))
            #result.append((eval(tmp[0]), eval(tmp[1])))
        except:pass

print result


Output:

$ python test.py 
[(-34.968398, -6.487265), (-34.969448, -6.48825), (-34.967364, -6.49237), (-34.965735, -6.582322)]

Read file as a list of tuples, 파일읽어서 튜플 만들기

 

 

반응형

+ Recent posts