對文本進行情感分析:TextBlob

jopen 10年前發布 | 43K 次閱讀 TextBlob Python開發

TextBlob是一個用 Python (2和3)編寫的開源的文本處理庫。它可以用來執行很多自然語言處理的任務,比如,詞性標注,名詞性成分提取,情感分析,文本翻譯,等等。你可以在官方文檔閱讀TextBlog的所有特性。

為什么我要關心TextBlob?

我學習TextBlob的原因如下:

  1. 我想開發需要進行文本處理的應用。我們給應用添加文本處理功能之后,應用能更好地理解人們的行為,因而顯得更加人性化。文本處理很難做對。TextBlob站在巨人的肩膀上(NTLK),NLTK是創建處理自然語言的Python程序的最佳選擇。

  2. 我想學習下如何用 Python 進行文本處理。

from textblob import TextBlob

text = '''
The titular threat of The Blob has always struck me as the ultimate movie
monster: an insatiably hungry, amoeba-like mass able to penetrate
virtually any safeguard, capable of--as a doomed doctor chillingly
describes it--"assimilating flesh on contact.
Snide comparisons to gelatin be damned, it's a concept with the most
devastating of potential consequences, not unlike the grey goo scenario
proposed by technological theorists fearful of
artificial intelligence run rampant.
'''

blob = TextBlob(text)
blob.tags           # [(u'The', u'DT'), (u'titular', u'JJ'),
                    #  (u'threat', u'NN'), (u'of', u'IN'), ...]

blob.noun_phrases   # WordList(['titular threat', 'blob',
                    #            'ultimate movie monster',
                    #            'amoeba-like mass', ...])

for sentence in blob.sentences:
    print(sentence.sentiment.polarity)
# 0.060
# -0.341

blob.translate(to="es")  # 'La amenaza titular de The Blob...'

特性

  • Noun phrase extraction
  • Part-of-speech tagging
  • Sentiment analysis
  • Classification (Naive Bayes, Decision Tree)
  • Language translation and detection powered by Google Translate
  • Tokenization (splitting text into words and sentences)
  • Word and phrase frequencies
  • Parsing
  • n-grams
  • Word inflection (pluralization and singularization) and lemmatization
  • Spelling correction
  • Add new models or languages through extensions
  • WordNet integration

項目主頁:http://www.baiduhome.net/lib/view/home/1408694649350

 本文由用戶 jopen 自行上傳分享,僅供網友學習交流。所有權歸原作者,若您的權利被侵害,請聯系管理員。
 轉載本站原創文章,請注明出處,并保留原始鏈接、圖片水印。
 本站是一個以用戶分享為主的開源技術平臺,歡迎各類分享!