Example of Statistical Article Word Number Function Realized by Python

  • 2021-07-10 20:14:07
  • OfStack

In this paper, an example is given to describe the function of counting the number of words in articles realized by Python. Share it for your reference, as follows:

The title is as follows: You have a directory, put your diary for one month, all of which are txt. In order to avoid the problem of word segmentation, assuming that the contents are all in English, please count the most important words in each diary.

In fact, it is to count the words that appear most in an article, but to remove those common conjunctions, prepositions and predicate verbs, the code is:


#coding=utf-8
import collections
import re
import os
useless_words=('the','a','an','and','by','of','in','on','is','to')
def get_important_word(file):
  f=open(file)
  word_counter=collections.Counter()
  for line in f:
    words=re.findall('\w+',line.lower())
    word_counter.update(words)
  f.close()
  most_important_word=word_counter.most_common(1)[0][0]
  count=2
  while(most_important_word in useless_words):
    most_important_word=word_counter.most_common(count)[count-1][0]
    count+=1
  num=word_counter.most_common(count)[count-1][1]
  print 'the most important word in %s is %s,it appears %d times'%(file,most_important_word,num)
if __name__=='__main__':
  filepath='.'
  for dirpath,dirname,dirfiles in os.walk(filepath):
    for file in dirfiles:
      if os.path.splitext(file)[1]=='.txt':
        abspath=os.path.join(dirpath,file)
        if os.path.isfile(abspath):
          get_important_word(abspath)

Study notes:

collections Module, a built-in module of python, provides many useful collection classes. We're using it here Counter Class and the most_common() Method

PS: Here are two related statistical tools for your reference:

Online word count tool:
http://tools.ofstack.com/code/zishutongji

Online Character Statistics and Editing Tools:
http://tools.ofstack.com/code/char_tongji

For more readers interested in Python related contents, please check the topics of this site: "Summary of Python File and Directory Operation Skills", "Summary of Python Text File Operation Skills", "Python Data Structure and Algorithm Tutorial", "Summary of Python Function Use Skills", "Summary of Python String Operation Skills" and "Introduction and Advanced Classic Tutorial of Python"

I hope this article is helpful to everyone's Python programming.


Related articles: