Yucefs-MacBook-Pro2:ITP-ROY yucefmerhi$ python wordfreq.py
This program analyzes word frequency in a file
and prints a report on the n most frequent words.File to analyze: emails
Output analysis of how many words? 60
de 81
la 54
que 45
en 34
a 31
y 31
el 26
mi 24
por 23
es 20
un 20
una 19
http 17
i 17
las 16
para 16
los 15
como 14
me 14
the 14
con 13
to 13
com 12
no 11
tu 11
and 10
arte 10
lo 10
se 10
yucef 10
al 9
te 9
you 9
dos 8
cibernetic 7
del 7
is 7
it 7
muy 7
página 7
años 6
bien 6
he 6
hola 6
org 6
www 6
cualquier 5
espero 5
gracias 5
mis 5
mucho 5
my 5
número 5
sin 5
web 5
artista 4
at 4
entre 4
for 4
have 4
Yucefs-MacBook-Pro2:ITP-ROY yucefmerhi$Code
and prints a report on the n most frequent words.File to analyze: emails
Output analysis of how many words? 60
de 81
la 54
que 45
en 34
a 31
y 31
el 26
mi 24
por 23
es 20
un 20
una 19
http 17
i 17
las 16
para 16
los 15
como 14
me 14
the 14
con 13
to 13
com 12
no 11
tu 11
and 10
arte 10
lo 10
se 10
yucef 10
al 9
te 9
you 9
dos 8
cibernetic 7
del 7
is 7
it 7
muy 7
página 7
años 6
bien 6
he 6
hola 6
org 6
www 6
cualquier 5
espero 5
gracias 5
mis 5
mucho 5
my 5
número 5
sin 5
web 5
artista 4
at 4
entre 4
for 4
have 4
Yucefs-MacBook-Pro2:ITP-ROY yucefmerhi$Code
# wordfreq.py
import string
def compareItems((w1,c1), (w2,c2)):
if c1 > c2:
return - 1
elif c1 == c2:
return cmp(w1, w2)
else:
return 1
def main():
print "This program analyzes word frequency in a file"
print "and prints a report on the n most frequent words.\n"
# get the sequence of words from the file
fname = raw_input("File to analyze: ")
text = open(fname,'r').read()
text = string.lower(text)
for ch in """!"#$%&()*+,-./:;<=>?@[\\]?_'`{|}?""":
text = string.replace(text, ch,' ')
words = string.split(text)
# construct a dictionary of word counts
counts = {}
for w in words:
try:
counts[w] = counts[w] + 1
except KeyError:
counts[w] = 1
# output analysis of n most frequent words.
n = input("Output analysis of how many words? ")
items =counts.items()
items.sort(compareItems)
for i in range(n):
print "%-10s%5d" % items[i]
if __name__ == '__main__': main()
