Can anyone help me with creating an n-gram frequency program in python?
I don't know what a 'n-gram frequency' is, but I would always advise people to check out good libraries for the stuff they want to do. Especially if the stuff you want to do is supported by Python's powerful standard libraries :-D
where can i find such libraries? Sorry if the question was dumb, beginner here.. n-gram frequency supposedly has this kind of process: Consider the following sentence “Jack and Jill ran up the hill.” Suppose n = 3, which means we’re looking at triplets of adjacent words. Our example sentence has the following sequence of triplets. Just as in question #1, we will ignore the case of letter along with punctuation, numbers, etc. (“ jack” , “ and” , “ jill”) (“ and”, “ jill” , “ran” ) (“ jill”, “ ran” , “up” ) (“ran”, “up”, “the”) (“up”, “the”, “hill”)
Python has powerful built-in string functions, like str.split() (replace str by your sentence) for example the following code: sentence = "Jack and Jill went up the hill" gets you the list ['Jack', 'and', 'Jill', 'went', 'up', 'the', 'hill']
oops forgot to include sentence.split() up there, before the result :P
oh okay, so does that mean libraries is also called as module? I am quite familiar with the built-in function that you showed me. I think that function can be done as soon as you import string..
I thought split was built-in in python 2.7 :D no need to import any module for that. Try the itertools module as well
oh okay i'll check that out! Thanks.. i think you are right it is a built in function, I just have been using it with other functions with the string module..
Natural Language ToolKit ( http://nltk.org ) has a functions for finding n-gram collocations. Read the following chapters: http://nltk.googlecode.com/svn/trunk/doc/book/ch01.html (section on Collocations and Bigrams) http://nltk.googlecode.com/svn/trunk/doc/book/ch03.html (more in-depth information) http://nltk.googlecode.com/svn/trunk/doc/howto/collocations.html (documentation for collocations)
import sys class ostream(object): def __init__(self,file): self.file = file def __lshift__(self, obj): self.file.write(obj) return self cout = ostream(sys.stdout) cerr = ostream(sys.stderr) endl = '\n' char = raw_input("Character is ") num = int(raw_input("Number is ")) n = num - 1 char = char.capitalize() for i in xrange(num): if char == 'L': for k in xrange(n): cout << ' '; n -= 1; if (char == 'R' or char == 'L'): for j in xrange(i+1): cout << 'X'; cout << endl;
Join our real-time social learning platform and learn together with your friends!