I'm working on problem set ps3b. I've gotten def comp_choose_word to return a word to me, which is good. However, it's slow as crap because of all the searching it has to do. Like, go get coffee and add some milk and sugar slow. I'm going to continue to get 6A, 6B, and 6C to function since I don't have that much left, and then my first goal will be to try to speed up this search business. (Along with a bunch of other stuff I could probably improve because I'm new to programming.) Anyways, this is my code for comp_choose_word, if you have any suggestions:
def comp_choose_word(hand, word_list): """ Given a hand and a word_dict, find the word that gives the maximum value score, and return it. This word should be calculated by considering all possible permutations of lengths 1 to HAND_SIZE. hand: dictionary (string -> int) word_list: list (string) """ # TO DO... #hand = deal_hand(n) newhand = hand.copy() print newhand global comphand comphand = newhand print "Computer's hand: ", display_hand(hand) words = load_words() for i in range(1, n+1): possi = get_perms(hand, i) for p in possi: for w in words: if p == w: global biggest biggest = p break return biggest
I see several problems here. 1. Everything before the for loop is not necessary as hand and word_list are sent to the function. 2. There are no variables before the for loop to keep track of the highest scoring word and it's score. 3. Your for loop, loops through the possible word lengths, get's the perms at each length, checks to see if they are in the word list and if so sets an unneeded global and breaks out of the loop. Then continues to check all other possibilities and returns the last biggest word it finds. Instead of using global biggest just set biggest = '' before the for loop. 4. You never call get_word_score so you can return the best scoring word and simply return the last longest word you find. 5. Instead of using "for w in words:, if p == w: and break" simply use "if p in words:" I changed from the second option to your way and it almost doubled the time in my test. Using "if p in words:" >>> import cProfile >>> cProfile.run("comp_choose_word({'t':2,'s':2,'e':3}, word_list)") 187173 function calls (66884 primitive calls) in 67.559 seconds Using "for w in words ..." >>> import cProfile >>> cProfile.run("comp_choose_word({'t':2,'s':2,'e':3}, word_list)") 187173 function calls (66884 primitive calls) in 132.940 seconds So with that said here is your code slightly modified as above. biggest = '' for i in range(1, n+1): possi = get_perms(hand, i) for p in possi: if p in words: biggest = p return biggest Now let's add the best scoring word and not just the last longest word. biggest = '' best_score = 0 for i in range(1, n+1): possi = get_perms(hand, i) for p in possi: if p in words: # call get_word_score to get score of current word # if score is better than best_score # update best_score and biggest biggest = p return biggest Fill in the three commented lines and you will have a fully working function. As you can see from cProfile above this can still be slow but to make a significant difference above this you would have to change the dictionary structure to likely a hash table of some sort or similar concept, which of course then requires changes to any code that uses it.
One thing I missed is that without the code above the for loop 'words' needs to be changed to 'word_list' that is sent to the function.
the 2008 pset (ps6) went a bit further with the idea. when the words are loaded you can make 2 dictionaries: {word:points} and {sorted word: word} e.g. ... {'ehlp':'help'} now you can search through the "sorted" dictionary and use the "points" dictionary to get the points. since the strings you are searching are sorted, you can get away with generating combinations instead of permutations --> this is a huge savings, especially for larger hands. this has to do with whether order matters or not. the 2 letter permutations of 'love' are lo , lv , le , ol , ov , oe , vl , vo , ve , el , eo , ev the 2 letter (sorted) combinations are: el , eo , ev , lo , lv , ov you get a speed hit at startup for making the two dictionaries. you get a speed increase for getting the points for each word you find you get a speed increase because you don't have as many iterations you can use itertools.permutations() or itertools.combinations() to generate the sets and subsets. http://dpaste.com/1057264/ http://docs.python.org/2.7/library/itertools.html
msmithhnova just changing it to 'if p in words' made a significant improvement on the time the computer took to choose a word. I'm going to continue messing with the program and implementing some of the other suggestions, including bwCA's idea of turning the lists into dictionaries. Another thing I'm curious about is timers. I wouldn't mind the program stopping after a set time and returning whatever current word 'biggest' is set to at that time. Thanks to both of you for your help!
I know I closed this quetion, but looking at bwCA's code comparison of combinations vs. permutations caused me to have a question. I noticed that the combination list was MUCH shorter, but it didn't include everything the permutation list had. With combinations, will some of the possible words be left out?
I believe what he means by the {sorted word: word} dictionary is that since the words 'these' and 'sheet' have the same letters and number of letters they would be represented something like this {eehst:[these,sheet]}. If all the keys had multiple values like this then you only need to search for the sorted combinations and not all permutations which would cut down your search. If you happened to have the letters 'eehst' then it could look at the words corresponding to that key and if not it could skip over that particular set so rather than having to search for all 'permutations', you only search for the smaller set of sorted "combinations'. Thus the 'combinations' you search for are shorter and so is the keys in the dictionary you are searching through. Something I would note though is that making the dictionary of all word scores he mentions {word:points} has questionable worth as you likely won't use all words during a game and thus do not need the scores for them all. In my trial runs above the get_word_score portion only took 0.017 seconds of the total time so unless pre-calculating them all took less than a few seconds I likely would not do that. I would have to test it.
loading the words from the file into a dictionary {word : points} takes 0.2 seconds. The point calculation is in the 'load' function - the function returns the dictionary.
0,2 seconds to get the sorted word dictionary
Cool, 0.2 seconds isn't bad. Wouldn't make a lot of difference either way then as both times are small. The 0.017 seconds above is for 372 calls on get_word_score so you would have to call get_word_score about 4376 times for it to use up 0.2 seconds. So, for a run of the program that 0.2 seconds is still larger than the time get_word_score would use but for 0.2 seconds does it really matter? lol You wouldn't notice the difference either way.
Join our real-time social learning platform and learn together with your friends!