MIT 6.00 Intro Computer Science (OCW) 27 Online
OpenStudy (anonymous):

Prob Set 3. I'm scratching my head about the starts1(2) part. Not sure how to proceed. I'm reading it as a fuzzy match type problem. Any thoughts?

OpenStudy (anonymous):

ps. My code so far on the string search part seems to work fine: def subStringMatchExact(w,l): """w = word, l = search string""" import string s = 0 index = () for i in w: z = string.find(w, l, s) s = z + 1 if z not in index and z != -1: index = index + (z,) print "Indexes of your search string are:", index Any help would be appreciated - Maybe I just need some sleep. ;-)

OpenStudy (anonymous):

I remember that my original solution to this problem was full of ugly kludges. If you want to work through it together later tonight I'd be in.

OpenStudy (anonymous):

Cool - I won't have much time tonight, so I wrote up a quick bit of code that may work. I haven't assigned variables or anything, so it is very raw. Here is my basic idea: def fuzzymatching(): import string w = "abcaabaacabbaccacb" p = "abc" print "Word is:", w print "Search item is:", p s1 = p[0] s2 = p[1] s3 = p[2] s4 = p[:2] s5 = p[1:3] s6 = p[0]+p[2] lista = [s1,s2,s3,s4,s5,s6,] for d in lista: g = string.find(w,d,0) print d, "is at index", g

OpenStudy (anonymous):

Ok well, I quickly ran over my code and produced this instead, which works, albeit its ugly! lol def fuzzymatching(): import string w = "abcaabaacabbaccacb" p = "abc" print "Word is:", w print "Search item is:", p s1 = p[0] s2 = p[1] s3 = p[2] s4 = p[:2] s5 = p[1:3] s6 = p[0]+p[2] lista = [s1,s2,s3,s4,s5,s6,] print "list a is:", lista #print s1, s2, s3, s4, s5, s6 h = 0 for e in w: for d in lista: #while h < len(w): g = string.find(w,d,h) h = h + 1 if g != -1: print d, "is at index", g

OpenStudy (anonymous):

Cool. I'm looking at a way to save the locations in lists or a big list

OpenStudy (anonymous):

Ya - right now my main problem is writing a function to break down the abc into "a,b,c,ab,ac" automatically, for any size string, and for any combination I specify...

OpenStudy (anonymous):

yeah, that was the real nightmare last time I tried this--two different segments meant two tuples of location tuples, then matching them up....This time I think i want to do it in sequence--so find [1:], record all locations, find [0] and [2:], record all locations, etc.

OpenStudy (anonymous):

Well I can't be too hard on myself - my total programming experience is less than 3 weeks! I'm off to bed - have a good one...let me know how your attempts turns out. I'm still working on that ps.

OpenStudy (anonymous):

cool, I'll put this away for the night after I finish this bit.

OpenStudy (anonymous):

I made some progress...I am working on this now: def fuzzier3(): """For W, add a word in quotes. For S, add a letter(s) to find in w.""" import string w = "abcabdabeabcmmmma" s = "a" i = 0 z = 0 print s while i <= len(w): j = string.find(w, s, i) i = i + 1 if j != -1: print "Index for each letter combination is:", j i = j + 1 z = z + 1 print "The amount of times the string is found in your word:", z You think I'm on the right track?

OpenStudy (anonymous):

Getting the first one right is going to be crucial for the 3b, 3c,and 3d. First, I suggest using more descriptive names, and combining the while with the if j != -1. Also note that he wants you be using tuples and a specific function setup, Here is what I came up with for the iterative part for 3a: def subStringMatchExact(target,key): locations = () found = find(target,key) while found != -1: locations = locations[:] + (found,) found = find(target,key,found+1) return locations While it looks simple, there are a few tricks in there you should know, like how to dynamically create and append to tuples. Hopefully this will help you see the rest of the assignment a little more clearly

OpenStudy (anonymous):

if you want to return the number of matches use this at the end instead: return len(locations)

OpenStudy (anonymous):

I never thought of writing this: locations = locations[:] + (found,). Thanks! My other working trial has better descriptions of variables. I start by using very simple single letters, otherwise I find I get mixed up. Thanks for the advice...and I am still reading about tuples...haven't read everything yet though.

OpenStudy (anonymous):

Aslander, I'm not entirely sure what part of the problem you're solving with the previous. I still think what you need is a way to generate all the different substrings you need to look for. I suggest using an integer to represent an index number in the original string, and search for two strings broken using that integer. Something like targetstring = the string I want to find stuff in keystring = the string I want to find indexnumber = 0 for indexnumber in range (length of keystring): firstsubstring = the part of the keystring from its beginning to the current indexnumber secondsubstring = the part of the keystring from the current indexnumber to the end then you can store those to look for them later, or you can look for them and figure out how to match them up to find strings with one element changed, or whatever.

OpenStudy (anonymous):

Oh, and for either firstsubstring or secondsubstring you'll have to add/subtract from indexnumber to make sure you're skipping a letter, and you'll have to do something (if statement or whatever) to make sure you don't get an indexerror when that addition/subtraction puts the indexnumber out of the index of the keystring.

OpenStudy (anonymous):

Somnamniac: What I was trying was to first solve the small problem of finding a search letter (s) in a word(w). If that code worked I would then tweak it to find a combination of letters in "w". If this worked, then I would try and enter the tuples into the code, and try finding combinations of the two letter combos in "w".

OpenStudy (anonymous):

For some reason I can find the index of combos, but not make it recur...or iterate, for any combination of two letters in the key.

OpenStudy (anonymous):

I'm not sure what you mean by "a combination of letters." is that one string, or two strings, or a series of strings of single letters? The way I see it, the first thing you posted in this thread will give you (or can be made to give you, I forget) a tuple of locations of a key string in a target string. If you just call that function a bunch of times with different key strings, you'll get a tuple of the locations of each keystring. so if your original keystring is '12345678', you can call that function with '' then '2345678', then '1' and '345678' etc, and the problem becomes one of taking the tuples of locations you get and matching them up to find where the first string is separated from the second by one index number.

OpenStudy (anonymous):

Ok - I see now, I was reading the problem incorrectly, and got confused...my bad. :-) I ended up writing code that would concatenate abc as abc -a - ab -ac - b - bc. But where is the ac? I'm working on that now...thanks for making me realize that somnamniac.

OpenStudy (anonymous):

ps. The "abc as abc -a - ab -ac - b - bc. ", should be without the ac....there should be a way to fix typos! :-)

OpenStudy (anonymous):

you don't want ac, you want 'a c', right?

OpenStudy (anonymous):

or rather, 'a' and 'c' with something else between them.

OpenStudy (anonymous):

Exactly...a and c with something in between.

OpenStudy (anonymous):

Now if I use the key "abc". I would need to get "something"+bc or a+"something"+c, or ab+"something", where Something = a, b or c. Would that be right?

OpenStudy (anonymous):

If anyone is interested - the code I got to work is this: def findmatch(word, search): """Finds a search item in a word.""" import string n = 0 searchtuple = () for i in word: index = string.find(word,search,n) n = index + 1 if index != -1 and index not in searchtuple: searchtuple = searchtuple + (index,) print "hi", checkkeys(search) return searchtuple, "Keys from search item:", checkkeys(search) ps. checkkeys(search) is another function I wrote to return all those substrings/keys. It works pretty good, but I could still write it better...or finetune it.

OpenStudy (anonymous):

Interesting, I made a different assumption than you did. I decided that overlapping results were not really separate results. So if you're searching for 'abab' in 'ababab', you'd just get 0. This is what I came up with: def subStringMatchExact(target, key) : lastfind = find(target, key), while find (target, key, lastfind[-1] + len(key)) > -1 : lastfind += find(target, key, lastfind[-1] + len(key)), if lastfind == (-1,) : return () else: return lastfind

OpenStudy (anonymous):

I understood the problem to demand returns of all instances of the key, regardless of overlapping. It also called for fuzzy matching, or if the key is abcd, then we would need to return any matches for a(n)cd, ab(n)d, abc, and bcd. In these cases, anything close might hit in the target string. My checkkeys(search) function tries to do that, returning everything in tuples.

OpenStudy (anonymous):

I'm slightly confused on part 4 of the problem. The sentence that says "If we keep only those elements of the second tuple that don't occur in the first tuple, we will have the matches with exactly one substitution" Using target1 and key12, if I input them into subStringExactMatch I get (5,15). If I input these into subStringMatchOneSub I get (5,15) for all subs except for when the key is broken up to "atg" and " ", where the possible matches are (0,5,15). From what I am understanding then the match with one substitution would be the tuple (0). Am I missing something here? Thanks.

OpenStudy (anonymous):

What I understood was that the functions we wrote will return hits of keys with and without substitutions. What problem 4 says to me is: do not get exact matches, but only those with one sub, by using your answers(functions) to problem 2+3, and use only thoes elements from Tuple 2 that do not match (or occur) in Tuple1. Now if I misunderstood things as well, then I have a bigger problem than you do! :-) Food for thought though, and I think I'll have to go over that problems set now!

OpenStudy (anonymous):

I'll give it another look a bit later. Thanks for the help :)