good evening.i trying to find out how to design a kernel that would allow me to align protein, some sample details example using SVM
anyone with a good link that show how to calculate score on protein alignment, sample of kernel computation or SVM application or even rapidminer application on predicting protein or polynucleotide alignment
http://videolectures.net/mlg08_joachims_sop/ does not provide more details computation
Example: Proteins can be represented as sequences of variable length, typically several hundred characters long, from the alphabet of 20 amino acids. In order to use learning algorithms that require vector inputs, we must first find a suitable feature vector representation, mapping sequence x into a vector space by . Kernel methods, such as SVMs, need to compute only inner products, called kernels, , for training and testing. Thus, we can accomplish the above mapping using a kernel for sequence data. Biologically motivated sequence comparison scores, such as SW or BLAST, provide an appealing representation of sequence data. The SW algorithm (Smith and Waterman, 1981) uses dynamic programming to compute the optimal local gapped alignment score between two sequences, whereas BLAST (Altschul et al., 1990) approximates SW by computing a heuristic alignment score. Both methods return empirically estimated E-values indicating the confidence of the score. These alignment-based scores do not define a positive definite kernel; however, one can use a feature representation based on the empirical kernel map , where is the pairwise score (or E-value) between x and y and , , are the training sequences. Using E-values of SW algorithm in this fashion gives strong classification performance (Liao and Noble, 2002). Note, however, that the method is slow, both because computing each SWscore is and because computing each empirically mapped kernel value is O(m).
Join our real-time social learning platform and learn together with your friends!