Skip to main content

Posts

Showing posts from November, 2006

Notes on Sequential Pattern Mining (2) -- Partial Order Pattern Mining and Contrast Mining

1. In , the authors induce TEIRESIAS algorithms to mining combinatorial patterns with gap constraints in biological sequences. The patterns TEIRESIAS mined is similiar with the common sequential patterns, but it could contain "." the wild card which is also in the alphbel of the sequences database standing for any other item available, for example pattern "A..B" is a length-4 pattern, with two arbitrary items between the first A and the last B. Patterns "AC.B", "AADB" are all said to be more specific than pattern "A..B". TEIRESIAS mining all the maximal patterns () with a support over a min threshold K. There some key points of TEIRESIAS algorithms: 1)The growth of the patterns The growth of the patterns is accomplished by convolute current pattern by a short length pattern. Pattern A and pattern B are convolutable if the last L(very small) characters of pattern A is the same as the first L characters of pattern B, then