Skip to main content

The trick of counting support in structured data mining

In most of structured data mining problems, there is trick of counting pattern support in different types of problems. Take sequece mining for example, suppose we have a dataset of 2 sequences as below:

(AB)(C)(AB)(BC)
(BC)(A)(BC)

We have 2 ways of counting the support of pattern (A)(C):
1) Count every appearances of (A)(C), and in this case, the support of (A)(C) would be 5(We call it support-all).
2) Count once for all appearances of (A)(C) in one sequence, in this case, the support of (A)(C) would be 2(We call it support-byseq).

Now, we have 2 types of supports. When dealing with practical problems, we usually name one of the supports as the threshold for frequent patterns. In fact, the other support do have some properties related with the chosen one. The BIDE algorithm setup an example, it uses support-byseq as threshold, and use the other support to form a pruning schema.

Comments

Popular posts from this blog

A simple implementation of DTW(Dynamic Time Warping) in C#/python

DTW(Dynamic Time Warping) is a very useful tools for time series analysis. This is a very simple (but not very efficient) c# implementation of DTW, the source code is available at  https://gist.github.com/1966342  . Use the program as below: double[] x = {9,3,1,5,1,2,0,1,0,2,2,8,1,7,0,6,4,4,5}; double[] y = {1,0,5,5,0,1,0,1,0,3,3,2,8,1,0,6,4,4,5}; SimpleDTW dtw = new SimpleDTW(x,y); dtw.calculateDTW(); The python implementation is available at  https://gist.github.com/3265694  . from python-dtw import Dtw import math dtw = Dtw([1, 2, 3, 4, 6], [1, 2, 3, 5],           distance_func=lambda x, y: math.fabs(x - y)) print dtw.calculate() #calculate the distance print dtw.get_path() #calculate the mapping path

Change the default user when start a docker container

When run(start) a docker container from an image, we can specify the default user by passing -u option in command line(In https://docs.docker.com/engine/reference/run/#user ). For example docker run -i -t -u ubuntu ubuntu:latest /bin/bash We can also use the USER instruction in DOCKERFILE to do the same thing(In https://docs.docker.com/engine/reference/builder/#user), note that the option in command line will override the one in the DOCKERFILE. And there is actually another way to start a container with neither DOCKERFILE nor -u option, just by a command like: docker run -i -t ubuntu:latest /bin/bash # with ubuntu as the default user This happens when your start the container from an image committed by a container with ubuntu as the default user. Or in detail: Run a container from some basic images, create ubuntu user inside it, commit the container to CUSTOM_IMAGE:1 . Run a container from CUSTOM_IMAGE:1 with "-u ubuntu" option, and commit the container to CUSTOM...

Install mysql-python with mariadb

mysql-python requires libmysqlclient-dev in ubuntu, but the installation of mariadb will have the lib with unmet dependenccies, so the error of "mysql_config not found" may occurred if you install mysql-python via pip. The case is that mariadb has a compatible package, if you have the ppa setup as in  http://downloads.mariadb.org/ . Just "sudo apt-get install libmariadbclient-dev".