Skip to main content

The segmentation of Time Series

In 2003, Keogh, E., Chu, S., Hart, D., Pazzani and M. Segmenting had written Time Series: A Survey and Novel Approach. in Data Mining in Time Series Databases published by World Scientific Publishing Company. The article talks about how to convert time series data with high frequency noise into some linear regression segmentations. If the original time series data has N timestamps, the linear regression result has K segments(K is far smaller than N), the storage and transmission space required of the data would be minimized.

The article summarized three typical method of segmenting time series: sliding window, top-down and bottom-up.

  • Sliding window method is to consider the time series as a data stream, and segment the series from early time to later time data. Every time the method steps forward a timestamp, adds the current value to exist window, or closes the current window and establishes a new window for incoming data. The method could be used as online processing method.
  • Top-down method is to recursively cut the current data segment into 2 sub segments, until some error threshold is reached.
  • Bottom-up method is to construct a lot of very small pieces segments, and iteratively combine the two segments with minimal error according to original time series.

The results show that top-down and bottom-up methods approximate the original time series better than sliding window due to the sliding window method is lack of global view about data. But top-down and bottom-up cannot be used to perform online time series segmentation.

The article provide a new algorithm SWAB based on bottom-up and sliding window which provide a semi-global view of the time series data. The algorithm establishes a buffer for several sliding windows, and uses bottom-up method to combine them. The SWAB even achieves a better approximation than the bottom-up method.

Comments

Daniel Lemire said…
Here is a related paper you might enjoy:

Daniel Lemire, A Better Alternative to Piecewise Linear Time Series Segmentation, SIAM Data Mining 2007, 2007.

http://arxiv.org/abs/cs.DB/0605103

Cheers!

Popular posts from this blog

A simple implementation of DTW(Dynamic Time Warping) in C#/python

DTW(Dynamic Time Warping) is a very useful tools for time series analysis. This is a very simple (but not very efficient) c# implementation of DTW, the source code is available at  https://gist.github.com/1966342  . Use the program as below: double[] x = {9,3,1,5,1,2,0,1,0,2,2,8,1,7,0,6,4,4,5}; double[] y = {1,0,5,5,0,1,0,1,0,3,3,2,8,1,0,6,4,4,5}; SimpleDTW dtw = new SimpleDTW(x,y); dtw.calculateDTW(); The python implementation is available at  https://gist.github.com/3265694  . from python-dtw import Dtw import math dtw = Dtw([1, 2, 3, 4, 6], [1, 2, 3, 5],           distance_func=lambda x, y: math.fabs(x - y)) print dtw.calculate() #calculate the distance print dtw.get_path() #calculate the mapping path

Change the default user when start a docker container

When run(start) a docker container from an image, we can specify the default user by passing -u option in command line(In https://docs.docker.com/engine/reference/run/#user ). For example docker run -i -t -u ubuntu ubuntu:latest /bin/bash We can also use the USER instruction in DOCKERFILE to do the same thing(In https://docs.docker.com/engine/reference/builder/#user), note that the option in command line will override the one in the DOCKERFILE. And there is actually another way to start a container with neither DOCKERFILE nor -u option, just by a command like: docker run -i -t ubuntu:latest /bin/bash # with ubuntu as the default user This happens when your start the container from an image committed by a container with ubuntu as the default user. Or in detail: Run a container from some basic images, create ubuntu user inside it, commit the container to CUSTOM_IMAGE:1 . Run a container from CUSTOM_IMAGE:1 with "-u ubuntu" option, and commit the container to CUSTOM...

The default CREATE TABLE options for Aria Engine in mariadb

The official document of mariadb does not mention the default CREATE TABLE options for tables using Aria Engine.  The default options are list as below: TRANSACTIONAL,  the default value is TRANSACTIONAL=0, i.e., non-transactional. ROW_FORMAT, the default value is ROW_FORMAT=PAGE, which may suits both transactional and non-transactional tables. PAGE_CHECKSUM,  the default value will follow aria_page_checksum system variable, which has default value ON. For the TRANSACTIONAL option, you may consider create a table as below(and ALTER the TRANSACTIONAL=1): CREATE TABLE `test_table` ( `id` int(11) NOT NULL AUTO_INCREMENT, PRIMARY KEY (`id`) ) ENGINE=Aria; If you change the ROW_FORMAT to DYNAMIC or FIXED, everything just goes fine. But if you have ALTER the table with TRANSACTION=1 and change the ROW_FORMAT to DYNAMIC or FIXED, you may got a warning: SHOW WARNINGS; +-------+------+----------------------------------------------------------+ | Level | Code...