Skip to main content

Notes on algorithms(1) -- Finding Patterns in a File -- An example on Parallelized Programming

Notes on (Gregory R. Andrews).

The problem is to find a string pattern in a file, and print lines containing the patterns.

The Sequential program of the problem is as below:

string line;

read a line of input from stdin into line;

while(!EOF){

look for pattern in line;

if (pattern is in line)

write line;

read next line of input;

}

The point we can parallelize is inside the while loop, the pattern-finding step and the line-reading step. But here we use the same variable line, and we must eliminate it.

So here we induces a variable line2 to perform the reading operation, and variable line1 to perform the finding operation, another variable buffer is to communicate between line1 and line2. The program is as below:

string buffer;
bool done = false;

co #process 1
string line1;
while (true){
wait for buffer to be full or done to be true;
if (done) break;
line1 = buffer;
signal that buffer is empty;
look for pattern in line1;
if (pattern is in line1)
write line1;
}

//#process 2
string line2;
while(true){
read next line of input into line2;
if (EOF) {done = true; break;}
wait for buffer to be empty;
buffer = line2;
signal that buffer is full;
}
oc;

The program use signal - wait mechanism to ensure the use of buffer. The idea is that while reading the buffer, writing operation can't perform.

Comments

Popular posts from this blog

A simple implementation of DTW(Dynamic Time Warping) in C#/python

DTW(Dynamic Time Warping) is a very useful tools for time series analysis. This is a very simple (but not very efficient) c# implementation of DTW, the source code is available at  https://gist.github.com/1966342  . Use the program as below: double[] x = {9,3,1,5,1,2,0,1,0,2,2,8,1,7,0,6,4,4,5}; double[] y = {1,0,5,5,0,1,0,1,0,3,3,2,8,1,0,6,4,4,5}; SimpleDTW dtw = new SimpleDTW(x,y); dtw.calculateDTW(); The python implementation is available at  https://gist.github.com/3265694  . from python-dtw import Dtw import math dtw = Dtw([1, 2, 3, 4, 6], [1, 2, 3, 5],           distance_func=lambda x, y: math.fabs(x - y)) print dtw.calculate() #calculate the distance print dtw.get_path() #calculate the mapping path

Install mysql-python with mariadb

mysql-python requires libmysqlclient-dev in ubuntu, but the installation of mariadb will have the lib with unmet dependenccies, so the error of "mysql_config not found" may occurred if you install mysql-python via pip. The case is that mariadb has a compatible package, if you have the ppa setup as in  http://downloads.mariadb.org/ . Just "sudo apt-get install libmariadbclient-dev".

The default CREATE TABLE options for Aria Engine in mariadb

The official document of mariadb does not mention the default CREATE TABLE options for tables using Aria Engine.  The default options are list as below: TRANSACTIONAL,  the default value is TRANSACTIONAL=0, i.e., non-transactional. ROW_FORMAT, the default value is ROW_FORMAT=PAGE, which may suits both transactional and non-transactional tables. PAGE_CHECKSUM,  the default value will follow aria_page_checksum system variable, which has default value ON. For the TRANSACTIONAL option, you may consider create a table as below(and ALTER the TRANSACTIONAL=1): CREATE TABLE `test_table` ( `id` int(11) NOT NULL AUTO_INCREMENT, PRIMARY KEY (`id`) ) ENGINE=Aria; If you change the ROW_FORMAT to DYNAMIC or FIXED, everything just goes fine. But if you have ALTER the table with TRANSACTION=1 and change the ROW_FORMAT to DYNAMIC or FIXED, you may got a warning: SHOW WARNINGS; +-------+------+----------------------------------------------------------+ | Level | Code | Message | +-------+--