Skip to main content

Notes on algorithms(1) -- Finding Patterns in a File -- An example on Parallelized Programming

Notes on (Gregory R. Andrews).

The problem is to find a string pattern in a file, and print lines containing the patterns.

The Sequential program of the problem is as below:

string line;

read a line of input from stdin into line;

while(!EOF){

look for pattern in line;

if (pattern is in line)

write line;

read next line of input;

}

The point we can parallelize is inside the while loop, the pattern-finding step and the line-reading step. But here we use the same variable line, and we must eliminate it.

So here we induces a variable line2 to perform the reading operation, and variable line1 to perform the finding operation, another variable buffer is to communicate between line1 and line2. The program is as below:

string buffer;
bool done = false;

co #process 1
string line1;
while (true){
wait for buffer to be full or done to be true;
if (done) break;
line1 = buffer;
signal that buffer is empty;
look for pattern in line1;
if (pattern is in line1)
write line1;
}

//#process 2
string line2;
while(true){
read next line of input into line2;
if (EOF) {done = true; break;}
wait for buffer to be empty;
buffer = line2;
signal that buffer is full;
}
oc;

The program use signal - wait mechanism to ensure the use of buffer. The idea is that while reading the buffer, writing operation can't perform.

Comments

Popular posts from this blog

A simple implementation of DTW(Dynamic Time Warping) in C#/python

DTW(Dynamic Time Warping) is a very useful tools for time series analysis. This is a very simple (but not very efficient) c# implementation of DTW, the source code is available at  https://gist.github.com/1966342  . Use the program as below: double[] x = {9,3,1,5,1,2,0,1,0,2,2,8,1,7,0,6,4,4,5}; double[] y = {1,0,5,5,0,1,0,1,0,3,3,2,8,1,0,6,4,4,5}; SimpleDTW dtw = new SimpleDTW(x,y); dtw.calculateDTW(); The python implementation is available at  https://gist.github.com/3265694  . from python-dtw import Dtw import math dtw = Dtw([1, 2, 3, 4, 6], [1, 2, 3, 5],           distance_func=lambda x, y: math.fabs(x - y)) print dtw.calculate() #calculate the distance print dtw.get_path() #calculate the mapping path

Change the default user when start a docker container

When run(start) a docker container from an image, we can specify the default user by passing -u option in command line(In https://docs.docker.com/engine/reference/run/#user ). For example docker run -i -t -u ubuntu ubuntu:latest /bin/bash We can also use the USER instruction in DOCKERFILE to do the same thing(In https://docs.docker.com/engine/reference/builder/#user), note that the option in command line will override the one in the DOCKERFILE. And there is actually another way to start a container with neither DOCKERFILE nor -u option, just by a command like: docker run -i -t ubuntu:latest /bin/bash # with ubuntu as the default user This happens when your start the container from an image committed by a container with ubuntu as the default user. Or in detail: Run a container from some basic images, create ubuntu user inside it, commit the container to CUSTOM_IMAGE:1 . Run a container from CUSTOM_IMAGE:1 with "-u ubuntu" option, and commit the container to CUSTOM...

Install mysql-python with mariadb

mysql-python requires libmysqlclient-dev in ubuntu, but the installation of mariadb will have the lib with unmet dependenccies, so the error of "mysql_config not found" may occurred if you install mysql-python via pip. The case is that mariadb has a compatible package, if you have the ppa setup as in  http://downloads.mariadb.org/ . Just "sudo apt-get install libmariadbclient-dev".