.: Most Read :.

Join Churp Churp

Join Churp Churp
Click on the image to find out more

Tuesday, February 28, 2012

Parallel execution of distributed SVM using MPI (CoDLib)

That is the title of our (me, supervisor and co-supervisor) conference paper. Alhamdulillah ada jugak nama dalam IEEE Xplore. This paper has been presented in 2011 ICIMu International Conference. The journey is not easy. We took months to write it… and the conference fees is not cheap. Thank Allah all of the efforts (fees, review, ‘add-some-peppers’) in publishing this paper is mainly made by my supervisor and co-supervisor. Yes, thanks also to one of my loyal proof-read-friend, Nur Syahinaz. Her english (or anglais – French) is great. Stay loyal my dear. A lot more papers are coming.

imagehttp://ieeexplore.ieee.org/xpl/freeabs_all.jsp?reload=true&arnumber=6122723

Okay. What this paper is all about?

Support Vector Machine (SVM) is an efficient data mining approach for data classification. However, SVM algorithm requires very large memory requirement and computational time to deal with very large dataset. To reduce the computational time during the process of training the SVM, a combination of distributed and parallel computing method, CoDLib have been proposed. Instead of using a single machine for parallel computing, multiple machines in a cluster are used. Message Passing Interface (MPI) is used in the communication between machines in the cluster. The original dataset is split and distributed to the respective machines. Experiments results shows a great speed up on the training of the MNIST dataset where training time has been significantly reduced compared with standard LIBSVM without affecting the quality of the SVM.

Too technical? Let me explain in a simpler way.

There is an approach in data classification – SVM. But this method performs quite slow when dealing with large number of input (dataset). Due to that, we have proposed a combination of distributed and parallel computing method – CoDLib. By using this method, we used a cluster – consists of multiple computers. In the end, the training time taken the results give huge speed up compare to the original sequential program.

I am happy. Really happy. and of course my family will so proud of me. And this makes me nak write more conference paper in the future. I hope Allah will gives me a great team again.

No comments:

Post a Comment