Seminar: Building Scalable Big Data Pipelines: Graph Processing and Genome Assembly
Kisung Lee
Assistant Professor, LSU Division of Computer Science and Engineering
Friday September 4, 2020
3:00 pm
Location: online-only
Abstract
The volume of real-world data in many domains is growing at an unprecedented rate. To address such big data challenges, I have been working on several research projects for designing and building scalable techniques and frameworks. This presentation will focus specifically on two distributed frameworks, one for scalable assembly of third-generation genome sequences and the other for scalable graph data processing using a NoSQL store.
I will first present a distributed genome assembly framework that can assemble large-scale third-generation sequence datasets using thousands of cores, resulting in faster assembly. The framework is built on the map-reduce computation model. I will then describe a distributed graph processing framework for iterative algorithms. The framework utilizes a disk-based NoSQL system to process big graph data in a scalable manner while improving the overall performance through several optimization techniques.
Bio
Dr. Kisung Lee is an assistant professor in the Division of Computer Science and Engineering at Louisiana State University. He received his doctoral degree in computer science from the Georgia Institute of Technology in 2015. During his doctoral study, he spent three summers at IBM Research T.J. Watson as a research intern. His research interests lie in the intersection of big data and distributed data-intensive systems. He is also working on research problems in spatial data management, social network analytics, and bioinformatics. He is a recipient of the Tiger Athletic Foundation Undergraduate Teaching Award in 2020. He served as a Program Committee Vice-Chair for IEEE BigData 2018.