Print Email Facebook Twitter Exploring Multicore Architectures For Streaming Applications Title Exploring Multicore Architectures For Streaming Applications Author Kulkarni, Rujuta (TU Delft Electrical Engineering, Mathematics and Computer Science) Contributor Al-Ars, Zaid (mentor) Degree granting institution Delft University of Technology Programme Computer Engineering Date 2019-11-04 Abstract The Smith Waterman algorithm is used to perform local alignment on biological sequences by calculating a similarity matrix. This process is computation-intensive. Only the elements along the minor diagonal of the matrix can be calculated in parallel, due to the nature of dependencies present in the algorithm. In the past, CPUs, GPUs and FPGAs have been used to implement the Smith Waterman algorithm. While GPUs offer better performance as compared to FPGAs and are easier to program, they have higher power consumption. The FPGA implementations typically employ systolic arrays, which consist of processing elements connected in a regular manner through which data is streamed. Custom designed processing elements for an FPGA implementation entails a lot of effort. In this thesis, we investigate alternative architectures to provide performance with a lower power profile and ease of programmability. We design a systolic array architecture with general purpose processors and map the Smith Waterman algorithm on it. The design of the systolic array consists of scratchpad memories to store intermediate data. Since employing multiple processors is a common method to extract more performance nowadays, we compare our architecture with a multicore architecture. Simulation results show that the systolic array architecture promises more speedup than the multicore architecture, achieving a performance of up to 1.5MCUPS for 16 processing elements, which is 4x times faster than a 16-processor multicore architecture. Moreover the performance of the systolic array architecture scales well with increasing number of processors as compared to the multicore architecture. Mapping the SW algorithm to the systolic array architecture is possible using only 100 lines of code programmed within 2 person-weeks in C which is a standard, familiar language. Our experiences with mapping the algorithm onto the systolic array architecture show that it could result into a CUDA-like programming paradigm. Subject Multicore architectureSystolic arraySmith waterman To reference this document use: http://resolver.tudelft.nl/uuid:3cbbd723-5fc8-481a-9dc3-263163589f0e Part of collection Student theses Document type master thesis Rights © 2019 Rujuta Kulkarni Files PDF Thesis_Rujuta_Kulkarni.pdf 3.28 MB Close viewer /islandora/object/uuid:3cbbd723-5fc8-481a-9dc3-263163589f0e/datastream/OBJ/view