Computer Architecture and SysTems Research Lab (CASTL)

Address: 114 Milton Carothers Hall (MCH), Tallahassee, FL 32306; Contact: Dr. Weikuan Yu (yuw@cs.fsu.edu), 850-644-5442

Tadoop: A Dual-Purpose Framework for Data Analytics and HPC

This project is supported by an NSF award ACI-1432892 titled as EAGER: Tadoop: A Dual-Purpose Framework Taming the Bipolarity of Storage and Communication for High-Performance Computing and Data Analytics.

Contact: Dr. Weikuan Yu

Project Mission

High-performance computing (HPC) providers and applications need next-generation solutions to process big data from scientific simulations. Conventional HPC systems found in national laboratories and universities are constructed based on the compute-centric paradigm while enterprise big data analytics applications prefer a data-centric paradigm such as MapReduce. Distinct architectural differences between these two paradigms demand unconventional approaches. This project takes a radically different approach to investigate key architectural components in compute-centric and data-centric paradigms, designs a transformative dual-purpose framework called Tadoop that addresses their bipolarity issues in storage and communication management, and unifies them for both HPC and enterprise analytics applications.

Research Activities

This high-risk Tadoop framework can enable a transformative data infrastructure for both HPC and data analytics applications and lead to broader impact in several aspects, such as demonstrating the transformation of existing HPC infrastructures into dual-purpose systems for computing and analytics, improving computer science curricula and instruction effectiveness, strengthening multidisciplinary data analytics research, releasing open-source software code, and transferring technologies for commercial service.

Research Accomplishments


  1. Dr. Jianhui Yue
  1. Yandong Wang
  2. Cong Xu
  3. Zhuo Liu
  4. Fang Zhou
  5. Teng Wang
  6. Kevin Vasko
  7. Hai Pham


  1. Y. Wang, R. Goldstone, W. Yu, T. Wang. Characterization and Optimization of Memory-Resident MapReduce on HPC Systems. 28th IEEE International Parallel and Distributed Processing Symposium (Acceptance rate: 21%). Tucson, AZ. May 2014.
  2. C. Xu*, R. Goldstone, Z. Liu*, H. Chen*, B. Neitzel, W. Yu. Exploiting Analytics Shipping with Virtualized MapReduce on HPC Backend Storage Servers. IEEE Transactions on Parallel and Distributed Systems. DOI: 10.1109/TPDS.2015.2389262.


This work is funded in part by National Science Foundation awards ACI-1432892 while at Auburn and ACI-1561041 while at FSU.

Get Source Code

If you are interested in getting a copy of our source code that enables virtualized analytics shipping on Lustre, please file in a request via this form. An email message will be sent to you with the link to our code.

Personal Tools