Publication

Selected Conference Publications

ALISA: Accelerating Large Language Model Inference via Sparsity-Aware KV Caching” has been accepted by the prestigious 51st IEEE/ACM International Symposium on Computer Architecture (ISCA ’24).
Time Series Prediction with Anomaly-Aware Recurrent Neural Networks. Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD), September 2022.
Integrating Security into the Big Data Ecosystem. IEEE Military Communications Conference (MilCom). 29 November-2 December 2021 San Diego, CA, USA.
CT-Net: Channel Tensorization Network for Video Classification. International Conference on Learning Representations (ICLR’21), 2021.
Lelantus: Fine-Granularity CopyOn-WriteOperations for Secure Non-Volatile Memories. In: ACM/IEEE 47th International Symposium on Computer Architecture (ISCA). pp:597-609. https://www.iscaconf.org/isca2020
Disperse Access Considered Energy Inefficiency in Intel Optane DC Persistent Memory Servers. The 40th IEEE International Conference on Distributed Computing Systems (ICDCS 2020). July 8 – 10, 2020, Singapore.
ArchSampler: Architecture-Aware Memory Sampling Library for In-Memory Applications. ICCD 2018: 258-265.
Sapprox: Enabling Efficient and Accurate Approximations on Sub-datasets with Distribution-aware Online Sampling. Accepted to VLDB 17
DataNet: A Data Distribution-aware Method for Sub-dataset Analysis On Distributed File Systems, Accepted to the 30th IEEE International Parallel & Distributed Processing Symposium (IPDPS’16).
GreenMatch: Renewable-Aware Workload Scheduling for Massive Storage Systems, Accepted to the 30th IEEE International Parallel & Distributed Processing Symposium (IPDPS’16).
Experiences in Using OS-level Virtualization for Block I/O. PDSW2015: the 10th parallel data storage workshop held in conjunction with SC15. Monday, November 16, 2015. Austin, TX.
Opass: Analysis and Optimization of Parallel Data Access on Distributed File Systems. In the 29th IEEE International Parallel & Distributed Processing Symposium. *This work is conducted at a PRObE staging cluster—128-node Marmot cluster, which is supported in part by the National Science Foundation under awards CNS-1042537 and CNS-1042543 (PRObE).
SLAM: Scalable Locality-Aware Middleware for I/O in Scientific Analysis and Visualization. The 23rd International Symposium on High Performance Distributed Computing (ACM HPDC2014). *This work is conducted at a PRObE staging cluster—128-node Marmot cluster, which is supported in part by the National Science Foundation under awards CNS-1042537 and CNS-1042543 (PRObE).
DL-MPI: Enabling Data Locality Computation for MPI-based Data-Intensive Applications. 2013 IEEE International Conference on Big Data (BigData 2013), Oct 6-9, 2013, Santa Clara, CA, USA. *This work is conducted at a PRObE staging cluster—128-node Marmot cluster, which is supported in part by the National Science Foundation under awards CNS-1042537 and CNS-1042543 (PRObE). http://www.nmc-probe.org/. See our projects using PRObE resources.
A Scalable Reverse Lookup Scheme using Group-based Shifted Declustering Layout, with Junyao Zhang, Pengju Shang. The 25th IEEE International Parallel & Distributed Processing Symposium May 16-20, 2011, Anchorage (Alaska) USA.
A Novel Power management for CMP Systems in Data-intensive Environment, with Pengju Shang. The 25th IEEE International Parallel & Distributed Processing Symposium May 16-20, 2011, Anchorage (Alaska) USA.
VisIO: Enabling Interactive Visualization of Ultra-Scale, Time Series Data via High-Bandwidth Distributed I/O Systems, with Christopher Mitchell, James Ahrens(LANL). The 25th IEEE International Parallel & Distributed Processing Symposium May 16-20, 2011, Anchorage (Alaska) USA.
MRAP: A Novel MapReduce-based Framework to Support HPC Analytics Applications with Access Patterns, with Saba Sehrish, Grant Mackey and John Bent (LANL), the ACM High Performance Distributed Computing (ACM HPDC’10). June 2010, Chicago, IL, USA.
Bridging the Gap between Parallel File Systems and Local File Systems: A Case Study with PVFS, with Peng Gu, and Robert Ross (ANL), the 37th International conference on Parallel Processing 2008. September 8–12, Portland, Oregon, USA (ICPP08).
Shifted Declustering: An Ideal-placement Layout Scheme for Multi-way Replication Storage Architecture, with Huijun Zhu and Peng Gu, the 22nd ACM International Conference on Supercomputing (ICS08).June 7–12, 2008 Island of Kos, Aegean Sea, Greece.
RIMAC: A Redundancy-based, Hierarchical Cache Architecture for Energy-efficient Storage Systems, with Xiaoyu Yao, the 1st ACM EuroSys conference (EuroSys2006),pp249-262, April 2006, Belgium.
Nexus: A Novel Weighted-Graph-Based Prefetching Algorithm for Metadata Servers in Petabyte-Scale Storage Systems, with Peng Gu, Yifeng Zhu and Hong Jiang, ACM/IEEE CCGrid2006, April 2006,Singapore.
Foreseer: A Novel, Locality-Aware Peer-to-peer System Architecture for Keyword Searches, with Hailong Cai, in Proceedings of ACM/IFIP/USENIX 5th International Middleware Conference (Middleware 2004), Oct. 19-23th, 2004, Toronto, Ontario, Canada.
EERAID: Energy-efficient Redundant And Inexpensive Disk Array, with Dong Li, in Proceedings of 11th ACM SIGOPS European Workshop, September 20-22, 2004, Leuven, Belgium.
WOLF — A Novel Reordering Write Buffer for Log-structured File System, with Yiming Hu, in USENIX Conference Proceedings on File And Storage Technologies (FAST’02), pp 47-61, Jan. 28-30, 2002, Monterey, California.
Selected Journals
- MAR: A Novel Power Management for CMP Systems in Data Intensive Environment. IEEE Transaction on Computers. Article DOI: I0.1109/tc.2015.2458854.2015.
- DRAW: A New Data-gRouping-AWare Data Placement Scheme for Data Intensive Applications with Interest Locality. Accepted by IEEE Transactions on Magnetics.
- Supporting HPC Analytics Applications with Access Patterns using Data restructuring and Data-centric scheduling techniques. IEEE Transactions on Parallel and Distributed Systems. Jan. 2013 (vol. 24 no. 1) ISSN: 1045-9219, pp. 158-169.
- TRAID: Exploiting Temporal Redundancy and Spatial Redundancy to Boost Transaction Processing Systems Performance, with Pengju Shang and Saba Sehrish. IEEE Transactions on Computers, Vol. 61, No. 4, pp 517-530, April 2012.
- A New Placement-ideal Layout for Multi-way Replication Storage System, with Pengju Shang, Huijun Zhu and Peng Gu. IEEE Trans. Computers 60(8): 1142-1156 (2011).
- Nexus: A Novel Weighted-Graph-Based Prefetching Algorithm for Metadata Servers in Large Scale Storage Systems, with Peng Gu, Hong Jiang, Yifeng Zhu and Pengju Shang. IEEE transactions on computers, Vol. 59, No. 1, pp1-15, January 2010.
- A New Hierarchical Data Cache Architecture for iSCSI Storage Server, with Christopher Mitchell, Xiaoyu Yao and Peng Gu, IEEE Transactions on Computers, vol.58 no.4, pp433-447, April 2009.
- Exploiting In-memory and On-disk Redundancy to Conserve Energy in Parity Disk Array, with Xiaoyu Yao, and Huijun Zhu, IEEE Transactions on Computers. Vol. 57, No. 6, pp. 733-747, June 2008.
- eRAID: Conserving Energy in Conventional Disk based RAID System, with Huijun Zhu, and Dong Li, IEEE Transactions on Computers. Vol. 57, No. 3, pp. 359-374, March 2008.
- HBA: Distributed Metadata Management for Large Cluster-based Storage Systems, with Yifeng Zhu, Hong Jiang, and Feng Xian, IEEE Transactions on Parallel and Distributed Systems. Vol. 19, No. 6, pp. 750-763, June 2008.
- Exploiting Geographical and Temporal Locality to Boost Search Efficiency in Peer-to-peer Systems, with Hailong Cai, IEEE Transactions on Parallel and Distributed Systems, Vol. 17, No. 10, pp: 1189-1203, Oct. 2006.
- A Novel Reorder Write Buffer to Boost Write Performance for Log-structured File System, with Yiming Hu, IEEE Transactions on Computers, Vol. 52, No. 12, pp 1559-1572, December 2003.
- UCFS — A User-space, High Performance, Custom File System for Web Proxy Servers, with Rui Min, Yingwu Zhu, and Yiming Hu, IEEE Transactions on Computers, Vol. 51, No. 9, pp: 1056-1073, Sep. 2002.

Selected Conference Publications

Selected Journals