Recent PDL Publications


The PDL Packet - Fall 2024 Newsletter

 

H4H: Hybrid Convolution-Transformer Architecture Search for NPU-CIM Heterogeneous Systems for AR/VR Applications

Yiwei Zhao, Jinhui Chen, Sai Qian Zhang, Syed Shakib Sarwar, Kleber Hugo Stangherlin, Jorge Tomas Gomez, Jae-Sun Seo, Barbara De Salvo, Chiao Liu, Phillip B. Gibbons, Ziyun Li

30th Asia and South Pacific Design Automation Conference (ASPDAC ’25), January 20–23, 2025, Tokyo, Japan.

Low-latency and low-power edge AI is crucial for Augmented/Virtual Reality applications. Recent advances demonstrate that hybrid models, combining convolution layers (CNN) and transformers (ViT), often achieve a superior accuracy/performance tradeoff on various computer vision and machine learning (ML) tasks. However, hybrid ML models can present system challenges for latency and energy efficiency due to their diverse nature in dataflow and memory access patterns. In this work, we leverage architecture heterogeneity from Neural Processing Units (NPU) and Compute-In-Memory (CIM) and explore diverse execution schemas for efficient hybrid model executions. [...more]

 

Can Increasing the Hit Ratio Hurt Cache Throughput?
BEST PAPER AWARD!

Ziyue Qiu, Juncheng Yang, Mor Harchol-Balter

EAI International Conference on Performance Evaluation Methodologies and Tools,
December 12-13, 2024 Milan, Italy.

Software caches are an intrinsic component of almost every computer system. Consequently, caching algorithms, particularly eviction policies, are the topic of many papers. Almost all these prior papers evaluate the caching algorithm based on its hit ratio, namely the fraction of requests that are found in the cache, as opposed to disk. The "hit ratio" is viewed as a proxy for traditional performance metrics like system throughput or request latency. Intuitively it makes sense that higher hit ratio should lead to higher throughput (and lower request latency), since more requests are found in the cache (low access time) as opposed to the disk (high access time). [...more]

 

The Key to Effective UDF Optimization: Before Inlining, First Perform Outlining

Samuel Arch, Yuchen Liu, Todd Mowry, Jignesh Patel, Andrew Pavlo

Proceedings of the VLDB Endowment, Vol. 18, No. 1., December 2024.

Although user-defined functions (UDFs) are a popular way to augment SQL’s declarative approach with procedural code, the mismatch between programming paradigms creates a fundamental optimization challenge. UDF inlining automatically removes all UDF calls by replacing them with equivalent SQL subqueries. Although inlining leaves queries entirely in SQL (resulting in large performance gains), we observe that inlining the entire UDF often leads to sub-optimal performance. A better approach is to analyze the UDF, deconstruct it into smaller pieces, and inline only the pieces that help query optimization. [...more]

 

Morph: Efficient File-Lifetime Redundancy Management for Cluster File Systems

Timothy Kim, Sanjith Athlur, Saurabh Kadekodi, Francisco Maturana Dax Delvira, Arif Merchant, Gregory R. Ganger, K. V. Rashmi

SOSP ’24, November 4–6, 2024, Austin, TX, USA.

Many data services tune and change redundancy configurations of files over their lifetimes to address changes in data temperature and latency requirements. Unfortunately, changing redundancy configs (transcode) is IO-intensive. The Morph cluster file system introduces new transcode-efficient redundancy schemes to minimize overheads as files progress through lifetime phases. [...more]

 

Reducing Cross-Cloud/Region Costs with the Auto-Configuring MACARON Cache

Hojin Park, Ziyue Qiu, Gregory R. Ganger, George Amvrosiadis

SOSP ’24, November 4–6, 2024, Austin, TX, USA.

An increasing demand for cross-cloud and cross-region data access is bringing forth challenges related to high data transfer costs and latency. In response, we introduce Macaron, an auto-configuring cache system designed to minimize cost for remote data access. A key insight behind Macaron is that cloud cache size is tied to cost, not hardware limits, shifting the way we think about cache design and eviction policies. [...more]

 


Recent PDL News

Zhihao Jia a 2025 Sloan Research Fellow

Congratulations to Zhihao Jia who has been named a Sloan Research Fellows of 2025. The 126 scholars awarded this honor represent the most promising early-career scientists working today. Their achievements and potential place them among the next generation of scientific leaders in the U.S. and Canada. ...

Read More »

Gauri Joshi Named 2025 Goldsmith Lecturer

The PDL along with the IEEE Information Theory Society is pleased to announce that Gauri Joshi has been named the 2025 Goldsmith Lecturer. The Goldsmith Lecturer is a woman, no more than ten years beyond having her highest degree conferred, selected for the quality of her research contributions...

Read More »

Sophia Cao Wins ACM Student Research Competition at SOSP 2024!

Sophia Cao Wins ACM Student Research Competition at SOSP 2024!

Congratulations to Sophia on winning the ACM Student Research contest at SOSP this year. Her research on "Possum: A Tail of Dynamic Flash Capacity for Sustainability" investigates managing flash storage density for improved performance and device endurance...

Read More »