:: Akasha: Breaking the I/O Wall in Hierarchical KV Cache Management Amey Agrawal*, Elton Pinto*, Souradeep Bera*, Sukrit Kumar, Irene Paul Stephen, Chus Antonanzas, Anirudha Agrawal, Aldinash Seitenov, Prakhar Jagwani, Anmol Agarwal, Haoran Qiu, Alexey Tumanov
:: Setu: A Global Tensor Exchange for Composable Transfers in Large-Scale ML Systems Elton Pinto*, Amey Agrawal*, Aldinash Seitenov, Anirudha Agrawal, Bhumika Chopra, Roshan Dathathri, Sadjad Fouladi, Alexey Tumanov
:: Revati: Transparent GPU-Free Time-Warp Emulation for LLM Serving Amey Agrawal*, Mayank Yadav*, Sukrit Kumar, Anirudha Agrawal, Garv Ghai, Souradeep Bera, Elton Pinto, Sirish Gambhira, Mohammad Adain, Kasra Sohrab, Chus Antonanzas, Alexey Tumanov [PDF]
:: On Evaluating Performance of LLM Inference Serving Systems Amey Agrawal, Nitin Kedia, Anmol Agarwal, Jayashree Mohan, Nipun Kwatra, Souvik Kundu, Ramachandran Ramjee, Alexey Tumanov [PDF]
:: No Request Left Behind: Tackling Heterogeneity in Long-Context LLM Inference with Medha Amey Agrawal, Haoran Qiu, Junda Chen, Íñigo Goiri, Chaojie Zhang, Rayyan Shahid, Ramachandran Ramjee, Alexey Tumanov, Esha Choukse [PDF]
:: Maya: Optimizing Deep Learning Training Workloads using Emulated Virtual Accelerators Srihas Yarlagadda*, Amey Agrawal*, Elton Pinto*, Hakesh Darapaneni, Mitali Meratwal, Shivam Mittal, Pranavi Bajjuri, Srinivas Sridharan, Alexey Tumanov » EuroSys'26 [PDF]
:: Inshrinkerator: Compressing Deep Learning Training Checkpoints via Dynamic Quantization Amey Agrawal, Sameer Reddy, Satwik Bhattamishra, Venkata Prabhakara Sarath Nookala, Vidushi Vashishth, Kexin Rong, Alexey Tumanov » SoCC'24 [PDF]
:: Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve Amey Agrawal, Nitin Kedia, Ashish Panwar, Jayashree Mohan, Nipun Kwatra, Bhargav S. Gulavani, Alexey Tumanov, and Ramachandran Ramjee » OSDI'24 [PDF] [Code] [Video]
:: Metron: Holistic Performance Evaluation Framework for LLM Inference Systems Amey Agrawal*, Anmol Agarwal*, Nitin Kedia, Jayashree Mohan, Souvik Kundu, Nipun Kwatra, Ramachandran Ramjee, Alexey Tumanov [PDF] [Code]
:: Vidur: A Large Scale Simulation Framework For LLM Inference Amey Agrawal, Nitin Kedia, Jayashree Mohan, Ashish Panwar, Nipun Kwatra, Bhargav S. Gulavani, Ramachandran Ramjee, and Alexey Tumanov » MLSys'24 [PDF] [Code] [Video]
:: Sarathi: Efficient LLM Inference by Piggybacking Decodes with Chunked Prefills Amey Agrawal, Ashish Panwar, Jayashree Mohan, Nipun Kwatra, Bhargav S. Gulavani and Ramachandran Ramjee [PDF]
:: Singularity: Planet-Scale, Preemptible and Elastic Scheduling of AI Workloads Singularity Team, Microsoft [PDF]
:: Learning Digital Circuits: A Journey Through Weight Invariant Self-Pruning Neural Networks Amey Agrawal, and Rohit Karlupiya » NeurIPS 2019 Workshop, SNN 2021 Workshop [PDF] [Code]
:: Delog: A Privacy Preserving Log Filtering Framework for Online Compute Platforms Amey Agrawal, Abhishek Dixit, Namrata Shettar, Darshil Kapadia, Rohit Karlupia, Vikram Agrawal, and Rajat Gupta » IEEE Big Data 2019 [PDF]
:: Logan: A Distributed Online Log Parser Amey Agrawal, Rajat Gupta, and Rohit Karlupiya » ICDE 2019 [PDF] [Website]