CV | Jiaming (Jamin) Cheng

Contact Information

Name	Jiaming Cheng
Professional Title	Efficient ML for AIoT and On-Device LLMs
Email	jiaming@jiamingcheng.me

Research Interests

Efficient on-device LLM inference and compression for edge/AIoT at the algorithm and software level — structured pruning, low-bit quantization, knowledge distillation, and reproducible deployment benchmarks for large language models under tight compute, memory, and energy budgets.

Education

2020 - 2024

Columbus, Ohio, USA
B.S., cum laude

The Ohio State University

Computer and Information Science
- GPA: 3.51/4.00

Research Experience

2024 - present

Columbus, Ohio, USA
Research Volunteer, advised by Profs. Rajiv Ramnath and Brijesh Soni

The Ohio State University

Efficient Machine Learning & Model Compression for Edge / AIoT.
- Investigated efficient ML and model compression for edge/AIoT — vision-model structured pruning and on-device LLM inference — across four papers (two published, one to appear, one under review); joined via a capstone.
- Designed and implemented the pruning methods in EPIC and SPICE — an L2 extension of DepGraph and a Taylor+L2+KD hybrid (TaLK) — as primary code author; trained on the Ohio Supercomputer Center (GPU) and deployed to Raspberry Pi.
- Built a phase-wise on-device LLM inference benchmark on GPU, CPU, and Raspberry Pi, comparing standard Transformers with a sub-quadratic (Qwen3.5 GatedDeltaNet) architecture.
- Authored the experimental sections and produced the analysis and figures; onboarded new members onto the codebase and edge pipeline with deployment scripts and memos.

Publications

2025

EPIC: Efficient Pruning for Inference on Constrained Devices

Practice and Experience in Advanced Research Computing (PEARC '25)
2026

SPICE: Structured Pruning for Inference on Constrained Edge Devices

IEEE Consumer Communications & Networking Conference (CCNC)
2026

Phase-Wise Analysis of LLM Inference Acceleration on GPU, CPU, and Edge Device

Practice and Experience in Advanced Research Computing (PEARC '26), to appear
2026

Large Models for Small Devices: Recent Advances and Empirical Analysis of Edge AI Deployment

ACM Transactions on Internet of Things (TIOT), under review

Skills

Programming: Python, Rust, TypeScript

ML & Systems: PyTorch, CUDA, llama.cpp, torch-pruning, bitsandbytes, FlashAttention

Research Engineering: uv, Pydantic, Ruff, ty, Weights & Biases, Typer

Infrastructure: Slurm, Docker, Kubernetes, Ceph, Terraform, Ansible

Contact Information

Research Interests

Education

B.S., cum laude

The Ohio State University

Computer and Information Science

Research Experience

Research Volunteer, advised by Profs. Rajiv Ramnath and Brijesh Soni

The Ohio State University

Efficient Machine Learning & Model Compression for Edge / AIoT.

Publications

EPIC: Efficient Pruning for Inference on Constrained Devices

Practice and Experience in Advanced Research Computing (PEARC '25)

SPICE: Structured Pruning for Inference on Constrained Edge Devices

IEEE Consumer Communications & Networking Conference (CCNC)

Phase-Wise Analysis of LLM Inference Acceleration on GPU, CPU, and Edge Device

Practice and Experience in Advanced Research Computing (PEARC '26), to appear

Large Models for Small Devices: Recent Advances and Empirical Analysis of Edge AI Deployment

ACM Transactions on Internet of Things (TIOT), under review

Skills