CV

Curriculum vitae.

Contact Information

Name Jiaming Cheng
Professional Title Efficient ML for AIoT and On-Device LLMs
Email jiaming@jiamingcheng.me

Research Interests

Efficient on-device LLM inference and compression for edge/AIoT at the algorithm and software level — structured pruning, low-bit quantization, knowledge distillation, and reproducible deployment benchmarks for large language models under tight compute, memory, and energy budgets.

Education

  • 2020 - 2024

    Columbus, Ohio, USA

    B.S., cum laude
    The Ohio State University
    Computer and Information Science
    • GPA: 3.51/4.00

Research Experience

  • 2024 - present

    Columbus, Ohio, USA

    Researcher (advised by Prof. Rajiv Ramnath and Prof. Brijesh Soni)
    The Ohio State University
    Efficient ML and model compression for edge/AIoT — vision-model structured pruning and on-device LLM inference — across four papers (two published, one to appear, one under review).
    • Designed and implemented the pruning methods in EPIC and SPICE (an L2 extension of DepGraph and a Taylor+L2+KD hybrid, TaLK) as primary code author; trained on the Ohio Supercomputer Center and deployed to Raspberry Pi.
    • Built a phase-wise on-device LLM inference benchmark on GPU, CPU, and Raspberry Pi, comparing standard Transformers with a sub-quadratic (Qwen3.5 GatedDeltaNet) architecture.
    • Authored the experimental sections and produced the analysis and figures; onboarded new members onto the codebase and edge pipeline.

Publications

  • 2025
    EPIC: Efficient Pruning for Inference on Constrained Devices
    Practice and Experience in Advanced Research Computing (PEARC '25)
  • 2026
    SPICE: Structured Pruning for Inference on Constrained Edge Devices
    IEEE Consumer Communications & Networking Conference (CCNC)
  • 2026
    Phase-Wise Analysis of LLM Inference Acceleration on GPU, CPU, and Edge Device
    Practice and Experience in Advanced Research Computing (PEARC '26), to appear
  • 2026
    An Empirical Survey of AI Model Compression Techniques for Edge Deployments
    IEEE Internet of Things Journal (under review)

Skills

Programming: Python, Rust, TypeScript
ML & Systems: PyTorch, CUDA, llama.cpp, torch-pruning, bitsandbytes, FlashAttention
Research Engineering: uv, Pydantic, Ruff, ty, Weights & Biases, Typer
Infrastructure: Slurm, Docker, Kubernetes, Ceph, Terraform, Ansible