Academic Pages is a ready-to-fork GitHub Pages template for academic personal websites

About

Hi, I’m Chris Zhang.

Engineer. Researcher. Builder of strange and smart things.

I work at the crossroads of AI systems, cybersecurity, and machine learning infrastructure — exploring how large models can think faster, act smarter, and run anywhere from cloud clusters to mobile chips. My curiosity usually leads me where performance meets intelligence.


Research & Technical Interests

I’m fascinated by how intelligence scales across hardware and software. My current work touches on:

  • LLM fine-tuning & inference optimization — quantization, mixed-precision, and compiler-level scheduling for ARM and mobile GPUs.
  • Agentic & multimodal AI — agents that can see, reason, and act, combining visual, textual, and behavioral understanding.
  • Retrieval-Augmented Generation (RAG) — designing systems that retrieve and reason with external memory intelligently.
  • Heterogeneous computing — balancing workloads between CPU, GPU, and neural accelerators for transformer inference.

Projects & Experiments

Here are a few recent things I’ve built or explored:

On-Device LLM Inference Optimizing llama.cpp on ARM SoCs to run large models efficiently using GPU offloading and memory-aware scheduling.

Agentic AI for Threat Investigation Built a multi-agent framework that performs autonomous phishing analysis — combining LLM reasoning, Playwright-driven browsing, and network forensics.

Security Simulation Platform Created a controlled environment that mirrors real login portals pixel-for-pixel to test how AI-driven phishing detection systems respond to 0-day lookalikes.


Philosophy

I believe the frontier of AI isn’t just about bigger models — it’s about smarter systems: ones that adapt, interact, and make efficient use of the world’s messy hardware and data. My work lives in that tension between precision and chaos, where science meets engineering and curiosity refuses to stay in its lane.


Currently Exploring

  • Efficient fine-tuning for multimodal transformers
  • Adaptive GPU pipelines for inference scheduling
  • Lightweight agents for mobile and edge inference

Connect

I’m always open to conversations about AI for Cybersecurity, LLM systems, AI agents, or performance engineering. Find me between experiments, probably with too many terminals open.