Ka Lok Ng

Independent Researcher

Data Analyst

Gastronomist

When I was in my final year at HKU, my team and I won first runner-up at a hackathon hosted by Microsoft Hong Kong. We built a prototype Q&A tool with named entity recognition (NER), powered by a plain vanilla Transformer, to help legal professionals search and query case information more efficiently. At the time it was definitely an experimental toy—but looking back, it’s kind of wild how practical (and common) this sort of thing has become today.

We won a Microsoft Surface laptop, which I ended up giving to my dad as my first “proper” gift to him (and honestly, it was a fun machine to play with too). But the most rewarding part wasn’t the prize—it was seeing, firsthand, how powerful the Transformer architecture could be. This was before ChatGPT, Gemini, SAM, and all the other Transformer-based tools that are everywhere now. Back then, most people hadn’t even heard of GPT, attention, or any of the buzzwords that later became shorthand for “AI.”

In those days, deep learning still felt a lot like alchemy—and in many ways, it still does. But that win really stuck with me, and it pushed me to dive deeper into the theory behind modern AI.

More recently, though, I’ve started to feel like I finally have a more complete mental model—one that lets me see deep learning as a serious, structured science rather than just trial-and-error magic. A big part of that shift came from a series of works by Professor Yi Ma’s team. That’s also why I started this blog: to organize my thoughts, practice writing, and document my research journey as I go.

news

Apr 01, 2026	Project: Resemblance of Cross Attention like Operator with Conditional GMM Denoiser
Mar 09, 2026	Project: Whitebox Transformer Implementation is published

latest posts

May 22, 2026	Temporal Straightening for Latent Planning in Control Theory Language
Mar 03, 2026	Compression via denoising
Jan 28, 2026	Optimization from basics to ISTA