andy's blog

hi! this is my corner of the internet...

i'm a machine Learning engineer with a deep passion for ML research and infra/systems, currently a cs undergrad.

i'm currently diving into post-training of LLMs, CUDA programming and inference optimization.

research and engineering interests:

optimizing tokenization compression rates in multilingual NLP and semantic tokenizer modeling.
exploring the hierarchical softmax vs full softmax trade-offs in block sparse attention.
cuda programming and fast ml kernels.
optimizing language bucketing via data driven language distribution clustering for modeling multilingual tokenizers.
fast and efficient distributed systems.

currently reading : Rasbt's build a Large Language Model from scratch

things i'm into:

thanks for dropping by!

social handles / contacts 👇🏻