hi! this is my corner of the internet...
i'm a machine Learning engineer with a deep passion for ML research and infra/systems, currently a cs undergrad.
i'm currently diving into post-training of LLMs, CUDA programming and inference optimization.
research and engineering interests:
- optimizing tokenization compression rates in multilingual NLP and semantic tokenizer modeling.
- exploring the hierarchical softmax vs full softmax trade-offs in block sparse attention.
- cuda programming and fast ml kernels.
- optimizing language bucketing via data driven language distribution clustering for modeling multilingual tokenizers.
- fast and efficient distributed systems.
currently reading : Rasbt's build a Large Language Model from scratch
things i'm into:
- math, ml and computers in general.
- writing cuda kernels to make gpus go brrr...
- reading ML papers and implementing small prototypes from scratch.
- reading one piece and binging anime in my free time (just soothes my soul ☺️).
thanks for dropping by!
social handles / contacts 👇🏻
- X / Twitter
- DeepML [i solve ml problems here]
- consider buying me a coffee to support more open research!