🏠 Home
I am documenting and posting my learning, reading, and coding notes. Half of these notes are about my personal
experiences and insights in ML, security, and coding. I have developed a habit of systematically taking notes
whenever I learn about a new set of knowledge. This practice has helped me a lot in quickly understanding a new
field, capturing key points, and facilitating future review. Previously, I kept these notes private, but I will
now reorganize and share them publicly. Hope they can be helpful to broader audois who are interested in the
related topics.
I also consider it important to maintain a paper list for each research topic I am working on. By reading,
summarizing, and categorizing these papers, I can gain a comprehensive understanding of the SOTAs, identify
their limitations, and uncover new opportunities. With the support of my excellent students and collaborators,
my group maintains multiple paper lists related to our research interests and ongoing projects. We also decide
to share these lists publicly to benefit the broader community. I hope they will be useful for your research as
well~
- Machine learning basis (Jan. 2025)
- This post summarizes the basis of machine learning summarized from the following three books [1,
2,
3]. It contains the least amount of background
needed for deep learning as well as some basic statistical learning concepts and knowledge that is
worth knowing.
It helps build the ML foundation from a statistical point of view, understand how widely used ML
modeling and training techniques are derived (e.g., weight decay is equal to assuming the weight
follows a Gaussian prior), and clarify some common confusions (e.g., model uncertainty).
- Multilayer perceptron (MLP) (Feb. 2025)
- This post summarizes almost everything you need to know about MLP, the most basic deep learning
structure.
It contains unique things probably you cannot find somewhere else, e.g., calculating the backward
pass of the whole model and when doing tensor parallel and some training tricks summarized from my
experiences.
- Transformers (Mar. 2025)
- This post introduces the basis and some advanced topics of transformers,
including the original transformer model, BERT, GPT, calculating the backward pass, parallel
training, prompting, efficient inference, etc.
It could serve as a good starting point for those who want to understand the internal mechanisms
about transformers and find research topics about LLMs.
- LLM safety and security: taxonomy, status, and future (July 2024)
- This document categorizes and summarizes existing papers on the security and safety threats/risks of
LLMs, including benchmark evaluation and individual attack and defense techniques.
- AI agent security: taxonomy, status, and future (Dec. 2024)
- This post categorizes and summarizes existing works on the security threats of LLM-enabled agent
systems, including benchmark evaluation and individual attack and defense techniques.
- All you need to know about Code LLM: Datasets, foundation models,
fine-tuning, reasoning, and agents (Feb. 2025)
- This post categorizes and summarizes recent papers on the whole pipeline of code LLMs, including
pre-training and fine-tuning datasets, code foundation models, fine-tuning techniques, and agents.