I am documenting and posting my learning, reading, and coding notes. Half of these notes are about my personal experiences and insights in ML, security, and coding. I have developed a habit of systematically taking notes whenever I learn about a new set of knowledge. This practice has helped me a lot in quickly understanding a new field, capturing key points, and facilitating future review. Previously, I kept these notes private, but I will now reorganize and share them publicly. Hope they can be helpful to broader audois who are interested in the related topics.
I also consider it important to maintain a paper list for each research topic I am working on. By reading, summarizing, and categorizing these papers, I can gain a comprehensive understanding of the SOTAs, identify their limitations, and uncover new opportunities. With the support of my excellent students and collaborators, my group maintains multiple paper lists related to our research interests and ongoing projects. We also decide to share these lists publicly to benefit the broader community. I hope they will be useful for your research as well~
- Machine learning basis (Jan. 2025)
- This post summarizes the basis of machine learning summarized from the following three books [1,
2,
3]. It contains the least amount of background needed for deep learning as well as some basic statistical learning concepts and knowledge that is worth knowing.
It helps build the ML foundation from a statistical point of view, understand how widely used ML modeling and training techniques are derived (e.g., weight decay is equal to assuming the weight follows a Gaussian prior), and clarify some common confusions (e.g., model uncertainty).
- Multilayer perceptron (MLP) (Feb. 2025)
- This post summarizes almost everything you need to know about MLP, the most basic deep learning structure.
It contains unique things probably you cannot find somewhere else, e.g., calculating the backward pass of the whole model and when doing tensor parallel and some training tricks summarized from my experiences.
- Transformers (Mar. 2025)
- This post introduces the basis and some advanced topics of transformers,
including the original transformer model, BERT, GPT, calculating the backward pass, parallel training, prompting, efficient inference, etc.
It could serve as a good starting point for those who want to understand the internal mechanisms about transformers and find research topics about LLMs.
- LLM safety and security: taxonomy, status, and future (July 2024)
- This document categorizes and summarizes existing papers on the security and safety threats/risks of LLMs, including benchmark evaluation and individual attack and defense techniques.
- AI agent security: taxonomy, status, and future (Dec. 2024)
- This post categorizes and summarizes existing works on the security threats of LLM-enabled agent systems, including benchmark evaluation and individual attack and defense techniques.
- All you need to know about Code LLM: Datasets, foundation models, fine-tuning, reasoning, and agents (Feb. 2025)
- This post categorizes and summarizes recent papers on the whole pipeline of code LLMs, including pre-training and fine-tuning datasets, code foundation models, fine-tuning techniques, and agents.