Yansheng Mao’s Website

Biography

Yansheng Mao is an undergraduates at Peking University, China (expected to graduate in 2026). He is in the Zhi Class, an special class for excellent students major in artificial intelligence. He studie large language models, especially the reasoning, as an intern at Muhan Zhang’s research group, GraphPKU. He is awarded the 2024 scholarship of Zhi Class.

Research Interests

Large language model reasoning. While LLMs seem to be intelligent in answering some questions, they still make reasoning mistakes in some trivial but fundamental tasks and they fail to do complex reasoning like proving mathmatic theorems (recently there is a significant improvement led by GPT-4o). I’m studying long-context reasoning, a subfield of LLM reasoning.

LLM architecture. Transformer architecture nearly domains all the AI fields due to its robustness. However, one of its issues is the quadratic complexity (both memory and time) w.r.t. the sequence length. This issue hinders training and doing inference on long texts with LLMs. Many works try to develop subquadratic architectures like Mamba and some other works try to boost the performance of smaller models.