Any views expressed within media held on this service are those of the contributors, should not be taken as approved or endorsed by the University, and do not necessarily reflect the views of the University in respect of any particular issue.
【BAYES COFFEE HOUSE TECH TALK SERIES】Leveraging Large Language Models for Formal Mathematical Theorem Proving: Data Synthesis and Chain-of-Thought Annotation in the DeepSeek-Prover Series
【BAYES COFFEE HOUSE TECH TALK SERIES】Leveraging Large Language Models for Formal Mathematical Theorem Proving: Data Synthesis and Chain-of-Thought Annotation in the DeepSeek-Prover Series
While Large Language Models (LLMs) excel at knowledge reasoning tasks, they face significant challenges in formal mathematical theorem proving due to data scarcity and strict logical precision requirements. This talk introduces the DeepSeek-Prover series, highlighting how automated dataset construction and reasoning annotations have effectively transferred knowledge from data-rich to data-scarce domains, achieving state-of-the-art results in formal proofs. Additionally, formal theorem proving will be discussed as an ideal benchmark for evaluating rigorous reasoning capabilities of language models.
Title: Leveraging Large Language Models for Formal Mathematical Theorem Proving: Data Synthesis and Chain-of-Thought Annotation in the DeepSeek-Prover Series
Large language models (LLMs) excel at information retrieval, knowledge memorization, and reasoning, yet they face challenges in domains with scarce data and long-tail distributions, such as formal mathematical theorem proving. In this domain, LLMs must generate rigorous, verifiable proof scripts for interactive theorem-proving systems. Challenges stem from limited natural-language corpora, the demand for logical precision, and strict accuracy requirements. In developing DeepSeek-Prover V1, we automated large-scale translation of natural-language mathematical problems into formal propositions and synthesized corresponding proofs. After rigorous multi-level filtering, fine-tuning on this dataset achieved state-of-the-art results, effectively distilling knowledge from data-rich to data-scarce parallel domains. DeepSeek-Prover V1.5 further introduced automatic annotation of chain-of-thought reasoning within proof code data, enabling smaller models to adopt explicit reasoning strategies that significantly improve long-reasoning tasks. This talk will also introduce formal mathematical theorem proving as an ideal benchmark for evaluating and training language models’ capabilities in rigorous reasoning.
Bio:
Huajian Xin is currently a PhD student at the University of Edinburgh under Professor Li Wenda. His research focuses on enabling large language models (LLMs) to perform formalized mathematical theorem proving, with a particular emphasis on autoformalization tasks that bridge natural language and formal language, as well as agent tasks that leverage computer algebra systems and other tools. He is also interning with the ByteDance Doubao/Seed Team and previously interned at DeepSeek, where he led the research on the DeepSeek Prover series of models. For more information, please visit his personal homepage at https://xinhuajian.wordpress.com.
Comments are closed
Comments to this thread have been closed by the post author or by an administrator.