Improving Reasoning Abilities of Large Language Models through Imperfect Synthetic Data
- 2025-04-07 (Mon.), 10:30 AM
- 統計所B1演講廳;茶 會:上午10:10。
- 實體與線上視訊同步進行。
- 許大山 博士
- 聯發科技
Abstract
In this presentation, we will discuss our recent publication, "RL-STaR: Theoretical Analysis of Reinforcement Learning Frameworks for Self-Taught Reasoner." Our findings indicate that synthetic data do not need to be completely accurate; errors in reasoning steps are acceptable as large language models (LLMs) can still learn the correct answers over time. Parallel to the theoretical treatment, we reflect on the use of synthetic data in LLM-based assistant for programming in general and for Register Transfer Level (RTL) design in particular.
線上視訊請點選連結
附件下載
最後更新日期:2025-04-02 09:48