process-cxr

Follow

🎯

Focusing

SeptRan process-cxr

🎯

Focusing

Follow

Chen Xinran. Undergraduate at SJTU; Graduate student at UCAS / ISCAS. Research focus on post-training, Baidu AMU Team.

17 followers · 43 following

Achievements

Achievements

Pinned Loading

Self-KDRL Self-KDRL Public

Forked from lasgroup/SDPO

Reinforcement Learning via Self-Distillation (SDPO)

Python
rllm rllm Public

Forked from rllm-org/rllm

Democratizing Reinforcement Learning for LLMs

Python
verl-recipe-opkd verl-recipe-opkd Public

Forked from verl-project/verl-recipe

A set of examples based on verl for end-to-end RL training recipes.

Python
verl-upstream verl-upstream Public

Forked from verl-project/verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python
ai-library ai-library Public template

Forked from jackyzha0/quartz

🍊 My personal AI learning notes on LLMs, reinforcement learning, and machine learning.

TypeScript 4
lasgroup/SDPO lasgroup/SDPO Public

Reinforcement Learning via Self-Distillation (SDPO)

Python 946 106