Skip to content

oh-lab/CPQL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 

Repository files navigation

CPQL: Peng's Q(λ) for Conservative Value Estimation in Offline Reinforcement Learning

arXiv

🧵 This paper introduces CPQL: Conservative Peng's Q($\lambda$), mitigates overly-pessimistic value estimation, achieves the performance greater than (or equal to) that of the behavior policy, and provides near-optimal performance guarantees. This codebase is heavily inspired by CORL, an offline RL codebase.

Getting started

For first-time installation, please follow the installation instructions provided in the CORL GitHub repository.

git clone https://github.com/tinkoff-ai/CORL.git && cd CORL
pip install -r requirements/requirements_dev.txt

Training

To train d4rl datasets,

python algorithms/cpql.py --config configs/cpql/hopper/random_v2.yaml

Citing CORL

If you use CORL in your work, please use the following bibtex

@inproceedings{kim2026peng,
  title={Peng's Q ($$\backslash$lambda $) for Conservative Value Estimation in Offline Reinforcement Learning},
  author={Kim, Byeongchan and Oh, Min-hwan},
  booktitle={The Fourteenth International Conference on Learning Representations},
  year={2026}
}

About

Official code for ICLR'26 paper [Peng's Q(λ) for Conservative Value Estimation in Offline Reinforcement Learning]

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages