ShimBoi

Jay Shim

ML Researcher · Continual Learning · Post-Training · VLA Models

I'm an undergrad researcher in the LARG Lab at UT Austin (Turing Scholar, CS Honors), advised by Prof. Peter Stone. I study how large models retain previously learned capabilities during sequential fine-tuning — a problem at the intersection of continual learning, reinforcement learning, and safe deployment.

Research

Simple Recipe Works: VLAs are Natural Continual Learners with RL

Jiaheng Hu*, Jay Shim*, Chen Tang, Yoonchang Sung, Bo Liu, Peter Stone, Roberto Martin-Martin

We show that simple sequential LoRA fine-tuning with RL avoids catastrophic forgetting in VLAs, matching or outperforming dedicated continual learning methods (EWC, Experience Replay, Weight Merge). Built a distributed training framework extending RLinf with custom post-training techniques and efficient dataloaders. Scaled JAX training infrastructure by 1000x via XLA profiling and JIT compilation redesign.

Contrastive Decoding for Improved CoT Reasoning in LLMs

Jay Shim et al. · SoCal NLP Symposium 2024

Zero-shot inference-time decoding method achieving ~6% improvement on reasoning benchmarks (GSM8K, HotpotQA, CommonsenseQA) with Mistral-7B and Phi-1.5.

Selected Projects

continual-vla-rl	Continual RL for VLAs — PPO, GRPO, 5 CRL baselines on LIBERO. 196 commits, open-sourced.
MJX-PureJaxRL	GPU-accelerated RL — 50M env steps in 15 min via MJX + PureJaxRL integration.
AttentionIsAllYouNeed	Full Transformer reimplementation from scratch — 15.34 BLEU on WMT'14 DE→EN.

Tech Stack

Writing

Dedication is All We Need: Recreating the Original Transformer

B.S. Computer Science Honors (Turing Scholar) · UT Austin · GPA 3.97/4.00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ShimBoi

Achievements

Achievements

Highlights

Block or report ShimBoi