Skip to content

coldfinity/typed-sr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

typed-sr

Symbolic regression via grammar-guided deep reinforcement learning, with a typed CFG that enforces dimensional consistency.

Motivation

Standard symbolic regression systems treat the expression grammar as unconstrained — any syntactically valid tree is a candidate regardless of whether it makes physical sense. This project adds a typed CFG where each production rule carries dimensional type signatures (length, time, mass, etc.), so the search never wastes capacity on expressions like sin(velocity + mass).

Research Question

Does enforcing dimensional consistency via a typed CFG improve symbolic regression on physics datasets, compared to an untyped baseline?

Architecture

Data (X, y)
    │
    ▼
Context Encoder       small MLP: (X, y) → embedding z
    │
    ▼
RNN Sampler           LSTM conditioned on z, samples token sequences
    │  ↑ grammar mask (valid tokens only at each step)
    ▼
Expression Evaluator  builds tree, evaluates on X, computes reward
    │
    ▼
REINFORCE Update      policy gradient + entropy bonus

Project Structure

typed-sr/
├── grammar/
│   ├── cfg.py          # production rules, token vocab, validity mask
│   ├── typed_cfg.py    # dimension type system + typed masks
│   └── tree.py         # expression tree, evaluator
├── model/
│   ├── encoder.py      # dataset → embedding
│   ├── rnn.py          # LSTM sampler
│   └── dsr.py          # full DSR loop
├── train/
│   ├── reinforce.py    # policy gradient, entropy bonus
│   └── reward.py       # NMSE, complexity penalty, Pareto frontier
├── eval/
│   ├── feynman.py      # Feynman benchmark loader
│   └── metrics.py      # recovery rate, complexity
└── experiments/
    ├── baseline.py     # untyped DSR
    └── typed.py        # typed DSR

Roadmap

  • Expression tree + evaluator (tree.py)
  • Untyped CFG + validity mask (cfg.py)
  • Reward function + Pareto frontier (reward.py)
  • LSTM sampler (rnn.py)
  • REINFORCE training loop (reinforce.py)
  • Typed CFG extension (typed_cfg.py)
  • Feynman benchmark (feynman.py)
  • Experiments + results

References

About

Symbolic regression via grammar-guided deep reinforcement learning, with a typed CFG that enforces dimensional consistency.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages