Add tiny transformer LLM notebook#2163
Conversation
Add a new gallery notebook demonstrating a tiny decoder-only transformer LLM implemented with pytensor/xtensor (doc/gallery/transformers/tiny_transformer_llm.ipynb). Update .gitignore to exclude AI tool artifacts, gallery downloaded data, and JupyterLab session files. Also apply related updates to math implementation and rewrites (pytensor/tensor/math.py, pytensor/xtensor/rewriting/math.py) and adjust tests (tests/tensor/test_math.py, tests/xtensor/test_math.py) to match the changes.
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
|
Doing the notebook found a few things which were adding an overhead in xtensor, adjust and end-up being even faster now xtensor than plain tensor. |
| constant when possible, instead of a chain of ``Mul`` nodes over individual | ||
| ``ScalarConstant``s. |
There was a problem hiding this comment.
why? pytensor will rewrite away the mul constants, this is just eager stuff? We don't want to eagerly use static shapes in actual inputs
| @@ -0,0 +1,1246 @@ | |||
| { | |||
There was a problem hiding this comment.
Line #1. from pytensor.xtensor.shape import stack as xstack
import pytensor.xtensor as ptx, and then use ptx.stack and the like
Reply via ReviewNB
| @@ -0,0 +1,1246 @@ | |||
| { | |||
There was a problem hiding this comment.
Line #32. scores = px.dot(q, k, dim="hd") / scale # (batch, head, time_q, time_k)
you can do assert scores.dims == ("batch", "head", "time_q", "time_k"), to self document the dims instead of as a comment, also teaches these are always around for introspection
Reply via ReviewNB
| @@ -0,0 +1,1246 @@ | |||
| { | |||
There was a problem hiding this comment.
Line #4. def gen_step(context, rng):
you can work with xtensor variables still, just convert to tensor before going into the scan, convert to xtensor inside the scan, convert to tensor before returning from scan, and convert the scan outputs to xtensor outside as soon as you get them. Basically handle the boundary.
Also you could make a while scan that runs until the termination token is emitted
Reply via ReviewNB
|
This is nice, I don't want the random xtensor changes, we need to investigate why it was not simplifying in your case, may be another symptom of #2056 or something else, but shouldn't be done in a docs PR |
|
@ricardoV94 follow some of your comments, and came up with this: #2164 |
| @@ -0,0 +1,1246 @@ | |||
| { | |||
There was a problem hiding this comment.
Summary
matmuland keep outer products on the existing einsum fallback.tensordotreshape shapes and add focused regression coverage.Test plan
conda run -n pytensor-dev python -m ruff check pytensor/tensor/math.py pytensor/xtensor/rewriting/math.py tests/tensor/test_math.py tests/xtensor/test_math.py