Feature/telemetry demo notebook#1308
Conversation
* Fix type of HookedTransformerConfig.device This is typed as `Optional[str]` but sometimes returns `torch.device`. Updated the code to just return the `str` instead of wrapping with a device. I'm not confident that every function which takes a device will always be passed a string, so I didn't change functions like warn_if_mps. Found while working on TransformerLensOrg#1219 * more cleanup * 3.0 CI Bugs (TransformerLensOrg#1261) * Fixing `utils` imports * skip gated notebooks on PR from forks * Updating notebooks * Ensure LLaMA only runs when HF_TOKEN is available --------- Co-authored-by: jlarson4 <jonahalarson@comcast.net>
TransformerLens 3.1.0
|
Hey @jlarson4 -- I've opened this PR to address the task (1148) assigned to me. You'll notice a few minor changes from my initial concept and code. These updates focus specifically on streamlining the loop and eliminating caching overhead, but the final result fully aligns with the original submission goals. I'll be available this week to tweak or refactor anything based on your critical review. Thanks! |
|
Thank you for putting this together @jonathanrbelanger-lang, it looks awesome. I should have time today to give it a thorough review & send over any comments if I have them. |
There was a problem hiding this comment.
A couple file wide notes:
- Can we rename the demo to
Realtime_Training_Telemetry_Demo - Can you add some detailed text cells about what/how/why this notebook is doing? A majority of the demo notebooks include some level of explanation, see
Main_Demo.ipynbfor an example of what I'm looking for here - Can we update the setup cell to function similar to the setup cells in our other notebooks, so that this notebook can be run locally via Jupyter or in Colab? Specifically, this new demo uses
matplotlibwhich is not installed by default in TransformerLens, and the demo cannot be run locally because of it.
Let me know if you have any questions
There was a problem hiding this comment.
Yes, absolutely. Let me get to work on that, probably tomorrow evening as I'm knee deep in another project tonight, if that's amenable.
There was a problem hiding this comment.
Absolutely, no rush. Thank you!
Description
Adds a new educational demo notebook (
demos/TL_Demo_RT_Viz.ipynb) that provides a lightweight, zero-dependency bridge to extract and visualize mechanistic telemetry (Attention Coherence and Head Agreement) during a training loop.Motivation and Context:
model.run_with_cacheis only called at log intervals, saving roughly 10x memory/compute overhead compared to naive caching loops.n_layers, making it highly forkable for users experimenting with larger architectures.ruffand passes all modern syntax and formatting checks cleanly.Fixes # N/A
Type of change
Screenshots
Checklist: