About This Research
The Paper
This blog accompanies our paper "Understanding State-Tracking in Linear RNNs for Code Execution", which investigates why theoretically capable architectures fail to learn state-tracking from standard training data.
Key Contributions
- We translate permutation group tasks into Python REPL traces for next-token prediction
- We identify that dense intermediate state supervision is critical for learnability
- We show DeltaNet[-1,1] learns and extrapolates state-tracking where transformers fail
- We analyze challenges in real code: probabilistic transitions and tokenization discontinuity
Resources
Contact
For questions about this research, please contact the authors.