An Experiment of Implementing Reservoir Sampling to StreamingLLM

Authors

  • Jiyang Pan Author
  • Jiayi Weng Author
  • Yansen Huang Author
  • Patrick Pan Author
  • Qianbiao Zhao Author

DOI:

https://doi.org/10.61173/a09ce072

Keywords:

streamingLLM, Reservoir Sampling, Large Language Models, KV-cache, tokens

Abstract

When the text is longer than the training sequence length, the streamingLLM can successfully improve the computational speed and guarantee a certain degree of accuracy. But this method is only suitable for short term memory questions and answers. Because the StreamingLLM doesn’t improve a computer’s Long-term memory ability. We tried to combine the reservoir sampling with streamingLLM. Since StreamingLLM, the reservoir sampling will randomly take samples from what were meant to be discarded. Through adding reservoir sampling, we find the results are more accurate and representative.

Downloads

Published

2025-02-26

Issue

Section

Articles