An Experiment of Implementing Reservoir Sampling to StreamingLLM

Jiyang Pan; Jiayi Weng; Yansen Huang; Patrick Pan; Qianbiao Zhao

doi:10.61173/a09ce072

Authors

Jiyang Pan Author
Jiayi Weng Author
Yansen Huang Author
Patrick Pan Author
Qianbiao Zhao Author

DOI:

https://doi.org/10.61173/a09ce072

Keywords:

streamingLLM, Reservoir Sampling, Large Language Models, KV-cache, tokens

Abstract

When the text is longer than the training sequence length, the streamingLLM can successfully improve the computational speed and guarantee a certain degree of accuracy. But this method is only suitable for short term memory questions and answers. Because the StreamingLLM doesn’t improve a computer’s Long-term memory ability. We tried to combine the reservoir sampling with streamingLLM. Since StreamingLLM, the reservoir sampling will randomly take samples from what were meant to be discarded. Through adding reservoir sampling, we find the results are more accurate and representative.

An Experiment of Implementing Reservoir Sampling to StreamingLLM

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section