An Experiment of Implementing Reservoir Sampling to StreamingLLM
DOI:
https://doi.org/10.61173/a09ce072Keywords:
streamingLLM, Reservoir Sampling, Large Language Models, KV-cache, tokensAbstract
When the text is longer than the training sequence length, the streamingLLM can successfully improve the computational speed and guarantee a certain degree of accuracy. But this method is only suitable for short term memory questions and answers. Because the StreamingLLM doesn’t improve a computer’s Long-term memory ability. We tried to combine the reservoir sampling with streamingLLM. Since StreamingLLM, the reservoir sampling will randomly take samples from what were meant to be discarded. Through adding reservoir sampling, we find the results are more accurate and representative.