Cache pre-processing of large documents fully embedded into context #66

neilmehta24 · 2024-12-26T23:03:16Z

mlx_lm has a cache_prompt and load_prompt feature that makes it easier to work with long prompts. When LM Studio injects an entire document into context, it may take a long time to pre-process the document. This pre-processing will be invalidated when the cache is invalidated. If users have the option to load/save the cache, this pre-processing time would be gone

The text was updated successfully, but these errors were encountered:

neilmehta24 added the enhancement New feature or request label Dec 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache pre-processing of large documents fully embedded into context #66

Cache pre-processing of large documents fully embedded into context #66

neilmehta24 commented Dec 26, 2024

Cache pre-processing of large documents fully embedded into context #66

Cache pre-processing of large documents fully embedded into context #66

Comments

neilmehta24 commented Dec 26, 2024