diff --git a/README.md b/README.md
index b21c82a..2674128 100644
--- a/README.md
+++ b/README.md
@@ -4,6 +4,8 @@
 Gradient Cached Dense Passage Retrieval (`GC-DPR`) - is an extension of the original [DPR](https://github.com/facebookresearch/DPR) library.  
 We introduce Gradient Cache technique which enables scaling batch size of the contrastive form loss and therefore the number of in-batch negatives far beyond GPU RAM limitation. With `GC-DPR` , you can reproduce the state-of-the-art open Q&A system trained on 8 x 32GB V100 GPUs with a single 11 GB GPU.  
 
+To use Gradient Cache in your own project, checkout our [GradCache package](https://github.com/luyug/GradCache).
+
 ## Gradient Cache Technique  
 - Retriever training quality depends on its effective batch size.
 - The contrastive form loss (NLL with in-batch negatives) conditions on the entire batch and requires fitting the entire batch into GPU memory.