AC-Cache: A Memory-Efficient Caching System for Small Objects via Exploiting Access Correlations

AC-Cache consists of two main processes: correlation analysis and KV object distribution. Accordingly, the source code is structured into two execution phases. Data synchronization between these phases must be manually managed by the tester. Due to the extended duration of the test process and the inability to automate it via shell scripts, we provide a detailed description of how to test AC-Cache. The tester is required to manually modify and update certain scripts and source files during the process.

Requirement

platform: Linux
build tools: cmake (>=3.20)
compiler: gcc (>=4.8)
python: python3 (==3.10)
library: jsoncpp, libisal, libmemcached, libfmt, python-prtpy

Build

git clone https://github.com/nankeys/AC-Cache.git
cd src && mkdir _build && cd _build
cmake ..
make -j

Trace process

Download

Twitter: Refer to Twitter cache trace
- https://ftp.pdl.cmu.edu/pub/datasets/twemcacheWorkload/open_source/
Meta: Refer to Running cachebench with the trace workload
- kvcache/202206: aws s3 cp --no-sign-request --recursive s3://cachelib-workload-sharing/pub/kvcache/202206/ ./
- kvcache/202401: aws s3 cp --no-sign-request --recursive s3://cachelib-workload-sharing/pub/kvcache/202401/ ./

Preprocess

# All the steps work under directory `Preproccess/`
# Please change the path of workload used in the python file(i.e. `stats.py`)

Uncompress the trace, for example

zstd -d cluster2.sort.zst

Split the trace in days

# change the fname in split_in_days.py
python3 split_in_days.py

Generate the stat file

# change the traceno in stats.py
python3 stats.py

Split the traces in threads

# change the traceno in thread_split.py
python3 thread_split.py

For each trace, extract the position of hot objects

# change the traceno in FreqExtraction.py
python3 FreqExtraction.py

Put the informaion into parameter.h. Put the variations into variation.
Change the information in src/config.json.

Correalation Analysis

Change the information in main_correlation.cpp
Change the dir and rebuild

cd _build
make -j

Run correlation to generate the correlation graph. Note: the generation of the correlation graph could take a long time. It will generate a file whose name is louvaion_node_{trace_no}_{flimit}

./correlation

Graph partition

Download and compile the louvain.

wget https://master.dl.sourceforge.net/project/louvain/louvain-generic.tar.gz?viasf=1
tar -zvxf louvain-generic.tar.gz
cd louvain-generic/
make

Generate initial groups

cd GroupDivision/
bash initial.sh

merge the group

# change the infromation of trace
python3 merge_discrete_file.py

Execute Algorithm 1: Partition correlation graph

python3 divided_graph.py

Then we get the graph file of the CGroups.

Objects Distribution

Put the generated information of CGroups to parameter.h.
Set up the experiments you want to test and changes the variation in cache.h.
Setup the Memcached Nodes.
Record information of Memcached nodes to config.json.

"server_info": [
    {
      "ip": "172.18.96.10",
      "port": 11211
    },{
      "ip": "172.18.96.11",
      "port": 11211
    }
]

Change the dir and rebuild

cd _build
make -j

Run the executable file

./CorAna

The result will be write to result.txt.

Various evaluations

To adopt to a new evaluation, one should change the file config.h to get the parameter from the self-defined source file but not the config.json.

Plot

All the scripts for ploting the graph is under directory plot. The testers need to record the result as the format depicted in *.csv. Then the testers can run the python scripts.

Notes

All the paths in the scripts and source code should be carefully checked.
Preprocesses is important and would take a long time.
The process of correlation analysis takes a long time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AC-Cache: A Memory-Efficient Caching System for Small Objects via Exploiting Access Correlations

Requirement

Build

Trace process

Download

Preprocess

Correalation Analysis

Graph partition

Objects Distribution

Various evaluations

Plot

Notes

About

Releases 1

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
GroupDivision		GroupDivision
Preproccess		Preproccess
plot		plot
src		src
README.md		README.md

nankeys/ACCache

Folders and files

Latest commit

History

Repository files navigation

AC-Cache: A Memory-Efficient Caching System for Small Objects via Exploiting Access Correlations

Requirement

Build

Trace process

Download

Preprocess

Correalation Analysis

Graph partition

Objects Distribution

Various evaluations

Plot

Notes

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages