This repository has been archived by the owner on Oct 14, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 13
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #13 from yangyang233333/db_api
DB的API相关commit
- Loading branch information
Showing
22 changed files
with
798 additions
and
26 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -14,4 +14,7 @@ test_temp | |
|
||
# others | ||
ref | ||
build/* | ||
build/* | ||
|
||
storage | ||
db_storage |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -13,6 +13,7 @@ smallkv 是一个列存的、基于LSM架构的存储引擎。 | |
**项目正在疯狂迭代中!!** | ||
|
||
--- | ||
|
||
## 进度 | ||
|
||
- [x] 跳表 | ||
|
@@ -27,13 +28,18 @@ smallkv 是一个列存的、基于LSM架构的存储引擎。 | |
- [ ] 读流程 | ||
- [ ] 写流程 | ||
- [ ] Compaction模块 | ||
- [ ] 用FreeListAllocate(src/memory/allocate.h)替换系统内存分配器 | ||
|
||
--- | ||
|
||
## BUILD | ||
|
||
You must use the g++ compiler and Ubuntu 22.04 system. | ||
You must use the g++ compiler(with C++ 17 supported) and Ubuntu 22.04 system. | ||
|
||
### build from docker (Highly recommended) | ||
|
||
```shell | ||
git clone [email protected]:yangyang233333/smallkv.git | ||
docker pull qianyy2333/smallkv-test | ||
docker run -it -v /{smallkv代码所在的目录}:/test qianyy2333/smallkv-test /bin/bash | ||
./build.sh ## 编译 | ||
|
@@ -42,6 +48,7 @@ docker run -it -v /{smallkv代码所在的目录}:/test qianyy2333/smallkv-test | |
``` | ||
|
||
### build from source code: | ||
|
||
```shell | ||
# 安装依赖 | ||
apt update && apt upgrade -y && apt install cmake make git g++ gcc -y && cd ~ \ | ||
|
@@ -50,40 +57,60 @@ apt update && apt upgrade -y && apt install cmake make git g++ gcc -y && cd ~ \ | |
&& git clone https://github.com/nlohmann/json && cd json && mkdir build && cd build && cmake .. && make -j && sudo make install && cd ~ \ | ||
&& git clone https://github.com/abseil/abseil-cpp.git && cd abseil-cpp && mkdir build && cd build && cmake .. && make -j && make install && cd ~ \ | ||
&& rm -rf spdlog googletest json | ||
git clone [email protected]:yangyang233333/smallkv.git | ||
cd smallkv | ||
./build.sh ## 编译 | ||
./main_run.sh ## 主程序 | ||
./unittest_run.sh ## 单元测试 | ||
``` | ||
|
||
--- | ||
|
||
## 设计 | ||
|
||
### 1. **内存池设计** | ||
|
||
![mem_pool](./img/mem_pool_design.png) | ||
|
||
### 2. **缓存设计** | ||
|
||
![cache](./img/cache_design.png) | ||
Cache中持有N(默认为5)个指向CachePolicy的指针,相当于5个分片,可以减少哈希冲突以及减少锁的范围;LRUCache和LFUCache都是CachePolicy的子类。 | ||
Cache中持有N(默认为5)个指向CachePolicy的指针,相当于5个分片,可以减少哈希冲突以及减少锁的范围;LRUCache和LFUCache都是CachePolicy的子类。 | ||
|
||
### 3. **SSTable设计** | ||
|
||
### 3. **SSTable设计** | ||
每个.sst文件存储一个SSTable结构,SSTable结构如下所示: | ||
![sstable_schema](./img/sstable.png) | ||
下面细说每个模块的内容: | ||
- #### 3.1 DataBlock | ||
下面细说每个模块的内容: | ||
|
||
- #### 3.1 DataBlock | ||
|
||
![data_block_schema](./img/data_block_schema.png) | ||
1)上图中,每个Record存储了具体的KV数据,并且记录了连续的Key的共享长度(为了差值压缩); | ||
2)Restart主要用来进行二分查找,根据Restart中记录的offset信息可以解析出对应的Record Group中最小的Key,通过比对连续的Restart中的Key可以快速定位K-V pair,每个Restart记录了一个Record Group中的Record数量,以及对应的size和offset,每个Restart长度为12字节; | ||
3)Restart_NUM记录了Restart的数量; | ||
4)Restart_Offset记录了Restart的size和offset信息; | ||
- #### 3.2 MetaBlock | ||
MetaBlock中存储了Filter信息(位数组和哈希函数个数),也就是布隆过滤器的数据。为什么需要这个数据?因为sst是顺序append结构,所以写入很快(O(1)),但是查找非常慢(O(N)),于是需要一个布隆过滤器来对请求进行初步的过滤(可以过滤掉一定不存在的KV pair)。 | ||
- #### 3.3 IndexBlock | ||
1)上图中,每个Record存储了具体的KV数据,并且记录了连续的Key的共享长度(为了差值压缩); | ||
2)Restart主要用来进行二分查找,根据Restart中记录的offset信息可以解析出对应的Record | ||
Group中最小的Key,通过比对连续的Restart中的Key可以快速定位K-V pair,每个Restart记录了一个Record | ||
Group中的Record数量,以及对应的size和offset,每个Restart长度为12字节; | ||
3)Restart_NUM记录了Restart的数量; | ||
4)Restart_Offset记录了Restart的size和offset信息; | ||
|
||
- #### 3.2 MetaBlock | ||
|
||
MetaBlock中存储了Filter信息(位数组和哈希函数个数),也就是布隆过滤器的数据。为什么需要这个数据?因为sst是顺序append结构,所以写入很快(O( | ||
1)),但是查找非常慢(O(N)),于是需要一个布隆过滤器来对请求进行初步的过滤(可以过滤掉一定不存在的KV pair)。 | ||
|
||
- #### 3.3 IndexBlock | ||
|
||
![index_block_schema](./img/index_block_schema.png) | ||
IndexBlock存储对应的DataBlock中的最大key信息(注意:实际存储的是shortest_key,并且shortest_key = min{shortest_key > 对应的DataBlock的最大key},这样可以减小比较次数,缓解高并发下的压力);Offset_Info存储了对应DataBlock的size和offset。 | ||
- #### 3.4 Footer | ||
IndexBlock存储对应的DataBlock中的最大key信息(注意:实际存储的是shortest_key,并且shortest_key = min{shortest_key > | ||
对应的DataBlock的最大key},这样可以减小比较次数,缓解高并发下的压力);Offset_Info存储了对应DataBlock的size和offset。 | ||
|
||
- #### 3.4 Footer | ||
|
||
![footer_schema](./img/footer_schema.png) | ||
MetaBlock_OffsetInfo记录了MetaBlock的size和offset,IndexBlock_OffsetInfo记录了IndexBlock的offset(第一个IndexBlock的offset)和size(所有IndexBlock的总大小)。 | ||
|
||
--- | ||
|
||
## 第三方依赖: | ||
|
||
1. [spdlog](https://github.com/gabime/spdlog) | ||
|
@@ -92,16 +119,19 @@ MetaBlock_OffsetInfo记录了MetaBlock的size和offset,IndexBlock_OffsetInfo | |
4. [abseil](https://github.com/abseil/abseil-cpp) | ||
|
||
--- | ||
## 参考: | ||
|
||
## 有用的参考资料: | ||
|
||
1. [阿里云NewSQL数据库大赛](https://tianchi.aliyun.com/competition/entrance/531980/introduction) | ||
2. [corekv](https://github.com/hardcore-os/coreKV-CPP) | ||
3. [leveldb](https://github.com/google/leveldb) | ||
4. [LSM树原理](https://zhuanlan.zhihu.com/p/181498475) | ||
5. [LSM Tree是什么?](https://www.zhihu.com/question/446544471/answer/2348883977) | ||
6. [WAL](https://zhuanlan.zhihu.com/p/258091002) | ||
6. [WAL](https://zhuanlan.zhihu.com/p/258091002) | ||
7. [Linux I/O: fsync, fflush, fwrite, mmap](https://juejin.cn/post/7001665675907301412) | ||
|
||
--- | ||
|
||
感谢 [JetBrains](https://jb.gg/OpenSourceSupport) 捐献的免费许可证帮助我们开发smallkv。 | ||
Thanks to [JetBrains](https://jb.gg/OpenSourceSupport) for donating product licenses to help develop **smallkv** <a href="https://jb.gg/OpenSourceSupport"><img src="img/jb_beam.svg" width="94" align="center" /></a> | ||
Thanks to [JetBrains](https://jb.gg/OpenSourceSupport) for donating product licenses to help develop **smallkv | ||
** <a href="https://jb.gg/OpenSourceSupport"><img src="img/jb_beam.svg" width="94" align="center" /></a> |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
// | ||
// Created by qianyy on 2023/1/28. | ||
// | ||
#include "db.h" | ||
#include "db_impl.h" | ||
|
||
namespace smallkv { | ||
DB::DB(const Options &options) { | ||
db_impl = std::make_unique<DBImpl>(options); | ||
} | ||
|
||
DBStatus DB::Put(const WriteOptions &options, | ||
const std::string_view &key, | ||
const std::string_view &value) { | ||
return db_impl->Put(options, key, value); | ||
} | ||
|
||
DBStatus DB::Delete(const WriteOptions &options, | ||
const std::string_view &key) { | ||
return db_impl->Delete(options, key); | ||
} | ||
|
||
DBStatus DB::Get(const ReadOptions &options, | ||
const std::string_view &key, | ||
std::string *value) { | ||
return db_impl->Get(options, key, value); | ||
} | ||
|
||
DBStatus DB::BatchPut(const WriteOptions &options) { | ||
return db_impl->BatchPut(options); | ||
} | ||
|
||
DBStatus DB::BatchDelete(const ReadOptions &options) { | ||
return db_impl->BatchDelete(options); | ||
} | ||
|
||
DBStatus DB::Close() { | ||
return db_impl->Close(); | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
// | ||
// Created by qianyy on 2023/1/27. | ||
// | ||
#include <memory> | ||
#include <string_view> | ||
#include "status.h" | ||
#include "options.h" | ||
|
||
#ifndef SMALLKV_DB_H | ||
#define SMALLKV_DB_H | ||
namespace smallkv { | ||
class DBImpl; | ||
|
||
class DB { | ||
public: | ||
explicit DB(const Options& options); | ||
|
||
~DB() = default; | ||
|
||
// DB 应该是单例,禁止拷贝、赋值 | ||
DB(const DB &) = delete; | ||
|
||
DB &operator=(const DB &) = delete; | ||
|
||
DBStatus Put(const WriteOptions &options, | ||
const std::string_view &key, | ||
const std::string_view &value); | ||
|
||
DBStatus Delete(const WriteOptions &options, | ||
const std::string_view &key); | ||
|
||
// 将Key对应的值写到value地址上 | ||
DBStatus Get(const ReadOptions &options, | ||
const std::string_view &key, | ||
std::string *value); | ||
|
||
// 批写 | ||
DBStatus BatchPut(const WriteOptions &options); | ||
|
||
DBStatus BatchDelete(const ReadOptions &options); | ||
|
||
// 关闭数据库:调用此函数可以保证所有已写入数据会被持久化到磁盘, | ||
DBStatus Close(); | ||
|
||
private: | ||
std::unique_ptr<DBImpl> db_impl; | ||
}; | ||
} | ||
#endif //SMALLKV_DB_H |
Oops, something went wrong.