You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Background
Now doris uses BDBJE to process metadata log. In the scenario of high concurrent writing, there are performance problems.
The write latency of BDBJE is between 1 ms and tens of ms, and only supports single concurrent write. The theoretical maximum TPS is around 1000. Therefore, there are serious performance problems in high-concurrency writing scenarios.
Pressure test
We tested doris' BDBJE with only one FE node on creating databases. In the locked state, the write TPS is about 800+; in the case of removing the lock, the write TPS is about 1500+.
Implement the RaftCore module with C++, realize Doris FE log storage, and replace the original BDBJE.
The Java layer implements RaftJournal to implement the Journal interface.
RaftCore Design:
Interface module: C++ RaftCore, adding, deleting, checking, member changing, and initialization and closing
interface. Just provide a JNI interface. The write and read JNI interfaces support concurrent calls.
2.log module, state machine module, leader election
RaftJournal Design:
When FE initializes RaftJournal, it calls the JNI interface to initialize the underlying C++ implementation
The RaftCore.
RaftJournal internally uses the JNI interface to store and read metadata.
When closing FE, call the JNI interface to close the RaftCore implemented by the underlying C++.
Members change the JNI interface.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Background
Background
Now doris uses BDBJE to process metadata log. In the scenario of high concurrent writing, there are performance problems.
The write latency of BDBJE is between 1 ms and tens of ms, and only supports single concurrent write. The theoretical maximum TPS is around 1000. Therefore, there are serious performance problems in high-concurrency writing scenarios.
Pressure test
We tested doris' BDBJE with only one FE node on creating databases. In the locked state, the write TPS is about 800+; in the case of removing the lock, the write TPS is about 1500+.
Industry Mainstream
ZK writes TPS 30000+.
RaftKeeper writes TPS at 70000+. RaftKeeper Benchmark
The scheme to use raft
Current situation of Doris FE:
The general idea of Raft management Log:
Implement the RaftCore module with C++, realize Doris FE log storage, and replace the original BDBJE.
The Java layer implements RaftJournal to implement the Journal interface.
RaftCore Design:
Interface module: C++ RaftCore, adding, deleting, checking, member changing, and initialization and closing
interface. Just provide a JNI interface. The write and read JNI interfaces support concurrent calls.
2.log module, state machine module, leader election
RaftJournal Design:
When FE initializes RaftJournal, it calls the JNI interface to initialize the underlying C++ implementation
The RaftCore.
RaftJournal internally uses the JNI interface to store and read metadata.
When closing FE, call the JNI interface to close the RaftCore implemented by the underlying C++.
Members change the JNI interface.
Beta Was this translation helpful? Give feedback.
All reactions