Query node加载数据原理 #32698
Replies: 4 comments 3 replies
-
大致上是每个collection有一根数据管道,其中有一个querynode负责管理这根管道,我们称之为shard-leader,它从管道中接收来自pulsar的数据,数据积累在内存里,称为growing segment,当数据达到一定量,比如一百兆,就把这块数据落盘变成sealed segment,其他的querynode等加载sealed segment。 Read this doc to learn more: https://milvus.io/blog/deep-dive-3-data-processing.md |
Beta Was this translation helpful? Give feedback.
-
那如果我一个collection的数量非常庞大,加载数据会分别加载到不同的querynode中吗 |
Beta Was this translation helpful? Give feedback.
-
In Milvus, data cannot be read unless they are loaded. When the proxy receives a data load request, it sends the request to query coordinator which decides the way of assigning shards to different query nodes. The assigning information (i.e. The names of vchannels and the mapping between vchannels and their corresponding pchannels) is sent to query nodes via method call or RPC (remote procedure call). Subsequently, the query nodes create corresponding MsgStream objects to consume data. 根据这我理解一个collection数据是可以加载到不同的 querynode中的,上述您说了插入10个G数据后会发现其他也有数据,这个10个G是milvus内置的1个参数吗 还是一个经验值;这个load到querynode的过程会是 banlance的吗 |
Beta Was this translation helpful? Give feedback.
-
看起来,你们应该有比较多的partition,每个partition都会有growing数据,而growing数据都会在shard delegator上。默认情况下只有一个delegator,因此大部分数据都在一台机器上。 |
Beta Was this translation helpful? Give feedback.
-
目前我们QueryNode有10个节点,发现一个现象是加载某一个colection数据到query时,只有某个querynode的内存出现了比较大的增加:0.8G ->0.89G
0.76 ->5.51
0.71 ->0.80
0.70 ->0.74
0.65 ->0.68
0.69 ->0.71
0.89 ->0.94
0.89 ->0.99
0.74 ->0.77
0.79 ->0.79而其他节点变化很小,咨询一下 collection的数据是只会加载到某一个queryNode中吗,具体的工作原理是什么
Beta Was this translation helpful? Give feedback.
All reactions