-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The system shuts down automatically if the template creation fails. #463
Conversation
@@ -66,6 +66,7 @@ public abstract class BaseMode { | |||
protected List<DataClient> dataClients = new ArrayList<>(); | |||
protected List<SchemaClient> schemaClients = new ArrayList<>(); | |||
protected Measurement baseModeMeasurement = new Measurement(); | |||
public static volatile boolean stopAllSchemaClient = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
感觉加这个有点依赖全局变量了,不太优雅
目前看代码这个变量有两个作用,一是元数据注册线程之间的通信,二是BaseMode依赖这个变量判断是否注册成功
对于一,我问了chatgpt ”如果一个线程想让其它正在等待的线程出现异常该怎么做“,它说“调用 reset() 方法会打破当前的屏障,并使所有正在等待的线程抛出 BrokenBarrierException”
对于二,我看了一遍调用过程,是从BaseMode一层层调用到ModelStrategy的,最好通过返回值的方式去传达元数据注册是否成功
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
方法一:这种之前尝试过,reset 后其余的线程能顺利执行,但我觉得既然注册 template 都失败了,那注册 database、与实践序列也没有必要了。
方法二:这个方式逻辑上确实清晰,但这个可能需要把其他数据库得方法也要修改,改动的多一些,当失败后打印失败信息,然后直接 return false。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
其实不存在方法一、二之分吧,只是说如果不用全局变量的话怎么处理这俩问题
对于“reset 后其余的线程能顺利执行”,reset后其它线程应该是否为BrokenBarrierException?catch到这个异常就不继续执行
对于“其他数据库得方法也要修改”,这个确实是个问题,能否给registerSchema接口改一下让接口抛异常?其它数据库的实现就不用改了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| 对于“reset 后其余的线程能顺利执行”,reset后其它线程应该是否为BrokenBarrierException?catch到这个异常就不继续执行
感觉可以按照这个搞?如果是这样的话那就处理一下异常就行,不用额外维护一个全局变量,可能更清晰?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
当存在线程被屏障阻塞时,使用 reset 后能使其余线程抛出 BrokenBarrierException,然后 catch 后不继续运行也没有问题。对于二返回值有两个想法:
一、使用 callable 替换 runnable ,在主线程中获取每一个 schemaclient 的结果,当存在一个线程结果为 false,主线程注册元数据也返回 false,然后停止 bm。但是 callable 开销要比 runnable 大,担心影响测试。
二、使用全局变量,默认为 true,当一个 schemaClient 线程失败后,将其改为 false。当所有 schemaClient 结束后,主线程根据全局变量的结果决定是否继续执行,这个开销小一些。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
使用方法一。
try { | ||
if (config.isTEMPLATE() && templateInit.compareAndSet(false, true)) { | ||
Template template = null; | ||
if (config.isTEMPLATE() && !schemaList.isEmpty()) { | ||
template = createTemplate(schemaList.get(0)); | ||
} | ||
int sessionIndex = random.nextInt(sessionListMap.size()); | ||
Session templateSession = new ArrayList<>(sessionListMap.keySet()).get(sessionIndex); | ||
registerTemplate(templateSession, template); | ||
} | ||
} finally { | ||
templateBarrier.await(); | ||
if (BaseMode.isStopAllSchemaClient()) { | ||
return; | ||
} | ||
int sessionIndex = random.nextInt(sessionListMap.size()); | ||
Session templateSession = new ArrayList<>(sessionListMap.keySet()).get(sessionIndex); | ||
registerTemplate(templateSession, template); | ||
} | ||
templateBarrier.await(); | ||
for (Map.Entry<Session, List<TimeseriesSchema>> pair : sessionListMap.entrySet()) { | ||
registerDatabases(pair.getKey(), pair.getValue()); | ||
try { | ||
for (Map.Entry<Session, List<TimeseriesSchema>> pair : sessionListMap.entrySet()) { | ||
registerDatabases(pair.getKey(), pair.getValue()); | ||
} | ||
} finally { | ||
schemaBarrier.await(); | ||
if (BaseMode.isStopAllSchemaClient()) { | ||
return; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
感觉没必要try多次?一个大try就行?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
使用多个 try 是保证每个屏障都能正确执行,防止死锁。不过要是使用上面提到的方法二,这里确实使用一个 try 就 ok。这样我去看看 方法二 都要修改哪些部分。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果上面用 reset 的话应该就用一个就行?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Nice and clear~
问题
场景
将 device 均分给每个 schemaClient 创建,创建 template -> 创建 database -> 激活 template -> 创建时间序列,使用 template 元数据的流程存在顺序关系:
所以,此处使用 3 个 CyclicBarrier :templateBarrier、schemaBarrier、activateTemplateBarrier。
解法