Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Daemon process #18

Merged
merged 26 commits into from
Aug 25, 2017
Merged

Daemon process #18

merged 26 commits into from
Aug 25, 2017

Conversation

martinyunify
Copy link
Contributor

@martinyunify martinyunify commented Aug 23, 2017

daemon process #9
-- fill nic pool with nic id.
-- maintain nic info table
-- delete cached nic when exit.
-- manage and allocate gateway nic
-- add version number generator script
-- validate vxnet before launch server #16
-- update README
-- refactor nic pool
-- add prometheus monitoring
-- add clean cache command

樊尚享 added 4 commits August 18, 2017 15:58
adopt grpc
define protocol
update deps
add nicpool event loop
refactor provider
@martinyunify martinyunify requested a review from jolestar August 23, 2017 03:00
Copy link
Contributor

@jolestar jolestar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

除了上面几个小问题,整体上没大问题,但要注意两点:

  1. 通过 pool.Add, pool.Done 进行异步操作的地方较多,如果这块出现 bug 可能导致死锁。
  2. NicPool 的逻辑需要想办法通过单元测试验证下,


resoureStub *qingcloud.QCNicProvider

sync.WaitGroup
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里 pool 自己本身既是 sync.RWMutex 也是 sync.WaitGroup 看后面的代码的时候感觉会比较奇怪,不如作为独立的字段容易理解。此处仅仅建议。

go func() {
defer pool.Done()
//check if nic is in dict
pool.RLock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个 lock 主要是为了保护 pool.nicDict, 我觉得要么扩展一个 sync map 出来,要么用 func 把从 map 中获取数据的操作封装起来,否则这种 lock,unlock 在很多地方出现,容易出错。

return err
}
nic, err := nicProvider.CreateNic()
conn, err := grpc.Dial(conf.BindAddr,grpc.WithInsecure())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个 BindAddr 要有个默认值,如果没配置就用默认值。

樊尚享 added 10 commits August 23, 2017 15:03
validate vxnet during bootstrap
validate
validate vxnet
validate vxnet
add ready pool to recycle nic first

migrate to cocurrent map in gateway and nicinfo
get all items in gateway map
@martinyunify
Copy link
Contributor Author

为了避免死锁网卡信息和网卡缓冲池是分开维护的。这里可能有不同步所以加了validationfunc确保拿到的网卡在线。加waitgroup是因为需要在关闭的时候删除返还的网卡。只有在关闭的时候才会调用wait。

cachedlist = append(cachedlist, pkg.StringPtr(nic))
log.Debugf("Got nic %s in nic pool", nic)
}
err := pool.nicProvider.ReclaimNic(cachedlist)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

释放空闲 nic 的操作可以提供个接口,由系统调用。不要在 shutdown pool 的时候进行操作,因为关闭 hostnic daemon 的时候是在 IaaS 操作的一个 job 中,在 job 中发起新的 job 会导致死锁。

pool := &NicPool{
nicDict: cmap.New(),
nicpool: make(chan string, size),
nicReadyPool: make(chan string, ReadyPoolSize),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感觉 nicReadyPool size 可以直接设置成系统支持的最大 nic 数,如果 readypool 放不进去,就说明到达上限了,直接删除就行。然后有个后台的 go routine 定时清理 readypool 到一个合理的值,比如 ReadyPoolSize。 感觉这样逻辑也更清楚些,对突发的大量容器反复创建销毁的支持也更友好些。你考虑下。

Fan Shang Xiang added 5 commits August 24, 2017 16:42
@martinyunify martinyunify changed the title WIP:Daemon process Daemon process Aug 25, 2017
樊尚享 added 2 commits August 25, 2017 15:26
add unit test cases
fix unit test cases
add test case for clearcache
@martinyunify martinyunify merged commit 999f318 into master Aug 25, 2017
@martinyunify martinyunify deleted the daemon_process branch August 25, 2017 08:17
@martinyunify martinyunify restored the daemon_process branch August 25, 2017 08:26
@martinyunify martinyunify deleted the daemon_process branch August 25, 2017 08:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants