-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(crawler): add canceled context to Visit func #27
feat(crawler): add canceled context to Visit func #27
Conversation
pkg/crawler/crawler.go
Outdated
@@ -63,6 +63,9 @@ func NewCrawler(opt Option) Crawler { | |||
|
|||
func (c *Crawler) Crawl(ctx context.Context) error { | |||
log.Println("Crawl maven repository and save indexes") | |||
ctx, ctxCancelFunc := context.WithCancel(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ctx, ctxCancelFunc := context.WithCancel(ctx) | |
ctx, cancel := context.WithCancel(ctx) |
nit: I think cancel
is used conventionally.
https://go.dev/doc/database/cancel-operations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done - 8f9c98b
pkg/crawler/crawler.go
Outdated
go func() { | ||
for _, child := range children { | ||
c.urlCh <- url + child | ||
} | ||
}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if just checking context here?
go func() { | |
for _, child := range children { | |
c.urlCh <- url + child | |
} | |
}() | |
go func() { | |
for _, child := range children { | |
select { | |
// Context can be canceled if we receive an error from another Visit function. | |
case <-ctx.Done(): | |
return | |
default: | |
c.urlCh <- url + child | |
} | |
} | |
}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pkg/crawler/crawler.go
Outdated
case <-ctx.Done(): | ||
return nil | ||
default: | ||
resp, err := c.http.Get(url) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a good opportunity to make the HTTP request context-aware.
resp, err := c.http.Get(url) | |
req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil) | |
if err != nil { | |
return xerrors.Errorf("unable to new HTTp request: %w", err) | |
} | |
if err = client.Do(req); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point!
Created this in 262f22d.
Description
When we get error from
Visit
function we closec.urlCh
.But there are cases when another
Visit
function tries to write toc.urlCh
before we return error.In this case we get
panic
instead of error.e.g. - https://github.com/aquasecurity/trivy-java-db/actions/runs/8041756834/job/21961450007#step:5:619
To avoid this case - we need to add context with
cancel
function forVisit
function.