-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Datadogfleetautomation extension POC #5272
base: prod
Are you sure you want to change the base?
Conversation
GetModuleInfos() service.ModuleInfos | ||
} | ||
|
||
if host, ok := host.(exportModules); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔵 Code Quality Violation
Avoid modifying function parameters (...read more)
Assigning new values to function parameters exhibits several bad coding practices and should be avoided for several reasons:
- Redefining parameters: The code redefines a parameter within the function body by assigning a new value. This is considered a bad practice because it can lead to confusion and make the code harder to understand. Modifying a function parameter in this manner breaks the expected behavior and can cause unexpected side effects. It is generally best to treat function parameters as read-only values and avoid reassigning them.
- Shadowing variables: The code further exacerbates the issue by using the short variable declaration
:=
to define a new variable within the function body. This shadows the original parameter, making it inaccessible within the function. Shadowing variables can cause confusion and make the code harder to reason about. It is better to use distinct variable names to maintain clarity and avoid any unintended side effects.
To write more maintainable and understandable code, it is advisable to adhere to the following practices:
- Avoid redefining function parameters.
- Use descriptive and unambiguous variable names.
- Avoid shadowing variables.
- Maintain consistency in variable references.
By following these best practices, the code becomes more readable and easier to manage and avoids introducing unnecessary complexity and confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: update to GetModulesInfo() in service/hostcapabilities
extension/datadogfleetautomationextension/fleetautomationextension.go
Outdated
Show resolved
Hide resolved
extension/datadogfleetautomationextension/fleetautomationextension_test.go
Show resolved
Hide resolved
extension/datadogfleetautomationextension/fleetautomationextension.go
Outdated
Show resolved
Hide resolved
extension/datadogfleetautomationextension/fleetautomationextension.go
Outdated
Show resolved
Hide resolved
Addressing the golint violations |
cb116ca
to
34ae621
Compare
34ae621
to
48930cd
Compare
extensionConfig *Config | ||
extensionID component.ID | ||
telemetry component.TelemetrySettings | ||
collectorConfig *confmap.Conf | ||
collectorConfigStringMap map[string]any | ||
ticker *time.Ticker | ||
done chan bool | ||
mu sync.RWMutex | ||
uuid uuid.UUID | ||
|
||
buildInfo component.BuildInfo | ||
moduleInfo service.ModuleInfos | ||
moduleInfoJSON *moduleInfoJSON | ||
activeComponentsJSON *activeComponentsJSON | ||
version string | ||
|
||
forwarder defaultForwarderInterface | ||
compressor *compression.Compressor | ||
serializer serializer.MetricSerializer | ||
|
||
agentMetadataPayload AgentMetadata | ||
otelMetadataPayload OtelMetadata | ||
|
||
httpServer *http.Server | ||
healthCheckV2Enabled bool | ||
healthCheckV2ID *component.ID // currently first healthcheckv2 extension found; could expand to multiple health checks if needed | ||
healthCheckV2Config map[string]any | ||
componentStatus map[string]any // retrieved from healthcheckv2 extension, if enabled/configured | ||
|
||
hostnameProvider source.Provider |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it is doing many things. Can we split this multiple internal packages. Like http server can be its own package
package datadogfleetautomationextension // import "github.com/open-telemetry/opentelemetry-collector-contrib/extension/datadogfleetautomationextension" | ||
|
||
import ( | ||
"encoding/json" | ||
"fmt" | ||
|
||
"github.com/DataDog/datadog-agent/pkg/serializer/marshaler" | ||
"go.opentelemetry.io/collector/component" | ||
) | ||
|
||
type OtelCollector struct { | ||
HostKey string `json:"host_key"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move it to internal/payload package.
// Copyright The OpenTelemetry Authors | ||
// SPDX-License-Identifier: Apache-2.0 | ||
|
||
package datadogfleetautomationextension // import "github.com/open-telemetry/opentelemetry-collector-contrib/extension/datadogfleetautomationextension" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may be name it datadogfaextension
?
type defaultComponentChecker struct { | ||
extension *fleetAutomationExtension | ||
} | ||
|
||
func (d *defaultComponentChecker) isComponentConfigured(typ string, componentsKind string) (bool, *component.ID) { | ||
return d.extension.isComponentConfigured(typ, componentsKind) | ||
} | ||
|
||
func (d *defaultComponentChecker) isModuleAvailable(componentType string, componentKind string) bool { | ||
return d.extension.isModuleAvailable(componentType, componentKind) | ||
} | ||
|
||
func (d *defaultComponentChecker) isHealthCheckV2Enabled() (bool, error) { | ||
return d.extension.isHealthCheckV2Enabled() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we decouple component checker and extension. may be make component checker its own module
if componentsKind == extensionsKind { | ||
if componentStatus, ok := componentsConfig[componentsKind].(map[string]any); ok { | ||
if receiversStatus, ok := componentStatus["components"].(map[string]any); ok { | ||
componentNameInPipeline := componentKind + ":" + id | ||
if componentStatus, ok := receiversStatus[componentNameInPipeline].(map[string]any); ok { | ||
result = componentStatus | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should try to avoid arrow pattern. see https://blog.codinghorror.com/flattening-arrow-code/
try changing to
if componentsKind != extensionsKind {
continue
}
if pipelineMap, ok := value.(map[string]any); ok { | ||
if components, ok := pipelineMap["components"].(map[string]any); ok { | ||
for component, status := range components { | ||
componentParts := strings.Split(component, ":") | ||
if len(componentParts) != 2 { | ||
continue | ||
} | ||
kind := componentParts[0] | ||
fullID := componentParts[1] | ||
if kind == componentKind && id == fullID { | ||
if componentStatus, ok := status.(map[string]any); ok { | ||
result[key] = map[string]any{ | ||
component: componentStatus, | ||
} | ||
} | ||
} | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
e.mu.Lock() | ||
defer e.mu.Unlock() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need mutex here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't think too hard about this one; just knew that other extensions that used ConfigWatcher also used a mutex, and I also am not too clear on how multiple instances of the same extension would work. maybe not necessary for this.
48930cd
to
9effea7
Compare
// ErrUnsetAPIKey is returned when the API key is not set. | ||
ErrUnsetAPIKey = datadogconfig.ErrUnsetAPIKey | ||
// ErrEmptyEndpoint is returned when endpoint is empty | ||
ErrEmptyEndpoint = datadogconfig.ErrEmptyEndpoint | ||
// ErrAPIKeyFormat is returned if API key contains invalid characters | ||
ErrAPIKeyFormat = datadogconfig.ErrAPIKeyFormat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to export all of these, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah that's true I can probably move most things to un-exported variables/functions
// NonHexRegex is a regex of characters that are always invalid in a Datadog API Key | ||
NonHexRegex = datadogconfig.NonHexRegex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
// DefaultSite is the default site for the Datadog API. | ||
DefaultSite = datadogconfig.DefaultSite | ||
// NonHexChars is a regex of characters that are always invalid in a Datadog API key. | ||
NonHexChars = datadogconfig.NonHexChars | ||
// DefaultReporterPeriod is the default amount of time between sending fleet automation payloads to Datadog. | ||
DefaultReporterPeriod = 20 * time.Minute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
// ReporterPeriod sets the amount of time between sending fleet automation payloads to Datadog. | ||
ReporterPeriod time.Duration `mapstructure:"reporter_period"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we want to make this configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we allow it to be configured in hostmetadata reporter so I figured it made sense to keep the configuration the same. I don't actually care if we let users configure it or not and would just as easily remove it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just remove it for this first version
// APIKeyValidator is a function that validates the API key, provided to newExtension for mocking | ||
APIKeyValidator func(context.Context, string, *zap.Logger, *datadog.APIClient) error | ||
// SourceProviderGetter is a function that returns a source.Provider, provided to newExtension for mocking | ||
SourceProviderGetter func(component.TelemetrySettings, string, time.Duration) (source.Provider, error) | ||
// ForwarderGetter is a function that returns a defaultforwarder.Forwarder, provided to newExtension for mocking | ||
ForwarderGetter func(coreconfig.Component, corelog.Component) defaultforwarder.Forwarder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These can all be private
collectorConfig *confmap.Conf | ||
collectorConfigStringMap map[string]any | ||
ticker *time.Ticker | ||
done chan bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the done
channel used for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is used to close the ticker which sends the metadata every "reporterperiod" amount of time, but it looks like you found it and we can just do a cancellable context instead
} | ||
} | ||
|
||
func (e *fleetAutomationExtension) checkHealthCheckV2() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get why we do all this. We could just call the same API that the healthcheck extension uses and I think this would be much simpler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you mean. I don't know that it makes sense to duplicate healthCheckV2 functionality in our own extension, but it would make us less tightly coupled to healthcheckv2 and allow us to iterate on our own cadence.
I'll work on a refactor that does it that way instead and see if that is simpler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to duplicate its functionality, just the bits that get the info from the host. We don't need to necessarily use the format that the extension is using
if err != nil { | ||
e.telemetry.Logger.Error("Failed to prepare and send fleet automation payloads", zap.Error(err)) | ||
} | ||
case <-e.done: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of this I would pass a context and wait for the context to be cancelled
e.mu.RLock() | ||
defer e.mu.RUnlock() | ||
if e.hostnameSource == "unset" { | ||
e.telemetry.Logger.Info("Skipping fleet automation payloads since the hostname is empty") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case should just not happen, we should have bailed out if the extension cannot work properly with the user provided configuration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, I thought maybe it made sense to log it, and allow the user to update the config (collector technically allows config changes while running, right?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, well, there is very limited functionality related to this, we can always revert back to not adding an error in the future (but it's harder to go from "no error" -> error)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be in its own internal package
7bb0557
to
3176d4f
Compare
3176d4f
to
e7ed96a
Compare
Description
initial POC for Datadog Fleet Automation Extension
Link to tracking issue
Fixes Jira OTEL-2445
Testing
Full unit testing (todo: will open a card to write a few integration tests)
Documentation
Have only written a barebones README.md to pass checks; figured that would make sense to do after we make changes or add/remove/adjust functionality.
Additional notes
Does not yet send the new, third payload type. I am working on a separate branch to add that payload and am planning to open a second PR after this one is accepted to our fork