-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: simplify health/readiness checks #79
Conversation
Signed-off-by: Valery Piashchynski <[email protected]>
WalkthroughThe changes in this pull request involve modifications to several files related to the health check and job reporting functionalities of a Go application. Key updates include the introduction of new data structures for reporting plugin and job statuses, enhancements to the health check mechanism for improved observability, and adjustments to configuration files to support better job management. Additionally, some linters were removed from the configuration, and tests were updated to improve reliability and clarity. Changes
Assessment against linked issues
Possibly related PRs
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Outside diff range and nitpick comments (2)
tests/configs/.rr-jobs-status.yaml (1)
14-14
: Looks good! Consider monitoring performance and resource utilization post-deployment.Increasing
num_pollers
from 1 to 2 can potentially improve the system's ability to process jobs concurrently, leading to better performance and reduced processing times.However, it's important to monitor the system's resource utilization (CPU, memory) and overall performance after deploying this change to production. If the increased concurrency leads to resource contention or degraded performance, you may need to adjust the
num_pollers
value or scale the system's resources accordingly.Consider collecting metrics and logs related to job processing, queue lengths, and resource utilization. Analyze this data to determine if further tuning of the
num_pollers
parameter is necessary to achieve the desired performance characteristics while ensuring the system remains stable and responsive.ready.go (1)
36-37
: Consider initializing thereport
slice with a more appropriate capacity.The current initialization of the
report
slice with a capacity of 2 seems arbitrary and may not be sufficient for all scenarios. Consider initializing the slice with a capacity equal to the number of plugins in thestatusRegistry
to avoid unnecessary allocations and improve performance.-report := make([]*Report, 0, 2) +report := make([]*Report, 0, len(rd.statusRegistry))
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files ignored due to path filters (2)
go.work.sum
is excluded by!**/*.sum
tests/go.sum
is excluded by!**/*.sum
Files selected for processing (9)
- .golangci.yml (0 hunks)
- api.go (1 hunks)
- health.go (2 hunks)
- jobs.go (2 hunks)
- plugin.go (0 hunks)
- ready.go (2 hunks)
- tests/configs/.rr-jobs-status.yaml (1 hunks)
- tests/configs/.rr-status-503.yaml (2 hunks)
- tests/plugin_test.go (14 hunks)
Files not reviewed due to no reviewable changes (2)
- .golangci.yml
- plugin.go
Additional comments not posted (17)
tests/configs/.rr-status-503.yaml (2)
14-20
: LGTM! Consider adjusting the settings for production deployment.The introduction of the
jobs
section with parameters for job management aligns with the PR objective of enhancing job processing capabilities. The current settings seem suitable for testing or low-traffic scenarios.For production deployment, consider the following:
- Adjust
num_pollers
andnum_workers
based on the expected workload and system resources to ensure optimal performance and scalability.- Fine-tune the
pipeline_size
value based on the specific use case and available system resources to strike a balance between throughput and resource utilization.- Monitor the
allocate_timeout
anddestroy_timeout
values in production and adjust them if necessary to ensure efficient resource management without causing resource contention or starvation.
32-32
: Looks good!The
grace_period
value of 10 seconds seems appropriate for allowing the server to shut down gracefully. It provides sufficient time for the server to complete any ongoing tasks and release resources gracefully before terminating.api.go (2)
3-7
: LGTM!The
Report
struct is well-defined and aligns with the intended usage. The fields are appropriately named and tagged for JSON serialization.
9-19
: LGTM!The
JobsReport
struct is well-defined and captures essential job-related metrics. The fields are appropriately named and tagged for JSON serialization. This struct will be valuable for monitoring and debugging job performance and health.jobs.go (2)
49-60
: LGTM!The code correctly populates the
report
slice withJobsReport
instances created from thejobStates
elements. The field mappings are accurate, and the logic is sound.
62-66
: LGTM!The code correctly marshals the
report
slice into JSON format and handles any errors that may occur during the process. The error handling ensures that any issues are logged appropriately.health.go (5)
4-4
: LGTM!The
"encoding/json"
package import is necessary for JSON serialization and deserialization in theServeHTTP
method.
33-34
: LGTM!The
report
slice is initialized with a capacity of 2, which is a good practice to avoid unnecessary allocations. The comment provides clarity on the purpose of thereport
slice.
37-106
: Comprehensive health check implementation!The code block handles the case when no plugins are provided in the query. It performs a comprehensive health check by iterating over all registered plugins and checking their status. The detailed reports appended to the
report
slice provide valuable information about each plugin's health status, including error messages and status codes.The code handles various scenarios such as uninitialized plugins, plugins returning nil, and unexpected status codes. The response is properly serialized to JSON and written to the response writer.
Overall, this code block enhances the health check functionality by providing a comprehensive overview of all plugins' health status when no specific plugins are requested.
108-160
: Targeted health check implementation!The code block iterates over all provided plugins in the query and performs a targeted health check. It checks the status of each plugin and appends detailed reports to the
report
slice, including error messages and status codes.The code handles various scenarios such as uninitialized plugins, plugins returning nil, and unexpected status codes. Setting the response status code to
unavailableStatusCode
for status codes >= 500 is appropriate.Additionally, the code logs information when a plugin does not support health checks, providing visibility into unsupported plugins.
Overall, this code block enhances the health check functionality by allowing targeted health checks for specific plugins provided in the query.
161-168
: LGTM!The
report
slice is serialized to JSON, and the serialized JSON response is written to the response writer, which is the correct way to send the response.Any error during JSON marshaling is logged, which is important for debugging purposes.
ready.go (4)
40-96
: LGTM!The code changes for handling the readiness checks of all plugins when no specific plugins are provided in the query parameters look good. The code flow, error handling, and logging seem appropriate. The
report
slice is being populated correctly based on the readiness status of each plugin.
111-159
: LGTM!The code changes for handling the readiness checks of the provided plugins in the query parameters look good. The code flow, error handling, and HTTP status code setting seem appropriate. The
report
slice is being populated correctly based on the readiness status of each provided plugin.
160-162
: LGTM!The code correctly logs an informational message when a provided plugin does not support readiness checks and continues execution without adding any report for the unsupported plugin.
169-170
: LGTM!The code correctly writes the JSON response to the
ResponseWriter
. Although the error returned byw.Write
is being ignored, it is a common practice in HTTP handlers.tests/plugin_test.go (2)
661-680
: LGTM!The
checkHTTPStatusAll
function correctly verifies the response from the/health
endpoint without any query parameters. The changes improve the reliability of the test by unmarshalling the JSON response and asserting the contents of thestatus.Report
slice.
719-733
: LGTM!The
checkHTTPReadinessAll
function correctly verifies the response from the/ready
endpoint without any query parameters. The changes improve the reliability of the test by unmarshalling the JSON response into a slice ofstatus.Report
structs and asserting the response status code.
Signed-off-by: Valery Piashchynski <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Outside diff range and nitpick comments (1)
health.go (1)
33-35
: Consider setting thereport
slice capacity based on the expected number of plugins.The current capacity of 2 seems arbitrary. To avoid unnecessary allocations, set the capacity based on the expected number of plugins.
For example, if the expected number of plugins is known, you can set the capacity as follows:
report := make([]*Report, 0, expectedPluginCount)If the expected number of plugins is unknown, you can use the length of the
statusRegistry
map as a reasonable estimate:report := make([]*Report, 0, len(rd.statusRegistry))
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (3)
- health.go (2 hunks)
- jobs.go (2 hunks)
- ready.go (2 hunks)
Files skipped from review as they are similar to previous changes (1)
- ready.go
Additional comments not posted (8)
jobs.go (4)
45-46
: Suggestion implemented: Usinglen(jobStates)
as the capacity for thereport
slice.The code has been updated to set the capacity of the
report
slice tolen(jobStates)
, as suggested in the previous review. This ensures that the slice has enough capacity to store all the job reports without the need for reallocation.
49-60
: LGTM: Structured approach for job state reporting.The code segment populates the
report
slice by iterating overjobStates
and creatingJobsReport
instances. This change enhances code clarity and maintainability by using a structured approach to represent job state information. The implementation looks good.
62-66
: LGTM: JSON serialization and error handling.The code segment serializes the
report
slice into JSON format usingjson.Marshal
, which is a standardized format that is easier for clients to consume. Error handling has been introduced to ensure that any issues during the JSON marshaling process are logged appropriately. The implementation looks good.
68-70
: Suggestion implemented: Handling the error returned byw.Write
.The code has been updated to handle the error returned by
w.Write
, as suggested in the previous review. This ensures that any issues during the write operation are logged appropriately.health.go (4)
38-106
: LGTM!The code segment handling the scenario where no plugins are provided in the query looks good. It comprehensively checks the health of all registered plugins, handles various error scenarios, and provides a structured JSON response. The error handling and logging are also implemented appropriately.
111-163
: LGTM!The code segment handling the scenario where specific plugins are provided in the query looks good. It checks the health of the specified plugins, handles various error scenarios, and provides a structured JSON response. The error handling and logging are also implemented appropriately.
165-168
: LGTM!The code segment for serializing the
report
slice to JSON looks good. It properly marshals the data and logs any errors that occur during the process.
170-174
: LGTM!The code segment for writing the serialized JSON response to the response writer looks good. It properly writes the data and logs any errors that occur during the process.
Reason for This PR
closes: roadrunner-server/roadrunner#1998
Description of Changes
For the
/jobs
endpoint, array:License Acceptance
By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.
PR Checklist
[Author TODO: Meet these criteria.]
[Reviewer TODO: Verify that these criteria are met. Request changes if not]
git commit -s
).CHANGELOG.md
.Summary by CodeRabbit
New Features
Bug Fixes
Refactor
Tests