From 61086c220933b22e617a0bc0829e5d8139b8fa32 Mon Sep 17 00:00:00 2001 From: sg Date: Mon, 28 Oct 2024 14:04:17 +0000 Subject: [PATCH] component release channels RFC --- docs/dep/dep-006.md | 127 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 127 insertions(+) create mode 100644 docs/dep/dep-006.md diff --git a/docs/dep/dep-006.md b/docs/dep/dep-006.md new file mode 100644 index 000000000..d60fb5f75 --- /dev/null +++ b/docs/dep/dep-006.md @@ -0,0 +1,127 @@ +# Component Release Channels and Component Maturity Levels + +## Introduction + +The new SDK makes it trivial to add a component but with great power comes great + component writing requirements. +The more components we have the harder it is to check they are of high quality +and user friendly. +This document attempts to lay a foundation of requirements for what are the +attributes a component should have in order to be considered mature enough for +wider use. +It also describes component release channels and what are the graduation +criteria for every channel. + +## The problem + +We have currently written ~50 components which all are useful in one way or +another. +However, this organic and largely unplanned growth of the component base has +incurred a cost in quality. +A lot of these components do *something* but they lack documentation, are often +unstable or worse, can be run only with specific arguments and do not integrate +well with a target system. +Further, a lot of the enrichers work with their unit tests but have not been +validated against real world usage (e.g. reachability) or they haven't been +validated at all for accuracy (reachability). +Furthermore both scanners and reporters tend to use fields aimed for humans +(e.g. issue Description field) as data dumps. + +## The solution + +The usual way to solve this is with: +* extensive testing +* release channels +* strict graduation criteria +* proper communication of component maturity to users + +### Release channels + +The following are suggested: + +* `experimental` : component has no guarantees, it is heavily work in progress. + This channel is aimed for internal use. +* `alpha` : component works, is usable but unreliable, may fail or provide + random results from time to time. This channel is for cutting edge users who + don't mind the unreliability but need a quick solution +* `beta` : component works relatively well, we may know or suspect a few edge + cases that are on the roadmap for being fixed. Users can use this with the the + edge case warning. +* `released` : component works, it is very well documented, is very reliable + + +### Guidelines for what is required for a component to be considered `released` + +* All components: + * There is a README.md and Documentation.md file that has developer facing + and user facing documentation respectively + * The user facing description clearly and consicely says what’s the intended + use of the component with an example. .e.g + + ```text + this git-source component is used for cloning git repositories, it supports + cloning both with private keys and github PATs, has features a,b,c,d. + Parameters supported are: ...., you can use it as such:... + ``` + * The developer facing description clearly and consicely states how the +component is implemented and any design decisions or liberties taken with it. + * The component has defaults that make sense (can be run without + configuration, save for api keys) + * If the component accepts customization, there’s no customization combination + that breaks the component. There are tests that validate this. + * There’s extensive tests both for the incoming data format, for orchestrating + any remote and any logic. + * If the component talks to a remote system there is an end to end test that + validates that it works correctly + * No mocking in e2e tests if we can help it + * There’s an example workflow that uses it + * There’s example input and output data stored with the component. + * If a component can be slow under circumstances ( remote api integration, + heavy operations) it is clearly communicated to the user + * If a component relies on a remote api, there’s backoff-retry functionality + * The component produces logs of appropriate level documenting its actions + * The component publishes metrics of how much data it processed, what it did + with it etc. + * The component has automated integration tests that ensure it runs end to + end, with the tool it orchestrates. + +* scanners: + * Supports at least one specific use case for a product or integrates with a + single api + * If it needs tool output in a special format (e.g. json) then that is not + customizable. + * There's no machine-intented fields squashed in a human-intented api field + (e.g. the issue description contains json) + * Provides metrics that make sense via the metrics channel + (num of issues processed, num of issues dropped due to bad formatting, num + of issues of each priority) + * If a scanner needs to output an extra summary result(e.g. scorecard + results against a repository), it is clearly marked as so + +* enrichers: + * there’s tests that measure accuracy of the result (generate static sample + data, feed to enricher, ensure accuracy of results is acceptable) + +* reporters: + * Unless otherwise specified, reporters are heavily opinionated, no reporter + is just data dumping. e.g. the jira issue description should never have + arbitrary json. + * A reporter ONLY represents the Smithy output to the remote target, it does + not make decisions. + * It supports at least one use case of the remote component (e.g. it can + open a vuln ticket on defect dojo, can upload a bom on dt, can open ticket + on jira) + * There’s no output combination that looks ugly/broken + * The default output is well-presented + * If the reporter output is customizeable (e.g. via go-templates) the + default is well presented + +### Next steps & automation ideas + +* Files present: We can have a linter rule or script rule present to check if +all the required files for a component exist +* Description present: smithyctl could have a command to load/audit a component +and ensure all the necessary fields exist +* We can have a PR checklist for component prs via github config +* We can leverage e2e tests to ensure components do what they need with + configuration