component release channels RFC

smithy-security · Oct 28, 2024 · 810d32a · 810d32a
1 parent 71536bd
commit 810d32a
Showing 1 changed file with 78 additions and 0 deletions.
diff --git a/docs/dep/dep-006.md b/docs/dep/dep-006.md
@@ -0,0 +1,78 @@
+# Component Release Channels and Component Maturity Levels
+
+## Introduction
+
+The new SDK makes it trivial to add a component but with great power comes great component writing requirements.
+The more components we have the harder it is to check they are of high quality and user friendly.
+This document attempts to lay a foundation of requirements for what are the attributes a component should have in order to be considered mature enough for wider use.
+It also describes component release channels and what are the graduation criteria for every channel.
+
+## The problem
+
+We have currently written ~50 components which all are useful in one way or another.
+However, this organic and largely unplanned growth of the component base has incurred a cost in quality.
+A lot of these components do *something* but they lack documentation, are often unstable or worse, can be run only with specific arguments and do not integrate well with a target system.
+Further, a lot of the enrichers work with their unit tests but have not been validated against real world usage (e.g. reachability) or they haven't been validated at all for accuracy (reachability).
+Furthermore both scanners and reporters tend to use fields aimed for humans (e.g. issue Description field) as data dumps.
+
+## The solution
+
+The usual way to solve this is with:
+* extensive testing
+* release channels
+* strict graduation criteria
+* proper communication of component maturity to users
+
+### Release channels
+
+The following are suggested:
+
+* `experimental` : component has no guarantees, it is heavily work in progress. This channel is aimed for internal use.
+* `alpha` : component works, is usable but unreliable, may fail or provide random results from time to time. This channel is for cutting edge users who don't mind the unreliability but need a quick solution
+* `beta` : component works relatively well, we may know or suspect a few edge cases that are on the roadmap for being fixed. Users can use this with the the edge case warning.
+* `released` : component works, it is very well documented, is very reliable
+
+
+### Guidelines for what is required for a component to be considered `released`
+
+* All components:
+	* There is a README.md and Documentation.md file that has developer facing and user facing documentation respectively
+  * The user facing description clearly and consicely says what’s the intended use of the component with an example. .e.g `this git-source component is used for cloning git repositories, it supports cloning both with private keys and github PATs, has features a,b,c,d. Parameters supported are: ...., you can use it as such:...`
+  * The developer facing description clearly and consicely states how the component is implemented and any design decisions or liberties taken with it.
+  * The component has defaults that make sense (can be run without configuration, save for api keys)
+  * If the component accepts customization, there’s no customization combination that breaks the component. There are tests that validate this.
+  * There’s extensive tests both for the incoming data format, for orchestrating any remote and any logic.
+  * If the component talks to a remote system there is an end to end test that validates that it works correctly
+  * No mocking in e2e tests if we can help it
+  * There’s an example workflow that uses it
+  * There’s example input and output data stored with the component.
+  * If a component can be slow under circumstances ( remote api integration, heavy operations) it is clearly communicated to the user
+  * If a component relies on a remote api, there’s backoff-retry functionality
+  * The component produces logs of appropriate level documenting its actions
+  * The component publishes metrics of how much data it processed, what it did with it etc.
+  * The component has automated integration tests that ensure it runs end to end, with the tool it orchestrates.
+
+* scanners:
+    * Supports at least one specific use case for a product or integrates with a single api
+    * If it needs tool output in a special format (e.g. json) then that is not customizable.
+    * There's no machine-intented fields squashed in a human-intented api field (e.g. the issue description contains json)
+    * Provides metrics that make sense via the metrics channel (num of issues processed, num of issues dropped due to bad formatting, num of issues of each priority)
+    * If a scanner needs to output an extra summary result(e.g. scorecard results against a repository), it is clearly marked as so
+
+* enrichers:
+    * there’s tests that measure accuracy of the result (generate static sample data, feed to enricher, ensure accuracy of results is acceptable)
+
+* reporters:
+    * Unless otherwise specified, reporters are heavily opinionated, no reporter is just data dumping. e.g. the jira issue description should never have arbitrary json.
+    * A reporter ONLY represents the Smithy output to the remote target, it does not make decisions.
+    * It supports at least one use case of the remote component (e.g. it can open a vuln ticket on defect dojo, can upload a bom on dt, can open ticket on jira)
+    * There’s no output combination that looks ugly/broken
+    * The default output is well-presented
+    * If the reporter output is customizeable (e.g. via go-templates) the default is well presented
+
+### Next steps & automation ideas
+
+* Files present: We can have a linter rule or script rule present to check if all the required files for a component exist
+* Description present: smithyctl could have a command to load/audit a component and ensure all the necessary fields exist
+* We can have a PR checklist for component prs via github config
+* we can leverage e2e tests to ensure components do what they need with configuration