-
Notifications
You must be signed in to change notification settings - Fork 661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pipeline-cntr-guide #961
Merged
Merged
Pipeline-cntr-guide #961
Changes from 4 commits
Commits
Show all changes
34 commits
Select commit
Hold shift + click to select a range
aac45d4
Create Integrated-Circuit_pipeline_ggregated_counters_guide.md
rszarecki 755db4c
1st run
rszarecki de49e2e
Draft for aggregate counters
rszarecki 23bc97b
Update Integrated-Circuit_pipeline_ggregated_counters_guide.md
rszarecki b19e899
Apply suggestions from code review
rszarecki e247e34
Update doc/Integrated-Circuit_pipeline_ggregated_counters_guide.md
rszarecki 4de79c7
Update doc/Integrated-Circuit_pipeline_ggregated_counters_guide.md
rszarecki 897af40
comments included
rszarecki f31bd43
Update Integrated-Circuit_pipeline_ggregated_counters_guide.md
rszarecki c40dbf0
Update openconfig-platform-pipeline-counters.yang
rszarecki 044334c
Update openconfig-platform-pipeline-counters.yang
rszarecki 20127a0
Update openconfig-platform-pipeline-counters.yang
rszarecki 9e0ffbf
Update openconfig-platform-pipeline-counters.yang
rszarecki cacfddc
Update openconfig-platform-pipeline-counters.yang
rszarecki 466adc5
Update openconfig-platform-pipeline-counters.yang
rszarecki 2b72308
Update openconfig-platform-pipeline-counters.yang
rszarecki 99540b0
Update openconfig-platform-pipeline-counters.yang
rszarecki a220ef1
Update openconfig-platform-pipeline-counters.yang
rszarecki 8c4747e
Update openconfig-platform-pipeline-counters.yang
rszarecki cc0d004
Update openconfig-platform-pipeline-counters.yang
rszarecki ce0ee05
Update openconfig-platform-pipeline-counters.yang
rszarecki 3ae048b
Update openconfig-platform-pipeline-counters.yang
rszarecki 97d938c
Update openconfig-platform-pipeline-counters.yang
rszarecki 79d2930
Merge branch 'openconfig:master' into pipeline-cntr-guide
rszarecki a585aad
Version update
rszarecki 043cad1
Update release/models/platform/openconfig-platform-pipeline-counters.…
rszarecki eae0894
Update release/models/platform/openconfig-platform-pipeline-counters.…
rszarecki d53055b
Update release/models/platform/openconfig-platform-pipeline-counters.…
rszarecki 963320d
Update doc/Integrated-Circuit_pipeline_ggregated_counters_guide.md
rszarecki 1eafd3a
Update release/models/platform/openconfig-platform-pipeline-counters.…
rszarecki aae0c40
Update release/models/platform/openconfig-platform-pipeline-counters.…
rszarecki 9ade375
Merge branch 'master' into pipeline-cntr-guide
rszarecki 48403ab
Update release/models/platform/openconfig-platform-pipeline-counters.…
rszarecki 34eb667
Rename Integrated-Circuit_pipeline_ggregated_counters_guide.md to Int…
rszarecki File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
98 changes: 98 additions & 0 deletions
98
doc/Integrated-Circuit_pipeline_ggregated_counters_guide.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,98 @@ | ||
# Intergrated Circuit aggregated pipeline counters guide | ||
## Introduction | ||
This gude discuss semantics of different counters provided under | ||
rszarecki marked this conversation as resolved.
Show resolved
Hide resolved
|
||
`openconfig-platform/components/component/integrated-circuit/pipeline-counters` container. | ||
The "Integrated Circuit" or I-C, in this document is abstract term refering ASIC or NPUs (or combination of both) that provides packet processing capabilities. | ||
rszarecki marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## Per-block packets/octets counters | ||
[TODO] more detailed description | ||
## Drop packets/octets counters | ||
The drop container collects counters related to packet dropped for varouus reasons and in varous places inside "Integrated Circuit". | ||
rszarecki marked this conversation as resolved.
Show resolved
Hide resolved
|
||
### Aggregated drop counters | ||
This 4 counters should cover all packets dropped inside I-C with one exeption - packet driopped due to QoS queue tail-drop or AQM (RED/WRED). Aggregated drop couters are modeled as below: | ||
rszarecki marked this conversation as resolved.
Show resolved
Hide resolved
|
||
``` | ||
module: openconfig-platform | ||
+--rw components | ||
+--rw component* [name] | ||
+--rw integrated-circuit | ||
+--ro oc-ppc:pipeline-counters | ||
+--ro oc-ppc:drop | ||
+--ro oc-ppc:state | ||
+--ro oc-ppc:adverse-aggregate? oc-yang:counter64 | ||
+--ro oc-ppc:congestion-aggregate? oc-yang:counter64 | ||
+--ro oc-ppc:packet-processing-aggregate? oc-yang:counter64 | ||
+--ro oc-ppc:urpf-aggregate? oc-yang:counter64 | ||
``` | ||
#### urpf-aggregate | ||
> From OpenConfig definition:\ | ||
>This aggregation of counters represents the conditions in which packets are dropped due to failing uRPF lookup check. This counter and the packet-processing-aggregate counter should be incremented for each uRPF packet drop. | ||
|
||
This counter counts packet discarded as resutlt of Unicast Reverse Path Forwarding verification. ([RFC2827](https://datatracker.ietf.org/doc/html/rfc2827), [RFC3704](https://datatracker.ietf.org/doc/html/rfc3704)). | ||
rszarecki marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
##### Usability | ||
The increments of this counter are typically signal of some form of attack with spoofed sourec address. Typically dDOS class. | ||
rszarecki marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
#### packet-processing-aggregate | ||
> From OpenConfig definition:\ | ||
> This aggregation of counters represents the conditions in which packets are dropped due to legitimate forwarding decisions (ACL drops, No Route etc.) | ||
|
||
This counter counts packet discarded as resutlt of processing **non-corrupted packtet** against **legitimate, non-corrupted** state of I-C program (FIB content, ACL content, rate-limiting token-bucktes) which mandate packet drop. The examples of this class of discard are: | ||
- dropping packets which destination address to no match any FIB entry | ||
- dropping packets which destination address matches FIB entry pinting discard next-hop (e.g. route to null0) | ||
- dropping packts due to ACL/packet filter decission | ||
- dropping packets due to its TTL = 1 | ||
- dropping packts due to its size exceeds egress interface MTU and packet ca'nt be fragmented (IPv6 or Dont-Fragmemt bit is set) | ||
- etc | ||
|
||
Note: Form the I-C perspective it is doing exectly what it is told (programed) to do, and packet is parsable. | ||
rszarecki marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
##### Usability | ||
The increments of this counter are expected during convergence events as well as during stable operation. However rapid increase in drop rate **may** be a signal of network being unhealthy and typically requires further investigation. | ||
The further break down of this counter, if available as vendor extension under `/openconfig-platform:components/component/integrated-circuit/openconfig-platform-pipeline-counters:pipeline-counters/drop/vendor` container could help to further narrow-down cause of drops. | ||
|
||
If prolonged packet drops are found to be caused by lack of FIB entry for incomming packets, this suggest inconsistency between Network Control plane protocols (BGP, IGP, RSVP, gRIBI), FIB calculated by Controller Card and FIB programmed into given Integrated Circuit. | ||
|
||
If implemetation supports `urpf-aggregate` counter, packets discarded due to uRPF should not be counted as `packet-processing-aggregate`. Else, uRPF discarded oacket should be counted against this counter. | ||
rszarecki marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
#### congestion-aggregate | ||
> From OpenConfig definition:\ | ||
>This tracks the aggregation of all counters where the expected conditions of packet drops due to internal congestion in some block of the hardware that **may not be visible** in through other congestion indicators like interface discards or **queue drop counters**. | ||
|
||
This counter counts packet discarded as resutlt of exceedding performance limits of Integrated-Circuit, when it sprocess non-corruptec packets against to legitimate, non-corrupted progreamming state of I-C. | ||
|
||
The typial example is overloading given IC with higher packet rate (pps) then given chip can handle. For exeple, let's assume chip X can process 3.6bps of incomming traffic and 2000 Mpps. However if averange incoming packet size is 150B, at full ingress rate this become 3000Mpps. Hence 1/3 of packets would be cropped and should be counted against `congestion-aggregate`. | ||
|
||
Another example is the case when some I_C data bus is too narrow/slow for handling traffic. For example let's assume chip X needs to sent 3Tbps of it's ingress traffic to external buffer memory, which has only 2Tbps access I/O. It this case pactes would be discarded, because of internal congestion of memory I/O bus. Note, this packet are discarded even if queues are very little used, hence this are NOT QoS queue tail-drops nor WRED drops. | ||
|
||
Yet another example is the case where extreemly large and long ACL/filter requires more cycles to process then NPU is bugeted for. | ||
|
||
rszarecki marked this conversation as resolved.
Show resolved
Hide resolved
|
||
##### Usability | ||
The increments of this counter are signal of given Integrated Circuit being overhelmed by incomming traffic and complexity of packet processing that is required. | ||
|
||
#### adverse-aggregate | ||
> From OpenConfig definition:\ | ||
> This captures the aggregation of all counters where the switch is **unexpectedly** dropping packets. Occurrence of these drops on a stable (no recent hardware or config changes) and otherwise healthy switch needs further investigation. | ||
|
||
This counter counts packet discarded as resutlt of **corrupted** programming state in I-C or **corrupted** data structures of packet descriptors. | ||
|
||
Note: corrupted packet recived on ingress interface should be counted separatly in `/interfaces/interface/state/counters/in-errors` and NOT counted as `adverse-aggregate`. This is because incomming corrupted packets are NOT a signal of adverse state of given I-C (but rather of upstream system). Therefore it is better not to count such drops as `adverse-aggregate` to keep it clean signal of I-C adverse state. | ||
|
||
rszarecki marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
##### Usability | ||
The increments of this counter are generally signall of some hardware defect (e.g. memory errors or signal integrity issues) or (micro)code softwafe defects. | ||
rszarecki marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
#### Queue tail and AQM drops exeption discussion. | ||
Drops assotiated tith QoS queue tail or AQM are result of egress interface congestion. What is NOT the same as I-C congestion, and shoudl be considered normal, expected state from platform (router) point of view. It may be not expected state form Network design point of view but this perspective is not what individual network device is aware of. | ||
The OpenConfig definition for `congestion-aggregate` clerly excludes "queue drop counters". It has also perfect sens to not coult QoS queue drops under this `congestion-aggregate` in order to keep it a clear signal of hitting I-C performance limitations, rather then blend it with basic, simple egress interface speed limitations. | ||
rszarecki marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### Per-Block drop copunters | ||
[TODO] more detailed description for standard OpenConfig drop counters defined for Interface-, Lookup-, Queueing-, Fabric- and Host-Interface- blocks. Also discuss relationship with Control plane traffic packets/octets counters. | ||
### Vendor extensions | ||
Please refer to [Vendor-Specific Augmentation for Pipeline Counter](vendor_counter_guide.md) | ||
## Error counters | ||
This counters do not counts packets of bytes. They counte error events per block. | ||
[TODO] more detailed description | ||
## Control plane traffic packets/octets counters | ||
[TODO] more detailed description. Also discuss relationship with Host-Interface block counters. | ||
### Standard OpenConfig counters | ||
### Vendor extensions |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a typo in the filename, an
a
is missing in theaggregated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed