Skip to content

Daily BGP DFZ stats generated by the DFZ Name and Shame bot

License

Notifications You must be signed in to change notification settings

DFZ-Name-and-Shame/dnas_stats

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DNAS Reports

This repo contains the daily reports generated by the DFZ Name and Shame ("DNAS") parser.

Overview

The daily reports generated by DNAS, which are published in this repo, contain only the worst data found within the DFZ. The reports are the Top Trumps of garbage being advertised into the global BGP DFZ. It is not all the unwanted/bad information advertised into the DFZ. There is a large amount of unwanted data in the DFZ, reporting all of it wouldn’t be very interesting or entertaining.

To produce a report of only the "worst" data in the DFZ, the statistics from the parsed MRT files are aggregated based on a property of the BGP UPDATE message. The aggregating property depends on the kind of statistic being reported. For example, not every bogon prefix found in the DFZ is listed in the daily report, only the bogon prefixes with the highest number of unique origin ASNs are listed. In this example, detecting a higher number of unique origin ASNs for a bogon prefix can be used as a proxy metric, to infer that the prefix was more "visible" across the DFZ than another bogon prefix with fewer origin ASNs.

The report currently exist in a plain-text human readable format only. Additional formats such as a JSON parseable format are coming soon.

Data

Reporting Period

The DNAS parser is running continuously, parsing MRT archives of BGP UPDATES from various sources. The same stats (which are explained below) are extracted from each MRT file from each source.

Note: Each MRT source is one perspective of the DFZ. It's important to note that there is no single DFZ, only different perspectives of the same phenomenon.

Each day a report is generated for the previous day. The stats from each source are compared and the worst stats from each source are merged into a list of "the worst of the worst" across all sources. This means that in the report, the prefix with the most BOGON source ASNs may have been seen by $source1, and the prefix withe the highest number of updates may have been seen from $source2.

Data Sources

The original goal was to gather data from all the Tier 1 networks and the largest Tier 2 networks (don't get me started, I think this terminology is ridiculous!). The table below lists which of these Tier 1/2 networks are contributing data to the DNAS reports (not that many other networks contribute data to the same router collectors meaning data is being collected by many more networks than those listed below).

ASN Name Tier Collected Collector
174 Cogent 1 RRC25
701 Verizon/UUNET 1 Doesn't peer with RIS. Peers with route-views.routeviews.org but v6 only and this collector doesn't export MRTs. Peers with route-views6.routeviews.org but v6 only.
1239 T-Mobile/Sprint 1 RRC01, RRC12
1273 Vodafone 2 RRC03
1299 Arelion 1 RRC01, RRC25
2914 NTT 1 RRC01, RRC12, RRC14
3257 GTT 1 RRC01, RRC12, RRC23
3320 DTAG 2 RRC01, RRC03
3356 Colt 1 Not peering with RIS or RV
3491 PCCW Global 1 RRC01, RRC23
4134 ChinaNet Backbone 2 RRC01
4367 Telstra 2 Not peering with RIS or RV
4809 China Telecom 2 Not peering with RIS or RV
5511 FT/Orange 1 RRC01
6453 TATA 1 RRC03
6461 Zayo 1 RRC01, RRC03, RRC12, RRC20, RRC21
6762 TI Sparkle/Seabone 1 RRC12, RCC14
6830 Liberty Global 2 RRC01, RRC03
6939 Hurricane Electric 1* RRC01, RRC03, RRC05, RRC07, RRC10, RRC11, RRC12, RRC13, RRC14, RRC16
7018 AT&T 1 RRC00
7922 Comcast 2 Not peering with RIS or RV
9002 RETN 2 RRC01, RRC07, RRC11, RRC12, RRC23
12956 Telxius/Telefonica 1 RRC03

Note: Hurricane Electric purchases transit for IPv4 but not for IPv6.

The set of unique collectors from the list above are shown in the table below. A few additional collectors are added to provide a wider coverage of the DFZ because the collectors required to provide coverage of the Tier1/2 networks in the table above, are all located in Europe (sadly I can't just simply add all RIS and RVs collectors because I don't have the compute power). These are the collectors configured in the DNAS parser.

Collector Name/Location Notes Example URL
RRC00 RIPE-NCC Multihop, Amsterdam, Netherlands Part of Tier 1 coverage https://data.ris.ripe.net/rrc00/2024.07/
RRC01 LINX / LONAP, London, United Kingdom Part of Tier 1 coverage https://data.ris.ripe.net/rrc01/2024.07/
RRC03 AMS-IX / NL-IX, Amsterdam, Netherlands Part of Tier 1 coverage https://data.ris.ripe.net/rrc03/2024.07/
RRC12 DE-CIX, Frankfurt, Germany Part of Tier 1 coverage https://data.ris.ripe.net/rrc12/2024.07/
RRC25 RIPE-NCC Multihop, Amsterdam, Netherlands Part of Tier 1 coverage https://data.ris.ripe.net/rrc25/2024.07/
RRC15 PTTMetro, Sao Paulo, Brazil Part of wider visibility https://data.ris.ripe.net/rrc15/2024.07/
RRC19 NAP Africa JB, Johannesburg, South Africa Part of wider visibility https://data.ris.ripe.net/rrc19/2024.07/
RRC23 Equinix SG, Singapore, Singapore Part of wider visibility https://data.ris.ripe.net/rrc23/2024.07/

Report Definitions

Only the worst stats are collected and shown in the report. The definition of "worst" varies by the type of statistic being reported. For example, a prefix originated by more origin ASNs is more "visible" in the DFZ than a prefix with fewer origin ASNs. Thus, a BOGON prefix announced by multiple origin ASNs is considered worse than a BOGON prefix announced by a single origin, because it is more like to propagate across the DFZ the more ASNs originate the prefix. This is because there is no single view of the DFZ, "pollution" seen in the DFZ might not have been seen by many networks or even any networks at all, if good filtering is in place.

As a further example, if the statistic is the highest MED seen, 20 prefixes may have been seen with a MED of 12M (out of a maximum of 16.7M), but even if only 1 prefix was seen with a MED of 15M (leaving just a small percentage of the total metric space still usable), the 1 prefix would be shown in the report, not the 20 prefixes (because the 1 prefix leaves even less usable metric space than the 20 for onward propagation without hitting the max MED).

All of this means that there is lots of DFZ "noise" not shown in the report. Showing it all would result in a huge report and some of it isnt very interesting or much of a problem. This is why only the worst data from the past 24 hours is reported. The report serves as a kind of warning of stuff that should probably be fixed.

Additionally, some statistics are cumulative across MRT sources while some are not, depending on whether it makes sense for each specific statistic (this is explained below).

The report is address family agnostic, statistics are independant of IP version.

Prefixes with most bogon origin ASNs per prefix

All the unique BOGON origin ASNs seen for each prefix across all MRT sources.

The prefixes listed may have been seen from different MRT sources.

Prefixes with more BOGON origin ASNs are considered worse.

Bogon prefixes with most origin ASNs per prefix

All the unique origin ASNs seen for each BOGON prefix across all MRT sources

The prefixes listed may have been seen from different MRT sources.

BOGON prefixes with more origin ASNs are considered worse.

ASNs originating the most bogons ASNs

When the origin ASN is a BOGON ASN, walk up the AS PATH until the first non-BOGON ASN is found. This is a list of those ASNs propagating routes with a BOGON origin ASN.

The ASNs listed may have been seen from different MRT sources.

ASNs originating more prefixes with a BOGON origin ASN are considered worse.

Prefixes with the highest MED

The list of prefixes seen with the same highest MED.

The prefixes listed may have been seen from different MRT sources.

A higher MED is considered worse.

Longest AS path

A list of prefixes seen with the same highest AS Path length (they often all have a path length of 255 ASNs because this is the maximum BGPv4 supports).

The prefixes listed may have been seen from different MRT sources.

A longer AS Path is considered worse.

Longest community set

A list of prefixes all with the same highest number of communities attached. This includes standard and large communities. When the MRT source is operating at an IXP, any communities which match the IXP community prefix e.g.,65535:* are stripped before the communities on the UPDATE are counted. This is because the local IXP communities should not be forwarded and are expected on UPDATES at IXPs. This means that prefixes shown in these reports with hundreds of communities attached, which look like typical IXP communities might have come from another IXP the prefix has passed through and not been stripped.

The prefixes listed may have been seen from different MRT sources.

A higher number of communities is considered worse.

Abnormally large/small prefixes with most origin ASNs per prefix

A list of prefixes whose length is < /8 or > /24 for IPv4, < /16 or > /56 for IPv6, with the most origin ASNs. An IPv4 /32 from 1 origin ASN is less visible than an IPv6 /64 that has 3 origin ASNs, in this case the /32 wouldn't be shown in the report.

The prefixes listed may have been seen from different MRT sources.

A higher number of origin ASNs is considered worse.

Most BGP advertisements per prefix

A list of prefixes which were included in the most BGP messages (UPDATEs and WITHDRAWs). The prefixes shown were included in the same (highest) number of BGP messages.

The prefixes listed and number of messages which contained those prefixes is the highest number seen from a single MRT source. This is because it's not possible to determine if a BGP message seen by two MRT sources is the same message or not (deduplication).

A higher number of messages which contain the same prefix is considered worse.

Most BGP updates per prefix

A list of prefixes which were included in the most BGP UPDATEs.

The prefixes listed and number of UPDATEs which contained those prefixes is the highest number seen from a single MRT source. This is because it's not possible to determine if a BGP UPDATE seen by two MRT sources is the same UPDATE or not (deduplication).

A higher number of UPDATEs which contain the same prefix is considered worse.

Most BGP withdraws per prefix

A list of prefixes which were included in the most BGP WITHDRAWs.

The prefixes listed and number of WITHDRAWs which contained those prefixes is the highest number seen from a single MRT source. This is because it's not possible to determine if a BGP WITHDRAW seen by two MRT sources is the same WITHDRAW or not (deduplication).

A higher number of WITHDRAWs which contain the same prefix is considered worse.

Most BGP advertisements per origin ASN

A list of origin ASNs which were included in the most BGP messages (UPDATEs and WITHDRAWs). The origin ASNs shown were included in the same (highest) number of BGP messages.

The origin ASNs listed and number of messages which contained those ASNs is the highest number seen from a single MRT source. This is because it's not possible to determine if a BGP message seen by two MRT sources is the same message or not (deduplication).

A higher number of messages which contain the same origin ASN is considered worse.

Most BGP advertisements per peer ASN

A list of peer ASNs (1st ASN in the path) which sent the most BGP messages (UPDATEs and WITHDRAWs) to the MRT collector. The peer ASNs shown sent the same (highest) number of BGP messages.

The peer ASNs listed and number of messages sent by those peer ASNs is the highest number seen from a single MRT source. This is because it's not possible to determine if a BGP message seen by two MRT sources is the same message or not (deduplication).

A higher number of messages which contain the same peer ASN is considered worse.

Most BGP updates per peer ASN

A list of peer ASNs which sent the most BGP UPDATEs to the MRT collector.

The peer ASNs listed and number of UPDATEs which came from those peers is the highest number seen from a single MRT source. This is because it's not possible to determine if a BGP UPDATE seen by two MRT sources is the same UPDATE or not (deduplication).

A higher number of UPDATEs which came from the same peer is considered worse.

Most BGP withdraws per peer ASN

A list of peer ASNs which sent the most BGP WITHDRAWs to the MRT collector.

The peer ASNs listed and number of WITHDRAWs which came from those peers is the highest number seen from a single MRT source. This is because it's not possible to determine if a BGP WITHDRAW seen by two MRT sources is the same WITHDRAW or not (deduplication).

A higher number of WITHDRAWs which came from the same peer is considered worse.

Most origin ASNs per prefix

A list of prefixes which have the same highest number of origin ASNs.

The prefixes listed may have been seen from different MRT sources.

Prefixes with more origin ASNs are considered worse.

Most unknown attributes per prefix

A list of prefixes which have the same highest number of unknown BGP attributes attached.

The prefixes listed may have been seen from different MRT sources.

Prefixes with more unknown attributes are considered worse.

Contact

If the ASN you operate or prefixes you originate are in the report and you need help, or want to understand what this might mean, you can contact James at the following address: (jwbensley) [@] (gmail) dot com.

About

Daily BGP DFZ stats generated by the DFZ Name and Shame bot

Topics

Resources

License

Stars

Watchers

Forks