Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generate-asinfo generates wrong information #57

Open
twiddern opened this issue Feb 14, 2017 · 3 comments
Open

generate-asinfo generates wrong information #57

twiddern opened this issue Feb 14, 2017 · 3 comments

Comments

@twiddern
Copy link
Contributor

I've just noticed it some months ago but didn't raised an issue here and also couldn't managed it to make a fix.

Currently the generate-asinfo.py generates broken output. The information seems to be ok if you look manually into it, but if you load the asinfo.txt often ripe-entries are broken. Also information from other databases, which cymru collected seems sometimes a bit "wrong".

I'm not sure if the python script needs to be adjusted or if as-stats should be adjusted to show up the information from a "new format" from cymru. I'm sorry that I can't give an example right now.

For whom who it will fix, I also noticed that you can't query 300k asn at once, via netcat to cymru, this should be may split up in future.

@crazzy
Copy link

crazzy commented Feb 15, 2017

I have spent some time writing up a new script following the same format as the existing bundled file. My script is far from perfect but it works, some points that still need fixing is:

  • Validation of the resulting file
  • Way to many external dependencies for my taste (whois, grep, awk, sed and so on)
  • More research into eventual edge cases

But what my script does do is that it goes directly to the source, checking at IANA where everything is assigned, then queries the correct whois server for the data. Additionally downloads a file from RIPE FTP to do the asn to country mapping for RIPE. Also my script doesn't check for data for every single possible ASN. Only for the ASN's where we've actually seen traffic from. So maybe something to refine and possibly include in as-stats under contrib?

#!/usr/bin/zsh

zmodload -m -F zsh/files b:zf_\*

rrd_path=/root/AS-Stats/rrd
asinfo_path=/tmp/asinfo.txt
asn16_tmp=""
asn32_tmp=""
asn_mapper_tmp=""
ripe_as_cc_tmp=""
curl=/usr/bin/curl
whois=/usr/bin/whois
sed=/bin/sed
grep=/bin/grep
awk=/usr/bin/awk
iana_as_alloc_16="http://www.iana.org/assignments/as-numbers/as-numbers-1.csv"
iana_as_alloc_32="http://www.iana.org/assignments/as-numbers/as-numbers-2.csv"
ripe_as_to_country="ftp://ftp.ripe.net/pub/stats/ripencc/delegated-ripencc-latest"

get_as_data() {
	case $asnum in # There are a number of special purpose AS numbers that we should handle manually
		0)
			return # Reserved RFC 7607
		;;
		112)
			echo -e "112\tROOTSERV\tDNS-OARC,US\tUS" # RFC 7534
		;;
		23456)
			echo -e "23456\tIANA-ASTRANS\tIANA,US\tUS" # RFC 6793
		;;
		<64496-64511>)
			return # Reserved for documentation RFC 5398
		;;
		<64512-65534>)
			echo -e "$asnum\tPRIVATE-AS-$asnum\tPRIVATE,US\tUS" # RFC 6996, could be valid for the purpose of AS-Stats
		;;
		65535)
			return # Reserved last AS RFC 7300
		;;
		<65536-65551>)
			return # Reserved for documentation RFC 5398
		;;
		<4200000000-4294967294>)
			echo -e "$asnum\tPRIVATE-AS-$asnum\tPRIVATE,US\tUS" # RFC 6996, could be valid for the purpose of AS-Stats
		;;
		4294967295)
			return # Reserved last AS RFC 7300
		;;
		*)
			query_for_as $asnum
		;;
	esac
}

handler_arin() {
	asnum="$1"
	whois_server="$2"
	rawdata=$($whois -h $whois_server AS$asnum 2>/dev/null)
	asname=$(echo "$rawdata" | $grep -m1 '^ASName:' | $awk '{print $NF;}')
	orgname=$(echo "$rawdata" | $grep -m1 '^OrgName:' | $awk -F : '{print $NF;}' | $sed 's,^\s+,,g')
	country=$(echo "$rawdata" | $grep -m1 '^Country:' | $awk '{print $NF;}')
	if [ "$asname" = "" ]; then
		return
	fi
	echo -e "$asnum\t$asname\t$orgname,$country\t$country"
}

handler_ripe() {
	asnum="$1"
	whois_server="$2"
	rawdata=$($whois -h $whois_server AS$asnum 2>/dev/null)
	asname=$(echo "$rawdata" | $grep -m1 '^as-name:' | $awk '{print $NF;}')
	orgname=$(echo "$rawdata" | $grep -m1 '^org-name:' | $awk -F : '{print $NF;}' | $sed 's,^\s+,,g')
	country=$($grep -m1 "^ripencc.*asn.$asnum" $ripe_as_cc_tmp | $awk -F '|' '{print $4}')
	if [ "$asname" = "" ]; then
		return
	fi
	echo -e "$asnum\t$asname\t$orgname,$country\t$country"
}

handler_apnic() {
	asnum="$1"
	whois_server="$2"
	rawdata=$($whois -h $whois_server AS$asnum 2>/dev/null)
	echo "$rawdata" | $grep -q '^remarks:.*whois.nic.ad.jp'
	if [ $? -ne 0 ]; then # general apnic
		asname=$(echo "$rawdata" | $grep -m1 '^as-name:' | $awk '{print $NF;}')
		orgname=$(echo "$rawdata" | $grep -A1 -m1 '^as-name:' | $grep '^descr:' | $awk -F : '{print $NF;}' | $sed 's,^\s+,,g')
		country=$(echo "$rawdata" | $grep -m1 '^country:' | $awk '{print $NF;}')
	else #JPNIC
		rawdata=$($whois -h whois.nic.ad.jp AS$asnum/e 2>/dev/null)
		asname=$(echo "$rawdata" | $grep -m1 '^b\.' | $awk '{print $NF;}')
		orgname=$(echo "$rawdata" | $grep -m1 '^g\.' | $awk -F ']' '{print $NF;}' | $sed 's,^\s+,,g')
		country="JP"
	fi
	if [ "$asname" = "" ]; then
		return
	fi
	echo -e "$asnum\t$asname\t$orgname,$country\t$country"
}

handler_lacnic() {
	asnum="$1"
	whois_server="$2"
	rawdata=$($whois -h $whois_server AS$asnum 2>/dev/null)
	asname="UNSPECIFIED"
	orgname=$(echo "$rawdata" | $grep -m1 '^owner:' | $awk -F : '{print $NF;}' | $sed 's,^\s+,,g')
	country=$(echo "$rawdata" | $grep -m1 '^country:' | $awk '{print $NF;}')
	if [ "$asname" = "" ]; then
		return
	fi
	echo -e "$asnum\t$asname\t$orgname,$country\t$country"
}

handler_afrinic() {
	asnum="$1"
	whois_server="$2"
	rawdata=$($whois -h $whois_server AS$asnum 2>/dev/null)
	asname=$(echo "$rawdata" | $grep -m1 '^as-name:' | $awk '{print $NF;}')
	orgname=$(echo "$rawdata" | $grep -m1 '^org-name:' | $awk -F : '{print $NF;}' | $sed 's,^\s+,,g')
	country=$(echo "$rawdata" | $grep -m1 '^country:' | $awk '{print $NF;}')
	echo -e "$asnum\t$asname\t$orgname,$country\t$country"
}

query_for_as() {
	prepare_rir_maps
	asnum="$1"
	. $asn_mapper_tmp
	case $as_alloc in
		*ARIN*)
			handler_arin $asnum $whois_server
		;;
		*RIPE*)
			handler_ripe $asnum $whois_server
		;;
		*LACNIC*)
			handler_lacnic $asnum $whois_server
		;;
		*APNIC*)
			handler_apnic $asnum $whois_server
		;;
		*AFRINIC*)
			handler_afrinic $asnum $whois_server
		;;
		*)
			return # Unknown RIR
		;;
	esac
}

prepare_rir_maps() {
	if [[ -z "${asn_mapper_tmp// }" ]]; then
		asn16_tmp==(:)
		asn32_tmp==(:)
		ripe_as_cc_tmp=(:)
		$curl -sS -o $asn16_tmp $iana_as_alloc_16
		if [ $? -ne 0 ]; then
			echo "Failed to fetch AS number assignment plan from IANA" >&2
			exit 1
		fi
		$curl -sS -o $asn32_tmp $iana_as_alloc_32
		if [ $? -ne 0 ]; then
			echo "Failed to fetch AS number assignment plan from IANA" >&2
			exit 1
		fi
		$curl -sS -o $ripe_as_cc_tmp $ripe_as_to_country
		asn_mapper_tmp==(:)
		echo "case \$asnum in" > $asn_mapper_tmp
		(<$asn16_tmp <$asn32_tmp) | while read line; do
			parts=("${(@s/,/)line}")
			as_range=$parts[1]
			alloc=$parts[2]
			whois_server=$parts[3]
			if [[ -z "${whois_server// }" ]]; then
				continue # Non-interesting allocations do not have a whois server set
			fi
			if ! [[ $as_range =~ [0-9]+ ]]; then
				continue # Filters out the header at the top of the files
			fi
			if [[ $as_range = "0-65535" ]]; then
				continue # Only present in 32-bit registry, covers whole 16-bit registry referring to it
			fi
			if [[ $as_range =~ \- ]]; then
				is_range=1 # This is a range of AS numbers
				echo "<$as_range>)"
			else
				is_range=0 # This is a single AS number allocation
				echo "$as_range)"
			fi
			echo -e "\tas_alloc=\"$alloc\""
			echo -e "\twhois_server=\"$whois_server\""
			echo ";;"
		done >> $asn_mapper_tmp
		echo "esac" >> $asn_mapper_tmp
	fi
}

asinfo_tmp==(:)

for rrd in $(echo $rrd_path/??/*); do # We only update ASINFO for ASes we've seen traffic to/from
	bits=("${(@s,/,)rrd}")
	last_bit="${bits[${#bits}]}"
	bits=("${(@s,.,)last_bit}")
	asnum=$bits[1]
	get_as_data $asnum
done > $asinfo_tmp
zf_mv $asinfo_tmp $asinfo_path

@twiddern
Copy link
Contributor Author

Thanks for this, but something isn't correct working. Most country/flag information for RIPE countries are missing and if a flag is displayed, it's from a not RIPE serviced region.

@crazzy
Copy link

crazzy commented Feb 22, 2017

Yeah, I noticed there are still a few things to work out. I had to pause this project for a bit for other stuff (stuff that brings my employer money). But most of the data is there.

As can be seen in the script I took the RIPE country data from ftp://ftp.ripe.net/pub/stats/ripencc/delegated-ripencc-latest which I assumed would work out but apparently not. Or I have a parsing bug. Thing is RIPE doesn't reliably publish country for AS numbers in whois like the rest of the RIR's.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants