Skip to content

Commit

Permalink
build: Update DB [Mon Nov 11 00:52:09 UTC 2024]
Browse files Browse the repository at this point in the history
  • Loading branch information
ghost committed Nov 11, 2024
0 parents commit 81b85e0
Show file tree
Hide file tree
Showing 10 changed files with 696 additions and 0 deletions.
40 changes: 40 additions & 0 deletions .README.tmpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# teler Resource Collections

![Kitabisa SecLab](https://img.shields.io/badge/kitabisa-security%20project-blue)

This collection serves as the primary repository of external resources/datasets utilized by teler to identify potential threats. It is updated on a daily basis and undergoes automatic commit-and-push processes.

## Statistics

{{stats}}

> [!NOTE]
> Last updated at **{{updated_date}}**.

## Contributions

<a rel="license" href="http://creativecommons.org/licenses/by/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by/4.0/88x31.png" /></a>

We express deep gratitude to the projects that have integrated their datasets. We explicitly state that we do **NOT** assert ownership or authorship of any of these resources. This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</a>.

If you have any suggestions for additional resources that can aid in the development and advancement of the [teler](https://github.com/kitabisa/teler) or [teler-waf](https://github.com/kitabisa/teler-waf) project, we'd love to hear about them. :heart:

- **Common Web Attack**

This is taken from the [PHPIDS](https://github.com/PHPIDS/PHPIDS) project; please see their [README](https://github.com/PHPIDS/PHPIDS#credits) for names of those who have contributed.

- **CVEs**

The curated list of CVEs is derived from [Nuclei templates](https://github.com/projectdiscovery/nuclei-templates) provided by Project Discovery team, as well as contributions from the community.

- **Bad IP Address** & **Bad Referrer**

Both collections belong to [Nginx Ultimate Bad Bot Blocker](https://github.com/mitchellkrogza/nginx-ultimate-bad-bot-blocker) project. Bad IP Addresses collection also includes a list of [ipsum](https://github.com/stamparm/ipsum) (level: **3+**).

- **Bad Crawler**

This is taken from the [Crawler Detect](https://github.com/JayBizzle/Crawler-Detect) project, and you can find the names of contributors in the [Crawlers.txt](https://github.com/JayBizzle/Crawler-Detect/blob/master/raw/Crawlers.txt) file.

- **Directory Bruteforce**

This collection is obtained from the [fuzz.txt](https://github.com/Bo0oM/fuzz.txt).
8 changes: 8 additions & 0 deletions .github/dependabot.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
version: 2
updates:
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "daily"
labels:
- "package-ecosystem"
50 changes: 50 additions & 0 deletions .github/scripts/convert-cves.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
import json
import os
import yaml

def get_parent_directory(path, levels=1):
levels = levels + 1 if os.path.isfile(path) else 0
for _ in range(levels):
path = os.path.dirname(path)
return path

def convert_yaml_to_json(input_file):
with open(input_file, 'r') as f:
yaml_data = yaml.safe_load(f)

try:
json_data = {
"id": yaml_data['id'],
"info": {
"name": yaml_data['info']['name'],
"severity": yaml_data['info']['severity']
},
"requests": yaml_data['http']
}
except KeyError:
return False

return json_data

def convert_cves(directory_path):
json_output = {"templates": []}
for root, _, files in os.walk(directory_path):
for file_name in files:
if file_name.endswith(".yaml"):
input_file = os.path.join(root, file_name)
json_data = convert_yaml_to_json(input_file)
if not json_data:
continue
json_output["templates"].append(json_data)

return json_output

if __name__ == "__main__":
input_dir = "/tmp/nuclei-templates-main/http/cves/"
workspace_dir = get_parent_directory(os.path.abspath(__file__), levels=2)
output_file = os.path.join(workspace_dir, "db", "cves.json")

converted_data = convert_cves(input_dir)

with open(output_file, 'w') as f:
json.dump(converted_data, f, separators=(',', ':'))
45 changes: 45 additions & 0 deletions .github/scripts/generate-readme.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
#!/bin/bash

function count() {
local path="db/${1}"

if [[ ! "${1}" =~ \.json$ ]]; then
wc -l "${path}" | awk '{print $1}'
return 0
fi

local prop
case "${1}" in
common-web-attacks* )
prop="filters"
;;
cves* )
prop="templates"
;;
esac
jq -r ".${prop} | length" "${path}"
}

CWA_count=$(count "common-web-attacks.json")
CVEs_count=$(count "cves.json")
BadIP_count=$(count "bad-ip-addresses.txt")
BadRef_count=$(count "bad-referrers.txt")
BadCrawl_count=$(count "bad-crawlers.txt")
DirBrute_count=$(count "directory-bruteforces.txt")
total_count=$((CWA_count + CVEs_count + BadIP_count + BadRef_count + BadCrawl_count + DirBrute_count))

STATS_TABLE=${STATS_TABLE_TMPL}
STATS_TABLE=$(echo "${STATS_TABLE}" | sed "s/{{CWA_count}}/${CWA_count}/")
STATS_TABLE=$(echo "${STATS_TABLE}" | sed "s/{{CVEs_count}}/${CVEs_count}/")
STATS_TABLE=$(echo "${STATS_TABLE}" | sed "s/{{BadIP_count}}/${BadIP_count}/")
STATS_TABLE=$(echo "${STATS_TABLE}" | sed "s/{{BadRef_count}}/${BadRef_count}/")
STATS_TABLE=$(echo "${STATS_TABLE}" | sed "s/{{BadCrawl_count}}/${BadCrawl_count}/")
STATS_TABLE=$(echo "${STATS_TABLE}" | sed "s/{{DirBrute_count}}/${DirBrute_count}/")
STATS_TABLE=$(echo "${STATS_TABLE}" | sed "s/{{total_count}}/${total_count}/")
STATS_TABLE=$(echo "${STATS_TABLE}" | sed ':a;N;$!ba;s/\n/\\n/g')

README_TMPL=$(cat .README.tmpl)
README_TMPL=$(echo "${README_TMPL}" | sed "s/{{stats}}/${STATS_TABLE}/") # it doesn't work
README_TMPL=$(echo "${README_TMPL}" | sed "s/{{updated_date}}/$(date)/")

echo "${README_TMPL}"
17 changes: 17 additions & 0 deletions .github/workflows/deploy-teler-waf-demo.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
name: Deploy waf.teler.app

on:
workflow_run:
workflows: ["Update resources database"]
types:
- completed

jobs:
deploy:
runs-on: ubuntu-latest
if: ${{ github.event.workflow_run.conclusion == 'success' }}
steps:
- name: Call Deploy Hook
env:
HOOK_URL: ${{ secrets.TELER_WAF_DEMO_DEPLOY_HOOK_URL }}
run: curl "${HOOK_URL}"
69 changes: 69 additions & 0 deletions .github/workflows/update-db.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
name: "Update resources database"

on:
schedule:
- cron: "0 0 * * *"
workflow_dispatch:

jobs:
update:
runs-on: ubuntu-latest
steps:
- name: Check out code
uses: actions/checkout@master

- name: Configure Git
env:
USER: "ghost"
HOST: "users.noreply.github.com"
run: |
git config --local user.email "${USER}@${HOST}"
git config --local user.name "${USER}"
- name: Filtering branch (rm DB history)
env:
FILTER_BRANCH_SQUELCH_WARNING: "1"
run: git filter-branch --index-filter "git rm --cached --ignore-unmatch -r db/" $GITHUB_REF

- name: Install dependencies
run: sudo apt install jq zstd tar wget -y

- name: Downloading latest nuclei-templates
run: wget -q https://github.com/projectdiscovery/nuclei-templates/archive/refs/heads/main.zip && unzip main.zip
working-directory: "/tmp"

- name: Downloading other resources
run: |
mkdir -p db
wget -q "https://raw.githubusercontent.com/Bo0oM/fuzz.txt/master/fuzz.txt" -O "db/directory-bruteforces.txt" &
wget -q "https://raw.githubusercontent.com/dwisiswant0/cwa-filter-rules/master/dist/filters.json" -O "db/common-web-attacks.json" &
wget -q "https://raw.githubusercontent.com/JayBizzle/Crawler-Detect/master/raw/Crawlers.txt" -O "db/bad-crawlers.txt" &
wget -q "https://raw.githubusercontent.com/mitchellkrogza/nginx-ultimate-bad-bot-blocker/master/_generator_lists/bad-referrers.list" -O "db/bad-referrers.txt" &
wget -qP /tmp "https://raw.githubusercontent.com/mitchellkrogza/nginx-ultimate-bad-bot-blocker/master/_generator_lists/bad-ip-addresses.list" &
wget -qP /tmp "https://raw.githubusercontent.com/stamparm/ipsum/master/levels/3.txt" &
wait
sort -u /tmp/bad-ip-addresses.list /tmp/3.txt > "db/bad-ip-addresses.txt"
- name: Convert CVEs resources
run: python .github/scripts/convert-cves.py

- name: Build DB
env:
DB: "db.tar.zst"
working-directory: "db/"
run: rm -f *.zst MD5SUMS; tar -cf - * | zstd -o "${DB}"; md5sum * > MD5SUMS

- name: Generate README
env:
STATS_TABLE_TMPL: ${{ vars.STATS_TABLE_TMPL }}
run: bash .github/scripts/generate-readme.sh | tee README.md

- name: Push resources
run: |
COMMIT_MSG="build: "
[[ "${{ github.event_name }}" == "workflow_dispatch" ]] && COMMIT_MSG+="[force] "
COMMIT_MSG+="Update DB [$(date)]"
git add .
git commit -m "${COMMIT_MSG}"
git push origin master -f
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
get-resources.sh
auto-commit.sh
cves.go
go.*
Loading

0 comments on commit 81b85e0

Please sign in to comment.