Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix some FP cases for base64 parts #627

Merged
merged 4 commits into from
Dec 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions .ci/benchmark.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
META MD5 72b4b7db8a2ffef0f19e802c09032e14
META MD5 414228344bac7e55c5127be7b244e460
DATA MD5 abd9c025d5c323af814fbeb33f469c90
DATA: 16342283 interested lines. MARKUP: 62020 items
FileType FileNumber ValidLines Positives Negatives Templates
Expand Down Expand Up @@ -82,7 +82,7 @@ FileType FileNumber ValidLines Positives Negatives Templat
.ipynb 1 134 5
.j 1 241 4
.j2 30 5530 6 186 10
.java 621 134132 362 1363 172
.java 621 134132 362 1365 171
.jenkinsfile 1 58 2 6
.jinja2 1 64 2
.js 659 536413 531 2497 331
Expand Down Expand Up @@ -222,7 +222,7 @@ FileType FileNumber ValidLines Positives Negatives Templat
.yml 419 36169 559 889 376
.zsh 6 872 12
.zsh-theme 1 97 1
TOTAL: 10232 16342283 12255 49690 5102
TOTAL: 10232 16342283 12255 49692 5101
credsweeper result_cnt : 11517, lost_cnt : 0, true_cnt : 11342, false_cnt : 175
Rules Positives Negatives Templates Reported TP FP TN FN FPR FNR ACC PRC RCL F1
------------------------------ ----------- ----------- ----------- ---------- ----- ---- ----- ---- -------- -------- -------- -------- -------- --------
Expand Down Expand Up @@ -261,14 +261,14 @@ Key 3909 15717 485 394
Nonce 91 49 0 89 88 1 48 3 0.020408 0.032967 0.971429 0.988764 0.967033 0.977778
Other 8 7445 1 0 0 7446 8 0.000000 1.000000 0.998927 0.000000
PEM Private Key 1019 1483 0 1023 1019 4 1479 0 0.002697 0.000000 0.998401 0.996090 1.000000 0.998041
Password 1869 7535 2680 1776 1758 18 10197 111 0.001762 0.059390 0.989325 0.989865 0.940610 0.964609
Password 1869 7536 2680 1776 1758 18 10198 111 0.001762 0.059390 0.989326 0.989865 0.940610 0.964609
Salt 47 76 1 44 44 0 77 3 0.000000 0.063830 0.975806 1.000000 0.936170 0.967033
Secret 1297 1576 802 1288 1283 5 2373 14 0.002103 0.010794 0.994830 0.996118 0.989206 0.992650
Seed 1 6 0 0 0 6 1 0.000000 1.000000 0.857143 0.000000
Slack Token 4 1 0 4 4 0 1 0 0.000000 0.000000 1.000000 1.000000 1.000000 1.000000
Tencent WeChat API App ID 6 0 0 6 6 0 0 0 0.000000 1.000000 1.000000 1.000000 1.000000
Token 643 4170 454 616 614 2 4622 29 0.000433 0.045101 0.994114 0.996753 0.954899 0.975377
Twilio Credentials 30 39 0 30 30 0 39 0 0.000000 0.000000 1.000000 1.000000 1.000000 1.000000
URL Credentials 210 156 216 205 205 0 372 5 0.000000 0.023810 0.991409 1.000000 0.976190 0.987952
URL Credentials 210 157 215 205 205 0 372 5 0.000000 0.023810 0.991409 1.000000 0.976190 0.987952
UUID 1069 265 0 1068 1067 1 264 2 0.003774 0.001871 0.997751 0.999064 0.998129 0.998596
12255 49690 5102 11524 11342 175 49515 913 0.003522 0.074500 0.982436 0.984805 0.925500 0.954232
12255 49692 5101 11524 11342 175 49517 913 0.003522 0.074500 0.982437 0.984805 0.925500 0.954232
2 changes: 1 addition & 1 deletion credsweeper/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,4 @@
'__version__'
]

__version__ = "1.9.4"
__version__ = "1.9.5"
3 changes: 2 additions & 1 deletion credsweeper/filters/value_atlassian_token_check.py
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,8 @@ def check_atlassian_struct(value: str) -> bool:
# there is limit for big integer value: math.log10(1<<64) = 19.265919722494797
if 0 < delimiter_pos <= 20:
val = decoded[:delimiter_pos].decode(LATIN_1)
if int(val):
# at least 3 digits in the token
if 100 < int(val):
# test for ascii and Shannon entropy - there should be random data
data = decoded[delimiter_pos + 1:]
return Util.is_ascii_entropy_validate(data)
Expand Down
4 changes: 2 additions & 2 deletions credsweeper/filters/value_base64_part_check.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ def run(self, line_data: LineData, target: AnalysisTarget) -> bool:
"""

with contextlib.suppress(Exception):
if line_data.value_start and '/' == line_data.line[line_data.value_start - 1]:
if line_data.value_start and line_data.line[line_data.value_start - 1] in ('/', '+'):
if '-' in line_data.value or '_' in line_data.value:
# the value contains url-safe chars, so '/' is a delimiter
return False
Expand All @@ -48,7 +48,7 @@ def run(self, line_data: LineData, target: AnalysisTarget) -> bool:
data = [value_entropy, left_entropy, right_entropy]
avg = statistics.mean(data)
stdev = statistics.stdev(data, avg)
avg_min = avg - stdev
avg_min = avg - 1.1 * stdev
if avg_min < left_entropy and avg_min < right_entropy:
# high entropy of bound parts looks like a part of base64 long line
return True
Expand Down
4 changes: 3 additions & 1 deletion credsweeper/rules/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -593,7 +593,9 @@
type: pattern
values:
- (?<![0-9A-Za-z_-])(?P<value>EAAA[0-9A-Za-z_-]{60})(?![0-9A-Za-z_-])
filter_type: GeneralPattern
filter_type:
- ValuePatternCheck
- ValueBase64PartCheck
validations:
- SquareAccessTokenValidation
required_substrings:
Expand Down
Loading