Skip to content

Commit

Permalink
retrain
Browse files Browse the repository at this point in the history
  • Loading branch information
babenek committed Sep 5, 2024
1 parent f4ea9a2 commit 926f286
Show file tree
Hide file tree
Showing 7 changed files with 1,284 additions and 1,581 deletions.
54 changes: 27 additions & 27 deletions .ci/benchmark.txt
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
META MD5 e9f6945acb9321f73d44b8ed37b811ed
DATA MD5 d1b647616f0a004042d24c64a77327a0
DATA: 16345596 interested lines. MARKUP: 62811 items
META MD5 78cb5d3d6472d88c6f785225321626ea
DATA MD5 79ac8163a2132e6d6588718e81ced074
DATA: 16345596 interested lines. MARKUP: 62813 items
FileType FileNumber ValidLines Positives Negatives Templates
--------------- ------------ ------------ ----------- ----------- -----------
194 28318 70 417 91
194 28318 71 417 90
.1 2 641 2 5
.admx 1 26 1
.adoc 1 158 13 6 1
Expand Down Expand Up @@ -55,15 +55,15 @@ FileType FileNumber ValidLines Positives Negatives Templat
.erb 13 323 27
.erl 4 96 7
.ex 25 4968 5 98 5
.example 17 1838 74 38 54
.example 17 1838 75 38 53
.exs 24 4842 8 187 4
.ext 5 211 1 4 2
.fsproj 1 75 1 2
.g4 2 201 2
.gd 1 37 1
.gml 3 3075 16
.gni 3 5017 19
.go 1080 566476 693 4124 738
.go 1080 566476 694 4123 738
.golden 5 1168 1 13 29
.gradle 45 3265 4 90 100
.graphql 7 420 13
Expand All @@ -82,11 +82,11 @@ FileType FileNumber ValidLines Positives Negatives Templat
.ipynb 1 134 5
.j 1 241 4
.j2 30 5530 6 186 10
.java 621 134132 363 1364 170
.java 621 134132 361 1366 170
.jenkinsfile 1 58 2 6
.jinja2 1 64 2
.js 659 536413 535 2496 328
.json 851 13046493 1071 10911 140
.js 659 536413 536 2496 328
.json 851 13046493 1074 10910 140
.jsp 13 3202 1 40
.jsx 7 857 19
.jwt 1 1 2
Expand Down Expand Up @@ -136,7 +136,7 @@ FileType FileNumber ValidLines Positives Negatives Templat
.pbxproj 1 941 2
.pem 48 1169 47 8
.php 371 75710 129 1620 79
.pl 16 14727 6 34
.pl 16 14727 7 33
.pm 3 744 7
.po 3 2994 15
.pod 9 1859 1 23
Expand All @@ -153,13 +153,13 @@ FileType FileNumber ValidLines Positives Negatives Templat
.pug 2 193 2
.purs 1 69 4
.pxd 1 150 5 2
.py 890 291553 683 3301 723
.py 890 291553 684 3301 723
.pyi 4 1361 9
.pyp 1 167 1
.pyx 2 1094 23
.r 4 62 6 3 1
.rake 2 51 2
.rb 860 131838 259 3337 613
.rb 860 131838 260 3337 612
.re 1 31 1
.red 1 159 1
.release 1 13 4
Expand All @@ -179,7 +179,7 @@ FileType FileNumber ValidLines Positives Negatives Templat
.scala 40 5071 22 101
.scss 16 8553 32 1
.secrets 1 11 1
.sh 143 21525 54 480 30
.sh 143 21525 55 479 30
.slim 1 153 1 2
.smali 1 775 18
.snap 3 1708 9 30 2
Expand Down Expand Up @@ -209,7 +209,7 @@ FileType FileNumber ValidLines Positives Negatives Templat
.ts 583 106730 159 1800 201
.tsx 54 7914 1 114 5
.ttar 1 452 1
.txt 440 78102 5272 6373 49
.txt 440 78102 5277 6368 49
.utf8 1 77 2
.vsixmanifest 1 36 1
.vsmdi 1 6 2
Expand All @@ -220,19 +220,19 @@ FileType FileNumber ValidLines Positives Negatives Templat
.xml 9 689 9
.xsl 1 311 1
.yaml 137 19004 125 345 42
.yml 419 36169 552 890 379
.yml 419 36169 555 888 378
.zsh 6 872 12
.zsh-theme 1 97 1
TOTAL: 10264 16345596 12187 50518 5102
credsweeper result_cnt : 11522, lost_cnt : 0, true_cnt : 11225, false_cnt : 297
TOTAL: 10264 16345596 12204 50509 5098
credsweeper result_cnt : 11482, lost_cnt : 0, true_cnt : 11234, false_cnt : 248
Rules Positives Negatives Templates Reported TP FP TN FN FPR FNR ACC PRC RCL F1
------------------------------ ----------- ----------- ----------- ---------- ----- ---- ----- ---- -------- -------- -------- -------- -------- --------
API 129 3162 189 127 125 2 3349 4 0.000597 0.031008 0.998276 0.984252 0.968992 0.976562
AWS Client ID 168 21 0 160 160 0 21 8 0.000000 0.047619 0.957672 1.000000 0.952381 0.975610
AWS Multi 75 16 0 87 75 11 5 0 0.687500 0.000000 0.879121 0.872093 1.000000 0.931677
AWS S3 Bucket 66 24 0 92 66 24 0 0 1.000000 0.000000 0.733333 0.733333 1.000000 0.846154
Atlassian Old PAT token 27 308 3 12 3 8 303 24 0.025723 0.888889 0.905325 0.272727 0.111111 0.157895
Auth 419 2738 76 400 387 13 2801 32 0.004620 0.076372 0.986081 0.967500 0.923628 0.945055
Auth 419 2739 76 391 386 5 2810 33 0.001776 0.078759 0.988250 0.987212 0.921241 0.953086
Azure Access Token 19 0 0 12 12 0 0 7 0.368421 0.631579 1.000000 0.631579 0.774194
BASE64 Private Key 7 4 0 7 7 0 4 0 0.000000 0.000000 1.000000 1.000000 1.000000 1.000000
BASE64 encoded PEM Private Key 7 0 0 5 5 0 0 2 0.285714 0.714286 1.000000 0.714286 0.833333
Expand All @@ -242,8 +242,8 @@ CMD ConvertTo-SecureString 13 4 0 1
CMD Password 21 128 6 19 19 0 134 2 0.000000 0.095238 0.987097 1.000000 0.904762 0.950000
CMD Secret 1 1 0 1 1 0 1 0 0.000000 0.000000 1.000000 1.000000 1.000000 1.000000
CMD Token 6 0 0 6 6 0 0 0 0.000000 1.000000 1.000000 1.000000 1.000000
Certificate 23 471 1 25 19 6 466 4 0.012712 0.173913 0.979798 0.760000 0.826087 0.791667
Credential 95 420 74 95 94 1 493 1 0.002024 0.010526 0.996604 0.989474 0.989474 0.989474
Certificate 23 471 1 23 17 6 466 6 0.012712 0.260870 0.975758 0.739130 0.739130 0.739130
Credential 95 420 74 92 91 1 493 4 0.002024 0.042105 0.991511 0.989130 0.957895 0.973262
Docker Swarm Token 2 0 0 1 1 0 0 1 0.500000 0.500000 1.000000 0.500000 0.666667
Dropbox App secret 64 139 1 46 35 10 130 29 0.071429 0.453125 0.808824 0.777778 0.546875 0.642202
Facebook Access Token 0 1 0 0 0 1 0 0.000000 1.000000
Expand All @@ -258,17 +258,17 @@ Grafana Provisioned API Key 22 1 0
JSON Web Token 170 61 0 131 131 0 61 39 0.000000 0.229412 0.831169 1.000000 0.770588 0.870432
Jira / Confluence PAT token 0 4 0 0 0 4 0 0.000000 1.000000
Jira 2FA 15 6 1 12 12 0 7 3 0.000000 0.200000 0.863636 1.000000 0.800000 0.888889
Key 3905 15723 482 3972 3882 90 16115 23 0.005554 0.005890 0.994381 0.977341 0.994110 0.985654
Key 3908 15720 482 3963 3877 86 16116 31 0.005308 0.007932 0.994182 0.978299 0.992068 0.985135
Nonce 91 49 0 89 89 0 49 2 0.000000 0.021978 0.985714 1.000000 0.978022 0.988889
Other 0 8291 1 0 0 8292 0 0.000000 1.000000
PEM Private Key 1019 1483 0 1023 1019 4 1479 0 0.002697 0.000000 0.998401 0.996090 1.000000 0.998041
Password 1847 7539 2702 1777 1708 63 10178 139 0.006152 0.075257 0.983289 0.964427 0.924743 0.944168
Salt 45 76 2 43 42 1 77 3 0.012821 0.066667 0.967480 0.976744 0.933333 0.954545
Secret 1298 1576 798 1278 1276 2 2372 22 0.000842 0.016949 0.993464 0.998435 0.983051 0.990683
Password 1854 7537 2698 1746 1710 36 10199 144 0.003517 0.077670 0.985110 0.979381 0.922330 0.950000
Salt 45 76 2 43 43 0 78 2 0.000000 0.044444 0.983740 1.000000 0.955556 0.977273
Secret 1300 1576 798 1279 1278 1 2373 22 0.000421 0.016923 0.993740 0.999218 0.983077 0.991082
Seed 1 6 0 0 0 6 1 0.000000 1.000000 0.857143 0.000000
Slack Token 4 1 0 4 4 0 1 0 0.000000 0.000000 1.000000 1.000000 1.000000 1.000000
Token 647 4176 438 617 604 13 4601 43 0.002818 0.066461 0.989356 0.978930 0.933539 0.955696
Token 652 4171 438 623 618 5 4604 34 0.001085 0.052147 0.992587 0.991974 0.947853 0.969412
Twilio API Key 0 5 2 0 0 7 0 0.000000 1.000000
URL Credentials 210 151 220 209 206 3 368 4 0.008086 0.019048 0.987952 0.985646 0.980952 0.983294
URL Credentials 210 151 220 210 207 3 368 3 0.008086 0.014286 0.989673 0.985714 0.985714 0.985714
UUID 1069 265 0 1068 1067 1 264 2 0.003774 0.001871 0.997751 0.999064 0.998129 0.998596
12187 50518 5102 11535 11225 297 50221 962 0.005879 0.078937 0.979922 0.974223 0.921063 0.946898
12204 50509 5098 11489 11234 248 50261 970 0.004910 0.079482 0.980578 0.978401 0.920518 0.948577
Binary file modified credsweeper/ml_model/ml_model.onnx
Binary file not shown.
10 changes: 5 additions & 5 deletions tests/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,18 +8,18 @@
NEGLIGIBLE_ML_THRESHOLD = 0.0001

# credentials count after scan
SAMPLES_CRED_COUNT: int = 380
SAMPLES_CRED_LINE_COUNT: int = 397
SAMPLES_CRED_COUNT: int = 384
SAMPLES_CRED_LINE_COUNT: int = 401

# credentials count after post-processing
SAMPLES_POST_CRED_COUNT: int = 351
SAMPLES_POST_CRED_COUNT: int = 344

# with option --doc
SAMPLES_IN_DOC = 420

# archived credentials that are not found without --depth
SAMPLES_IN_DEEP_1 = SAMPLES_POST_CRED_COUNT + 23
SAMPLES_IN_DEEP_2 = SAMPLES_IN_DEEP_1 + 21
SAMPLES_IN_DEEP_1 = SAMPLES_POST_CRED_COUNT + 24
SAMPLES_IN_DEEP_2 = SAMPLES_IN_DEEP_1 + 19
SAMPLES_IN_DEEP_3 = SAMPLES_IN_DEEP_2 + 1

# well known string with all latin letters
Expand Down
Loading

0 comments on commit 926f286

Please sign in to comment.