Windows binaries in Downloads
If you want to take a look at the code, look at hg repo.
I believe one picture is worth thousand words, so here's the pic
cmd> clamscan --database=clamsrch.ndb --database=clamsrch.ldb --database=clampeid.ndb input.bin
cs clamsrch Multi-Sig searcher results
cs bm: 00021eb0 Lucifer (outerbridge) DFLTKY [8.byt.16].UNOFFICIAL
cs bm: 00021ee0 SHA256 Hash constant words K (0x428a2f98) [32.lil.256].UNOFFICIAL
cs bm: 00021fe4 Crypton kp [32.lil.16].UNOFFICIAL
cs bm: 00021ff4 Crypton kq [32.lil.16].UNOFFICIAL
cs bm: 000b7230 rfc3548 Base 64 Encoding with URL and Filename Safe Alphabet [8.byt.ASC.62].UNOFFICIAL
cs lo: ........ SHA256 Initial hash value H (0x6a09e667UL) [32.lil.LOGIC].UNOFFICIAL
cs lo: ........ MD4 digest [32.lil.LOGIC].UNOFFICIAL
cs lo: ........ RIPEMD [32.lil.LOGIC].UNOFFICIAL
cs lo: ........ SHA1 / SHA0 / RIPEMD-160 initialization [32.lil.LOGIC].UNOFFICIAL
cs lo: ........ RIPEMD k values [32.lil.LOGIC].UNOFFICIAL
cs ac: 0000004e (gl) dUP v2.x Patcher --> www.diablo2oo2.cjb.net.UNOFFICIAL
cs ac: 00000400 (ep) MingWin32 - Dev C++ v4.x (h).UNOFFICIAL
cs ac: 00000400 (gl) Dev-C++ v5.UNOFFICIAL
cs ac: 00000420 (gl) Dev-C++ v5.UNOFFICIAL
cs ac: 00022004 SHA224 [32.lil.AND].UNOFFICIAL
cs ac: 0005b5b8 RIPEMD-128 InitState [32.lil.AND].UNOFFICIAL
cs ac: 0005b638 RIPEMD-128 InitState [32.lil.AND].UNOFFICIAL
cs ac: 0005b85c RIPEMD-128 InitState [32.lil.AND].UNOFFICIAL
cs ac: 0005b5b8 RIPEMD-128 InitState [32.lil.AND].UNOFFICIAL
cs ac: 0005b638 RIPEMD-128 InitState [32.lil.AND].UNOFFICIAL
cs ac: 0005b85c RIPEMD-128 InitState [32.lil.AND].UNOFFICIAL
d:\projects\idfier\input.bin: OK
----------- SCAN SUMMARY -----------
Known viruses: 5844
Engine version: devel-clamav-0.97-475-gfcbc30b
Scanned directories: 0
Scanned files: 1
Infected files: 0
Data scanned: 1.33 MB
Data read: 1.25 MB (ratio 1.06:1)
Time: 0.288 sec (0 m 0 s)
You may know Luigi Auriemma's signsrch tool.
It's really nice tool, and it's does it job, but it's not really suited for scanning large groups of files.
Probably you've also heard about ClamAv. As you might know it has uber-fast search engine based on Aho-Corasick and multi-pattern version of Boyer-Moore
The main goal of clamifier is conversion of signrch-like database into format understandable by clamav's clamscan utility.
Conversion alone however wouldn't do a trick, there are also needed small modifications inside libclamav, so that we can get all matches.
Oh did I mention I've also made small changes to signsrch.sig database?
Keep in mind that original db was made mainly by Luigi so kudos goes to him.
The db format is a little bit more strict than original signsrch format,
mainly to make parsing easier. I'll describe it soon in, Now you can find some details in dbformat
As a bonus, I've also made simple script for converting peid userdb.txt into clamav format. (This actually was simple, as userdb has "almost" proper format)
I had some old ~600kb userdb, but it's kinda crappy. The converted db is inside clampeid.ndb file.
clamsrch.bat itself is only batch script, that calls clamscan, with following arguments:
--scan-archive=no --max-recursion=1
- not to delve into archives-r
- scan directory recursively--database=clamsrch.ndb --database=clamsrch.ldb
- databases generated by clamifier--dev-ac-only
- this option forces usage of only Aho-Corasick, it will have bigger memory requirements, but the scan will be a lot faster, here's comparison- this option is somewhat "hidden" - you won't see it with
clamscan --help
- this option is somewhat "hidden" - you won't see it with
data scanned | data read | W/O ac-only | with ac-only |
---|---|---|---|
307.80 MB | 162.17 MB | 52.341 sec | 22.235 sec |
(many packed files, that's why there's a difference between data scanned and data read)
The same data scanned with original signsrch:
Execution time: 124.016 s |
---|
...that you might want to consider
--database=clampeid.ndb
- use converted peid database--verbose
- by default offsets for logical signatures are not printed, but if you pass verbose they should be visible, like:
cs lo: ........ SHA1 / SHA0 / RIPEMD-160 initialization [32.lil.LOGIC].UNOFFICIAL
cs 0005b5bf 0005a033 0005a048 0005a041 0005a03a [end]
cs lo: ........ RIPEMD k values [32.lil.LOGIC].UNOFFICIAL
cs 00057479 0005c843 0005c604 0005cd62 0005c372 0005cb07 00057432 000574d7 [end]
By default, clamav adds .UNOFFICIAL string to - yes, you've guessed it - unofficial signatures. I really didn't want to do that, as it felt somehow bad. If you want this in provided windows binaries, you can patch them.
If you compile libclamav by yourself you can easily find and change that.
- as for patch for clamscan/libclamav, since clamav itself is (l)gpl, patch itself is GPL
- the rest of the stuff is on MIT license