This tool is designed to make it easy to signature potentially unique parts of RTF files.
It was written by David Cannings (@edeca) and released by PwC UK under the Apache 2.0 license.
To install, you'll need Python 3 and some basic libraries. These are handled automatically if you install using pip
:
$ pip install rtfsig
Then run like:
$ rtfsig -f badfile.rtf -y output.yar
This will scan the file for potentially unique RTF tags, print details to screen and save a Yara rule to output.yar
.
Please raise bugs as Github issues, and note this tool is in beta.
Basic output is shown on the console, which can be used to search VirusTotal (try a search like content:rsid7043998
).
-> % rtfsig -f 0b06052d3b5954594cf0e28bd9c50d9110eb8fb78cb78c9a99686eb4ba3391df.hostile
INFO:root:Starting to parse file 0b06052d3b5954594cf0e28bd9c50d9110eb8fb78cb78c9a99686eb4ba3391df.hostile
INFO:root:Non-standard RTF magic marker, should be {\rtf1, often a sign of malicious docs
INFO:root:Found an RSID table in this document
INFO:root:Found 1 embedded image(s) with set height/width
INFO:root:Found 2 document information group tags
INFO:root:Interesting strings (higher chance of FP): \rsid7043998, \rsid7476075, insrsid7043998, \rsid10243744, \rsid7604251, insrsid10243744, {\author blue}, rsidroot10243744, \rsid9200135, tblrsid10243744, charrsid10243744, \picw1\pich1\picwgoal1\pichgoal1 , pararsid10243744, \rsid7238080, insrsid7476075, \rsid11666446, insrsid12343406, \rsid12343406, {\operator blue}
INFO:root:Found some unique strings! Consider using vtgrep or deploying Yara rules
Debug output can be generated using -v
which is helpful if you are reporting a bug.
The tool will automatically generate Yara rules if the -y
option is passed. Two Yara rules are created, one which should generate low false positives (strict_rule
) and one which may have a higher false positive rate (loose_rule
).
It is recommended to review strings carefully and to change any of them
to a sensible number, for example 3 of them
.
An example rule generated from 0b06052d3b5954594cf0e28bd9c50d9110eb8fb78cb78c9a99686eb4ba3391df
looks like:
rule loose_rule {
meta:
description = "RTF file matching known unique identifiers (higher chance of FP, adjust 'any of them' if required)"
generated_by = "rtfsig version 0.0.2"
strings:
$ = "{\\author blue}" ascii
$ = "\\rsid7238080" ascii
$ = "pararsid10243744" ascii
$ = "insrsid7043998" ascii
$ = "\\rsid7043998" ascii
$ = "rsidroot10243744" ascii
$ = "\\rsid9200135" ascii
$ = "\\rsid7604251" ascii
$ = "insrsid7476075" ascii
$ = "\\rsid10243744" ascii
$ = "insrsid12343406" ascii
$ = "{\\operator blue}" ascii
$ = "insrsid10243744" ascii
$ = "charrsid10243744" ascii
$ = "\\rsid11666446" ascii
$ = "\\rsid12343406" ascii
$ = "\\picw1\\pich1\\picwgoal1\\pichgoal1 " ascii
$ = "tblrsid10243744" ascii
$ = "\\rsid7476075" ascii
condition:
uint32be(0) == 0x7b5c7274 and any of them
}
rule strict_rule {
meta:
description = "RTF file matching known unique identifiers (lower chance of FP)"
generated_by = "rtfsig version 0.0.2"
strings:
$ = "\\rsid7043998\\rsid7238080\\rsid7476075\\rsid7604251\\rsid9200135\\rsid10243744\\rsid11666446\\rsid12343406" ascii
condition:
uint32be(0) == 0x7b5c7274 and any of them
}
- At present, documents containing lots of obfuscation (e.g. comments between control words and their values) may not be parsed correctly. Please raise an issue with sample files for further inspection.
To setup a development environment, clone the git repository and run the following inside a virtualenv:
$ pip install -e ".[dev]"
Before submitting a pull request, please check all tests pass and there is 100% coverage of the core module.
This is as simple as running tox and checking the output:
$ tox
.. tool output ..
py37: commands succeeded
congratulations :)
Packaging:
$ python setup.py sdist bdist_wheel
Check and upload to PyPI, signing with GPG:
$ twine check dist/*
$ twine upload dist/* --sign --identity FCEC8AAA140C74C826592AC357974C5B48A00D9B
- v0.0.1 (18th October 2019) - Initial version, supports RSID control words and generating Yara rules
- v0.0.2 (23rd October 2019) - Second beta, added support for unique image identifiers and document information
- v0.0.3 (23rd October 2019) - Third beta, added support for picture sizes
- v0.1.0 (19th September 2020) - First public release, packaged as a Python module for PyPI
- v0.1.1 (26th January 2024) - Bumped Jinja2 dependency to a current version
- v0.1.2 (7th January 2025) - Bumped Jinja2 dependency to a current version
- v0.1.3 (7th January 2025) - Tests fixed and integrated with GitHub actions