###################
###################
Started as a fork of Christian Martorella's Metagoofil it has been completely refactored. So now it's almost all new!
-
From git
pip install git+git://github.com/SilentFrogNet/mercurius.git
The name Mercurius is inspired from the greek god Hermes. Among the others he is the god of luck, trickery and thieves.
He is also known as the "keeper of the boundaries" for his role as bridge between the upper and lower worlds.
Mercurius is a tool for extracting metadata of public documents (pdf, doc, xls, ppt, docx, xlsx, pptx, odt, ods, odp, jpg, jpeg, tiff) availables in the target websites.This information could be useful because you can get valid usernames, people names, hosts, emails,... for using later in bruteforce password attacks (vpn, ftp, webapps).
The tool first perform a query in Google requesting different file types that can have useful metadata (pdf, doc, xls, ppt,...), then will download those documents to the disk and extracts the metadata of the file using specific libraries for parsing different file types (Hachoir, Pdfminer, etc)
At the moment this tool can parse and extract metadata from:
- Microsoft Office 97 documents (doc, xls, ppt)
- Microsoft Office 2k+ documents (docx, xlsx, pptx)
- PDF (pdf)
- Images with Exif data (jpg/jpeg, tif/tiff)
- OpenOffice documents (odt, ods, odp) <- NOT YET
- Apple Office documents (pages, numbers, key) <- NOT YET
Those are the available extractors:
- PDFExtractor
- ImageExtractor
- MSOfficeExtractor
- MSOfficeXMLExtractor
- OpenOfficeExtractor
- AppleOfficeExtractor
The tool implements a plugin architecture though pluggy system.
To enable a new plugin it must be put in the mercurius/extractors
folder and then enabled through the configuration file with an entry like <plugin_file_name>=<class_extractor_name>
.
- Integrate Bing Search
- Integrate Exalead Search
- Make it python-agnostic? (working both on python 2 and 3) with six
- Manage applications's context
- Keep track of already downloaded files
- Keep domain context
- Further searches on the same domain will extend data
- if domain is changed or local analysis is performed, ask to cleanup or extend
- Change plugin system...move from "pick from folder" to "get through setuptools"
- Changed/Fixed Google Search
- Fixed downloader
- Fixed/Enhanced page parser
- Fixed metadataMSOfficeXML extractor
- Added Image Exif metadata extractor
- Fixed metadataPDF extractor
- Removed external projects
- Modified cli interface (using click)
- Added shell interface (using a modified version of click-shell)
- Ascii Art random banner like metasploit ;)
- Other little fixes
- Move all dependencies to setup.py file
- Setup a plugin architecture for the extractors with pluggy