Skip to content

Commit

Permalink
Merge branch 'release/3.0.0'
Browse files Browse the repository at this point in the history
  • Loading branch information
fedelemantuano committed Dec 3, 2017
2 parents 50c6947 + 577f7cc commit 2081e12
Show file tree
Hide file tree
Showing 14 changed files with 508 additions and 412 deletions.
37 changes: 35 additions & 2 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,27 @@ before_install:
- sudo apt-get -qq update

# Install msgconvert
- sudo apt-get install -y libemail-outlook-message-perl
- sudo apt-get install -y libemail-outlook-message-perl pandoc

# Build latest images spamscope-root, spamscope-elasticsearch

# make images
- if [ "$TRAVIS_PYTHON_VERSION" == "2.7" ]; then

if [ "$TRAVIS_BRANCH" == "master" ]; then
cd docker &&
docker build --build-arg BRANCH=$TRAVIS_BRANCH -t $DOCKER_USERNAME/spamscope-mail-parser:$TRAVIS_BRANCH . &&
docker run -i -t --rm $DOCKER_USERNAME/spamscope-mail-parser &&
cd -;
fi

if [ "$TRAVIS_BRANCH" == "develop" ]; then
cd docker &&
docker build --build-arg BRANCH=$TRAVIS_BRANCH -t $DOCKER_USERNAME/spamscope-mail-parser:$TRAVIS_BRANCH . &&
docker run -i -t --rm $DOCKER_USERNAME/spamscope-mail-parser:$TRAVIS_BRANCH &&
cd -;
fi
fi

# command to install dependencies
install:
Expand All @@ -35,7 +55,20 @@ script:
- python -m mailparser -f tests/mails/mail_test_6 -j

after_success:
coveralls
- coveralls

- if [ "$TRAVIS_PYTHON_VERSION" == "2.7" ]; then

if [ "$TRAVIS_BRANCH" == "master" ]; then
docker login -u="$DOCKER_USERNAME" -p="$DOCKER_PASSWORD";
docker push $DOCKER_USERNAME/spamscope-mail-parser;
fi

if [ "$TRAVIS_BRANCH" == "develop" ]; then
docker login -u="$DOCKER_USERNAME" -p="$DOCKER_PASSWORD";
docker push $DOCKER_USERNAME/spamscope-mail-parser:$TRAVIS_BRANCH;
fi
fi

notifications:
email: false
Expand Down
78 changes: 51 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,8 @@

## Overview

mail-parser is a wrapper for [email](https://docs.python.org/2/library/email.message.html) Python Standard Library.
mail-parser is not only a wrapper for [email](https://docs.python.org/2/library/email.message.html) Python Standard Library.
It give you an easy way to pass from raw mail to Python object that you can use in your code.
It's the key module of [SpamScope](https://github.com/SpamScope/spamscope).

mail-parser can parse Outlook email format (.msg). To use this feature, you need to install `libemail-outlook-message-perl` package. For Debian based systems:
Expand All @@ -24,25 +25,46 @@ $ apt-cache show libemail-outlook-message-perl

mail-parser supports Python 3.


## Description

mail-parser takes as input a raw email and generates a parsed object. This object is a tokenized email with some indicator:
- body
- headers
mail-parser takes as input a raw email and generates a parsed object. The properties of this object have the same name of
[RFC headers](https://www.iana.org/assignments/message-headers/message-headers.xhtml):

- bcc
- cc
- date
- delivered_to
- from\_ (not `from` because is a keyword of Python)
- message_id
- received
- reply_to
- subject
- from
- to

There are other properties to get:
- body
- headers
- attachments
- message id
- date
- charset mail
- sender IP address
- receiveds

We have also two types of indicator:
- anomalies: mail without message id or date
mail-parser can detect defect in mail:
- [defects](https://docs.python.org/2/library/email.message.html#email.message.Message.defects): mail with some not compliance RFC part

All properties have a JSON and raw property that you can get with:
- name_json
- name_raw

Example:

```
$ mail.to (Python object)
$ mail.to_json (JSON)
$ mail.to_raw (raw header)
```

The command line tool use the JSON format.

### Defects
These defects can be used to evade the antispam filter. An example are the mails with a malformed boundary that can hide a not legitimate epilogue (often malware).
This library can take these epilogues.
Expand All @@ -56,7 +78,7 @@ mail-parser can be downloaded, used, and modified free of charge. It is availabl
## Authors

### Main Author
Fedele Mantuano (**Twitter**: [@fedelemantuano](https://twitter.com/fedelemantuano))
**Fedele Mantuano**: [LinkedIn](https://www.linkedin.com/in/fmantuano/)


## Installation
Expand Down Expand Up @@ -97,24 +119,25 @@ mail = mailparser.parse_from_bytes(byte_mail)
Then you can get all parts

```
mail.attachments: list of all attachments
mail.body
mail.date: datetime object in UTC
mail.defects: defect RFC not compliance
mail.defects_categories: only defects categories
mail.delivered_to
mail.from_
mail.get_server_ipaddress(trust="my_server_mail_trust")
mail.has_defects
mail.headers
mail.headers
mail.mail: tokenized mail in a object
mail.message: email.message.Message object
mail.message_as_string: message as string
mail.message_id
mail.to_
mail.from_
mail.received
mail.subject
mail.text_plain_list: only text plain mail parts in a list
mail.attachments_list: list of all attachments
mail.date_mail
mail.parsed_mail_obj: tokenized mail in a object
mail.parsed_mail_json: tokenized mail in a JSON
mail.defects: defect RFC not compliance
mail.defects_category: only defects categories
mail.has_defects
mail.anomalies
mail.has_anomalies
mail.get_server_ipaddress(trust="my_server_mail_trust")
mail.receiveds
mail.text_plain: only text plain mail parts in a list
mail.to
```

## Usage from command-line
Expand All @@ -124,7 +147,7 @@ If you installed mailparser with `pip` or `setup.py` you can use it with command
These are all swithes:

```
usage: mailparser.py [-h] (-f FILE | -s STRING | -k) [-j] [-b] [-a] [-r] [-t] [-m]
usage: mailparser.py [-h] (-f FILE | -s STRING | -k) [-j] [-b] [-a] [-r] [-t] [-dt] [-m]
[-u] [-c] [-d] [-n] [-i Trust mail server string] [-p] [-z]
[-v]
Expand All @@ -141,6 +164,7 @@ optional arguments:
-a, --attachments Print the attachments of mail (default: False)
-r, --headers Print the headers of mail (default: False)
-t, --to Print the to of mail (default: False)
-dt, --delivered-to Print the delivered-to of mail (default: False)
-m, --from Print the from of mail (default: False)
-u, --subject Print the subject of mail (default: False)
-c, --receiveds Print all receiveds of mail (default: False)
Expand Down
123 changes: 79 additions & 44 deletions README → README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,28 +6,65 @@ mail-parser
Overview
--------

mail-parser is a wrapper for `email`_ Python Standard Library. It’s the
key module of `SpamScope`_.
mail-parser is not only a wrapper for
`email <https://docs.python.org/2/library/email.message.html>`__ Python
Standard Library. It give you an easy way to pass from raw mail to
Python object that you can use in your code. It's the key module of
`SpamScope <https://github.com/SpamScope/spamscope>`__.

mail-parser can parse Outlook email format (.msg). To use this feature, you need to install ``libemail-outlook-message-perl`` package. For Debian based systems:
mail-parser can parse Outlook email format (.msg). To use this feature,
you need to install ``libemail-outlook-message-perl`` package. For
Debian based systems:

::

$ apt-get install libemail-outlook-message-perl

For more details:

::

$ apt-cache show libemail-outlook-message-perl

mail-parser supports Python 3.

Description
-----------

mail-parser takes as input a raw mail and generates a parsed object.
This object is a tokenized email with some indicator:
- body - headers - subject - from - to - attachments - message id - date
- charset mail - sender IP address - receiveds
mail-parser takes as input a raw email and generates a parsed object.
The properties of this object have the same name of `RFC
headers <https://www.iana.org/assignments/message-headers/message-headers.xhtml>`__:

- bcc
- cc
- date
- delivered\_to
- from\_ (not ``from`` because is a keyword of Python)
- message\_id
- received
- reply\_to
- subject
- to

There are other properties to get: - body - headers - attachments -
sender IP address

mail-parser can detect defect in mail: -
`defects <https://docs.python.org/2/library/email.message.html#email.message.Message.defects>`__:
mail with some not compliance RFC part

All properties have a JSON and raw property that you can get with: -
name\_json - name\_raw

Example:

::

$ mail.to (Python object)
$ mail.to_json (JSON)
$ mail.to_raw (raw header)

We have also two types of indicator: - anomalies: mail without message id or date
- `defects`_: mail with some not compliance RFC part
The command line tool use the JSON format.

Defects
~~~~~~~
Expand All @@ -40,16 +77,16 @@ Apache 2 Open Source License
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

mail-parser can be downloaded, used, and modified free of charge. It is
available under the Apache 2 license.
available under the Apache 2 license. |Donate|

Authors
-------

Main Author
~~~~~~~~~~~

Fedele Mantuano (**Twitter**:
[@fedelemantuano](https://twitter.com/fedelemantuano))
**Fedele Mantuano**:
`LinkedIn <https://www.linkedin.com/in/fmantuano/>`__

Installation
------------
Expand All @@ -64,18 +101,18 @@ and install mail-parser with ``setup.py``:

::

cd mail-parser
$ cd mail-parser

python setup.py install
$ python setup.py install

or use ``pip``:

::

pip install mail-parser
$ pip install mail-parser

Usage in a project
-------------------
------------------

Import ``mailparser`` module:

Expand All @@ -92,57 +129,54 @@ Then you can get all parts

::

mail.attachments: list of all attachments
mail.body
mail.date: datetime object in UTC
mail.defects: defect RFC not compliance
mail.defects_categories: only defects categories
mail.delivered_to
mail.from_
mail.get_server_ipaddress(trust="my_server_mail_trust")
mail.has_defects
mail.headers
mail.headers
mail.mail: tokenized mail in a object
mail.message: email.message.Message object
mail.message_as_string: message as string
mail.message_id
mail.to_
mail.from_
mail.received
mail.subject
mail.text_plain_list: only text plain mail parts in a list
mail.attachments_list: list of all attachments
mail.date_mail
mail.parsed_mail_obj: tokenized mail in a object
mail.parsed_mail_json: tokenized mail in a JSON
mail.defects: defect RFC not compliance
mail.defects_category: only defects categories
mail.has_defects
mail.anomalies
mail.has_anomalies
mail.get_server_ipaddress(trust="my_server_mail_trust")
mail.receiveds

.. _email: https://docs.python.org/2/library/email.message.html
.. _SpamScope: https://github.com/SpamScope/spamscope
.. _defects: https://docs.python.org/2/library/email.message.html#email.message.Message.defects
mail.text_plain: only text plain mail parts in a list
mail.to

Usage from command-line
-----------------------

If you installed mailparser with ``pip`` or ``setup.py`` you can use it with
command-line.
If you installed mailparser with ``pip`` or ``setup.py`` you can use it
with command-line.

These are all swithes:

::

usage: mailparser [-h] (-f FILE | -s STRING | -k) [-j] [-b] [-a] [-r] [-t] [-m]
[-u] [-c] [-d] [-n] [-i Trust mail server string] [-p] [-z]
[-v]
usage: mailparser.py [-h] (-f FILE | -s STRING | -k) [-j] [-b] [-a] [-r] [-t] [-dt] [-m]
[-u] [-c] [-d] [-n] [-i Trust mail server string] [-p] [-z]
[-v]

Wrapper for email Python Standard Library

optional arguments:
-h, --help show this help message and exit
-f FILE_, --file FILE_
Raw email file (default: None)
-s STRING_, --string STRING_
-f FILE, --file FILE Raw email file (default: None)
-s STRING, --string STRING
Raw email string (default: None)
-k, --stdin Enable parsing from stdin (default: False)
-j, --json Show the JSON of parsed mail (default: False)
-b, --body Print the body of mail (default: False)
-a, --attachments Print the attachments of mail (default: False)
-r, --headers Print the headers of mail (default: False)
-t, --to Print the to of mail (default: False)
-dt, --delivered-to Print the delivered-to of mail (default: False)
-m, --from Print the from of mail (default: False)
-u, --subject Print the subject of mail (default: False)
-c, --receiveds Print all receiveds of mail (default: False)
Expand All @@ -168,12 +202,13 @@ Example:
This example will show you the tokenized mail in a JSON pretty format.


.. |PyPI version| image:: https://badge.fury.io/py/mail-parser.svg
:target: https://badge.fury.io/py/mail-parser
.. |Build Status| image:: https://travis-ci.org/SpamScope/mail-parser.svg?branch=develop
:target: https://travis-ci.org/SpamScope/mail-parser
.. |Coverage Status| image:: https://coveralls.io/repos/github/SpamScope/mail-parser/badge.svg?branch=develop
:target: https://coveralls.io/github/SpamScope/mail-parser?branch=develop
.. |BCH compliance| image:: https://bettercodehub.com/edge/badge/SpamScope/mail-parser?branch=devel
.. |BCH compliance| image:: https://bettercodehub.com/edge/badge/SpamScope/mail-parser?branch=develop
:target: https://bettercodehub.com/
.. |Donate| image:: https://www.paypal.com/en_US/i/btn/btn_donateCC_LG.gif
:target: https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick&hosted_button_id=VEPXYP745KJF2
Loading

0 comments on commit 2081e12

Please sign in to comment.