Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pe_source module from pe-reports #1

Open
wants to merge 40 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
2bad91e
Copy pe_source module for pe_reports repo
aloftus23 Jul 6, 2023
9c23a77
Create new setup.py and update import directories
aloftus23 Jul 17, 2023
7289af6
Run linter
aloftus23 Jul 18, 2023
ad3d391
Fix pre-commit checks
aloftus23 Jul 18, 2023
ab4e5a7
Update README.md with examples
aloftus23 Jul 18, 2023
f489fa6
fix readme linter
aloftus23 Jul 18, 2023
83ea436
Fix pytests
aloftus23 Jul 18, 2023
75f5e54
fix lint
aloftus23 Jul 18, 2023
77951cb
Fix dnstwist script
aloftus23 Jul 18, 2023
c42e28d
fix lint
aloftus23 Jul 18, 2023
3dbd17c
Don't tesst on python3.7 packages since it will fail with new package…
aloftus23 Jul 18, 2023
99e738b
Also remove python 3.6
aloftus23 Jul 18, 2023
0bcb398
fix build.yml file
aloftus23 Jul 18, 2023
0a5e6a4
Remove python 3.6/3.7 and add 3.10/3.11 to setup.py
aloftus23 Jul 26, 2023
3e7eab1
Correct comment and alphabetize package_data
aloftus23 Jul 26, 2023
ccbeaac
Remove E712 ignore
aloftus23 Aug 18, 2023
b23a404
Change pe_scripts to pe_source
aloftus23 Sep 24, 2023
e25ecd0
Delete .DS_Store
aloftus23 Sep 24, 2023
f58ca92
Fix .gitignore
aloftus23 Sep 24, 2023
3ed5c3d
Fix formatting in README
aloftus23 Sep 24, 2023
036eac1
Fix setup.py mistakes
aloftus23 Sep 24, 2023
412515b
Fix commenting in __init__
aloftus23 Sep 24, 2023
6c3d078
Remove ubuntu-10.04 from build.yml
aloftus23 Sep 24, 2023
5d1b723
Alphabetize .pre-commit configs
aloftus23 Sep 24, 2023
3bbb6a2
Merge branch 'AL-copy-pe-reports' of https://github.com/cisagov/pe-so…
aloftus23 Sep 24, 2023
d8e46a6
Fix options in docopt and readme
aloftus23 Sep 24, 2023
d67d2d7
Alphabetize database.ini
aloftus23 Sep 24, 2023
e63fb27
Delete tests/.DS_Store
aloftus23 Sep 24, 2023
ac7c6ea
Re-add CPython to setup.oy
aloftus23 Sep 24, 2023
00275f0
Fix comment in src/pe_source/cybersixgill.py
aloftus23 Sep 24, 2023
95615a1
Simplify pe_org log in src/pe_source/cybersixgill.py
aloftus23 Sep 24, 2023
3d24f0b
Run linter
aloftus23 Sep 24, 2023
f66a62e
Merge branch 'AL-copy-pe-reports' of https://github.com/cisagov/pe-so…
aloftus23 Sep 24, 2023
37e73ba
Use ipaddress python library to determine v6 vs v4
aloftus23 Sep 24, 2023
53e8e28
Sort root_domains_dnsmonitor by domain
aloftus23 Sep 24, 2023
f618d12
Sort cybersixgill.py sites
aloftus23 Sep 24, 2023
3a2e262
Fix docopt for soc-med-included
aloftus23 Sep 24, 2023
02fa638
Merge branch 'AL-copy-pe-reports' of https://github.com/cisagov/pe-so…
aloftus23 Sep 24, 2023
10f0e6d
Address magic numbers
aloftus23 Sep 24, 2023
c362c8e
Replace double for-loop with merge in dnsmonitor/source
aloftus23 Sep 24, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion .flake8
Original file line number Diff line number Diff line change
Expand Up @@ -22,4 +22,9 @@ select = C,D,E,F,W,B,B950
# operators. It no longer agrees with PEP8. See, for example, here:
# https://github.com/ambv/black/issues/21. Guido agrees here:
# https://github.com/python/peps/commit/c59c4376ad233a62ca4b3a6060c81368bd21e85b.
ignore = E501,W503
#
# Also ignore flake8's error about whitespaces before a ':'. (E203)
# No longer complies with PEP8 See example, here:
# https://github.com/PyCQA/pycodestyle/issues/373
# and here: https://github.com/psf/black/issues/315
ignore = E203,E501,W503
12 changes: 0 additions & 12 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -111,14 +111,10 @@ jobs:
os:
- ubuntu-latest
python-version:
- "3.7"
- "3.8"
- "3.9"
- "3.10"
- "3.11"
include:
- os: ubuntu-20.04
python-version: "3.6"
steps:
- uses: actions/checkout@v3
- id: setup-python
Expand Down Expand Up @@ -207,14 +203,10 @@ jobs:
os:
- ubuntu-latest
python-version:
- "3.7"
- "3.8"
- "3.9"
- "3.10"
- "3.11"
include:
- os: ubuntu-20.04
python-version: "3.6"
steps:
- uses: actions/checkout@v3
- id: setup-python
Expand Down Expand Up @@ -260,14 +252,10 @@ jobs:
os:
- ubuntu-latest
python-version:
- "3.7"
- "3.8"
- "3.9"
- "3.10"
- "3.11"
include:
- os: ubuntu-20.04
python-version: "3.6"
steps:
- uses: actions/checkout@v3
- id: setup-python
Expand Down
10 changes: 10 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,13 @@
# Files already tracked by Git are not affected.
# See: https://git-scm.com/docs/gitignore

## macOS ##
.DS_Store

## Project Specific ##
pe_reports_logging.log
src/pe_source/data/dnstwist_output.txt

## Python ##
__pycache__
.coverage
Expand All @@ -10,3 +17,6 @@ __pycache__
.python-version
*.egg-info
dist

## IDE ##
.vscode
15 changes: 14 additions & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,19 @@ repos:
hooks:
- id: mypy
additional_dependencies:
- boto3-stubs
- celery-types
- pandas-stubs
- types-chevron
- types-colorama
- types-docopt
- types-Flask-Migrate
- types-psycopg2
- types-Pygments
- types-PyYAML
- types-python-dateutil==2.8.19
- types-requests
- types-retry
- types-setuptools
- repo: https://github.com/asottile/pyupgrade
rev: v3.3.1
Expand All @@ -124,7 +137,7 @@ repos:

# Ansible hooks
- repo: https://github.com/ansible-community/ansible-lint
rev: v5.4.0
rev: v6.17.2
hooks:
- id: ansible-lint
# files: molecule/default/playbook.yml
Expand Down
73 changes: 59 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,20 +5,65 @@
[![Coverage Status](https://coveralls.io/repos/github/cisagov/pe-source/badge.svg?branch=develop)](https://coveralls.io/github/cisagov/pe-source?branch=develop)
[![Known Vulnerabilities](https://snyk.io/test/github/cisagov/pe-source/develop/badge.svg)](https://snyk.io/test/github/cisagov/pe-source)

This is a generic skeleton project that can be used to quickly get a
new [cisagov](https://github.com/cisagov) Python library GitHub
project started. This skeleton project contains [licensing
information](LICENSE), as well as
[pre-commit hooks](https://pre-commit.com) and
[GitHub Actions](https://github.com/features/actions) configurations
appropriate for a Python library project.

## New Repositories from a Skeleton ##

Please see our [Project Setup guide](https://github.com/cisagov/development-guide/tree/develop/project_setup)
for step-by-step instructions on how to start a new repository from
a skeleton. This will save you time and effort when configuring a
new repository!
This package is used to gather and store data for the CISA
[Posture & Exposure Reports](https://github.com/cisagov/pe-reports).

Data of interest include *Exposed Credentials, Domain Masquerading, Malware,
Inferred Vulnerabilities, and the Dark Web*. The data collected for the reports
is gathered on the 1st and 15th of each month.

## Requirements ##

- [Python Environment](CONTRIBUTING.md#creating-the-python-virtual-environment)

## Installation ##

- `git clone https://github.com/cisagov/pe-source.git`
- Add database/API credentials to `src/pe_source/data/pe_db/database.ini`
- `pip install -e .`

## Run P&E Source ##

```console
Usage:
pe-source DATA_SOURCE [--log-level=LEVEL] [--orgs=ORG_LIST] [--cybersix-methods=METHODS] [--soc_med_included]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --soc_med_included flag is not listed under "Options:" below.

Also, the other options use - instead of _. I suggest renaming the option --soc-med-included for consistency.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed here: d8e46a6


Arguments:
DATA_SOURCE Source to collect data from. Valid values are "cybersixgill",
"dnstwist", "hibp", "intelx", and "shodan".

Options:
-h --help Show this message.
-v --version Show version information.
-l --log-level=LEVEL If specified, then the log level will be set to
the specified value. Valid values are "debug", "info",
"warning", "error", and "critical". [default: info]
-o --orgs=ORG_LIST A comma-separated list of orgs to collect data for.
If not specified, data will be collected for all
orgs in the pe database. Orgs in the list must match the
IDs in the cyhy-db. E.g. DHS,DHS_ICE,DOC
[default: all]
-c --cybersix-methods=METHODS A comma-separated list of Cybersixgill methods to run.
If not specified, all will run. Valid values are "alerts",
"credentials", "mentions", "topCVEs". E.g. alerts,mentions.
[default: all]
-s --soc-med-included Include social media posts from cybersixgill in data collection.

```

## Examples ##

Run shodan on DHS and DOT:

```console
pe-source shodan --orgs=DHS,DOT
```

Run Cybersixgill mentions on DHS and include social media data:

```console
pe-source cybersixgill --cybersix-methods=mentions --orgs=DHS --soc_med_included
```

## Contributing ##

Expand Down
45 changes: 33 additions & 12 deletions setup.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
"""
This is the setup module for the example project.
This is the setup module for the pe-source project.

Based on:

Expand Down Expand Up @@ -42,10 +42,10 @@ def get_version(version_file):


setup(
name="example",
name="pe_source",
# Versions should comply with PEP440
version=get_version("src/example/_version.py"),
description="Example Python library",
version=get_version("src/pe_source/_version.py"),
description="Posture and Exposure Source Library",
long_description=readme(),
long_description_content_type="text/markdown",
# Landing page for CISA's cybersecurity mission
Expand Down Expand Up @@ -74,9 +74,6 @@ def get_version(version_file):
# Specify the Python versions you support here. In particular, ensure
# that you indicate whether you support Python 2, Python 3 or both.
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3 :: Only",
"Programming Language :: Python :: 3.6",
"Programming Language :: Python :: 3.7",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
Expand All @@ -85,13 +82,33 @@ def get_version(version_file):
],
python_requires=">=3.6",
# What does your project relate to?
keywords="skeleton",
keywords="posture exposure source",
packages=find_packages(where="src"),
package_dir={"": "src"},
package_data={"example": ["data/*.txt"]},
package_data={
"pe_source": [
"data/*",
"data/dnsmonitor/*",
"data/pe_db/*",
"data/shodan/*",
"data/sixgill/*",
],
},
py_modules=[splitext(basename(path))[0] for path in glob("src/*.py")],
include_package_data=True,
install_requires=["docopt", "schema", "setuptools >= 24.2.0"],
install_requires=[
"click",
"docopt",
"dnstwist",
"dshield",
"dnspython == 2.2.1",
"importlib_resources == 5.4.0",
"pandas == 1.5.1",
"psycopg2-binary == 2.9.3",
"retry == 0.9.2",
"schema == 0.7.5",
"shodan == 1.27.0",
],
extras_require={
"test": [
"coverage",
Expand All @@ -107,6 +124,10 @@ def get_version(version_file):
"pytest",
]
},
# Conveniently allows one to run the CLI tool as `example`
entry_points={"console_scripts": ["example = example.example:main"]},
# Conveniently allows one to run the CLI tool as `pe-source'
entry_points={
"console_scripts": [
"pe-source = pe_source.pe_source:main",
]
},
)
9 changes: 0 additions & 9 deletions src/example/__init__.py

This file was deleted.

1 change: 0 additions & 1 deletion src/example/data/secret.txt

This file was deleted.

103 changes: 0 additions & 103 deletions src/example/example.py

This file was deleted.

Loading