Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for database URLs #54

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 17 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ pip install tap-mongodb

## Set up Config file
Create json file called `config.json`, with the following contents:
```
```json
{
"password": "<password>",
"user": "<username>",
Expand All @@ -25,15 +25,29 @@ Create json file called `config.json`, with the following contents:
"database": "<database name>"
}
```
The folowing parameters are optional for your config file:

All of the above attributes are required by the tap to connect to your mongo instance.


Alternatively, you can use a database URL to connect in which case the above settings will be ignored and are thus optional.
Note that you have to include settings like `replica_set` or `ssl` directly in the URL instead of
providing them separately if you use a database URL to establish the connection.

For example, to connect to a database that uses DNS SRV records:
```json
{
"database_url": "mongodb+srv://user:[email protected]/test?w=majority&tls=true"
}
```

The following parameters are optional for your config file:

| Name | Type | Description |
| -----|------|------------ |
| `replica_set` | string | name of replica set |
|`ssl` | Boolean | can be set to true to connect using ssl |
| `include_schema_in_destination_stream_name` | Boolean | forces the stream names to take the form `<database_name>_<collection_name>` instead of `<collection_name>`|

All of the above attributes are required by the tap to connect to your mongo instance.

## Run in discovery mode
Run the following command and redirect the output into the catalog file
Expand Down
1 change: 1 addition & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
'pymongo==3.8.0',
'tzlocal==2.0.0',
'terminaltables==3.1.0',
'dnspython==2.1.0'
],
extras_require={
'dev': [
Expand Down
46 changes: 29 additions & 17 deletions tap_mongodb/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
from bson import timestamp

import singer
from urllib.parse import urlparse
from singer import metadata, metrics, utils

import tap_mongodb.sync_strategies.common as common
Expand All @@ -18,7 +19,7 @@

LOGGER = singer.get_logger()

REQUIRED_CONFIG_KEYS = [
REQUIRED_CONFIG_KEYS_IF_NO_DATABASE_URL_PROVIDED = [
'host',
'port',
'user',
Expand Down Expand Up @@ -350,27 +351,38 @@ def do_sync(client, catalog, state):


def main_impl():
args = utils.parse_args(REQUIRED_CONFIG_KEYS)
args = utils.parse_args([])
config = args.config

# Default SSL verify mode to true, give option to disable
verify_mode = config.get('verify_mode', 'true') == 'true'
use_ssl = config.get('ssl') == 'true'
if not config.get('database_url', None):
args = utils.parse_args(REQUIRED_CONFIG_KEYS_IF_NO_DATABASE_URL_PROVIDED)
config = args.config
connection_params = {"host": config['host'],
"port": int(config['port']),
"username": config.get('user', None),
"password": config.get('password', None),
"authSource": config['database'],
"ssl": use_ssl,
"replicaset": config.get('replica_set', None),
"readPreference": 'secondaryPreferred'}

connection_params = {"host": config['host'],
"port": int(config['port']),
"username": config.get('user', None),
"password": config.get('password', None),
"authSource": config['database'],
"ssl": use_ssl,
"replicaset": config.get('replica_set', None),
"readPreference": 'secondaryPreferred'}
# Default SSL verify mode to true, give option to disable
verify_mode = config.get('verify_mode', 'true') == 'true'
use_ssl = config.get('ssl') == 'true'

# NB: "ssl_cert_reqs" must ONLY be supplied if `SSL` is true.
if not verify_mode and use_ssl:
connection_params["ssl_cert_reqs"] = ssl.CERT_NONE
# NB: "ssl_cert_reqs" must ONLY be supplied if `SSL` is true.
if not verify_mode and use_ssl:
connection_params["ssl_cert_reqs"] = ssl.CERT_NONE

client = pymongo.MongoClient(**connection_params)
client = pymongo.MongoClient(**connection_params)

else:
url = config.get('database_url')
url_parsed = urlparse(url)
config['user'] = url_parsed.username
config['database'] = url_parsed.path[1:]
config['host'] = url_parsed.netloc
client = pymongo.MongoClient(url)

LOGGER.info('Connected to MongoDB host: %s, version: %s',
config['host'],
Expand Down