Skip to content

Commit

Permalink
source-mongodb: update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
mdibaiee committed Oct 10, 2023
1 parent 1496c5f commit 69021ac
Showing 1 changed file with 29 additions and 9 deletions.
38 changes: 29 additions & 9 deletions site/docs/reference/Connectors/capture-connectors/mongodb.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,20 @@ You'll need:

* Credentials for connecting to your MongoDB instance and database

* Read access to your MongoDB database and desired collections, see
* Read access to your MongoDB database(s), see
[Role-Based Access
Control](https://www.mongodb.com/docs/manual/core/authorization/) for more
information.
* Read access to the `local` database and `oplog.rs` collection in that
database are also necessary.
database.
* We recommend giving access to read all databases, as this allows us to
watch an instance-level change stream, allowing for better guarantees of
reliability, and possibility of capturing multiple databases in the same
task. However, if access to all databases is not possible, you can
give us access to a single database and we will watch a change stream on
that specific database.

In order to grant these permissions with a command like so:
In order to create a user with access to all databases, use a command like so:
```
use admin;
db.createUser({
Expand All @@ -37,6 +43,18 @@ You'll need:
roles: [ "readAnyDatabase" ]
})
```

In order to create a user with access to a specific database and the `local` database,
use a command like so:

```
use <your-db>;
db.createUser({
user: "<username>",
pwd: "<password>",
roles: ["read", { role: "read", db: "local" }]
})
```

* ReplicaSet enabled on your database, see [Deploy a Replica
Set](https://www.mongodb.com/docs/manual/tutorial/deploy-replica-set/).
Expand Down Expand Up @@ -93,16 +111,18 @@ The connector starts by backfilling data from the specified collections until it
reaches the current time. Once all the data up to the current time has been
backfilled, the connector then uses [**change
streams**](https://www.mongodb.com/docs/manual/changeStreams/) to capture
change events from the database and emit those updates to the flow collection.
change events and emit those updates to their respective flow collections.
If the connector's process is paused for a while, it will attempt to resume
capturing change events since the last received change event, however the
connector's ability to do this depends on the size of the [replica set
oplog](https://www.mongodb.com/docs/manual/core/replica-set-oplog/), and in
certain circumstances, when the pause has been long enough for the oplog to have
evicted old change events, the connector will need to re-do the backfill to
ensure data consistency. In such cases, the connector will error, and to resolve
this case, first try to increase the size of your oplog to avoid this issue in
the future, and then you need to remove the binding that is unable to be
captured, publishing your task, and then adding the binding back so the backfill
is restarted.
ensure data consistency. In these cases it is necessary to [resize your
oplog](https://www.mongodb.com/docs/manual/tutorial/change-oplog-size/#c.-change-the-oplog-size-of-the-replica-set-member) or
[set a minimum retention
period](https://www.mongodb.com/docs/manual/reference/command/replSetResizeOplog/#minimum-oplog-retention-period)
for your oplog to be able to reliably capture data.
The recommended minimum retention period is at least 24 hours, but we recommend
higher values to improve reliability.

0 comments on commit 69021ac

Please sign in to comment.