Home

Awesome

Extra MongoDB Tools

This repository provides additional MongoDB Tools with the following functionalities:

Building Tools

To build the MongoDB Tools, follow these steps:

  1. Clone the repository:

    git clone https://github.com/egose/database-tools
    cd database-tools
    
  2. Install dependencies and build the Go binaries:

    go mod tidy
    make build
    

This will ensure that all the necessary dependencies are installed and then build the Go binaries in dist directory.

Binary Arguments and Environment Variables

The binaries provided in this repository utilize MongoDB Tools directly, ensuring a familiar interface for users with minimal modifications to the command arguments. The design closely resembles the behavior and command structure of MongoDB's native tools such as mongodump and mongorestore.

mongo-archive

flagsenvironmentstypedescription
uriMONGOARCHIVE__URIstringMongoDB uri connection string
dbMONGOARCHIVE__DBstringdatabase to use
collectionMONGOARCHIVE__COLLECTIONstringcollection to use
hostMONGOARCHIVE__HOSTstringMongoDB host to connect to
portMONGOARCHIVE__PORTstringMongoDB port
sslMONGOARCHIVE__VERBOSEboolconnect to a mongod or mongos that has ssl enabled
ssl-ca-fileMONGOARCHIVE__SSL_CA_FILEstringthe .pem file containing the root certificate chain
ssl-pem-key-fileMONGOARCHIVE__SSL_PEM_KEY_FILEstringthe .pem file containing the certificate and key
ssl-pem-key-passwordMONGOARCHIVE__SSL_PEM_KEY_PASSWORDstringthe password to decrypt the sslPEMKeyFile, if necessary
ssl-crl-fileMONGOARCHIVE__SSL_CRL_Filestringthe .pem file containing the certificate revocation list
ssl-allow-invalid-certificatesMONGOARCHIVE__SSL_ALLOW_INVALID_CERTIFICATESboolbypass the validation for server certificates
ssl-allow-invalid-hostnamesMONGOARCHIVE__SSL_ALLOW_INVALID_HOSTNAMESboolbypass the validation for server name
ssl-fips-modeMONGOARCHIVE__SSL_FIPS_MODEbooluse FIPS mode of the installed openssl library
usernameMONGOARCHIVE__USERNAMEstringusername for authentication
passwordMONGOARCHIVE__PASSWORDstringpassword for authentication
authentication-databaseMONGOARCHIVE__AUTHENTICATION_DATABASEstringdatabase that holds the user's credentials
authentication-mechanismMONGOARCHIVE__AUTHENTICATION_MECHANISMstringauthentication mechanism to use
gssapi-service-nameMONGOARCHIVE__GSSAPI_SERVICE_NAMEstringservice name to use when authenticating using GSSAPI/Kerberos
gssapi-host-nameMONGOARCHIVE__GSSAPI_HOST_NAMEstringhostname to use when authenticating using GSSAPI/Kerberos
queryMONGOARCHIVE__QUERYstringquery filter, as a v2 Extended JSON string
query-fileMONGOARCHIVE__QUERY_FILEstringpath to a file containing a query filter (v2 Extended JSON)
read-preferenceMONGOARCHIVE__READ_PREFERENCEstringspecify either a preference mode or a preference json objectoutput
force-table-scanMONGOARCHIVE__FORCE_TABLE_SCANboolforce a table scanoutput
verboseMONGOARCHIVE__VERBOSEstringmore detailed log output
quietMONGOARCHIVE__QUIETboolhide all log output
az-account-nameMONGOARCHIVE__AZ_ACCOUNT_NAMEstringAzure Blob Storage Account Name
az-account-keyMONGOARCHIVE__AZ_ACCOUNT_KEYstringAzure Blob Storage Account Key
az-container-nameMONGOARCHIVE__AZ_CONTAINER_NAMEstringAzure Blob Storage Container Name
aws-access-key-idMONGOARCHIVE__AWS_ACCESS_KEY_IDstringAWS access key associated with an IAM account
aws-secret-access-keyMONGOARCHIVE__AWS_SECRET_ACCESS_KEYstringAWS secret key associated with the access keyName
aws-regionMONGOARCHIVE__AWS_REGIONstringAWS Region whose servers you want to send your requests to
aws-bucketMONGOARCHIVE__AWS_BUCKETstringAWS S3 bucket name
gcp-bucketMONGOARCHIVE__GCP_BUCKETstringGCP storage bucket name
gcp-creds-fileMONGOARCHIVE__GCP_CREDS_FILEstringGCP service account's credentials file
gcp-project-idMONGOARCHIVE__GCP_PROJECT_IDstringGCP service account's project id
gcp-private-key-idMONGOARCHIVE__GCP_PRIVATE_KEY_IDstringGCP service account's private key id
gcp-private-keyMONGOARCHIVE__GCP_PRIVATE_KEYstringGCP service account's private key
gcp-client-emailMONGOARCHIVE__GCP_CLIENT_EMAILstringGCP service account's client email
gcp-client-idMONGOARCHIVE__GCP_CLIENT_IDstringGCP service account's client id
cronMONGOARCHIVE__CRONboolrun a cron schedular and block current execution path
cron-expressionMONGOARCHIVE__CRON_EXPRESSIONstringa string describes individual details of the cron schedule
tzMONGOARCHIVE__TZstringuser-specified time zone
keepMONGOARCHIVE__KEEPboolkeep data dump
uri-pruneMONGOARCHIVE__URI_PRUNEboolprune MongoDB uri connection string

mongo-unarchive

flagsenvironmentstypedescription
uriMONGOUNARCHIVE__URIstringMongoDB uri connection string
dbMONGOUNARCHIVE__DBstringdatabase to use
collectionMONGOUNARCHIVE__COLLECTIONstringcollection to use
ns-excludeMONGOUNARCHIVE__NS_EXCLUDEstringexclude matching namespaces
ns-includeMONGOUNARCHIVE__NS_INCLUDEstringinclude matching namespaces
ns-fromMONGOUNARCHIVE__NS_FROMstringrename matching namespaces, must have matching nsTo
ns-toMONGOUNARCHIVE__NS_TOstringrename matched namespaces, must have matching nsFrom
hostMONGOUNARCHIVE__HOSTstringMongoDB host to connect to
portMONGOUNARCHIVE__PORTstringMongoDB port
sslMONGOUNARCHIVE__VERBOSEboolconnect to a mongod or mongos that has ssl enabled
ssl-ca-fileMONGOUNARCHIVE__SSL_CA_FILEstringthe .pem file containing the root certificate chain
ssl-pem-key-fileMONGOUNARCHIVE__SSL_PEM_KEY_FILEstringthe .pem file containing the certificate and key
ssl-pem-key-passwordMONGOUNARCHIVE__SSL_PEM_KEY_PASSWORDstringthe password to decrypt the sslPEMKeyFile, if necessary
ssl-crl-fileMONGOUNARCHIVE__SSL_CRL_Filestringthe .pem file containing the certificate revocation list
ssl-allow-invalid-certificatesMONGOUNARCHIVE__SSL_ALLOW_INVALID_CERTIFICATESboolbypass the validation for server certificates
ssl-allow-invalid-hostnamesMONGOUNARCHIVE__SSL_ALLOW_INVALID_HOSTNAMESboolbypass the validation for server name
ssl-fips-modeMONGOUNARCHIVE__SSL_FIPS_MODEbooluse FIPS mode of the installed openssl library
usernameMONGOUNARCHIVE__USERNAMEstringusername for authentication
passwordMONGOUNARCHIVE__PASSWORDstringpassword for authentication
authentication-databaseMONGOUNARCHIVE__AUTHENTICATION_DATABASEstringdatabase that holds the user's credentials
authentication-mechanismMONGOUNARCHIVE__AUTHENTICATION_MECHANISMstringauthentication mechanism to use
gssapi-service-nameMONGOUNARCHIVE__GSSAPI_SERVICE_NAMEstringservice name to use when authenticating using GSSAPI/Kerberos
gssapi-host-nameMONGOUNARCHIVE__GSSAPI_HOST_NAMEstringhostname to use when authenticating using GSSAPI/Kerberos
dropMONGOUNARCHIVE__DROPbooldrop each collection before import
dry-runMONGOUNARCHIVE__DRY_RUNboolview summary without importing anything. recommended with verbosity
write-concernMONGOUNARCHIVE__WRITE_CONCERNstringwrite concern options
no-index-restoreMONGOUNARCHIVE__NO_INDEX_RESTOREbooldon't restore indexes
no-options-restoreMONGOUNARCHIVE__NO_OPTIONS_RESTOREbooldon't restore collection options
keep-index-versionMONGOUNARCHIVE__KEEP_INDEX_VERSIONbooldon't update index version
maintain-insertion-orderMONGOUNARCHIVE__MAINTAIN_INSERTION_ORDERboolrestore the documents in the order of the input source
num-parallel-collectionsMONGOUNARCHIVE__NUM_PARALLEL_COLLECTIONSstringnumber of collections to restore in parallel
num-insertion-workers-per-collectionMONGOUNARCHIVE__NUM_INSERTION_WORKERS_PER_COLLECTIONstringnumber of insert operations to run concurrently per collection
stop-on-errorMONGOUNARCHIVE__STOP_ON_ERRORstringhalt after encountering any error during insertion
bypass-document-validationMONGOUNARCHIVE__BYPASS_DOCUMENT_VALIDATIONstringbypass document validation
preserve-uuidMONGOUNARCHIVE__PRESERVE_UUIDstringpreserve original collection UUIDs
verboseMONGOUNARCHIVE__VERBOSEstringmore detailed log output
quietMONGOUNARCHIVE__QUIETboolhide all log output
az-account-nameMONGOUNARCHIVE__AZ_ACCOUNT_NAMEstringAzure Blob Storage Account Name
az-account-keyMONGOUNARCHIVE__AZ_ACCOUNT_KEYstringAzure Blob Storage Account Key
az-container-nameMONGOUNARCHIVE__AZ_CONTAINER_NAMEstringAzure Blob Storage Container Name
aws-access-key-idMONGOUNARCHIVE__AWS_ACCESS_KEY_IDstringAWS access key associated with an IAM account
aws-secret-access-keyMONGOUNARCHIVE__AWS_SECRET_ACCESS_KEYstringAWS secret key associated with the access keyName
aws-regionMONGOUNARCHIVE__AWS_REGIONstringAWS Region whose servers you want to send your requests to
aws-bucketMONGOUNARCHIVE__AWS_BUCKETstringAWS S3 bucket name
gcp-bucketMONGOUNARCHIVE__GCP_BUCKETstringGCP storage bucket name
gcp-creds-fileMONGOUNARCHIVE__GCP_CREDS_FILEstringGCP service account's credentials file
gcp-project-idMONGOUNARCHIVE__GCP_PROJECT_IDstringGCP service account's project id
gcp-private-key-idMONGOUNARCHIVE__GCP_PRIVATE_KEY_IDstringGCP service account's private key id
gcp-private-keyMONGOUNARCHIVE__GCP_PRIVATE_KEYstringGCP service account's private key
gcp-client-emailMONGOUNARCHIVE__GCP_CLIENT_EMAILstringGCP service account's client email
gcp-client-idMONGOUNARCHIVE__GCP_CLIENT_IDstringGCP service account's client id
object-nameMONGOUNARCHIVE__OBJECT_NAMEboolObject name of the archived file in the storage
dirMONGOUNARCHIVE__DIRbooldirectory name that contains the dumped files
updatesMONGOUNARCHIVE__UPDATESboolarray of update specifications in JSON string
updates-fileMONGOUNARCHIVE__UPDATES_FILEboolpath to a file containing an array of update specifications
keepMONGOUNARCHIVE__KEEPboolkeep data dump
uri-pruneMONGOUNARCHIVE__URI_PRUNEboolprune MongoDB uri connection string

Examples

Dump Database and Upload to Azure Storage

mongo-archive \
--uri="mongodb://<username>:<password>@cluster0.mongodb.net/" \
--db=<dbname> \
--az-account-name=<az_account_name> \
--az-account-key=<az_account_key> \
--az-container-name=<az_container_name>

This example demonstrates how to dump the data from a specified database and upload it to Azure storage. Replace <username>, <password>, <dbname>, <az_account_name>, <az_account_key>, and <az_container_name> with the appropriate values for your setup.

Run Persistent Server for Regular Database Archival

mongo-archive \
--uri="mongodb://<username>:<password>@cluster0.mongodb.net/" \
--db=<dbname> \
--az-account-name=<az_account_name> \
--az-account-key=<az_account_key> \
--az-container-name=<az_container_name> \
--cron \
--cronExpression="* * * * *"

This example demonstrates how to run a persistent server that regularly archives a database. The server will execute the archival process based on the specified cron expression. Replace <username>, <password>, <dbname>, <az_account_name>, <az_account_key>, <az_container_name>, and <cron_expression> with your own values.

Restore the Target Database from Azure Storage

mongo-unarchive \
--uri="mongodb://localhost:27017" \
--db=<dbname> \
--az-account-name=<az_account_name> \
--az-account-key=<az_account_key> \
--az-container-name=<az_container_name>

This example shows how to restore the target database from Azure storage. Replace <dbname>, <az_account_name>, <az_account_key>, and <az_container_name> with your own values. The database will be restored to the MongoDB instance running on localhost:27017.

Restore the Target Database from Azure Storage and Apply Changes

mongo-unarchive \
--uri="mongodb://localhost:27017" \
--db=<dbname> \
--az-account-name=<az_account_name> \
--az-account-key=<az_account_key> \
--az-container-name=<az_container_name> \
--updates-file=/home/nonroot/updates.json

This example demonstrates how to restore the target database from Azure storage and apply changes contained in an updates file. Replace <dbname>, <az_account_name>, <az_account_key>, <az_container_name>, and /home/nonroot/updates.json with your own values. The updates file should contain the necessary instructions to modify the restored database.

An example of updates.json:

[
  {
    "collection": "users",
    "filter": {
      "email": {
        "$exists": true
      }
    },
    "update": [
      {
        "$set": {
          "email": {
            "$replaceOne": {
              "input": "$email",
              "find": "@",
              "replacement": "_"
            }
          }
        }
      }
    ]
  }
]

This JSON file provides an example of updating the users collection in the restored database.

Execute Binary Using Docker Container Image

To execute a binary using a Docker container image, you can use the following command:

docker run --rm \
    -v "$(pwd)/tmp:/tmp" \
    -e MONGOARCHIVE__DUMP_PATH=/tmp/datadump \
    ghcr.io/egose/database-tools:latest \
    mongo-archive \
    --uri="mongodb://<username>:<password>@cluster0.mongodb.net/" \
    --db=<dbname> \
    --az-account-name=<az_account_name> \
    --az-account-key=<az_account_key> \
    --az-container-name=<az_container_name> \
    --keep

Run Kubernetes CronJob with Mounted Volume

apiVersion: batch/v1
kind: CronJob
metadata:
  name: mongo-archive
spec:
  schedule: "0 12 * * *"
  concurrencyPolicy: Forbid
  jobTemplate:
    spec:
      backoffLimit: 3
      template:
        spec:
          restartPolicy: Never
          initContainers:
            - name: backup-permission
              image: alpine:3.18
              imagePullPolicy: IfNotPresent
              command: ["/bin/sh", "-c"]
              args:
                - |
                  rm -rf /tmp/*;
                  adduser -D -u 1000 nonroot;
                  chown nonroot:nonroot /tmp;
              volumeMounts:
                - mountPath: /tmp
                  name: backup-volume
          containers:
            - name: backup-job
              image: ghcr.io/egose/database-tools:0.2.6
              imagePullPolicy: IfNotPresent
              command: ["/bin/sh", "-c"]
              args:
                - |
                  mongo-archive --db=mydb --read-preference=primary --force-table-scan
              env:
                - name: MONGOARCHIVE__URI
                  value: "mongodb+srv://user:password@cluster0.my.mongodb.net"
                - name: MONGOARCHIVE__AZ_ACCOUNT_NAME
                  value: mystorage
                - name: MONGOARCHIVE__AZ_ACCOUNT_KEY
                  value: myaccountkey
                - name: MONGOARCHIVE__AZ_CONTAINER_NAME
                  value: mybackup
              volumeMounts:
                - mountPath: /tmp
                  name: backup-volume
          volumes:
            - name: backup-volume
              persistentVolumeClaim:
                claimName: backup-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: backup-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

Backlog

...