aws NoSQL database

AWS DocumentDB

[DocumentDB vs DynamoDB](https://cloud.netapp.com/blog/aws-cvo-blg-amazon-documentdb-basics-and-best-practicesH_H3)

DocumentDB and DynamoDB are both services that you can use as document databases. Both provide data portability and support migration through the AWS Database Migration Service. Both services also provide encryption with AWS Key Management Service and auditing with CloudTrail, CloudFormation, and VPC Flow Logs.

Despite these similarities, the use cases for the two services differ slightly. DynamoDB is both a document database and a key-value database. It is optimized for applications that rely on unique keys, but it is not as good at scan or query operations. In contrast, DocumentDB allows more flexible data indexing and is optimized for queries.

Another difference is the cost structure of the services. DynamoDB pricing is according to read/write units with on-demand, provisioned, or reserve pricing models. You can maintain small capacities to keep costs low and the first 25GB of storage are free.

In contrast, DocumentDB is based on a pay per instance pricing model. The smallest available instance sizes are the r4.large or r5.large instances. You can provision these instances or use bill per hour pricing.

[Impact of Indexes](https://cloud.netapp.com/blog/aws-cvo-blg-amazon-documentdb-basics-and-best-practicesH_H3)

Indexing enables you to decrease query times by making it easier to locate the data you need. However, when documents are indexed, each write or modification requires the index to be updated. This means that write times increase according to the number of indexes that must be updated each time. Indexes can also increase I/O operations and storage use.

Minimizing the number of indexes you create, can help you speed query times without drastically affecting write times. In general, you are recommended to use no more than five indexes per data collection.

Change Stream

It’s a way to monitor changes in a collection. It’s a feature of MongoDB and other related databases. Might be useful for post processing write events (for example, an data anonymization step).

Python Library for using DocumentDB

PyMongo

AWS DocumentDB: Tutorial

AWS DocumentDB: Connecting Programmatically

import pymongo
import sys


<a class="hashtag" onclick="focusTag(this)">Create</a> a MongoDB client, open a connection to Amazon DocumentDB as a replica set and specify the read preference as secondary preferred
client = pymongo.MongoClient('mongodb://<sample-user>:<password>@sample-cluster.node.us-east-1.docdb.amazonaws.com:27017/?tls=true&tlsCAFile=rds-combined-ca-bundle.pem&replicaSet=rs0&readPreference=secondaryPreferred&retryWrites=false') 


<a class="hashtag" onclick="focusTag(this)">Specify</a> the database to be used
db = client.sample_database


<a class="hashtag" onclick="focusTag(this)">Specify</a> the collection to be used
col = db.sample_collection


<a class="hashtag" onclick="focusTag(this)">Insert</a> a single document
col.insert_one({'hello':'Amazon DocumentDB'})


<a class="hashtag" onclick="focusTag(this)">Find</a> the document that was previously written
x = col.find_one({'hello':'Amazon DocumentDB'})


<a class="hashtag" onclick="focusTag(this)">Print</a> the result to the screen
print(x)


<a class="hashtag" onclick="focusTag(this)">Close</a> the connection
client.close()

Referenced in