Administration of MongoDB Cluster Operations Tutorial

Welcome to the seventh chapter of the MongoDB tutorial (part of the MongoDB Developer and Administrator Course). This lesson will explain the administrative features of MongoDB. Let us explore the objectives of this lesson in the next section.

Objectives

After completing this lesson, you will be able to:

  • Explain what memory-mapped files are
  • Explain allocation algorithms
  • Describe the storage engines MongoDB supports

Memory-Mapped Files

A memory-mapped file contains data stored by the operating system in its memory. This file uses the mmap()( Read as M- Map) system call to store data. The mmap() method maps the file to a region of virtual memory.

Memory-mapped files are the critical pieces of the MMAPv1 ( Read as M- Map version one) storage engine in MongoDB.

The memory-mapped files enable MongoDB to treat the contents of its data files as if they were in memory. This enables MongoDB to access and manipulate data faster. MongoDB uses the memory-mapped files to manage and interact with all types of data.

Memory mapping assigns data files to a virtual memory block having a byte-for-byte correlation. As and when accessing documents, the MongoDB memory maps them. Therefore, only accessed data is mapped to the memory, whereas the data that is not accessed is not memory mapped.

Once memory mapping is complete, the relationship between the file and memory allows MongoDB to interact with the data in the file as if it were in memory.

In the next section, we will discuss journaling mechanics.

Journaling Mechanics

MongoDB uses write operations before logging into an on-disk journal. This ensures the strength of write operations.

Before making any changes to the data files, MongoDB first performs the change operation in the journal. If a MongoDB instance encounters any error or terminates before writing the changes from the journal to the data files, MongoDB can re-apply the write operation to maintain a consistent state.

If a mongod instance exits unexpectedly without a journal, this means that your data is not consistent. You must then run a repair or, preferably, resync from a clean member of the replica set. When journaling is enabled, even if mongod stops unexpectedly, the program can recover all the written data to the journal.

The recovered data is in a consistent state. By default, the writes lost by MongoDB—that is, the writes not made to the journal, are made in the last 100 milliseconds. If journaling is enabled and if you have sufficient RAM in your system, the entire data set and the write working set can reside in the RAM.

To enable journaling, start mongod with the journal command-line option.

In the next section, we will discuss storage engines.

Want to check the course preview of our MongoDB Developer and Administrator Course? Click here to watch

Storage Engines

A storage engine of a database manages how data is stored on a disk. Typically, databases support multiple storage engines, each engine performing specific workloads.

For example, one storage engine may manage read-heavy operations, whereas another may support a higher-throughput for write operations. With multiple storage engines available, you can choose one that best suits your application.

In the next section, we will discuss MMAPv1 (Read as M-Map version 1) storage engine.

MMAPv1 Storage Engine

MMAPv1 is a storage engine based on memory-mapped files.

This storage engine can manage high-volume operations, such as inserts, reads, and in-place updates. MMAPv1 is the default storage engine in MongoDB 3.0 and all previous versions.

To ensure that all dataset modifications are stored on the disk, MongoDB records all modifications in a journal. It writes to the disk more frequently than it writes the data files. The journal lets MongoDB recover data from data files after a mongod instance exits without making all changes.

In the next section, we will discuss the WiredTiger ( Read as Wired Tiger) storage engine.

WiredTiger Storage Engine

In MongoDB version 3.0, an additional storage engine is available. Although Mmapv1 remains the default storage engine, the new WiredTiger engine offers additional flexibility and improved throughput for many workloads.

The WiredTiger storage engine is optionally available in the 64-bit build of MongoDB version 3.0. This engine supports high volumes of read, insert, and more complex update workloads.

Document-Level Locking: All write operations in WiredTiger occur within the context of a document-level lock. Therefore, multiple clients can simultaneously modify more than one document in a single collection. With such control, MongoDB can effectively support read, write, update, and high-throughput concurrent workloads.

For data persistence, WiredTiger uses a write-ahead transaction log in combination with various checkpoints. Using WiredTiger, MongoDB commits a checkpoint to the disk either every 60 seconds or when there are only 2 gigabytes of data to write.

The checkpoint thresholds are configurable and all data files between and during the checkpoints are always valid. The WiredTiger journal persists all data modifications between the checkpoints.

If MongoDB exits between the checkpoints, it enables the journal to replay all the data modified since the last checkpoint. By default, the WiredTiger journal is compressed using the snappy algorithm.

In the next section, we will discuss WiredTiger compression support.

WiredTiger Compression Support

MongoDB uses block and prefix compressions to support compression for collections and indexes.

Compression minimizes the storage use at the cost of an additional CPU. By default, prefix compression is used for all indexes in the WiredTiger engine. In addition, by default, all collections with WiredTiger use block compression combined with the snappy algorithm.

Compressions with zlib ( Read as z - lib) are also available. You can modify the default compression settings for all collections and indexes. During collection and index creation, you can configure compressions on per collection and per index basis

For most workloads, the default compression settings balance storage efficiency and processing requirements.

In the next section, we will discuss the power of 2 sized allocations.

Power of 2-Sized Allocations

MongoDB version 3.0 uses the power of 2 sized allocations as the default record allocation strategy for the MMAPv1 storage engine.

With this strategy, each record has a size in bytes that is a power of 2, for example, 32, 64, 128, 256, 512 KB, and 2MB. For documents larger than 2MB, the allocation is rounded up to the nearest multiple of 2MB.

The power of 2 sized allocation strategy has the following key properties:

  • It can reuse freed records and reduce fragmentation. Quantizing record allocation sizes in a fixed set of values ensures that an insert will fit into the free space available due to deletion or relocation of earlier documents.

  • It can reduce moves. The added padding space gives a document scope to grow without requiring a move. In addition to saving the cost of moving, this results in few updates.

In the next section, we will discuss no padding allocation strategy.

No Padding Allocation Strategy

For some collections, the workloads that consist of insert only or update operations do not increase or change the document sizes.

For such workloads, you can disable the power of 2 allocations using the collMod (Read as collection modification) command with the noPadding flag or the db.createCollection()(Read as D-B dot create collection) method with the noPadding option.

Prior to version 3.0, MongoDB used an allocation strategy that included dynamically calculated padding as a factor of the document size.

In the next section, we will discuss how to diagnose performance issues.

Diagnosing Performance Issues

Degraded performance in MongoDB is typically a function that depicts a relationship among the following:

  • The quantity of data stored in the database

  • The amount of available system RAM

  • The number of connections to the database

  • The amount of time the database spends in the locked state.

The performance issues in some cases are momentary and relate to traffic load, data access patterns, or the availability of hardware in the host system for virtualized environments.

The performance issues are also the results of inadequate or inappropriate indexing strategies and poor schema design patterns. In other situations, performance issues may indicate that the database requires additional capacity.

We will continue our discussion on diagnosing performance issues in the next section.

Diagnosing Performance Issues (contd.)

The following are a few causes of performance degradation in MongoDB.

Locks: To ensure dataset consistency, MongoDB uses locks. However, if certain operations run for longer durations or queue up, performance degrades because requests and operations wait for a lock.

Lock-related slowdowns can be irregular.

To know if a lock affects your database performance, review the data in the global-lock section of the serverStatus output. If “globalLock.currentQueue.total” ( Read as global lock dot current queue dot total) is consistently high, then it is possible that a large number of requests are waiting for a lock. This indicates that a possible concurrency issue is affecting your database performance.

Memory Usage: MongoDB uses memory-mapped files to store data. For a data set of a sufficient size, the MongoDB process allocates all the available memory to the system. This is part of the design for MongoDB’s enhanced performance. However, memory-mapped files make it difficult to determine if the amount of RAM is sufficient for a data set.

To determine MongoDB’s memory usage, check the memory usage statuses metrics of the serverStatus output. You can also check the resident memory using the mem.resident (Read as mem dot resident) command. If this exceeds the system memory and there is a significant amount of data on the disk that is not in the RAM, then it means that you have exceeded the capacity of your system.

In addition, check the amount of the mapped memory, mem.mapped(Read as mem dot mapped). If this value is greater than the system memory, some operations may require that data be read from the virtual memory. This will negatively impact the system performance.

We will discuss a few more performance issues in the next section.

Diagnosing Performance Issues (contd.)

A few more causes of performance issues are as follows:

Page Faults: When the MMAPv1 storage engine is used, page faults occur as MongoDB reads from or writes data to the data files that are not located in its physical memory.

On the contrary, operating system page faults occur when the physical memory is exhausted and pages of the physical memory are swapped to the disk. Page faults triggered by MongoDB are reported as the total number of page faults in one second.

To check for page faults, view the extra_info.page_faults the value in the serverStatus output. Page fault counters in MongoDB may increase drastically when poor performance happens and may correlate with the limited physical memory environment.

Page faults can increase when accessing larger data sets, such as scanning an entire collection. Limited and sporadic page faults in MongoDB do not necessarily indicate an issue and do not require any corrective measure. To reduce the page fault frequency, increase the RAM in MongoDB. Alternatively, deploy a sharded cluster or add shards and distribute loads among mongod instances.

Number of Connections: The server’s ability to handle a request depends on the number of connections between the application layer and the database. If the number of connections is too high, this can result in performance irregularities.

If the number of requests is high because of numerous concurrent application interactions, the database may not be able to meet the demand. In such a case, you need to increase the capacity of your deployment. For read-heavy applications, increase the size of your replica set and distribute read operations to secondary members.

For write-heavy applications, deploy sharding and add one or more shards to a sharded cluster and distribute the load among mongod instances. MongoDB has no limit on incoming connections unless it is constrained by system-wide limits.

In the next section, we will view a demo on monitoring performance in MongoDB.

Want to test your MongoDB skills? Take the MongoDB free practice test

Optimization Strategies for MongoDB

Many factors can affect database performance and responsiveness. These include index usage, query structure, data models, and application design. In addition, operational factors such as architecture and system configuration can also affect database performance.

The following techniques are used for evaluating the operational performance of MongoDB.

Database Profiler MongoDB

Database Profiler MongoDB provides a database profiler that can be used to identify the performance characteristics of each operation against the database.

For example, using the profiler, you can identify the queries or write operations that are running slow. You can use this information to determine what indexes to create.

Capped Collections

Capped Collections are circular, fixed-size collections that help keep documents in a proper order even without the use of an index. Capped collections receive very high-speed writes and sequential reads. For faster write operations, use capped collections.

$Natural Order

To return documents in the order they exist on the disk and return sorted operations, use the $natural operator. On a capped collection, this also returns the documents in the order in which they were written. The natural order does not use indexes but enables faster operations when you want to select the first or last item from the disk.

In the next section, we will discuss how to configure tag sets for replica sets.

Configure Tag Sets for Replica Set

You can configure tag sets in a replica set using the following methods:

  • db.currentOp() (Read as D-B dot current op): Use this method to evaluate mongod operations. This method reports the current operations running on a mongod instance.

  • cursor.explain() and db.collection.explain(): Use these explain methods to evaluate a query performance, such as the index MongoDB selects to fulfill a query and its execution statistics.

You can run the methods in the following three modes to control the amount of information returned:

  • queryPlanner(Read as query planner)

  • executionStats (Read as execution stats)

  • allPlansExecution (Read as all plan execution)

In the next section, we will discuss how to optimize query performance.

Optimize Query Performance

You can optimize query performance in the following ways:

Create Indexes to Support Queries:

Typically, scanning an index is much faster than scanning a collection. For commonly issued queries, create indexes for quick returns. To search for multiple fields, create a compound index for the query. The index structures are smaller than the documents reference and store references in an order.

Limit Query Results to Reduce Network Demand:

MongoDB cursors return results in groups of multiple documents. To get the desired number of results, use the limit() method to reduce the demand for network resources. For example, to get 10 results from a query to the “posts” collection, issue the following command given below.

db.posts.find().sort( { timestamp : -1 } ).limit(10)

Use Projections to Return Only Necessary Data:

To receive a subset of fields from documents, you can choose to have better performance by returning only the required fields. For example, for the posts collection query, you need the timestamp, title, author, and abstract fields.

Therefore, you would issue the following command given below:

db.posts.find( {}, { timestamp : 1 , title : 1 , author : 1 , abstract : 1} ).sort( { timestamp : -1 Use $hint to

Select a Particular Index”

You can force MongoDB to use a specific index using the hint() method. Use the hint() method on queries where you must select the fields included in several indexes to support performance testing.

Use the Increment Operator to Perform Operations Server-Side

If a field is a type number, then use the $inc operator to increment or decrement values in documents. This operator executes the logic at the server side rather than making changes at the client side before sending the whole document to the server. It also helps avoid race conditions.

In the next section, we will discuss the monitoring strategies used in MongoDB.

Monitoring Strategies for MongoDB

Monitoring is a critical component of the entire database administration.

The following three strategies are used to collect data and monitor a MongoDB instance:

  • Utilities are distributed with MongoDB that provides real-time reporting of database activities.

  • Database commands return statistics regarding the current database state with greater reliability.

  • MMS monitoring collects data from running MongoDB deployments and provides visualization and alerts based on that data.

Note that these strategies are complementary. Each strategy can help answer different questions and is useful in different contexts.

In the next section, we will discuss MongoDB Utilities.

MongoDB Utilities

MongoDB includes a number of utilities that quickly return statistics about the performance and activity of an instance.

Typically, these are useful for diagnosing issues and assessing a normal MongoDB operation. These utilities are as follows:

Mongostat: It captures and returns the counts of database operations by type. For example, insert, query, update, delete, and so on. These counts report the load distribution of the server. You can use mongostat to understand the distribution of operation types and to inform capacity planning.

Mongotop: The mongotop (Read as mongo top) utility tracks the current read and write activities of a MongoDB instance and reports their statistics on a per collection basis. You can use mongotop to check whether your database activity matches your expectations.

HTTP Console: MongoDB provides a Web interface for receiving diagnostic and monitoring information in a simple Web page. You can access this Web interface from localhost: (Read as Localhost semicolon port). Here, the number is 1000 more than the mongod port.

For example, if a locally running mongod is using the default port 27017( Read as 2-7-0-1-7), you will be able to access the HTTP console from the link given below.

http://localhost:28017

In the next section, we will discuss MongoDB commands.

MongoDB Commands

MongoDB uses a number of commands that report the state of the database. This data may provide a finer level of granularity than the utilities. You can use their output in scripts and programs to either develop custom alerts or to modify the behavior of your application.

The commands are as follows:

db.currentOp: This method helps to identify the in-progress operations of a database instance.

serverStatus: The serverStatus command, or the db.serverStatus()(Read as D-B dot server status) method from the shell, returns an overview of the database status. This command provides details of the disk

usage, memory usage, connection, journaling, and index access. The command returns quickly and does not impact the MongoDB performance.

dbStats: This command when executed from the shell returns a document that addresses the storage usage and data volumes. dbStats reflects the amount of storage used, the quantity of data contained in the database, object, collection, and index counters.

collStats: The collStats (Read as collection stats) or db.collection.stats (Read as D-B dot collection dot stats) method executed from the shell provides statistics, such as the number of documents in the collection, size of the collection, hard disk utilization by the collection, and information regarding its indexes.

replSetGetStatus or rs.status(): replSetGetStatus from the shell returns an overview of the replica set’s status. This command provides details of the state and configuration of the replica set and also provides statistics about its members. These metrics are very useful to check whether a replica set is properly configured or not.

Gangliamongodb: Ganglia is the python script for viewing a replica set information, such as memory usage, B-Tree statistics, master-slave status, and current connections. In addition, Mtop, munin, and nagios are the tools used to check server statics and scripts.

In the next section, we will discuss MongoDB Management Service, or MMS.

MongoDB Management service (MMS)

MMS is a cloud service that helps monitor, backup, and scale MongoDB on the infrastructure of your choice. You can monitor more than a 100 system metrics and get custom alerts before your system starts degrading.

You can create your own alerts and integrate monitoring with the tools you already use.

In the next section, we will discuss the data backup strategies in MongoDB.

Data Backup Strategies in MongoDB

When you deploy MongoDB in production, you should have a strategy for capturing and restoring backups to prepare for the data loss events.

You can perform a backup of MongoDB clusters in the following ways:

  • Back up by copying the underlying data files

  • Back up a database with mongodump

  • Use the MMS cloud backup

In the next section, we will discuss how to copy underlying data files.

Copying Underlying Data Files

You can create a backup for MongoDB by copying its underlying data files. If the volume where MongoDB stores data files supports point in time snapshots, you can use these snapshots to create backups of a MongoDB system at an exact moment in time. File systems snapshots are not specific to MongoDB.

The mechanics of snapshots depend on the underlying storage system. On a Linux computer, the Logical Volume Manager, or LVM manager, can create a snapshot.

To get a correct snapshot of a running mongod process, you must enable journaling. The journal must reside on the same logical volume as the other MongoDB data files. If journaling is not enabled, there is no guarantee that the snapshot will be consistent or valid.

To get a consistent snapshot of a sharded cluster, first, disable the balancer and capture a snapshot from every shard and the config server simultaneously.

If your storage system does not support snapshots, you can copy the files directly using the copy command in Linux, rsync, or a similar tool. Copying multiple files is not an atomic operation. Hence, stop all the writes to mongod before copying the files, otherwise, the files will be copied in an invalid state.

Backups created by copying the underlying data do not support point-in-time recovery for the replica sets and are difficult to manage for larger sharded clusters. These backups are huge because they include the indexes and duplicate the underlying storage padding and fragmentation.

On the other hand, mongodump creates smaller backups.

In the next section, we will discuss backup with mongodump.

Backup with MongoDump

The mongodump tool reads data from a MongoDB database and creates BSON files. The mongorestore (Read as mongo restore) tool populates a MongoDB database with the data from these BSON files. These tools are simple and efficient for backing up small MongoDB deployments. However, they are not suitable for backing up larger systems.

mongodump and mongorestore can operate against a running mongod process. These tools can manipulate the underlying data files directly. By default, mongodump does not capture the contents of the local database, it only captures the documents in the database.

Although the resulting backup is space efficient, mongorestore or mongod must rebuild the indexes after restoring data. When connected to a MongoDB instance, mongodump can adversely affect the mongod performance.

If your data volume is larger than the available system memory, the queries will push the working set out of the memory. To mitigate the impact of mongodump on the performance of the replica set, use mongodump to capture backups from a secondary member of the replica set.

Alternatively, you can shut down a secondary and use mongodump with the data files directly. If you shut down a secondary to capture data with mongodump, ensure that the operation completes before its oplog is unable to replicate.

To restore a point-in-time backup created with oplog, use mongorestore with the oplogReplay option. If applications modify data while mongodump is creating a backup, mongodump will compete for resources with those applications.

In the next section, we will discuss fsync ( Read as F Sync) and lock.

Fsync and Lock

The mongodump and the mongorestore allow you to perform data backups without shutting down the MongoDB server. However, you lose the ability to get a point-in-time view of the data.

MongoDB’s fsync command allows you to copy the data directory of a running MongoDB server without any risk of corruption. The fsync command forces the MongoDB server to send all the pending writes to the disk.

Optionally, it also prevents any further writes to the database until the server is unlocked. This write-lock feature makes the fsync command suitable for backups.

The example given below shows how to run the command from the shell, forcing an fsync, and acquiring a write lock.

> use admin

> db.runCommand({"fsync" : 1, "lock" : 1});

At this point, the data directory represents a consistent, point-in-time snapshot of your data. As the server is locked for writes, you can safely make a copy of the data file system, such as LVM, which allows a quick snapshot of the data directory.

After performing the backup, unlock the database again using the command given below.

> db.$cmd.sys.unlock.findOne();

Next, run the currentOp ( Read as current op) command to ensure that the lock is released. Note that it may take a moment after unlock is first requested.

The fsync command allows you to perform a flexible backup, without shutting down the server or sacrificing the point-in-time nature of the backup. The only way to have a point-in-time snapshot without any downtime for reads or writes is to back up from a slave.

In the next section, we will discuss Ops Manager, the backup software in MongoDB.

MongoDB Ops Manager Backup Software

MongoDB Ops (Read as Operations) Manager is a service that allows you to manage, monitor, and back up the MongoDB infrastructure.

Ops Manager provides the following services:

  • Ops (Read as Operations) Manager Backup provides scheduled snapshots and point-in-time recovery of your MongoDB replica sets and sharded clusters. This backup creates snapshots of the standalones that are run as single-member replica sets.

  • MMS has a lightweight Backup Agent that runs within your infrastructure and backs up data from the MongoDB processes you have specified.

In the next section, we will discuss security strategies in MongoDB.

Security Strategies in MongoDB

To maintain a secure MongoDB deployment, you need to implement controls and ensure that users and applications have access to only those data required for their job roles.

MongoDB provides various features that allow you as an administrator to implement these controls and restrictions for deploying any MongoDB instance.

Following are some of the security strategies in MongoDB: Authentication: It is the mechanism for verifying user and instance access to MongoDB.

Authorization: This allows you to control a user’s or an application’s access to MongoDB instances. Collection-Level Access Control: This allows scope privileges to specific collections.

Network Exposure and Security: It discusses potential security risks related to the network and strategies for decreasing the possible network-based attack vectors for MongoDB.

Security and MongoDB API Interfaces: This allows you to reduce and control the potential risks related to MongoDB’s JavaScript, HTTP, and REST interfaces.

Auditing: It includes audit server and client activities for mongod and mongos instances.

In the next section, we will discuss authentication implementation.

Authentication Implementation in MongoDB

Before gaining access to a system, clients should identify themselves to MongoDB. This ensures that a client cannot access the MongoDB data without authentication.

MongoDB supports various authentication mechanisms to enable clients to verify their identities. It supports a password-based challenge, response protocol, and x.509 (Read as X- five –O - Nine) certificates.

Additionally, MongoDB Enterprise provides support for Lightweight Directory Access Protocol, or LDAP, proxy authentication and Kerberos authentication.

x.509 ( Read as X- five –O - Nine) Certificate Authentication: This authentication mechanism was introduced in MongoDB version 2.6. It supports x.509 certificate authentication for use with a secure SSL connection. For server authentication, clients can use x.509 certificates instead of usernames and passwords. For membership authentication, members of sharded clusters and replica sets can use x.509 certificates instead of key files.

Kerberos Authentication: MongoDB Enterprise8 supports Kerberos service authentication. This is an industry standard authentication protocol for large client-server systems. To use MongoDB with Kerberos, you must have a properly configured Kerberos deployment and Kerberos service principals and add Kerberos user principal to MongoDB.

LDAP Proxy Authority Authentication: MongoDB Enterprise supports proxy authentication through an LDAP service.

In the next section, we will discuss authentication in a replica set and sharded cluster.

Authentication in a Replica set

You can authenticate the members of replica sets and sharded clusters. To authenticate the members of a single MongoDB deployment to each other, use the keyFile and x.509 ( Read as X- five –O - Nine) mechanisms.

When you use the keyFile authentication, this also enables the authorization for members. You also need to run replica sets and sharded clusters in a trusted networking environment.

Ensure that the network permits only trusted traffic to reach each mongod and mongos instance by using:

  • A firewall and network routing

  • Virtual private networks, or VPNs, or wide area networks, or WANs

  • Your network configuration allows every member of the replica set or sharded cluster to contact every other member.

  • The KeyFile configuration is done on all the members to permit authentication.

In the next section, we will discuss authentication on sharded clusters.

Authentication on Sharded Clusters

In sharded clusters, applications authenticate directly to mongos instances, using the credentials stored in the admin database of the config servers.

The shards in the sharded cluster also have credentials, and clients can authenticate directly to the shards to perform maintenance on the shards. In general, applications and clients should connect to the sharded cluster through the mongos.

Some maintenance operations, such as cleanupOrphaned, compact, and rs.reconfig() (Read as R-S dot config), require direct connections to the specific shards in a sharded cluster. To perform these operations with authentication enabled, you must connect directly to the shard and authenticate as a shard local administrative user.

To create a shard local administrative user, connect directly to the shard and create the user. MongoDB stores shard local users in the admin database of the shard itself. These shard local users are completely independent from the users added to the sharded cluster via mongos.

Shard local users are local to the shard and are inaccessible by mongos. Direct connections to a shard should only be for shard-specific maintenance and configuration.

In the next section, we will discuss authorization.

Authorization

Access control or authorization defines a user’s access to a system’s resources and operations. Ideally, users should be able to perform only those operations that are required to fulfill their defined job roles.

This is the “principle of least privilege” that controls and limits the potential risk of a compromised application. MongoDB’s role-based access control system allows you as an administrator to control all access and ensures that all the granted access applies as narrowly as possible.

MongoDB does not enable authorization by default. When you enable authorization, MongoDB requires authentication for all the connections.

After authorization is enabled, MongoDB controls each user access through its assigned role. Each role consists of a set of privileges, and each privilege consists of actions, a set of operations, and resources. Every user may have one or more assigned roles that describe their access. MongoDB provides several built-in roles, such as the read, readWrite, dbAdmin, and root roles.

Users can also have customized roles suited to clients’ requirements. MongoDB does not enable authorization by default. You can enable authorization using the --auth or the keyFile options. If you are using a configuration file, you can enable authorization with the security.authorization or the security.keyFile settings.

You can also can create new roles and privileges to cater to operational needs. You can assign privileges scoped as granularly as the collection level. When you grant a role to a user, the user receives all the privileges associated with the role.

A user can have several concurrent roles. In such cases, the user receives all the privileges of the respective roles.

In the next section, we will discuss end-to-end auditing for compliance.

End-to-End Auditing for Compliance

As an administrator, you need to implement security policies to control the activities in a system.

Auditing enables you to verify that the implemented security policies are controlling the activities as desired. Retaining the audit information ensures that you have enough information to perform forensic investigations and comply with the regulations and policies that require the audit data.

The auditing facility allows you and other users to track a system activity for deployments with multiple users and applications. The auditing facility can write audit events to the console, the syslog, a JSON file, or a BSON file.

Summary

Here is a quick recap of what was covered in this lesson:

  • Capped collections in MongoDB contain collections with predefined sizes that support insert and retrieve operations.

  • When creating a capped collection, you need to specify the size, query a collection, check if the collection is capped, and convert a collection to capped.

  • The GridFS specification splits a file into different parts or chunks and stores each unit as a separate document.

  • GridFS uses two collections—fs.files and fs.chunks.

  • A memory-mapped file contains data stored by the operating system in its memory.

  • MongoDB supports multiple storage engines, such as MMAPv1 and WireTiger, and allows you to choose the one best suited to your needs.

  • The causes of performance issues in MongoDB include locks, memory usage, page faults, and the number of connections.

Conclusion

This concludes the chapter Administration of MongoDB cluster Operations, the last lesson of the MongoDB developer course.

Find our MongoDB Developer and Administrator Online Classroom training classes in top cities:


Name Date Place
MongoDB Developer and Administrator 15 Sep -7 Oct 2018, Weekend batch Your City View Details
  • Disclaimer
  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.

We use cookies on this site for functional and analytical purposes. By using the site, you agree to be cookied and to our Terms of Use. Find out more

Request more information

For individuals
For business
Name*
Email*
Phone Number*
Your Message (Optional)

By proceeding, you agree to our Terms of Use and Privacy Policy

We are looking into your query.
Our consultants will get in touch with you soon.

A Simplilearn representative will get back to you in one business day.

First Name*
Last Name*
Email*
Phone Number*
Company*
Job Title*

By proceeding, you agree to our Terms of Use and Privacy Policy