Indexes in MongoDB

Knoldus Blog Audio
Reading Time: 4 minutes

In any database, indexes in MongoDB support the efficient execution of queries. Without them, the database must scan every document in a collection or table to select those that match the query statement. If an appropriate index exists for a query, the database can use the index to limit the number of documents it must inspect.

Keypoints for Indexing

As you create indexes, consider the following behaviors of indexes:

  • Each index requires at least 8 kB of data space.
  • Adding an index has some negative performance impact for write operations. For collections with a high write-to-read ratio, indexes are expensive since each insert must also update any indexes.
  • Collections with a high read-to-write ratio often benefit from additional indexes. Indexes do not affect un-indexed read operations.
  • When active, each index consumes disk space and memory. This usage can be significant and should be tracked for capacity planning, especially for concerns over working set size
  • Querying only the index can be much faster than querying documents outside of the index. Index keys are typically smaller than the documents they catalog, and indexes are typically available in RAM or located sequentially on disk.

Types of Indexes in MongoDB

Default _id Index

This is the default index which will be created by MongoDB when you create a new collection. If you don’t specify any value for this field, then _id will be primary key by default for your collection so that a user can’t insert two documents with same _id field values. You can’t remove this index from the _id field.

Single Field Index

You can use this index type when you want to create a new index on any field other than _id field.

Example:

db.myColl.createIndex( { name: 1 } )

This will create a single key ascending index on name field in myColl collection

Compound Index

You can also create an index on multiple fields using Compound indexes. For this index, order of the fields in which they are defined in the index matters. Consider this example:

db.myColl.createIndex({ name: 1, score: -1 })

This index will first sort the collection by name in ascending order and then for each name value, it will sort by score values in descending order.This index can be used to index array data. If any field in a collection has an array as its value then you can use this index which will create separate index entries for each elements in array. If the indexed field is an array, then MongoDB will automatically create Multikey index on it.Consider this example:

{
‘userid’: 1,
‘name’: ‘mongo’,
‘addr’: [
    {zip: 12345, ...},
    {zip: 34567, ...}
]
}

You can create a Multikey index on addr field by issuing this command in Mongo shell.

db.myColl.createIndex({ addr.zip: 1 })

Multikey Index

This index can be used to index array data. If any field in a collection has an array as its value then you can use this index which will create separate index entries for each elements in array. If the indexed field is an array, then MongoDB will automatically create Multikey index on it.

Consider this example:

{
‘userid’: 1,
‘name’: ‘mongo’,
‘addr’: [
    {zip: 12345, ...},
    {zip: 34567, ...}
]
}

You can create a Multikey index on addr field by issuing this command in Mongo shell.

db.myColl.createIndex({ addr.zip: 1 })

Geospatial Index

Suppose you have stored some coordinates in MongoDB collection. To create index on this type fields(which has geospatial data), you can use a Geospatial index. MongoDB supports two types of geospatial indexes.

2d Index: You can use this index for data which is stored as points on 2D plane.

db.collection.createIndex( { <location field> : "2d" } )
  • 2dsphere Index: Use this index when your data is stored as GeoJson format or coordinate pairs(longitude, latitude)
db.collection.createIndex( { <location field> : "2dsphere" }

Text Index

To support queries which includes searching for some text in the collection, you can use Text index.

Example:

db.myColl.createIndex( { address: "text" } )

Hashed Index

MongoDB supports hash-based sharding. Hashed index computes the hash of the values of the indexed field. It supports sharding using hashed sharded keys. It uses this index as shard key to partition the data across your cluster.

Example:

db.myColl.createIndex( { _id: "hashed" } )

Index Scan

Rather than searching through every document, we can search through the ordered index first. Key-Value pair, where the key is the value of the field that we’ve indexed on, and the value of the key is the actual document itself.
* _id is automatically indexed

It is possible to have many indexes on the same collection. You might create multiple indexes in different fields if you find that you have different queries for different fields.

MongoDB uses a data structure called a b-tree to store its indexes

The awesome query performance gain that we get with indexes doesn’t come for free. With each additional index, we decrease our write speed for a collection. If a document were to change or if it was completely removed, one or more of our b-trees might need to be balanced. This means that we need to be careful when creating indexes. We don’t want to have too many unnecessary indexes in a collection because there would then be an unnecessary loss in insert, update, and delete performance. You should have a good idea of what indexes are, their pros and cons, and how they work.

Conclusion

One obvious takeaway is: Create indexes. Based on your queries, you can define different types of indexes on your collections. If you don’t create indexes, then each query will scan the full collections which takes a lot of time making your application very slow and it uses lots of resources of your server. On the other hand, don’t create too many indexes either because creating unnecessary indexes will cause extra time overhead for all insert, delete and update. When you perform any of these operations on an indexed field, then you have to perform the same operation on index tree as well which takes time. Indexes are stored in RAM so creating irrelevant indexes can eat up your RAM space, and slow down your server.

Reference

https://docs.mongodb.com/manual/applications/indexes/

Written by 

Munander is a Software Consultant in Knoldus Software LLP. He has done b.tech from IMS Engineering college, Ghaziabad. He has decent knowledge of C,C++,Java,Angular and Lagom. He always tries to explore new technologies. His hobbies include playing cricket and adventure.