Menu
Share on

Introduction

When using MongoDB, you have the ability to be flexible with the structure of your data. You are not locked into maintaining a certain schema that all of your documents must fit into. For any given field in a document, you are able to use any of the available data types supported by MongoDB. Despite this default way of working, you are able to impose a JSON Schema in MongoDB to add validation on your collections if desired. We won't go into the details of schema design in this guide, but it can have an effect on data typing if implemented.

Data types specify a general pattern for the data they accept and store. It is paramount to understand when to choose a certain data type over another when planning your database. The type chosen is going to dictate how you're able to operate on your data and how it is stored.

JSON and BSON

Before getting into the details of specific data types, it is important to have an understanding of how MongoDB stores data. MongoDB and many other document-based NoSQL databases use JSON (JavaScript Object Notation) to represent data records as documents.

There are many advantages to using JSON to store data. Some of them being:

  • easiness to read, learn, and its familiarity among developers
  • flexibility in format, whether sparse, hierarchical, or deeply nested
  • self-describing, which allows for applications to easily operate with JSON data
  • allows for the focus on a minimal number of basic types

JSON supports all the basic data types like string, number, boolean, etc. MongoDB actually stores data records as Binary-encoded JSON (BSON) documents. Like JSON, BSON supports the embedding of documents and arrays within other documents and arrays. BSON allows for additional data types that are not available to JSON.

What are the data types in MongoDB?

Before going into detail, let's take a broad view of what data types are supported in MongoDB.

MongoDB supports a range of data types suitable for various types of both simple and complex data. These include:

Text

  • String

Numeric

  • 32-Bit Integer
  • 64-Bit Integer
  • Double
  • Decimal128

Date/Time

  • Date
  • Timestamp

Other

  • Object
  • Array
  • Binary Data
  • ObjectId
  • Boolean
  • Null
  • Regular Expression
  • JavaScript
  • Min Key
  • Max Key

In MongoDB, each BSON type has both an integer and string identifiers. We'll cover the most common of these in more depth throughout this guide.

String types

The string type is the most commonly used MongoDB data type. Any value written inside of double quotes "" in JSON is a string value. Any value that you want to be stored as text is going to be best typed as a String. BSON strings are UTF-8 and are represented in MongoDB as:

Type | Number | Alias |
------------------ | ------ | -------- |
String | 2 | "string" |

Generally, drivers for programming languages will convert from the language's string format to UTF-8 when serializing and deserializing BSON. This makes BSON an attractive method for storing international characters with ease for example.

Inserting a document with a String data type will look something like this:

db.mytestcoll.insertOne({first_name: "Alex"})
{
"acknowledged": true,
"insertedId": ObjectId("614b37296a124db40ae74d15")
}

Querying the collection will return the following:

db.mytestcoll.find().pretty()
{
_id: ObjectId("614b37296a124db40ae74d15"),
first_name: "Alex"
}

Using the $type operator

Before moving on to our next data type, it is important to know how you can type check the value before making any insertions. We'll use the previous example to demonstrate using the $type operator in MongoDB.

Let's say it has been awhile since we worked with the mytestcoll collection from before. We want to insert some additional documents into the collection with the first_name field. To check that we used String as the data type stored as a value of first_name originally, we can run the following using either the alias or number value of the data type:

db.mytestcoll.find( { "first_name": { $type: "string" } } )

Or

db.mytestcoll.find( { "first_name": { $type: 2 } } )

Both queries return an output of all documents that have a String value stored for first_name from our previous section's insertion:

[ { _id: ObjectId("614b37296a124db40ae74d15"), first_name: "Alex" } ]

If you query for a type that is not stored in the first_name field of any document, you will get no result. This indicates that it is another data type stored in first_name.

You are also able to query for multiple data types at a time with the $type operator like the following:

db.mytestcoll.find( { "first_name": { $type: ["string", "null"] } } )

Because we did not insert any Null type values into our collection, the result will be the same:

[ { _id: ObjectId("614b37296a124db40ae74d15"), first_name: "Alex" } ]

You can use the same method with all of the following types that we will discuss.

Numbers and numeric values

MongoDB includes a range of numeric data types suitable for different scenarios. Deciding which type to use depends on the nature of the values you plan to store and your use cases for the data. JSON calls anything with numbers a Number. That forces the system to figure out how to turn it into the nearest native data type. We'll start off by exploring integers and how they work in MongoDB.

Integer

The Integer data type is used to store numbers as whole numbers without any fractions or decimals. Integers can either be positive or negative values. There are two types in MongoDB, 32-Bit Integer and 64-Bit Integer. They can be represented in the two ways depicted in the table below, number and alias:

Integer type | number | alias |
------------ | ----- | ------------ |
`32-bit integer`| 16 | "int" |
`64-bit integer`| 18 | "long" |

The ranges a value can fit into for each type are the following:

Integer type | Applicable signed range | Applicable unsigned range |
------------ | ------------------------------ | ------------------------------- |
`32-bit integer`| -2,147,483,648 to 2,147,483,647| 0 to 4,294,967,295 |
`64-bit integer`| -9,223,372,036,854,775,808 to | 0 to 18,446,744,073,709,551,615
9,223,372,036,854,775,807

The types above are limited by their valid range. Any value outside of the range will result in an error. Inserting an Integer type into MongoDB will look like the below:

db.mytestcoll.insertOne({age: 26})
{
"acknowledged": true,
"insertedId": ObjectId("614b37296a124db40ae74d14")
}

And finding the result will return the following:

db.mytestcoll.find().pretty()
{
_id: ObjectId("614b37296a124db40ae74d14"), age: 26
}

As suggested by the names, a 32-Bit Integer has 32 bits of integer precision which is useful for smaller integer values that you do not want to store as a sequence of digits. When growing in number size, you are able to bump up to the 64-Bit Integer which has 64 bits of integer precision and fits the same use case as the former.

Double

In BSON, the default replacement for JSON's Number is the Double data type. The Double data type is used to store a floating-point value and can be represented in MongoDB like so:

Type | Number | Alias |
------------------ | ------ | -------- |
Double | 1 | "double" |

Floating point numbers are another way to express decimal numbers, but without exact, consistent precision.

Floating point numbers can work with a large number of decimals efficiently but not always exactly. The following is an example of entering a document with the Double type into your collection:

db.mytestcoll.insertOne({testScore: 89.6})
{
"acknowledged": true,
"insertedId": ObjectId("614b37296a124db40ae74d13")
}

There can be slight differences between input and output when calculating with doubles that could potentially lead to unexpected behavior. When performing operations that require exact values MongoDB has a more precise type.

Decimal128

If you are working with very large numbers with lots of floating point range, then the Decimal128 BSON data type will be the best option. This will be the most useful type for values that require a lot of precision like in use cases involving exact monetary operations. The Decimal128 type is represented as:

Type | Number | Alias |
------------------ | ------ | --------- |
Decimal128 | 19 | "decimal" |

The BSON type, Decimal128, provides 128 bits of decimal representation for storing numbers where rounding decimals exactly is important. Decimal128 supports 34 decimal digits of precision, or a sinificand with a range of -6143 to +6144. This allows for a high amount of precision.

Inserting a value using the Decimal128 data type requires using the NumberDecimal() constructor with your number as a String to keep MongoDB from using the default numeric type, Double.

Here, we demonstrate this:

db.mytestcoll.insertOne({price : NumberDecimal("5.099")})
{
"acknowledged": true,
"insertedId": ObjectId("614b37296a124db40ae74d12")
}

When querying the collection, you then get the following return:

db.mytestcoll.find().pretty()
{
_id: ObjectId("614b37296a124db40ae74d12"),
price: "5.099"
}

The numeric value maintains its precision allowing for exact operations. To demonstrate the Decimal128 type versus the Double, we can go through the following exercise.

How precision can be lost based on data type

Say we want to insert a number with many decimal values as a Double into MongoDB with the following:

db.mytestcoll.insertOne({ price: 9999999.4999999999 })
{
"acknowledged": true,
"insertedId": ObjectId("614b37296a124db40ae74d24")
}

When we query for this data, we get the following result:

db.mytestcoll.find().pretty()
{
_id: ObjectId("614b37296a124db40ae74d24"),
price: 9999999.5
}

This value rounds up to 9999999.5, losing its exact value that we inputted it with. This makes Double ill-suited for storage of numbers with many decimals.

The next example demonstrates where precision will get lost when passing a Double implicitly with Decimal128 instead of a String like in the previous example.

We start by inserting the following Double again but with NumberDecimal() to make it a Decimal128 type:

db.mytestcoll.insertOne({ price: NumberDecimal( 9999999.4999999999 ) })
{
"acknowledged": true,
"insertedId": ObjectId("614b37296a124db40ae74d14")
}

Note: When making this insertion in the MongoDB shell, the following warning message is displayed:

Warning: NumberDecimal: specifying a number as argument is deprecated and may lead to
loss of precision, pass a string instead

This warning message indicates to you that the number you are trying to pass could be subject to a loss of precision. They suggest to use a String using NumberDecimal() so that you do not lose any precision.

If we ignore the warning and insert the document anyways, the loss of precision is seen in the query results from the rounding up of the value:

db.mytestcoll.find().pretty()
{
_id: ObjectId("614b37296a124db40ae74d14"),
price: Decimal128("9999999.50000000")
}

If we follow the recommended NumberDecimal() approach using a String we'll see the following results with maintained precision:

db.mytestcoll.insertOne({ price: NumberDecimal( "9999999.4999999999" ) } )
db.mytestcoll.find().pretty()
{
_id: ObjectId("614b37296a124db40ae74d14"),
price: Decimal128("9999999.4999999999")
}

For any use case requiring precise, exact values, this return could cause issues. Any work involving monetary operations is an example where precision is going to be extremely important and having exact values is critical to accurate calculations. This demonstration highlights the importance of knowing which numeric data type is going to be best suited for your data.

Date

The BSON Date data type is a 64-bit integer that represents the number of milliseconds since the Unix epoch (Jan 1, 1970). This data type stores the current date or time and can be returned as either a date object or as a string. Date is represented in MongoDB as follows:

Type | Number | Alias |
------------------ | ------ | ------------ |
Date | 9 | "date" |

Note: BSON Date type is signed. Negative values represent dates before 1970.

There are three methods for returning date values.

  1. Date() - returns a string

  2. new Date() - returns a date object using the ISODate() wrapper

  3. ISODate() - also returns a date object using the ISODate() wrapper

We demonstrate these options below:

var date1 = Date()
var date2 = new Date()
var date3 = ISODate()
db.mytestcoll.insertOne({firstDate: date1, secondDate: date2, thirdDate: date3})
{
"acknowledged": true,
"insertedId": ObjectId("614b37296a124db40ae74d22")
}

And when returning:

db.mytestcoll.find().pretty()
{
"_id" : ObjectId("614b37296a124db40ae74d22"),
firstDate: 'Tue Sep 28 2021 11:28:52 GMT+0200 (Central European Summer Time)',
secondDate: ISODate("2021-09-28T09:29:01.924Z"),
thirdDate: ISODate("2021-09-28T09:29:12.151Z")
}

Timestamp

There is also the Timestamp data type in MongoDB for representing time. However, Timestamp is going to be most useful for internal use and is not associated with the Date type. The type itself is a sequence of characters used to describe the date and time when an event occurs. Timestamp is a 64 bit value where:

  • the most significant 32 bits are time_t value (seconds since Unix epoch)
  • the least significant 32 bits are an incrementing ordinal for operations within a given second

Its representation in MongoDB will look as follows:

Type | Number | Alias |
------------------ | ------ | ------------ |
Timestamp | 17 | "timestamp" |

When inserting a document that contains top-level fields with empty timestamps, MongoDB will replace the empty timestamp value with the current timestamp value. The exception to this is if the _id field contains an empty timestamp. The timestamp value will always be inserted as is and not replaced.

Inserting a new Timestamp value in MongoDB will use the new Timestamp() function and look something like this:

db.mytestcoll.insertOne( {ts: new Timestamp() });
{
"acknowledged": true,
"insertedId": ObjectId("614b37296a124db40ae74d23")
}

When querying the collection, you'll return a result resembling:

db.mytestcoll.find().pretty()
{
"_id" : ObjectId("614b37296a124db40ae74d24"),
"ts" : Timestamp( { t: 1412180887, i: 1 })
}

Object

The Object data type in MongoDB is used for storing embedded documents. An embedded document is a series of nested documents in key: value pair format. We demonstrate the Object type below:

var classGrades = {"Physics": 88, "German": 92, "LitTheoery": 79}
db.mytestcoll.insertOne({student_name: "John Smith", report_card: classGrades})
{
"acknowledged": true,
"insertedId": ObjectId("614b37296a124db40ae74d18")
}

We can then view our new document:

db.mytestcoll.find().pretty()
{
_id: ObjectId("614b37296a124db40ae74d18"),
student_name: 'John Smith',
report_card: {Physics: 88, German: 92, LitTheoery: 79}
}

The Object data type optimizes for storing data that is best accessed together. It provides some efficiencies around storage, speed, and durability as opposed to storing each class mark, from the above example, separately.

Binary Data

The Binary data, or BinData, data type does exactly what its name implies and stores binary data for a field's value. BinData is best used when you are storing and searching data, because of its efficiency in representing bit arrays. This data type can be represented in the following ways:

Type | Number | Alias |
------------------ | ------ | ------------ |
Binary data | 5 | "binData" |

Here is an example of adding some Binary data into a document in a collection:

var data = BinData(1, "111010110111100110100010101")
db.mytestcoll.insertOne({binaryData: data})
{
"acknowledged": true,
"insertedId": ObjectId("614b37296a124db40ae74d20")
}

To then see the resulting document:

db.mytestcoll.find().pretty()
{
"_id" : ObjectId("614b37296a124db40ae74d20"),
"binaryData" : BinData(1, "111010110111100110100010101")
}

ObjectId

The ObjectId type is specific to MongoDB and it stores the document's unique ID. MongoDB provides an _id field for every document. ObjectId is 12 bytes in size and can be represented as follows:

Type | Number | Alias |
------------------ | ------ | ------------ |
ObjectId | 7 | "objectId" |

ObjectId consists of three parts that make up its 12-byte makeup:

  • a 4-byte timestamp value, representing the ObjectId's creation, measured in seconds since the Unix epoch
  • a 5-byte random value
  • a 3-byte incrementing counter initialized to a random value

In MongoDB, each document within a collection requires a unique _id to act as a primary key. If the _id field is left empty for an inserted document, MongoDB will automatically generate an ObjectId for the field.

There are several benefits to using ObjectIds for the _id:

  • in mongosh (MongoDB shell), the creation time of the ObjectId is accessible using the ObjectId.getTimestamp() method.
  • sorting on an _id field that stores ObjectId data types is a close equivalent to sorting by creation time.

We've seen ObjectIds throughout the examples so far, and they will look similar to this:

db.mytestcoll.find().pretty()
{
_id: ObjectId("614b37296a124db40ae74d19")
}

Note: ObjectId values should increase over time, however they are not necessarily monotonic. This is because they:

  • Only contain one second of temporal resolution, so values created within the same second do not have guaranteed ordering
  • values are generated by clients, which may have differing system clocks

Boolean

MongoDB has the native Boolean data type for storing true and false values within a collection. Boolean in MongoDB can be represented as follows:

Type | Number | Alias |
------------------ | ------ | ------------ |
Boolean | 8 | "bool" |

Inserting a document with a Boolean data type will look something like the following:

db.mytestcoll.insertOne({isCorrect: true, isIncorrect: false})
{
"acknowledged": true,
"insertedId": ObjectId("614b37296a124db40ae74d21")
}

Then when searching for the document the result will appear as:

db.mytestcoll.find().pretty()
{
"_id" : ObjectId("614b37296a124db40ae74d21")
"isCorrect" : true,
"isIncorrect" : false
}

Regular Expression

The Regular Expression data type in MongoDB allows for the storage of regular expressions as the value of a field. MongoDB uses PCRE (Perl Compatible Regular Expression) as its regular expression language.

Its can be represented in the following way:

Type | Number | Alias |
------------------ | ------ | ------- |
Regular Expression | 11 | "regex" |

BSON allows you to avoid the typical "convert from string" step that is commonly experienced when working with regular expressions and databases. This type is going to be most useful when you are writing database objects that require validation patterns or matching triggers.

For instance, you can insert the Regular Expression data type like this:

db.mytestcoll.insertOne({exampleregex: /tt/})
{
"acknowledged": true,
"insertedId": ObjectId("614b37296a124db40ae74d16")
}
db.mytestcoll.insertOne({exampleregext:/t+/})
{
"acknowledged": true,
"insertedId": ObjectId("614b37296a124db40ae74d17")
}

This sequence of statements will add these documents to your collection. You are then able to query your collection to find the inserted documents:

db.mytestcoll.find().pretty()
{
_id: ObjectId("614b37296a124db40ae74d16"), exampleregex: /tt/,
_id: ObjectId("614b37296a124db40ae74d17"), exampleregex: /t+/
}

The regular expression patterns are stored as regex and not as strings. This allows you to query for a particular string and get returned documents that have a regular expression matching the desired string.

JavaScript (without scope)

Much like the previously mentioned Regular Expression data type, BSON allows for MongoDB to store JavaScript functions without scope as their own type. The JavaScript type can be recognized as follows:

Type | Number | Alias |
------------------ | ------ | ------------ |
JavaScript | 13 | "javascript" |

Adding a document to your collection with the JavaScript data type will look something like this:

db.mytestcoll.insertOne({jsCode: "function(){var x; x=1}"})
{
"acknowledged": true,
"insertedId": ObjectId("614b37296a124db40ae74d122")
}

This functionality allows you to store JavaScript functions inside of your MongoDB collections if needed for a particular use case.

Note: With MongoDB version 4.4 and above, an alternative JavaScript type, the JavaScript with Scope data type, has been deprecated

Conclusion

In this article, we've covered most of the common data types that are useful when working with MongoDB databases. There are additional types that are not explicitly covered in this guide that may be helpful depending on the use case. Getting started by knowing these types covers most use cases. It is a strong foundation to begin modelling your MongoDB database.

It is important to know what data types are available to you when using a database so that you're using valid values and operating on the data with expected results. There are risks you can run into without properly typing your data like demonstrated in the Double versus Decimal128 exercise. It's important to think about this ahead of committing to any given type.

If you are interested in checking out Prisma with a MongoDB database, you can check out the data connector documentation.