One of the primary features of relational databases in general is the ability to define schemas or table structures that exactly specify the format of the data they will contain. This is done by prescribing the columns that these structures contain along with their data type and any constraints.
Data types specify a general pattern for the data they accept and store. Values must adhere to the requirements that they outline in order to be accepted by MySQL. While it is possible to define custom requirements, data types provide the basic building blocks that allow MySQL to validate input and work with the data using appropriate operations.
MySQL includes a wide range of data types that are used to label and validate that values conform to appropriate types. In this guide, we will discuss the most common data types available in MySQL, the different input and output formats they use, and how to configure various fields to meet your applications' needs.
Before going into detail, let's take a broad view of what data types MySQL provides.
MySQL supports a reasonable range of data types suitable for various types of simple and complex data. These include:
We'll cover the most common of these in more depth throughout this guide.
As you get started with types, it's important to remember that types alone are not always a complete solution to data validation, but a component. Other database tools, like constraints also have a role to play in defining correctness. Still, data types are often the first line of defense against invalid data.
For many cases, the general types provided by MySQL are appropriate for the kinds of data you'll be storing. For example, while you could store the coordinates of a geometric point in two different number columns, the provided
point type is purpose built to store and validate exactly this type of information. When choosing types, check to see that you are using the most specific type applicable to your use case.
MySQL includes a range of numeric data types suitable for different scenarios. The appropriate type depends on the exact nature of the values you plan to store as well as your precision requirements.
The integer data type is a category of types used to store numbers without any fractions or decimals. These can be either positive or negative values, and different integer types can store different ranges of numbers. Integer types with smaller ranges of acceptable values take up less space than those with wider ranges.
The basic list of integer types includes the following:
|Applicable signed range
|Applicable unsigned range
|-128 to 127
|0 to 255
|-32768 to 32767
|0 to 65535
|-8388608 to 8388607
|0 to 16777215
|-2147483648 to 2147483647
|0 to 4294967295
|-2^63 to 2^63-1
|0 to 2^64-1
The types above are limited by their valid range. Any value outside of the range will result in an error.
In addition to the types mentioned above, MySQL also recognizes an alias called
SERIAL. Marking a column as
SERIAL will give it these properties:
BIGINT UNSIGNED NOT NULL AUTO_INCREMENT UNIQUE. This is used as a shorthand for common primary key column properties. The column will automatically assign a new unique value whenever a record is added.
Fixed point types are used to control the amount of precision or specificity possible for a number with decimals. In MySQL, this can be controlled by manipulating two factors: precision and scale.
Precision is the maximum amount of total digits that a number can have. In contrast, scale is the number of digits to the right of the decimal point. By manipulating these numbers, you can control how large the fractional and non-fractional components of a number are allowed to be.
These two arguments are used to control arbitrary precision using the
decimal data types (these two types are synonymous in MySQL). The
numeric type takes zero to two arguments.
With no arguments, the column is defined as having a precision of 10 and a scale of 0. This means that the column can hold up to 10 digits, but none of these can be after the decimal point:
When a single argument is provided, it is interpreted as the precision of the column with scale set to 0. This effectively allows you to specify the maximum number of digits in an integer-like number (no fractional or decimal components). For example, if you need a 5 digit whole number, you can specify:
Specify precision followed by scale when configuring a column using both controls. MySQL will round the decimal component of any input to the correct number of digits using the scale number. MySQL will use the precision and scale to determine how many digits are allowed on the left side of the decimal point. If an entry exceeds the allowed number of digits, MySQL will produce an error.
For example, we can specify a column with a total precision of 5 and a scale of 2:
This column would have the following behavior:
|Accepted (fits precision)?
Floating point numbers are another way to express decimal numbers, but without exact, consistent precision. Instead, floating point types only have a concept of a maximum precision which is often related to the architecture and platform of the hardware.
For example, to limit a floating point column to 8 digits of precision, you can use the
FLOAT type, which stores results using 4 bytes with anywhere from 0 to 23 digits of precision:
DOUBLE type uses 8 bytes to store data and can use precisions of 24 to 53 digits.
Because of these design choices, floating point numbers can work with numbers with large number of decimals efficiently, but not always exactly. The internal representation of numbers may cause slight differences between the input and output. This can cause unexpected behavior when comparing values, doing floating point math, or performing operations that require exact values.
Both floating point numbers provided by types like
DOUBLE and fixed point numbers provided by the
DECIMAL types can be used to store decimal values. How do you know which one to use?
The general rule is that if you need exactness in your calculations, the
NUMERIC type is always the better choice. The
NUMERIC type will store values exactly as they are provided, meaning that the results are entirely predictable when retrieving or computing over values. The
NUMERIC type is called arbitrary precision because you specify the amount of precision the type requires and it will store that exact amount of digits in the field.
In contrast, types like
DOUBLE are variable precision types. The amount of precision they maintain depends on the input value. When they reach the end of their allowed level of precision, they may round the remaining digits, leading to differences between the submitted and retrieved values.
So when would you use variable precision types? Variable precision types like
DOUBLE are well suited for scenarios where exact values are not necessary (for example, if they'll be rounded anyways) and when speed is highly valuable. Variable precision will generally offer performance benefits over the
MySQL's character types and string types can be placed into two categories: fixed length and variable length. The choice between these two affects how MySQL allocates space for each value and how it validates input.
The simplest character-based data type within MySQL is the
char type. With no arguments, the
char type accepts a single character as input:
When a positive integer is provided in the declaration, the
char column will store a fixed length character string equal to the number of characters specified:
If a string is provided with fewer characters, blank spaces will be appended to pad the length:
|# of input characters
|# of stored characters
If a string is given with greater than the allowed number of characters, MySQL will raise an error. As an exception to this rule, if the overflowing characters are all spaces, MySQL will simply truncate the excess spaces to fit the field.
The alternative to fixed length character fields are variable length fields. For this, MySQL provides the
varchar type. The
varchar type stores characters with no fixed size. Unlike
varchar cannot be used without specifying the maximum number of characters to store.
By defining a
varchar with a positive integer, you can set a maximum string length:
This differs from using the
char type with an integer in that
varchar will not pad the value if the input does not meet the maximum field length:
|# of input characters
|# of stored characters
If the string is greater than the maximum length, MySQL will throw an error. The same truncation behavior that's present in
char fields occurs here: if the overflowing characters are spaces, they will be truncated to fit inside the maximum character length.
MySQL also supports the
varbinary data types. These operate in a similar manner to the
varchar types, but store binary strings rather than character strings. This has implications on how they are stored and operated on (for things like comparisons, sorting, etc.).
varbinary types, the integer given when defining the column type represents the number of bytes instead of the number of characters.
Two other data types that MySQL provides for strings and character storage are
text. These types operate similar to the
varbinary types respectively and are meant for storing large objects. They operate mostly the same as their counterparts, but have a few differences like not begin able to have default values and requiring a prefix length when creating an index.
MySQL does not actually have a native boolean type to represent true and false values.
MySQL recognizes the types
BOOLEAN in an effort for compatibility with other database systems. Its internal implementation, however, uses a
TINYINT(1) column to store the values and interprets them as true or false based on a set of rules.
When interpreting numeric values in a boolean context, the value of
0 is considered false. All non-zero values are considered true.
MySQL recognizes the boolean literals
FALSE and converts
TRUE to 1 and
FALSE to 0 when storing them.
MySQL has support for representing dates, times, and combinations of the two.
date type can store a date without an associated time value:
When processing input for
date columns, MySQL can interpret different formats to determine the correct date to store. However, the component parts must always come in the same sequence: year, month, and then day. The
STR_TO_DATE() function is available to help convert other date formats to a format MySQL will interpret correctly.
When displaying dates, MySQL uses the
YYYY-MM-DD format. You can use the
DATE_FORMAT() function to format output in other formats.
date type can store values ranging from
time data type can store a specific time of day without an associated timezone or date.
When processing input for
time columns, MySQL can interpret multiple formats to determine the correct time to store. When input has colons, it is generally interpreted as
hh:mm:ss. Any shortened value (using only one column) will be interpreted as using
hh:mm. When the input does not have colons, the time is processed to fill up the smallest value first. For example,
1045 is taken as 10 minutes and 45 seconds.
MySQL also supports fractional seconds if a decimal point is given. It stores up to 6 digits of precision after the decimal. Values in
time columns can range from
When displaying time values, MySQL uses the
hh:mm:ss format. As with dates, a function is provided, called
TIME_FORMAT() to display time values using other formats.
MySQL can represent timestamps, a combination of a date and time used to represent a specific moment in time, in two different variations: using the
timestamp type and the
datetime type can represent values from
1000-01-01 00:00:00 to
9999-12-31 23:59:59. It can also include fractional seconds of up to six digits similar to the
timestamp type can represent values from
1970-01-01 00:00:01 UTC to
2038-01-19 03:14:07 UTC. It can handle fractional seconds as well. When storing
timestamp values, all values are converted from the given timezone to UTC for storage and converted back to the local timezone on retrieval. The
datetime type does not do this.
From MySQL 8.0.19 onward, you can include a timezone offset when storing a
timestamp to explicitly set the timezone for the stored value. You do this by including a value after the time component, with no space to indicate the offset. The range of accepted values goes from
+14:00, which represents the offset of the stored value from UTC.
When deciding on whether to store date and time values using
timezone types, it's often helpful to separate them by what they are best for.
datetime values as a specific date and time, in relationship to the calendar and clock wherever it is retrieved. If a person goes to bed at 11pm at night, a
datetime value can represent that value, regardless of what timezone the person is in currently.
On the other hand,
timezone values are best at representing a specific moment in time that is unambiguous across timezones. To send a video call invite, a
timezone value would be able to make sure the meeting occurs at the same time for everyone, regardless of what timezone the participant is in.
Along with the types we covered with some depth above, there are additional types that are useful in specific scenarios. We'll cover these briefly to give you an idea of how to use them and when they may be useful.
Two related types that allow users to dictate the valid values for a column are the
enum type is a string type that allows the user to define a collection of valid values when the column is created. Any value that matches one of the defined values is accepted and all other values are rejected. This functions similar to a drop down menu in that a choice can be made from a specific set of options. For example, an
season could be created with the values
To create an
enum column, specify the type as
enum, giving the possible values as strings, separated by commas, inside of a set of parentheses, like this:
season ENUM('winter', 'spring', 'summer', 'autumn')
A similar type of user-defined type is the
set type. Like the
set types allow users to specify valid values as strings upon definition. The difference between these two types is that in a
set, more than one value can be stored for each record.
For example, if you needed a column to a represent the days of the week volunteers are available to work, you could have a
set column like this:
availability SET('sunday', 'monday', 'tuesday', 'wednesday', 'thursday', 'friday', 'saturday')
When entering values for the
availability column we just created, you provide a single string with commas separating all of the days the volunteer is available. For example:
set types in MySQL, duplicate values in input are always removed and upon retrieval, the values follow the ordering used in the
set definition regardless of the ordering when input in the column.
MySQL supports columns in JSON using the
json type. Data stored as
json is stored in binary for faster execution and processing so that the server does not have to interpret a string to operate on
To operate on
JSON columns, MySQL provides a number of functions to work with values within the document.
In this article, we've covered a lot of the most common data types that are useful when working with MySQL databases. There are additional types not covered in this guide that are helpful to know about, but these represent a good starting point for most use cases.
It is important to use the type system appropriately so that you can control valid values and operate on data as expected. There are pitfalls you can run into if you choose a type not suited for your data, so giving it thought before you commit to a data type is worthwhile in most cases.
The declaration syntax for a
DECIMAL column is
DECIMAL(M, D). The ranges of values for the arguments are as follows:
- M is the maximum number of digits (the precision). It has a range of 1 to 65.
- D is the number of digits to the right of the decimal point (the scale). It has a range of 0 to 30 and must be no larger than M.
The storage requirements for string types in MySQL can be represented in the following table with L representing the actual length in bytes of a given string value.
|L + 1 bytes, where L < 2^8
|L + 2 bytes, where L < 2^16
|L + 3 bytes, where L < 2^24
|L + 4 bytes, where L < 2^32
VARCHAR similarly store characters with no fixed size.
VARCHAR is different because it cannot be used without specifying the maximum number of characters to store whereas
TEXT does not require this.
The example definition syntax of a
VARCHAR would look like this:
VARCHAR columns are variable-length strings. The maximum length can be specified as a value from 0 to 65,535.
The effective maximum length of a
VARCHAR is subject to the maximum row size among all columns (65,535 bytes).
ENUM is an advantageous storage type in situations where a column has a limited set of possible values.
It has the same storage requirements as all columns:
|1 or 2 bytes, depending on the number of enumeration values (65,535 values maximum)