Generated columns
D1 allows you to define generated columns based on the values of one or more other columns, SQL functions, or even extracted JSON values.
This allows you to normalize your data as you write to it or read it from a table, making it easier to query and reducing the need for complex application logic.
Generated columns can also have indexes defined against them, which can dramatically increase query performance over frequently queried fields.
There are two types of generated columns:
VIRTUAL
(default): the column is generated when read. This has the benefit of not consuming storage, but can increase compute time (and thus reduce query performance), especially for larger queries.STORED
: the column is generated when the row is written. The column takes up storage space just as a regular column would, but the column does not need to be generated on every read, which can improve read query performance.
When omitted from a generated column expression, generated columns default to the VIRTUAL
type. The STORED
type is recommended when the generated column is compute intensive. For example, when parsing large JSON structures.
Generated columns can be defined during table creation in a CREATE TABLE
statement or afterwards via the ALTER TABLE
statement.
To create a table that defines a generated column, you use the AS
keyword:
As a concrete example, to automatically extract the location
value from the following JSON sensor data, you can define a generated column called location
(of type TEXT
), based on a raw_data
column that stores the raw representation of our JSON data.
To define a generated column with the value of $.measurement.location
, you can use the json_extract
function to extract the value from the raw_data
column each time you write to that row:
Generated columns can optionally be specified with the column_name GENERATED ALWAYS AS <function> [STORED|VIRTUAL]
syntax. The GENERATED ALWAYS
syntax is optional and does not change the behavior of the generated column when omitted.
A generated column can also be added to an existing table. If the sensor_readings
table did not have the generated location
column, you could add it by running an ALTER TABLE
statement:
This defines a VIRTUAL
generated column that runs json_extract
on each read query.
Generated column definitions cannot be directly modified. To change how a generated column generates its data, you can use ALTER TABLE table_name REMOVE COLUMN
and then ADD COLUMN
to re-define the generated column, or ALTER TABLE table_name RENAME COLUMN current_name TO new_name
to rename the existing column before calling ADD COLUMN
with a new definition.
Generated columns are not just limited to JSON functions like json_extract
: you can use almost any available function to define how a generated column is generated.
For example, you could generate a date
column based on the timestamp
column from the previous sensor_reading
table, automatically converting a Unix timestamp into a YYYY-MM-dd
format within your database:
Alternatively, you could define an expires_at
column that calculates a future date, and filter on that date in your queries:
- Tables must have at least one non-generated column. You cannot define a table with only generated column(s).
- Expressions can only reference other columns in the same table and row, and must only use deterministic functions ↗. Functions like
random()
, sub-queries or aggregation functions cannot be used to define a generated column. - Columns added to an existing table via
ALTER TABLE ... ADD COLUMN
must beVIRTUAL
. You cannot add aSTORED
column to an existing table.