Understanding when to use constraints or concurrency control tools will help you build scalable systems.
You can't expect application code to always write clean data into your database. If you allow garbage data in, then it won't be long until you're finding yourself manually removing invalid data.
The following examples are for PostgreSQL but could be generalized to SQL standards.
Protecting a single cell
The most basic constraint is one on a single cell, protecting your database from invalid values during row inserts or updates.
CREATE TABLE products (
id integer,
name text,
price numeric CONSTRAINT positive_price CHECK (price > 0)
)
// Imagine setting the price to -20 instead of subtracting it
// Well this would protect you...
Protecting a row
Sometimes you'll want to add a constraint across multiple values in a row.
CREATE TABLE products (
id integer,
name text,
price numeric,
sale_price numeric,
CONSTRAINT valid_sale CHECK (sale_price < price)
)
// This check will return an error if you inserted the sale price in JPY
Protecting a row that is dependent on another row
In general, PostgreSQL doesn't allow database constraints to be dependent on different rows or tables. You'd have to use concurrency control methods like optimistic locking during your db transactions to guarantee a cross-row validity check. However, I'd like to call out two important exceptions:
Enforcing uniqueness
You'll commonly need to enforce uniqueness on a column, so there's a database constraint for that kind of cross-row dependency. A btree index is created for the column upon adding this constraint, which is used to enforce the uniqueness constraint.
CREATE TABLE products (
id integer,
name text UNIQUE,
price numeric,
)
// Because there will only ever be one game called 'Okami'
Enforcing uniqueness, generalized
PostgreSQL provides EXCLUSION
, which is a rather powerful generalized constraint. For cases in which you'd like to enforce "uniqueness" over rows but you are not just comparing equality, EXCLUSION
might help. For example, you might be building an appointment application that needs to prevent multiple bookings at the same time. You can use EXCLUSION
to enforce this kind of generalized "uniqueness":
CREATE TABLE room_reservations (
room_id integer,
reserved_at timestamptz,
reserved_until timestamptz,
canceled boolean DEFAULT false,
EXCLUDE USING gist (
room_id WITH =, tstzrange(reserved_at, reserved_until) WITH &&
) WHERE (not canceled)
);
INSERT INTO room_reservations (room_id, reserved_at, reserved_until) VALUES
(1, '2015-01-01 00:00', '2015-01-01 23:59'),
(1, '2015-01-02 00:00', '2015-01-02 23:59');
INSERT INTO room_reservations (room_id, reserved_at, reserved_until) VALUES
(1, '2015-01-01 10:00', '2015-01-01 14:00');
// This will return an error,
// since room 1 already has a reservation during that time range.
Adapted this example from @fphilipe's gist
Let's break down the syntax for EXCLUSION
constraints as it is a bit hard to follow:
EXCLUDE USING gist (
room_id WITH =,
tstzrange(reserved_at, reserved_until) WITH &&
)
In English: using a Generalized Search Tree index, exclude rows that have with room_id
equal to each other AND overlapping timezone reserved ranges (&&
is the overlapping operator).
That constraint gives you unique reservation intervals for each individual room!
Sources
https://www.postgresql.org/docs/current/ddl-constraints.html https://gist.github.com/fphilipe/0a2a3d50a9f3834683bf https://www.slideshare.net/slideshow/not-just-unique-exclusion-constraints/5356101#24