Conflict-free replicated data types v4
Conflict-free replicated data types (CRDT) support merging values from concurrently modified rows instead of discarding one of the rows as traditional resolution does.
Each CRDT type is implemented as a separate PostgreSQL data type with
an extra callback added to the bdr.crdt_handlers
catalog. The merge
process happens inside the BDR writer on the apply side without any user
action needed.
CRDTs require the table to have column-level conflict resolution enabled, as documented in CLCD.
The only action you need to take is to use a particular data type in CREATE/ALTER TABLE rather than standard built-in data types such as integer. For example, consider the following table with one regular integer counter and a single row:
Suppose you issue the following SQL on two nodes at same time:
After both updates are applied, you can see the resulting values using this query:
This code shows that you lost one of the increments due to the update_if_newer
conflict resolver. If you use the CRDT counter data type instead,
the result looks like this:
Again issue the following SQL on two nodes at same time, and then wait for the changes to be applied:
This example shows that CRDTs correctly allow accumulator columns to work, even in the face of asynchronous concurrent updates that otherwise conflict.
The crdt_gcounter
type is an example of state-based CRDT types that
work only with reflexive UPDATE SQL, such as x = x + 1
, as the example shows.
The bdr.crdt_raw_value
configuration option determines whether queries
return the current value or the full internal state of the CRDT type. By
default, only the current numeric value is returned. When set to true
,
queries return representation of the full state. You can use the special hash operator
(#
) to request only the current numeric value without using the
special operator (the default behavior). If the full state is
dumped using bdr.crdt_raw_value = on
, then the value can
reload only with bdr.crdt_raw_value = on
.
Note
The bdr.crdt_raw_value
applies formatting only of data returned
to clients, that is, simple column references in the select list. Any column
references in other parts of the query (such as WHERE
clause or even
expressions in the select list) might still require use of the #
operator.
Another class of CRDT data types is referred to delta CRDT types. These are a special subclass of operation-based CRDTs.
With delta CRDTs, any update to a value is compared to the previous value on the same node. Then a change is applied as a delta on all other nodes.
Suppose you issue the following SQL on two nodes at same time:
After both updates are applied, you can see the resulting values using this query:
With a regular integer
column, the result is 2
. But
when you update the row with a delta CRDT counter, you start with the OLD
row version, make a NEW row version, and send both to the remote node.
There, compare them with the version found there (e.g.,
the LOCAL version). Standard CRDTs merge the NEW and the LOCAL version,
while delta CRDTs compare the OLD and NEW versions and apply the delta
to the LOCAL version.
The CRDT types are installed as part of bdr
into the bdr
schema.
For convenience, the basic operators (+
, #
and !
) and a number
of common aggregate functions (min
, max
, sum
, and avg
) are
created in pg_catalog
. This makes them available without having to tweak
search_path
.
An important question is how query planning and optimization works with these
new data types. CRDT types are handled transparently. Both ANALYZE
and
the optimizer work, so estimation and query planning works fine without
having to do anything else.
State-based and operation-based CRDTs
Following the notation from [1], both operation-based and state-based CRDTs are implemented.
Operation-based CRDT types (CmCRDT)
The implementation of operation-based types is trivial because the operation isn't transferred explicitly but computed from the old and new row received from the remote node.
Currently, these operation-based CRDTs are implemented:
crdt_delta_counter
—bigint
counter (increments/decrements)crdt_delta_sum