CCF/doc/operations/data_persistence.rst

50 строки
3.8 KiB
ReStructuredText

Data Persistence
================
Durability
----------
Persistence to disk is handled by host-side code, outside the enclave, by code that is not attested and may be controlled by an attacker in the CCF threat model.
As a result, CCF cannot make a formal guarantee about data persistence. Durability relies on the operator maintaining enough healthy nodes, and making regular backups of the ledger. To minimise the risk of data loss on node failure, the CCF host component issues an ``fflush()`` call on every transaction as soon as it becomes committable (i.e. followed by a signature).
The operator can further minimize the risk of data loss by running their CCF-based service on a larger number of nodes.
Directories
-----------
When a new node has joined the network or when a failed node needs to be recovered, the latest committed snapshot file can be copied to the node before it is started. The node will then automatically resume from the latest snapshot file (see :ref:`operations/ledger_snapshot:Join or Recover From Snapshot`).
The new/recovered node may also need to have access to all ``.committed`` ledger files in some cases, for example, if the node needs to serve historical queries. It is therefore safe to back up all the ``.committed`` ledger and snapshot files. It is recommended to have two separate directories on each node - one being a read-write directory where *all* the ledger and snapshot files reside and another shared read-only directory where *only* the ``.committed`` ledger and snapshot files reside.
.. mermaid::
graph TD;
subgraph ccf network;
A["Node 0 (Primary)<br> [Read-Write directory]"];
B["Node 1<br> [Read-Write directory]"];
C["Node 2<br> [Read-Write directory]"];
D["Shared Mount<br>[Read-Only directory]"]
style D fill:#bbf,stroke:#f66,stroke-width:2px,color:#fff,stroke-dasharray: 5 5
A-->D
A-. copy files .-> D
B-->D
C-->D
end;
The read-only directory could be a shared mounted directory which is accessible to all the nodes in the network. The shared read-only ledger and snapshot directories can be specified via the ``ledger.read_only_directories`` and ``snapshot.read_only_directory`` configuration options respectively.
It is recommended to have the most-up-to-date copies of ``.committed`` ledger and snapshot files (see :ref:`operations/data_persistence:Best Practices`) in the read-only directory. Operators must take care to avoid any race conditions in the copy process.
Best Practices
--------------
It is recommended for operators to backup the ledger and snapshot files as soon as they become committed (i.e. ``.committed`` included in file name). While a majority of nodes will eventually have an identical copy of the ledger, the ledger file should be the most up-to-date on the current primary node. Snapshot files are only generated by the current primary node. As such, monitoring the directories specified by ``ledger.directory`` and ``snapshots.directory`` for the `current` primary node allows operators to retrieve the latest ledger and snapshot files.
.. note:: It is the responsibility of the operator to move/copy these files safely to avoid "ledger holes", i.e. historical ledger files not being available to a new node that started from a recent snapshot.
A low value for ``ledger.chunk_size`` means that smaller ledger files are generated and can thus be backed up by operators more regularly, at the cost of having to manage a large number of ledger files.
Similarly, a low value for ``snapshots.tx_count`` means that snapshots are generated often and that join/recovery time will be short, at the cost of additional workload on the primary node for snapshot generation.
.. note:: Uncommitted ledger files (which are likely to contain committed transactions) should also be used on recovery, as long as they are copied to the node's ``ledger.directory`` directory.