How Log Proofs Work

Certificate Transparency logs use a special cryptographic mechanism to facilitate public auditing of certificates and logs. This special cryptographic mechanism, known as a Merkle hash tree, is a simple binary tree consisting of hashed leaves and nodes (see figure 1). Leaves are the hashes of individual certificates that have been appended to the log. Nodes are the hashes of paired child leaves or paired child nodes. The root hash, from which all nodes and leaves stem, is known as the Merkle tree hash. When the log server signs the Merkle tree hash (along with other information) it’s known as the signed tree head (STH).


Periodically, perhaps once an hour, a log server appends all of its newly acquired certificates to the log. It does this by creating a separate Merkle tree hash with the newly acquired certificates, and then combining this separate Merkle tree hash with the old Merkle tree hash to form a new Merkle tree hash (see figure 2). The new Merkle tree hash is then signed to create a new signed tree head. This process continues over and over, creating an ever-growing Merkle tree of all certificates ever submitted to the log.

Consistency Proofs and Audit Proofs

Because of the way they're structured, Merkle hash trees make it possible for a log to prove two things very efficiently and quickly:
  • That all certificates have been consistently appended to the log.
  • That a particular certificate has been appended to the log.
A log does this by providing two cryptographic proofs: a Merkle consistency proof and a Merkle audit proof.

Merkle Consistency Proofs

A Merkle consistency proof lets you verify that any two versions of a log are consistent: that is, the later version includes everything in the earlier version, in the same order, and all new entries come after the entries in the older version. If you can prove that a log is consistent it means that no certificates have been back-dated and inserted into the log, no certificates have been modified in the log, and the log has never been branched or f
orked.

To see how Merkle consistency proofs work, assume you want to verify the consistency between the log in figure 2 and the log in figure 3. First, you need to verify that the old Merkle tree hash is a subset of the new Merkle tree hash. Then you need to verify that the new Merkle tree hash is the concatenation of the old Merkle tree hash plus all the intermediate node hashes of the newly appended certificates. The consistency proof is the minimum set of intermediate node hashes you need to compute these two things.

In this case, the consistency proof consists of the following intermediate node hashes: k, l, and m (see figure 4). You can use k and m to create the old Merkle tree hash, thereby verifying that the old tree exists and is unchanged. Then you can use l with k to create n, and then use n with m to create the new Merkle tree hash for the log. If your computed Merkle tree hash matches the one advertised by the log, then you know the log is consistent.

Monitors and auditors regularly use consistency proofs to verify that logs are behaving properly. Because a monitor typically has the entire list of certificates that are in a log, it can calculate the consistency proof by itself and verify the consistency of a log. Auditors can simply query a log server and get the consistency proof for any two signed tree heads.


Merkle Audit Proofs

A Merkle audit proof lets you verify that a specific certificate has been included in a log. This is a critical verification task because the Certificate Transparency model demands that all TLS clients reject any certificates that do not show up in a certificate log.

To see how Merkle audit proofs work, assume you want to verify that certificate d3 (leaf d) has been appended to the log in figure 3. The Merkle audit proof is the missing node hashes required to compute all of the nodes between the leaf and the tree root. If the root hash you compute from the audit path matches the currently advertised Merkle tree hash for the log, then the leaf exists in the tree (or, put another way, the certificate exists in the log).

In this case, the Merkle audit proof consists of the following node hashes: c, i, n (see figure 5). Because you already know d, you can use c to compute j. You can then use i and j to compute m, and you can use n and m to compute the Merkle tree hash for the log. Similarly, if you want to verify that certificate d4 has been appended to the log, the log would send you a consistency proof with the following node hashes: f, l, m. You already know the leaf hash (e), which means you can use leaf hash f to calculate node hash k. You can then use node hash l to compute node hash n, and you can use node hash m and n to compute the Merkle tree hash for the log.


Anyone can request a Merkle audit proof from a log and verify that a certificate is in the log. Auditors routinely send these types of requests to logs so they can verify certificates for TLS clients. If a Merkle audit proof fails to produce a root hash that matches the Merkle tree hash, it means the certificate is not in the log.

Putting Proofs to Work

Proofs provide the cryptographic data an auditor needs to do its job. In general, auditors know very little about a log, but despite this limited knowledge, proofs make it possible for an auditor to verify whether a log is consistent and whether a particular certificate has been appended to the log.

Without proofs--and the Merkle trees that make them possible--an auditor would need to access or fetch all of a log's records in order to perform its auditing role. Proofs make this data exchange much more efficient, requiring the log to exchange substantially less data with the auditor. For example, a log containing 10 million certificates would require a consistency proof that has only 24 node hashes. If you increase the log size to 20 million certificates the number of node hashes in the consistency proof goes to 25.

Proofs are also useful for monitors, although in a slightly different way. Monitors typically save copies of the logs they monitor so they can inspect each of the certificates in the log and keep watch for certificates of interest. If a monitor wants to check the consistency of a particular log that it's monitoring, it can compute the consistency proof itself and then use it to verify the consistency of the log. Likewise, if a monitor ever needs to verify that a particular certificate exists in a log, it can compute the audit proof itself and then use it to verify the certificate.