Obsstore Format

/!\ This page is intended for developer

{i} This an archive page extracted from ChangesetEvolutionDevel. It contains data related to discussion and design of the obsstore format. V2 of the format have been implemented for a couple of year so this page is kept for historical purpose.

Markers are stored in an append-only file stored in '.hg/store/obsstore'.

V1 (old) Format

(see in line document for latest data)

quick summary

longer explanation

The file starts with a version header:

The header is followed by the markers. Each marker is made of:

V2 (current) Format


There is two extra information we would like to see in a second version of the format:

possibles change



We have multiple option for storing parents:

  1. Having an explicite field similar to successors (one byte to know how many parents, then parents)
  2. Having an explicite field but store the number of parent in the bit fields (since we never have more than 2 parents)
  3. Using the successors field. Having negative number of successors mean it is a prune.

Option (3) is the most space saving but prevent use to store parent information for more changesets if needed in the future (We do not have a final exchange plan yet).

Option (1) and (2) takes 2 to 8 bits more than (3) but are more flexible.

bit field

If we extend the bit field to 2 Bytes, it makes sense to use option (2) for storing parent.

proposed V2 Format

The P number would be hidden in the bit field. We need to store 4 possible values here: 0 parents, 1 parent, 2 parents, ø parents information stored. Possible assignement is 00, 01, 10, 11. this let both 0 parent and ø parent info to be 0 module 3.

V3 (future) format


proposed changes

Architecture overview

docket file

The principle and content is similar to the persistent Nodemap Docket.

The docket is updated at each transaction (that adds obsmarker)

index file

The index file is mostly similar to the persistent nodemap. A radix tree allow lookup using node prefix. However the data stored is a bit different. The nodemap store a revision number, while the obstore-index store a triplet of addresses of one obsmarker block within the data file. The three value stored address:

Same as for persistent nodemap, once a non-ambiguous prefix has been found, we need to validate that it match the full node we are looking for. To do so, we will have to check the node stored in one of the pointed markers.

Same as for persistent nodemap, after a transaction, the necessary new radix-tree block are appended to the file. When the dead-block / all-block ratio becomes too high, the index file is rewritten from scratch and a new ID is used.

data file

The data file contains marker blocks. A marker block starts with a triplet (or more likely tuple) of other block address (or None), then an obsmarker. The address in the initial triplet (more likely tuple) point to:

CategoryDeveloper CategoryEvolution

CEDObsstoreFormat (last edited 2020-09-25 08:48:48 by Pierre-YvesDavid)