This section records some design and implementation details.
The basic relationships of SAX and DOM is shown in the following UML diagram.
The core of the relationship is the
Handler concept. From the SAX side,
Reader parses a JSON from a stream and publish events to a
Writer implements the
Handler concept to handle the same set of events. From the DOM side,
Document implements the
Handler concept to build a DOM according to the events.
Value supports a
Value::Accept(Handler&) function, which traverses the DOM to publish events.
With this design, SAX is not dependent on DOM. Even
Writer have no dependencies between them. This provides flexibility to chain event publisher and handlers. Besides,
Value does not depends on SAX as well. So, in addition to stringify a DOM to JSON, user may also stringify it to a XML writer, or do anything else.
Both SAX and DOM APIs depends on 3 additional concepts:
Stream. Their inheritance hierarchy is shown as below.
Value (actually a typedef of
GenericValue<UTF8<>>) is the core of DOM API. This section describes the design of it.
Value is a variant type. In RapidJSON's context, an instance of
Value can contain 1 of 6 JSON value types. This is possible by using
Value contains two members:
union Data data_ and a
unsigned flags_. The
flags_ indiciates the JSON type, and also additional information.
The following tables show the data layout of each type. The 32-bit/64-bit columns indicates the size of the field in bytes.
|Pointer to the string (may own)||4||8|
|Length of string||4||4|
|Pointer to array of members (owned)||4||8|
|Number of members||4||4|
|Capacity of members||4||4|
|Pointer to array of values (owned)||4||8|
|Number of values||4||4|
|Capacity of values||4||4|
|32-bit signed integer||4||4|
|32-bit unsigned integer||4||4|
|64-bit signed integer||8||8|
|64-bit unsigned integer||8||8|
|Double precision floating-point||8||8|
Here are some notes:
SizeTypeis typedef as
Intis always an
Int64, but the converse is not always true.
flags_ contains both JSON type and other additional information. As shown in the above tables, each JSON type contains redundant
kXXXFlag. This design is for optimizing the operation of testing bit-flags (
IsNumber()) and obtaining a sequential number for each type (
String has two optional flags.
kCopyFlag means that the string owns a copy of the string.
kInlineStrFlag means using Short-String Optimization.
Number is a bit more complicated. For normal integer values, it can contains
kUint64Flag, according to the range of the integer. For numbers with fraction, and integers larger than 64-bit range, they will be stored as
Kosta provided a very neat short-string optimization. The optimization idea is given as follow. Excluding the
Value has 12 or 16 bytes (32-bit or 64-bit) for storing actual data. Instead of storing a pointer to a string, it is possible to store short strings in these space internally. For encoding with 1-byte character type (e.g.
char), it can store maximum 11 or 15 characters string inside the
|MaxChars - Length||1||1|
A special technique is applied. Instead of storing the length of string directly, it stores (MaxChars - length). This make it possible to store 11 characters with trailing
This optimization can reduce memory usage for copy-string. It can also improve cache-coherence thus improve runtime performance.