The scale factor of a SNB dataset is the number of simulated users. All dataset scales contain data for
three years of social network activity.
The validation scale is 10,000 users. Official SNB results may be published att this scale of the 30,000
user scale or a power of 10 multiple of either. For example, 100,000 , 300,000, 1,000,000, 3,000,000
and so forth.
The dataset is divided in a bulk loadable initial database population and an update stream. These are
generated by the SNB data generator. The data generator has options for splitting the dataset into any
number of files.
The update stream contains the latest 10% of the events in the simulated social network. These events
form a single serializable sequence in time. Some events will depend on preceding events, for example
a post must exist before a reply to the post is posted. The data generator guarantees that these are
separated by at least 2 minutes of simulation time.
The update stream may be broken into arbitrarily many substreams. Valid partitioning criteria are for
example the user performing the action, if any, or a round-robin division where each successive event
goes into a different substream. When two events occur in the same substream, their order must be
the same as in the original update stream.
Rationale: The authors are aware that the prevalent practice for online benchmarks is to tie the
reported throughput to the scale, e.g. max 12.5 tpmC per warehouse in TPC-C. The authors depart from
this practice here because with throughput tied to scale, test systems with interesting throughputs
rapidly become very expensive,, raising the entry barrier for publishing a result. Itt is thought that
scaling in buckets lowers the barrier of entry and reduces incentive to use hardware configurations that
would be unusual in a production environment.
Benchmark Schema and Data Model
SNB may be implemented with different data models, e.g. relational, RDF and different graph data
The reference schema is provided as RDFS and SQL. The data generator produces TTL syntax for RDF
and comma separated values for other data models.
A single attribute has a single data type, as follows:
- Identifier – This is an integer value foreign key or a URI in RDF. If this is an integer column, the
implementation data type should support at least 2^50 distinct values.
D – A datetime should support a date range from 0000 to 9999 in the year field, with a resolution of
no less than one second
- Short string - The string column for names may have a variable length and may have a
declared maximum length, e.g. 40 characters.
Long string – For example a post content may be a long string that is often short in the data
but may not declare a maximum length and must support data sizes f up to 1MB.
A single attribute in the reference schema may not be divided into multiple attributes in the target
A schema on the DBMS is optional. An RDF implementation for example may work without one. An RDF
implementation is allowed to load the RDF reference schema and to take advantage of the data type
and cardinality statements therein.
A relational or graph schema may specify system specific options affecting storage layout. These may
for example specify vertical partitioning. Vertical partitioning means anything from a column store
layout with per-column allocated storage space to use of explicit column groups. Any mix of row or
column-wise storage structures is allowed as long as this is declaratively specified data structure by data
structure. Data structure here means for example table or index.
Covering indices and clustered indices are allowed. If these are defined, then all replications of data
implied by these must be maintained statement by statement, i.e. each auxiliary data structure must be
consistent with any other data structures of the table after each data manipulation operation.
A covering index is an index which materializes a specific order of a specific subset or possibly all
columns of a table. A clustered index is an index which materializes all columns of a table in a specific
order, which order may or may not be that of the primary key of the table. A clustered or covering index
may be the primary or only representation of a table.
Any subset of the columns on a covering or clustered index may be used for ordering the data. A hash
based index or a combination of a hash based and tree based index are all allowed, in row or column-
wise or hybrid forms.
Implementation Language and Data Access Transparency
The queries and updates may be implemented in a declarative query language or as procedural code
using an API.
If a declarative query language is used, e.g. SPARQL or SQL, then explicit query plans are prohibited in
all the read-only queries. The update transactions may still consist of multiple statements,, effectively
amounting to explicit plans.
Explicit query plans include but are nt limited to:
- Directives or hints specifying a join order or join type
- Directives or hints specifying an access path, e.g. which index to use
- Directive or hints specifying an expected cardinality, selectivity, fan-out or any other information
that pertains to the expected number or results or cost of all or part of the query.
Auxiliary Data Structures and Precomputation
Q2 retrieves for a person the latest n posts or comments posted by a contact of the person.
An implementation may choose to precompute this ‘ top of the wall’ query.
If doing so, inserting any new post or comment will add the item in question to the materialized top k
post views of each of the contacts of the person. If after insertion, this list were to be longer than 20
items, the transaction will delete the oldest item.
If this precomputation is applied, the update of the ‘top of the wall’ materialization of the users
concerned must be implemented as a single transaction.
This precomputation may be implemented as client side logic in the test driver, as stored procedures or
as triggers. In all cases the operations, whether one or many, must constitute a single transaction. A
SPARQL protocol operation consisting of multiple statements may be a valid implementation of the SUT
executes the statements as a single transaction.
Other precomputation of query results is explicitly prohibited.
Components of Benchmark Execution
A benchmark execution is divided into the following steps:
- Data Preparation – This includes running the data generator, placing the generated files in
a staging area, configuring storage, setting up the SUT configuration and preparing any data
partitions in the SUT. This may include preallocating database space but may not include
loading any data or defining any schema having to do with the benchmark.
- Bulk Load – This includes defining the database schema, if any, loading the initial database
population, making this durably stored, gathering any optimizer statistics,. The bulk load time
is reported and is equal to the amount of elapsed wall clock time between starting the schema
definition and receiving the confirmation message of the end of statistics gathering. T
- Benchmark Run – The run begins after the bulk load or after another benchmark run. If the run
does not directly follow the bulk load, it must start at a point in the update stream that has not
previously been played into the database. In other words, a run may only include update events
whose timestamp is later than the latest post creation date in the database prior to start of run.
The run starts when the first of the test drivers sends its first message to the SUT. If the SUT is
in-process with the driver the window starts when the driver starts.
- Measurement Window – The measurement window is the timed portion of the benchmark run.
It may begin at any time during the run. The activity during the measurement window must
meet the criteria set forth in Query Mix and must include enough updates to satify the criteria
in Minimum Measurement Window. The measurement window is terminated at the discretion
of the test sponsor at any time when the Minimum Measurement Window criteria are met. All
the processes constituting the SUT are to be killed at the end of the window or alternatively all
the hardware components of the SUT are to be powered off.
- Recovery Test – The SUT is to be restarted after the measurement window and the auditor will
verify that the SUT contains the entirety of the last update recorded by the test driver(s) as
A checkpoint is defined as the operation which causes data persisted in a transaction log to become
durable outside of the transaction log. In specific this means that a SUT restart after instantaneous
failure following the completion of the checkpoint may not have recourse to transaction log entries
written before the end of the checkpoint.
A checkpoint typically involves a synchronization barrier at which all data committed prior too the
moment is required to be in durable storage htat does not depend on the transaction llog.
Nott all DBMS’s use a checkpointing mechanism for durability. For example a system may rely on
redundant storage of data for durability guarantees against instantaneous failure of a single server.
The measurement window may contain a checkpoint. If the measurement window does not contain
one, then the restart test will involve redoing all the updates in the window as part of the recovery test.
The base unit of work is the insertion of one post.
For each post inserted, the following number of other operations must be completed:
For a run to qualify the number of successfully executed operation must not deviate from the above
frequencies by more than 1%.
Minimum Measurement Window
The measurement window starts at the point in simulation time given by the post with the highest
creation date in the timestamp. The update stream must contain at least 35 days worth of events in
simulation time that are in the future from the datetime of the latest pre-run post in the database.
The minimal measurement window corresponds to 30 days worth of events in simulation time, as
generated by the data generator. Thus the number of events in the minimal window is linear to the
scale factor. And grows the database by approximately 3%.
When the test sponsor decides to start the measurement window, the time of the latest new post event
successfully completed by any of the test drivers is taken to be the start of the measurement window in
simulation time. The window may be terminated by the test sponsor at any time when the timestamp
of the latest new post event in the log of any of the test drivers is more than 30 days of simulation time
in the future of the starting timestamp.
The test summary tool (see below) may be used for reading the logs being written by a test driver.
A qualifying run must use the SNB test driver provided with the data generator. The test driver may
be modified by the test sponsor for purposes of interfacing to the SUT. The parameter generating and
result recording and workload scheduling parts of the test driver should not be changed. The auditor
needs to have access to the test driver source code used for producing the driver used in the reported
The test driver produces the following artifacts as a by product of the run:
Start and end time of each execution in real time, recorded with microsecond precision, including the
identifier of the operation and any substitution parameters. If the operation is an update taken from
the simulation timeline then the timestamp in simulation time is also recorded.
The test driver is scale-out capable. Many instances of the test driver may be used in a test run. The
number and configuration of the test drivers must be disclosed, along with hardware details of the
platform running the driver(s), together with details of the network interface connecting the drivers
too the SUT. The SUT hardware may also be used for hosting the driver(s), at the ddiscretion of the test
A separate test summary tool provided with the test driver analyzes the test driver log(s) after a
measurement window is completed.
The tool produces for each of the distinct queries and transactions the following summary:
- Count of executions
- Start and end date of the window in real time and in simulation time.
- Metric in operations per second at scale.
- Number of test drivers
- Number of database sessions (threads) per test driver
percentile/maximum execution time.
The interactive workload requires full ACID support from the SUT.
All the updates in a transaction must either take place or be all cancelled.
If a database object, e.g. table, has auxiliary data structures, e.g. indices, the content of these must be
consistent after the commit or rollback of a transaction. If multiple clientt application threads share
one transaction context, these may transiently see inconsistent states, e.g. there may be a time when an
insert of a row is reflected in one index of a table but not in another.
If a transaction reads the database with intent to update, the DBMS must guarantee that repeating the
same read within the same transaction will return the same data. This also means that no more and
no less data rows must be returned. In other words, this corresponds to snapshot or to serializable
isolation. This level of isolationis applied for the operations where the transaction mix so specifies.
If the database is accessed without transaction context or without intent too update, then the DBMS
should provide read committed semantics, e.g. repeating the same read may produce different results
but these results may never include effects of pending uncommitted transactions.
The effects of a transaction must be made durable against instantaneous failure before the SUT
confirms the successful commit of a transaction to the application.
For systems using a transaction log, this implies syncing the durable media of the transaction log before
confirming success to the application.. This will typically entail group commit where transactions that
fall in the same short window are logged together and the logging device will typically be an SSD or
battery backed RAM on a storage controller. For systems using replication for durability, this will entail
receipt of a confirmation message from the replicating party before confirming successful commit to
The full disclosure report consists of the following:
- Name and contact information of the test sponsor
- Description of the DBMS
- Description of the SUT hardware
- Description of additional resources used for driving the test, if separate from the SUT hardware
- 3 years total cost of ownership for the SUT, including hardware and software
- Metric and numerical quantities summary produced by the test driver summary function
- Attestation letter by the auditor
- Supporting Files
The Supporting Files is a file archive (e.g. tar or zip) with the following:
- Complete configuration files for the DBMS
- Configuration files for the test driver(s).
- Any operating system tuning parameters with non-default values in effect at the time of the run.
For example sysctl.conf non Linux.
- Copy of the test driver log(s) for the run.
- Complete modifications of the test driver source code, if any