Skip to end of metadata
Go to start of metadata

DBGEN

IDBeginEndNameLeadDescriptionResultDepStatus
TG116/Jan23/JanQuantization of populationUPC

This task consists in quantizing the populations of countries, so they can become more predictable and stable. The goal is to reduce the variation of the execution times when different workloads with the same parameters are executed.

The result is a new version of the generator where the populations of the countries have been quantized, so the generated data respects this new populations.

 DONE. To be validated by OGL.
TG223/Jan20/MarQuantization of the rest of characteristicsUPC

Like the previous task, this task consists in quantizing the rest of characteristics of the graph, if any, for the same goal as exposed before. For example, the number of Friends (still not approved).

Similar to the previous task, but with the rest of the characteristics Changes are going to be developed as they are a requirement. Some of the potential new quantizations will depend on substitution parameters.
TG36/Feb13/FebProposal of changes for Flashmob post generationUPC

The goal of this task is to first, do some research about how the generation of posts correlated with fhaslmob events in the real world can be implemented.

The result of this task is a proposal of changes to the generator. DONE. There is a wiki page explaining the generation process.
TG4 13/Feb27/FebImplementation of Flashmob post generationUPC This task consists in implementing the proposal of the previous task.

The result is a new version of the generator, where the changes proposed in the previous task have been implemented, and hence the posts data generated are correlated with flashmob events.

TG3

DONE. There have been some validation of the results produced performed by Arnau Prat.

 

TG912/Mar20/MarDeterministic Implementation of determinismThe generator have to be deterministic independently of the number of machines used.  
TG520/Mar26/Mar Update streamsUPC

Implementation of the update stream workload.

A new version of the generator, where the update streams (or the data required) are generated along with the current data.

TQ4 
TG626/Mar27/MarMetadata generationUPC

Implement the corresponding changes into the generator, so the metadata needed by the workload generator to generate the workload is provided for a given data set.

A new version of the generator which produces the metadata required by the workload generator.

 DISCARDED. The workload generators will use their own metadata and setup files.
TG727/Mar30/MarDocumentationUPCFinish all the documentation regarding the data generation process.A polished and complete documentation of the generator.  
TG8--New IDs (URIs) with timestamp prefixUPCAll entities with creationDate (e.g. Person, Post, ...) will have new IDs (URIs) which are the concatenation of the creationDate and the current IDNew long IDs (URIs) that can be sorted by creation date  
TG1001/Abr02/AbrFacebook-likeUPCImplementation of a facebook-like degree distribution.The data generator produces a degree distribution similar to that expected in facebook.  

Interactive Workload & QGEN

IDBeginEndNameLeadDescriptionResultDepStatus
TQ112/Feb1/Mar

Scale factors for the interactive query set

VUA

This task will identify the scale factors for the interactive benchmark such as the number of users, number of years, the required space for storage and the number of clients ... Regarding the transactions in the interactive queries, this task defines the ratio between the transaction load presented to the SUTs, the cardinality of the tables accessed by the transactions, ...

The basic unit of scaling and scaling factors are defined.  
TQ2.19/Feb21/FebFinish Interactive workloadOGL12 queries done 8 queries to go, will be delivered in both SPARQL and SQL. We expect to contribute more to the interactive mix, specifically 2 more short queries and some enlargement on updates involving some precomputation, otherwise updates will be insignificant in the interactive workload.
   
TQ2.29/Feb30/MarPrototype of BI workloadOGL14 queries   
TQ3 30/MarDetermining the SNB interactive mix ratiosOGLThis depends in part on TUM’s implementation of the update driver but will proceed in part without.  As earlier suggested, we aim at 5% update, 60% short and 35% long queries in the update mix and using a Virtuoso SQL based  implementation we will set  the frequencies so that this mix is obtained.  After this we will be looking for feedback towards adjusting the mix components.Query probabilities in the interactive mix are defined  
TQ431/Jan30/MarUpdate driverTUMThe objective of the task is to design the driver that would generate transactional workload for the interactive SN benchmark.Walking skeleton of the update driver.  
TQ510/Mar21/MarInteractive benchmark execution rules & metricsVUA

This task will define defines the execution rules and the methods for calculating the benchmark metric. This may need to learn the rules and metrics from TPC?C and TPC?E.

Execution rules and metrics are defined

 (Renzo) Draft of the Specification (April 16).
TQ67/Feb30/MarSubstitution parametersVUAThe objective of this task is to define methods for selection and generation of test data for the interactive queries. Test data is the data used to replace the substitution parameters in the query templates and create the instance queries. The selected test data must ensure that the instance queries are comparable in the sense of having similar execution complexity.Methods for selecting test data for the interactive queries The problem of selecting test data is being investigated from two points of view (Andrey/Peter and Renzo).
TQ72/Mar15/MarInteractive benchmark validationVUA

This task will define the rules for validating the performance results (e.g., define the required precision of the output results)

Rules for validating the results are defined.  
TQ815/Mar30/MarPaper writingVUA

This period will be spent for writing and revising the paper about SNB benchmark that may be submitted to VLDB

An industrial paper submitted  
TQ923/Mar30/MarFinal documentation of the interactive query setNEO

Including description, parameters and results, validation setup, etc. (we have to be sure that the definition of the queries correspond with the sample queries and results in GitHub)

   
TQ10 30/MarSQL implementation  of the SIB workloadsOGLSQL queries will be used as a baseline for performance measures, while SPARQL queries will be used as examples and to generate the validation results.   
TQ11 30/MarIntegration of the BIBM driver and the new SIB update driverOGLThis should be minimal, as the BIBM query part should be an easy cut and paste into the update driver.   
TQ12 30/MarAnalysis of the execution and desired query plans for SIB interactive and BIOGL    

Documentation

IDBeginEndLeadNameDescriptionResultDepStatus
TD1 30/03NEODefinition and implementation of a common presentation/documentation style for all benchmarks    
TD2 30/03NEOFinal organization and documentation of tools in GitHub based on the previous task    
  • No labels