SNTF Concall 2013/08/15

Participants: Peter, Larri, Orri, Alex
Lead: Renzo

- Report of the work done in the last weeks.

Renzo proposes the following tasks for the next weeks:
- elaboration of the deliverables.
- to work in the design of the workload, in particular the definition of substitution parameters.

Orri presents their comments about the selection of substitution parameters.
He describes several technical issues and requirements for the data (e.g. correlations)

Alex report that he has developed a data checker for the CSV files of the data generator. It is very useful to check the consistency of the data.
He talks about the test-driver for Neo4j that he is developing and comment some issues about the substitution parameters.

Peter highlight that a special effort in the generation of input parameters is needed.
We need queries with a predictable behavior. Hence the test data must be selected carefully.
It could be a little difficult because the data is not uniform.
We need to separate the data in classes. Then the data generator can make a post-process to select the test data.

Larri: if we have deterministic generation, then the parameters can be selected in the same way (independent of the scale)

We must avoid different parameters like in BSBM.
The idea is to have something like the Start Schema benchmark.
The variants of the queries should be predictable.

Orri comments about data and substitution parameters in TPC-H.

Peter proposes to carry out a data analysis of the datasets, looking for special parameters for individual queries. The objective is to study the data to find interesting parameter bindings.

Larri: we can simplify the selection thinking in a real application where the parameter are "random".

Orri comments several issues about the analysis and generation of the substitution parameters.

Alex: for the BI workload we could have graphLab as partner in LDBC

Action points for next two weeks:
- Alex: implementation of data-checker and test-driver for Neo4j
- Renzo: design of the new features of the data generator
- Orri: analysis of the interactive workload and the substitution parameters (next monday)
- All: preparation of deliverables


