Skip to end of metadata
Go to start of metadata

 

These are the notes for the skype conference call held on Wednesday january 23 2013, on the progress and activities of the Social Network Benchmark task force (11:30-12:45).

Participants:

  • Larri, Miquel, Norbert (UPC); Orri (OGL); Peter (VUA) ; Alex Averbuch (NEO)
  • Andrey (TUM) - absent

Summary:

Miquel explained that he would be leaving UPC next week and that Norbert is taking over the lead in the task force. Miquel explain that UPC has been studying the S3G2 generator, derived its schema, developed some scripts to convert RDF output to tabular files. Norbert did mention that in the future it is probably better to change the S3G2 with a direct option for tabular output.

The experiences are that the data in principle seems to comply with the expectations of UPC. Now the focus in on the SIB query sets, these need improvement. The work on creating an update stream for the moment is postponed, hence also these action points. Next immediate steps for UPC is to run a number of graph analysis algorithms on such datasets in DEX.

Andrey missed the call, hence the status of the choke point analysis is unclear. Orri mentioned that he recently did a full analysis of SQL and the choke points exposed by it. Peter mentions that while relational/TPC choke points are important, the social network/graph nature would make such a list still uncomplete. Peter agreed to have a 1:1 call soon with Andrey to catch up.

Peter argued for the inclusion in the third benchmark query set of tasks that would map to graph processing frameworks such as Pregel/Graphlab/Giraph/GreenMarl. Alex remarked that such graph programming is not the focus of NEO's neo4j product.

In order to better focus the discussions around benchmark design, Peter proposed to take a step back and attempt to write a goals and requirements document that would describe at least three benchmarks in terms of which kinds of systems its target, which kinds of scenarios, and the performance dimensions relevant for them.

Action Points:

  1. Alex to provide NEO feedback
    1. will comment on the SIB query sets (transactional, anlaytical, graph analysis)
    2. will try to provide NEO-typical queries that are not yet expressed (or expressable) in these, for instance involving state
    3. will try to approach NEO users that use NEO for social network data to obtain query examples
  2. Orri and Peter will create a high-level 'requirements and goals' documents for the social network benchmarks
    1. Orri provides input in email, Peters will refine and add to the wiki
       
  3. UPC to continue working with the S3G2 data ganerator and adapt it (document of work here)
    1. run existing DEX code to compute graph metrics
    2. provide pre-generated datasets with README to NEO
    3. together with Duc create a true 'scale factor' that allows to predictably generate a dataset of a certain size
  4. UPC to approach Accesso and Havas Media
    1. possibly include Peter in a call
    2. show the social network schema and query sets and ask for feedback
    3. try to obtain real datasets (Facebook, Twitter, etc)
    4. run existing DEX code to compute graph metrics  and compare with S3G2
  5. timestamps in S3G2 [ postponed ]
    1. Duc to explain the current situation (what timestamps are generated and with what constrains and correlations)
    2. Duc to explain and share his stream data generator
    3. UPC to enhance the stream generator to generate an update query workload
    4. UPC to design a mechanism to separate a S3G2 dataset in a snapshot and a subsequent stream of updates
  6. query choke points
    1. Andrey to modify the transactional queries so they include optionals and are affected by correlations (add more parameters)
      1. include work to devise a mechanism to "learn" similar parameter bindings of correlated parameters with the same selectivities
    2. Andrey to encode the analytical SIB queries in SPARQL
    3. providing more choke point ideas
      1. Orri offered to provide more, input from others also welcome (Peter?, Andrey/Thomas?, ...)
  • No labels