This page describes the generation process of flashmob posts, that is, posts whose creation date and topic are driven by flashmob events.
Broadly speaking, the FlashmobTagDictionary class picks a random subset of tags and assigns them a random date uniformly distributed. The selected tags may not be unique, that is, the same tag can be selected multiple times. We will refer to each instance of a tag as a flashmob tag. Along with the tag id, two other values per flashmob tag are generated: a uniformly distributed random date, and a level whose distribution follows a power law (low relevance events will be more frequent). The date indicates where the spike of activity regarding this flashmob tag will be located, while the level indicates the relevance of that flashmob tag.
For instance, we may have three flashmob tags or instances, each of them represented as a tuple <tag, date>, as follows: <"Barack_Obama",100,2>, <"Barack_Obama",500,10> and <"Steven_Seagal", 250,5>.
Given a user, we select a set of flashmob tags from the FlashmobTagDictionary. This set of flashmob tags represents the flashmob events a user migh be involved. The selection is guided by the actual interests of the user, as well as some random factor which is affected by the level of the flashmob tag. Once we have the flashmob tags selected, we proceed to generate the posts for these tags. The number of posts or activity level of a user regarding a flashmob tag depends on the level of the flashmob tags. The larger the level, the larger the number of posts. Each post is assigned a date. The date is generated following the following distribution, centered at the date of the flashmob tag. Currently, the flashmob tag span is hardcoded to 72 hours. The distribution function is taken from the following paper: Meme-tracking and the dynamics of the news cycle. Basically, is a combination of a logarithmic and an exponential function.
The posts for groups are generated in a similar way.
In the param.ini file, there are several new parameters to configure the flashmob post generation.
- probInterestFlashmobTag: Specifies the probability that a user, and a flashmob tag the user is interested in, to post to that flashmob tag.
- flashmobTagMinLevel: Specifies the minimum level a flashmob tag can have. (This is an integer greater than 0 )
- flashmobTagMaxLevel: Specifies the maximum level a flashmob tag can have. (This is an integer greater than 0 )
- flashmobTagDistExp: Specifies the exponent of the power law distribution of the flashmob tag levels.
- probRandomPerLevel: Specifies the probability a random flashmob tag is taken by a user to post about it. This is multiplied withthe flashmob tag level in order to calculate the final probability. Therefore, the larger the level, the larger the probability
- postPerLevelScaleFactor: Specifies the maximum number of post a user will post regarding a flashmob tag, per level. This is multiplied with the flashmob tag level in order to compute the final maximum num posts.
- flashmobTagsPerMonth: Specifies density of flashmob tags in a per month basis.
Example of the post distribution for tag with id number 6, with and without flashmob post generation. In this case, with the default parameters, 3 flashmob tags for tag 6 are generated. One around 4000, another one around hour 9000, and a high level one around hour 17000.
WITHOUT FLASHMOB POSTS
WITH FLASHMOB POSTS