Validating Models for Online Social Networks

Abstract: Several network models have been proposed to explain the link structure observed in online social networks. Such models are usu- ally justified by showing that they can replicate certain observed features of real networks, such as a power law degree distribution, a high degree of clustering, and the small world property. This talk will addresses the problem of choosing the model that best fits a given real world network. We implement a model selection method based on un-supervised learning. An alternating decision tree is trained using synthetic graphs generated according to each of the models under consideration. Graphs are represented as feature vec- tors incorporating the frequency counts of all connected subgraphs on 3 and 4 vertices as well as features describing the degree distribu- tion and small world property. We show that the subgraph counts alone are suffcient in separating the training data. For the test data, we take four Facebook graphs from various American Univer- sities. Our results show that models incorporating some form of the preferential attachment mechanism tend to perform the best.