Kronecker Graphs: An Approach to Modeling Networks
Jure Leskovec, Deepayan Chakrabarti, Jon Kleinberg, Christos Faloutsos, Zoubin Ghahramani.
Year: 2010, Volume: 11, Issue: 33, Pages: 985−1042
Abstract
How can we generate realistic networks? In addition, how can we do so with
a mathematically tractable model that allows for rigorous analysis of
network properties? Real networks exhibit a long list of surprising
properties: Heavy tails for the in- and out-degree distribution, heavy
tails for the eigenvalues and eigenvectors, small diameters, and
densification and shrinking diameters over time.
Current network models and generators either fail to match several of the above
properties, are complicated to analyze mathematically, or both. Here
we propose a generative model for networks that is both
mathematically tractable and can generate networks that have all the above
mentioned structural properties. Our main idea here is to use a
non-standard matrix operation, the Kronecker product, to generate
graphs which we refer to as "Kronecker graphs".
First, we show that Kronecker graphs naturally obey common network
properties. In fact, we rigorously prove that they do so. We also
provide empirical evidence showing that Kronecker graphs can effectively
model the structure of real networks.
We then present KRONFIT, a fast and scalable algorithm for fitting the
Kronecker graph generation model to large real networks. A naive approach
to fitting would take super-exponential time. In contrast, KRONFIT takes
linear time, by exploiting the structure of Kronecker matrix
multiplication and by using statistical simulation techniques.
Experiments on a wide range of large real and synthetic networks show that KRONFIT finds accurate parameters that very well mimic the properties of target
networks. In fact, using just four parameters we can accurately
model several aspects of global network structure.
Once fitted, the model parameters can be used to gain insights
about the network structure, and the resulting synthetic graphs can be
used for null-models, anonymization, extrapolations, and graph
summarization.