Monika Bhattacharjee, Moulinath Banerjee, George Michailidis.
Year: 2020, Volume: 21, Issue: 107, Pages: 1−59
We consider the problem of estimating the location of a single change point in a network generated by a dynamic stochastic block model mechanism. This model produces community structure in the network that exhibits change at a single time epoch. We propose two methods of estimating the change point, together with the model parameters, before and after its occurrence. The first employs a least-squares criterion function and takes into consideration the full structure of the stochastic block model and is evaluated at each point in time. Hence, as an intermediate step, it requires estimating the community structure based on a clustering algorithm at every time point. The second method comprises the following two steps: in the first one, a least-squares function is used and evaluated at each time point, but ignoring the community structure and only considering a random graph generating mechanism exhibiting a change point. Once the change point is identified, in the second step, all network data before and after it are used together with a clustering algorithm to obtain the corresponding community structures and subsequently estimate the generating stochastic block model parameters. The first method, since it requires knowledge of the community structure and hence clustering at every point in time, is significantly more computationally expensive than the second one. On the other hand, it requires a significantly less stringent identifiability condition for consistent estimation of the change point and the model parameters than the second method; however, it also requires a condition on the misclassification rate of misallocating network nodes to their respective communities that may fail to hold in many realistic settings. Despite the apparent stringency of the identifiability condition for the second method, we show that networks generated by a stochastic block mechanism exhibiting a change in their structure can easily satisfy this condition under a multitude of scenarios, including merging/splitting communities, nodes joining another community, etc. Further, for both methods under their respective identifiability and certain additional regularity conditions, we establish rates of convergence and derive the asymptotic distributions of the change point estimators. The results are illustrated on synthetic data. In summary, this work provides an in-depth investigation of the novel problem of change point analysis for networks generated by stochastic block models, identifies key conditions for the consistent estimation of the change point, and proposes a computationally fast algorithm that solves the problem in many settings that occur in applications. Finally, it discusses challenges posed by employing clustering algorithms in this problem, that require additional investigation for their full resolution.