jump to main area
:::
A- A A+

Seminars

Community Recovery from Beyond-Pairwise Relational and Statistical Information

  • 2017-10-30 (Mon.), 10:30 AM
  • Recreation Hall, 2F, Institute of Statistical Science
  • Prof. I-Hsiang Wang
  • Department of Electrical Engineering, National Taiwan University

Abstract

Community detection is an important task across many disciplines, including data analysis, social networks, and computer science. With a data set consisting of a collection of entities belonging to different communities, the goal of community detection algorithms is to uncover the underlying community structures either by passively observing how the entities interact with one another, or actively taking measurements on the entities. One of the most widely used approach is graph clustering, which leverages the pairwise relational information to uncover the communities. The entities are modeled as vertices, and the pairwise relational information is modeled as (weighted) edges in the graph. However, in many scenarios, there is abundant beyond-pairwise relational and statistical information which can be further harnessed to obtain better community detection results. ??? In this talk, we explore two community detection methods that leverage beyond-pairwise relational information and statistical information respectively. For harnessing beyond-pairwise relational information, we cast it into a hypergraph clustering problem, where each hyperedge represents the relational information among the entities represented by the vertices in that hyperedge. A polynomial running time recovery algorithm is proposed to achieve the fundamental statistical limit of community detection on a hypergraph stochastic block model. For leveraging beyond-pairwise statistical information, we cast it into a data extraction problem where the recovery algorithm can actively take measurements on a group of entities, each of which records the histogram of community labels in that group. An explicit construction of the measured groups along with a linear running time recovery algorithm is proposed to achieve the fundamental limit on the total number of measurements required to recover all the hidden community labels. The fundamental limit is then extended to the setting of partial recovery with noisy measurements. ??? Parts of this talk are joint work with Shao-Lun Huang and my students Chung-Yi Lin, Wei-Ning Chen, I Chien, and Kuan-Yun Lee.

Update:
scroll to top