Graphs are ubiquitous data structures in numerous domains, such as social science (social networks), natural science (physical systems, and protein-protein interaction networks) and knowledge graphs. As new generalizations of traditional deep neural networks to graph structured data, Graph Neural Networks (or GNNs) have demonstrated the power in graph representation learning and have permeated numerous areas of science and technology. However, GNNs also inherited drawbacks of traditional deep neural networks, i.e., lacking interpretability. Moreover, the complexity of graph data introduces the scalability as a new limitation for GNNs because graph structured data are not independent. These drawbacks have raised tremendous concerns to adopt GNNs in many critical applications pertaining to fairness, privacy, and safety. Thus, this project aims to tackle the major drawbacks of GNNs and greatly enlarge their usability in critical applications. To achieve the research goal, this project systematically investigates advanced principles for scalable GNNs and new mechanisms to interpret GNNs. The proposed research extends the state-of-the-art GNNs to a new frontier, investigates original problems that entreat innovative solutions and paves the way for a new research endeavor effectively tame graph mining. As many real-world domains require scalable and interpretable graph mining techniques, the project has potential to benefit many real-world applications from various disciplines such as Computer Science, Social Science, Healthcare and Bioinformatics.
This project proposes novel principles and mechanisms for scalable and interpretable graph neural networks to facilitate the adoption of GNNs on critical domains, investigates associated fundamental research issues and develops effective algorithms. The project offers the first comprehensive investigation on these directions, and the designed novel methodologies and tasks will deepen our understanding on the inner working mechanisms of GNNs and contribute to real-world applications. The success of this project will be (1) New scalable and interpretable GNNs with state-of-the-art graph representation learning and predictive performance; (2) Theoretical analysis such as convergence and complexity; and (3) Open-source implementations of all key algorithms and frameworks. Disparate means are planned to disseminate the project and its findings, such as web enabled data and software repositories, books, journal and conference publications, special purpose workshops or tutorials, and industrial collaborations. The project can be effectively integrated to undergraduate and graduate courses as well as in student research projects.