geometric deep learning

99

Upload: petteri-teikari-phd

Post on 23-Jan-2018

1.175 views

Category:

Technology


2 download

TRANSCRIPT

Page 2: Geometric Deep Learning

Structure of the presentation

WhatHigh-level overview on emerging field of geometric deep learning (and graph deep learning)

HowPresentation focused on startup-style organizations with everyone doing a bit of everything, everyone needing to understand a bit of everything. CEO cannot be the ‘idea guy’ not knowing anything about graphs and geometric deep learning, if you are operating in this space

EFFECTUATION – THE BEST THEORY OF ENTREPRENEURSHIP YOU ACTUALLY FOLLOW, WHETHER YOU’VE HEARD OF IT OR NOT

by Ricardo dos Santos

Page 3: Geometric Deep Learning

Brief Review ofGeometric Deep Learning

Page 4: Geometric Deep Learning

Geometric Deep Learning #1Bronstein et al. (July 2017): “Geometric deep learning (http://geometricdeeplearning.com/) is an umbrella term for e merging techniques attempting to generalize (structured) deep neural models to non-Euclidean domains, such as graphs and manifolds. The purpose of this article is to overview different examples of geometric deep-learning problems and present available solutions, key difficulties, applications, and future research directions in this nascent field”

SCNN (2013)

GCNN/ChebNet (2016)

GCN (2016)

GNN (2009)

Geodesic CNN (2015)

Anisotropic CNN (2016)

MoNet (2016)

Localized SCNN (2015)

Page 5: Geometric Deep Learning

Geometric Deep Learning #2Bronstein et al. (July 2017): “The non-Euclidean nature of data implies that there are no such familiar properties as global parameterization, common system of coordinates, vector space structure, or shift-invariance. Consequently, basic operations like convolution that are taken for granted in the Euclidean case are even not well defined on non-Euclidean domains.”

“First attempts to generalize neural networks to graphs we are aware of are due to Mori et al. (2005) who proposed a scheme combining recurrent neural networks and random walk models. This approach went almost unnoticed, re-emerging in a modern form in Suhkbaatar et al. (2016) and Li et al. (2015) due to the renewed recent interest in deep learning.”

“In a parallel effort in the computer vision and graphics community, Masci et al. (2015) showed the first CNN model on meshed surfaces, resorting to a spatial definition of the convolution operation based on local intrinsic patches. Among other applications, such models were shown to achieve state-of-the-art performance in finding correspondence between deformable 3D shapes. Followup works proposed different construction of intrinsic patches on point clouds Boscaini et al. (2016)a,b and general graphs Monti et al. (2016).”

In calculus, the notion of derivative describes how the value of a function changes with an infinitesimal change of its argument. One of the big differences distinguishing classical calculus from differential geometry is a lack of vector space structure on the manifold, prohibiting us from naïvely using expressions like f(x+dx). The conceptual leap that is required to generalize such notions to manifolds is the need to work locally in the tangent space.

Physically, a tangent vector field can be thought of as a flow of material on a manifold. The divergence measures the net flow of a field at a point, allowing to distinguish between field ‘sources’ and ‘sinks’. Finally, the Laplacian (or Laplace-Beltrami operator in differential geometric jargon)

“A centerpiece of classical Euclidean signal processing is the property of the Fourier transform diagonalizing the convolution operator, colloquially referred to as the Convolution Theorem. This property allows to express the convolution f⋆g of two functions in the spectral domain as the element-wise product of their Fourier transforms. Unfortunately, in the non-Euclidean case we cannot even define the operation x-x’ on the manifold or graph, so the notion of convolution does not directly extend to this case.

Page 6: Geometric Deep Learning

Geometric Deep Learning #3Bronstein et al. (July 2017): “We expect the following years to bring exciting new approaches and results, and conclude our review with a few observations of current key difficulties and potential directions of future research.”

Generalization: Generalizing deep learning models to geometric data requires not only finding non-Euclidean counterparts of basic building blocks (such as convolutional and pooling layers), but also generalization across different domains. Generalization capability is a key requirement in many applications, including computer graphics, where a model is learned on a training set of non-Euclidean domains (3D shapes) and then applied to previously unseen ones.

Time-varying domains: An interesting extension of geometric deep learning problems discussed in this review is coping with signals defined over a dynamically changing structure. In this case, we cannot assume a fixed domain and must track how these changes affect signals. This could prove useful to tackle applications such as abnormal activity detection in social or financial networks. In the domain of computer graphics and vision, potential applications deal with dynamic shapes (e.g. 3D video captured by a range sensor).

Computation: The final consideration is a computational one. All existing deep learning software frameworks are primarily optimized for Euclidean data. One of the main reasons for the computational efficiency of deep learning architectures (and one of the factors that contributed to their renaissance) is the assumption of regularly structured data on 1D or 2D grid, allowing to take advantage of modern GPU hardware. Geometric data, on the other hand, in most cases do not have a grid structure, requiring different ways to achieve efficient computations. It seems that computational paradigms developed for large-scale graph processing are more adequate frameworks for such applications.

Page 7: Geometric Deep Learning

Primer onGRAPHs

Taylor and Wrana (2012)doi: 10.1002/pmic.201100594

Page 8: Geometric Deep Learning

Graph theory especially useful for networks analysis

https://doi.org/10.1126/science.286.5439.509Cited by 29,071 articles

https://doi.org/10.1038/30918Cited by 33,772

Random rewiring procedure for interpolating between a regular ring lattice and a random network, without altering the number of vertices or edges in the graph.

http://www.bbc.co.uk/newsbeat/article/35500398/how-facebook-updated-six-degrees-of-separation-its-now-357

https://research.fb.com/three-and-a-half-degrees-of-separation/

http://slideplayer.com/slide/9267536/

Page 9: Geometric Deep Learning

Graph theory Common metrics and definitions

Graph-theoretic node importance mining on network topology - Xue et al. (2017)

The graph-theoretic node importance mining methods based on network topologies comprise two main categories: node relevance and shortest path. The method of node relevance is measured by degree analysis. The methods of shortest path that aim at finding optimal spreading paths are measured by several node importance analyses, e.g., betweenness, closeness centrality, eigenvector centrality, Bonacich centrality and alter-based centrality.

Betweenness is used particularly for measurements of power while closeness centrality and eigenvector centrality are used particularly for measurements of centrality. Bonacich centrality is an extension of eigenvector centrality which measures node importance on both centrality and power. The other mining methods for node importance based on network topologies included in this review are via processes such as node deleting, node contraction, and data mining and machine learning embedded techniques. For heterogeneous network structures, fusion methods integrate all the previously mentioned measurements.

28 February, 2013Google’s Knowledge Graph: one step closer to the semantic web?By Andrew Isidoro

Knowledge Graph, a database of over 570m of the most searched-for people, places and things (entities), including around 18bn cross-references.

The knowledge graph as the default data model for learning on heterogeneous knowledgeWilcke, Xandera; Bloem, Peterc; de Boer, VictorData Science, vol. Preprint, no. Preprint, pp. 1-19, 2017http://doi.org/10.3233/DS-170007

The FuhSen Architecture. High-level architecture comprising (a) Mediator and wrappers architecture to build the (b) knowledge graph on demand. The answer of a keyword query corresponds to an RDF subject-molecule that integrates RDF molecules collected from the wrappers. (c) The components to enrich the results KG.

FuhSen: A Federated Hybrid Search Engine for building a knowledge graph on-demandJuly 2016https://doi.org/10.1007/978-3-319-48472-3_47+ https://doi.org/10.1109/ICSC.2017.85researchgate.net

Page 10: Geometric Deep Learning

Ranking in time-varying complex networksRanking in evolving complex networksHao Liao, Manuel Sebastian Mariani, Matúš Medo, Yi-Cheng Zhang, Ming-Yang ZhouPhysics Reports Volume 689, 19 May 2017, Pages 1-54https://doi.org/10.1016/j.physrep.2017.05.001

Top: The often-studied Zachary’s karate club network has 34 nodes and 78 links (here visualized with the Gephi software). Bottom: Ranking of the nodes in the Zachary karate club network by the centrality metrics described in this section. Node labels on the horizontal axis correspond to the node labels in the top panel.

For the APS citation data from the period 1893–2015 (560,000 papers in total), we compute the ranking of papers according to various metrics— citation count c, PageRank centrality p (with the teleportation parameter α = 0.5), and rescaled PageRank R(p). The figure shows the median ranking position of the top 1% of papers from each year. The three curves show three distinct patterns. For c, the median rank is stable until approximately 1995; then it starts to grow because the best young papers have not yet reached sufficiently high citation counts. For p, the median rank grows during the whole displayed time period because PageRank applied on an acyclic time-ordered citation network favors old papers. By contrast, the curve is approximately flat for R(p) during the whole period which confirms that the metric is not biased by paper age and gives equal chances to all papers.

An illustration of the difference between the first-order Markovian (time-aggregated) and second-order network representation of the same data. Panels A–B represent the destination cities (the right-most column) of flows of passengers from Chicago to other cities, given the previous location (the left-most column). When including memory effects (panel B), the fraction of passengers coming back to the original destination is large, in agreement with our intuition. A similar effect is found for the network of academic journals

Page 11: Geometric Deep Learning

information diffusion introMany graphs can be modeled or used to predict how an information flows in the given graph.● How influential are with your Instagram posts, tweets, LinkedIn posts, etc?● How does tweet affect the stock market, or in more general terms, how can the causality be inferred from graph?● In practice, you see heat diffusion methods applied also applied to information diffusion

Random walks and diffusion on networksNaoki Masuda, Mason A. Porter, Renaud LambiottePhysics Reports (Available online 31 August 2017)https://doi.org/10.1016/j.physrep.2017.07.007

Fig. 12. The weary random walker retires from the network and heads off into the distant sunset. [This picture was drawn by Yulian Ng.].

Inferring networks of diffusion and influenceManuel Gomez Rodriguez, Jure Leskovec, Andreas KrauseKDD '10 Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mininghttps://doi.org/10.1145/1835804.1835933

There are several interesting directions for future work. Here we only used time difference to infer edges and thus it would be interesting to utilize more informative features (e.g., textual content of postings etc.) to more accurately estimate the influence probabilities. Moreover, our work considers static propagation networks, however real influence networks are dynamic and thus it would be interesting to relax this assumption. Last, there are many other domains where our methodology could be useful: inferring interaction networks in systems biology (protein-protein and gene interaction networks), neuroscience (inferring physical connections between neurons) and epidemiology. We believe that our results provide a promising step towards understanding complex processes on networks based on partial observations.

Page 12: Geometric Deep Learning

information diffusion Social Networks #1Nonlinear Dynamics of Information Diffusion in Social NetworksACM Transactions on the Web (TWEB) Volume 11 Issue 2, May 2017 Article No. 11https://doi.org/10.1145/3057741

Online Social Networks and information diffusion: The role of ego networksValerio Arnaboldi, Marco Conti, Andrea Passarella, Robin I.M. DunbarOnline Social Networks and Media 1 (2017) 44–55http://dx.doi.org/10.1016/j.osnem.2017.04.001

Data Driven Modeling of Continuous Time Information Diffusion in Social NetworksLiang Liu ; Bin Chen ; Bo Qu ; Lingnan He ; Xiaogang QiuData Science in Cyberspace (DSC), 2017 IEEEhttps://doi.org/10.1109/DSC.2017.103

Online Bayesian Inference of Diffusion NetworksShohreh Shaghaghian ; Mark Coates IEEE Transactions on Signal and Information Processing over Networks ( Volume: 3, Issue: 3, Sept. 2017 )https://doi.org/10.1109/TSIPN.2017.2731160

Modeling the reemergence of information diffusion in social networkDingda Yang, Xiangwen Liao, Huawei Shen, Xueqi Cheng, Guolong ChenPhysica A: Statistical Mechanics and its Applications [Available online 1 September 2017]http://dx.doi.org/10.1016/j.physa.2017.08.115

Information Diffusion in Online Social Networks: A SurveyAdrien Guille, Hakim Hacid, Cécile Favre, Djamel A. ZighedACM SIGMOD Volume 42 Issue 2, May 2013 Pages 17-28 https://doi.org/10.1145/2503792.2503797

Page 13: Geometric Deep Learning

information diffusion Social Networks #2Literature Survey on Interplay of Topics, Information Diffusion and Connections on Social NetworksKuntal Dey, Saroj Kaushik, L. Venkata Subramaniam(Submitted on 3 Jun 2017)https://arxiv.org/abs/1706.00921

Page 14: Geometric Deep Learning

information diffusion scientific citation networks #1Integration of Scholarly Communication Metadata Using Knowledge GraphsAfshin Sadeghi, Christoph Lange, Maria-Esther Vidal, Sören AuerInternational Conference on Theory and Practice of Digital LibrariesTPDL 2017: Research and Advanced Technology for Digital Libraries pp 328-341https://doi.org/10.1007/978-3-319-67008-9_26

Particularly, we demonstrate the benefits of exploiting semantic web technology to reconcile data about authors, papers, and conferences.

A Recommendation System Based on Hierarchical Clustering of an Article-Level Citation NetworkJevin D. West ; Ian Wesley-Smith ; Carl T. BergstromIEEE Transactions on Big Data ( Volume: 2, Issue: 2, June 1 2016 )https://doi.org/10.1109/TBDATA.2016.2541167http://babel.eigenfactor.org/

The scholarly literature is expanding at a rate that necessitates intelligent algorithms for search and navigation. For the most part, the problem of delivering scholarly articles has been solved. If one knows the title of an article, locating it requires little effort and, paywalls permitting, acquiring a digital copy has become trivial. However, the navigational aspect of scientific search - finding relevant, influential articles that one does not know exist - is in its early development

Big Scholarly Data: A SurveyFeng Xia ; Wei Wang ; Teshome Megersa Bekele ; Huan LiuIEEE Transactions on Big Data ( Volume: 3, Issue: 1, March 1 2017 )https://doi.org/10.1109/TBDATA.2016.2641460

ASNA - Academic Social Network Analysis

Page 15: Geometric Deep Learning

information diffusion scientific citation networks #2Implicit Multi-Feature Learning for Dynamic Time Series Prediction of the Impact of InstitutionsXiaomei Bai ; Fuli Zhang ; Jie Hou ; Feng Xia ; Amr Tolba ; Elsayed ElashkarrIEEE Access ( Volume: 5 )https://doi.org/10.1109/ACCESS.2017.2739179

Predicting the impact of research institutions is an important tool for decision makers, such as resource allocation for funding bodies. Despite significant effort of adopting quantitative indicators to measure the impact of research institutions, little is known that how the impact of institutions evolves in time

The Role of Positive and Negative Citations in Scientific EvaluationXiaomei Bai ; Ivan Lee ; Zhaolong Ning ; Amr Tolba ; Feng XiaIEEE Access ( Volume: PP, Issue: 99 )https://doi.org/10.1109/ACCESS.2017.2740226

Predicting the impact of research institutions is an important tool for decision makers, such as resource allocation for funding bodies. Despite significant effort of adopting quantitative indicators to measure the impact of research institutions, little is known that how the impact of institutions evolves in time

Recommendation for Cross-Disciplinary Collaboration Based on Potential Research Field DiscoveryWei Liang ; Xiaokang Zhou ; Suzhen Huang ; Chunhua Hu ; Qun JinAdvanced Cloud and Big Data (CBD), 2017https://doi.org/10.1109/CBD.2017.67

The cross-disciplinary information is hidden in tons of publications, and the relationships between different fields are complicated, which make it challengeable recommending cross-disciplinary collaboration for a specific researcher.

Petteri: Whether to recommend “outliers” i.e. unexpected combinations of fields, or something outside your field that would be useful to you. Or just the typical landmark papers of your field? Depends on your needs for sure.

https://iris.ai/

http://www.bibblio.org/learning-and-knowledge

In the future, we will further explore the relationships between the impact of institutions and the features driving the impact of institutions change to enhance the prediction performance. In addition, this work is conducted only on literatures from the eight top conferences based on Microsoft Academic Graph (MAG), dataset, examining other conferences for the same observed patterns could widen the significance of our findings.

Page 16: Geometric Deep Learning

information diffusion Finance, Quant trading, decision makingInformation Diffusion, Cluster formation and Entropy-based Network Dynamics in Equity and Commodity MarketsStelios Bekiros , Duc Khuong Nguyen , Leonidas Sandoval Junior , Gazi Salah UddinEuropean Journal of Operational Research (2016)http://dx.doi.org/10.1016/j.ejor.2016.06.052

https://www.prowler.io/

https://www.causalitylink.com/

https://www.forbes.com/sites/antoinegara/2017/02/28/kensho-sp-500-million-valuation-jpmorgan-morgan-stanley/#6fe4bb0b5cbf

Technology that brings transparency to complex systemshttps://www.kensho.com/

Our platform uses artificial intelligence to discover, extract and index events, variables and relationships about markets, sectors, industries and equities. It absorbs news articles, analysts’ point-of view or equity-related materials as they are published. Save time and get ahead by letting AI do the repetitive reading for you. Focus on new knowledge.

Analysis of Investment Relationships Between Companies and Organizations Based on Knowledge GraphXiaobo Hu, Xinhuai Tang, Feilong TangIn: Barolli L., Enokido T. (eds) Innovative Mobile and Internet Services in Ubiquitous Computing. IMIS 2017. Advances in Intelligent Systems and Computing, vol 612https://doi.org/10.1007/978-3-319-61542-4_20

A design for a common-sense knowledge-enhanced decision-support system: Integration of high-frequency market data and real-time newsKun Chen, Jian Yin, Sulin PangExpert Systems (June 2017) doi: 10.1111/exsy.12209

Compared with previous work, our model is the first to incorporate broad common-sense knowledge into a decision support system, thereby improving the news analysis process through the application of a graphic random-walk framework. Prototype and experiments based on Hong Kong stock market data have demonstrated that common-sense knowledge is an important factor in building financial decision models that incorporate news information.

Dynamics of financial markets and transaction costs: A graph-based studyFelipeLillo, RodrigoValdésResearch in International Business and Finance Volume 38, September 2016, Pages 455-465

Using financialization as a conceptual framework to understand the current trading patterns of financial markets, this work employs a market graph model for studying the stock indexes of geographically separated financial markets. By using an edge creation condition based on a transaction cost threshold, the resulting market graph features a strong connectivity, some traces of a power law in the degree distribution and an intensive presence of cliques.

Ponzi scheme diffusion in complex networksAnding Zhu, Peihua Fu, Qinghe Zhang, ZhenyueChenPhysica A: Statistical Mechanics and its Applications Volume 479, 1 August 2017, Pages 128-136https://doi.org/10.1016/j.physa.2017.03.015

Page 17: Geometric Deep Learning

“Intelligent knowledge graphs” with “actionable insights”Model-Driven Analytics: Connecting Data, Domain Knowledge, and LearningThomas Hartmann, Assaad Moawad, Francois Fouquet, Gregory Nain, Jacques Klein, Yves Le Traon, Jean-Marc Jezequel(Submitted on 5 Apr 2017)https://arxiv.org/abs/1704.01320

Gaining profound insights from collected data of today's application domains like IoT, cyber-physical systems, health care, or the financial sector is business-critical and can create the next multi-billion dollar market. However, analyzing these data and turning it into valuable insights is a huge challenge. This is often not alone due to the large volume of data but due to an incredibly high domain complexity, which makes it necessary to combine various extrapolation and prediction methods to understand the collected data. Model-driven analytics is a refinement process of raw data driven by a model reflecting deep domain understanding, connecting data, domain knowledge, and learning.

Page 18: Geometric Deep Learning

Graph theory example Applications beyond typical networksConstruction (BIM): “Graph theory based representation of building information models (BIM) for access control applications”Automation in Construction, Volume 68, August 2016, Pages 44-51https://doi.org/10.1016/j.autcon.2016.04.001

IFC 4 modelIFC-SPF format.

Medical Imaging (OCT): “Improving Segmentation of 3D Retina Layers Based on Graph Theory Approach for Low Quality OCT Images”Metrology and Measurement System Volume 23, Issue 2 (Jun 2016)https://doi.org/10.1515/mms-2016-0016

Dijkstra shortest path algorithm

Risk Assessment: “A New Risk Assessment Framework Using Graph Theory for Complex ICT Systems”MIST '16 Proceedings of the 8th ACM CCS1https://doi.org/10.1145/2995959.2995969

Biodiversity management: “Multiscale connectivity and graph theory highlight critical areas for conservation under climate change”Ecological Applications (8 June 2016)http://doi.org/10.1890/15-0925

Brain Imaging: ““Small World” architecture in brain connectivity and hippocampal volume in Alzheimer’s disease: a study via graph theory from EEG data”Brain Imaging and Behavior April 2017, Volume 11, Issue 2, pp 473–485 doi: 10.1007/s11682-016-9528-3

Small World trends in the two groups of subjects

Medical Imaging (OCT): “Reconstruction of 3D surface maps from anterior segment optical coherence tomography images using graph theory and genetic algorithms”Biomedical Signal Processing and ControlVolume 25, March 2016, Pages 91-98https://doi.org/10.1016/j.bspc.2015.11.004

Cybersecurity: “Big Data Behavioral Analytics Meet Graph Theory: On Effective Botnet Takedowns”IEEE Network ( Volume: 31, Issue: 1, January/February 2017 ) https://doi.org/10.1109/MNET.2016.1500116NM

Page 19: Geometric Deep Learning

Graph Signal Processing and quantitative graph theoryDefferrard et al. (2016): “The emerging field of Graph Signal Processing (GSP) aims at bridging the gap between signal processing and spectral graph theory [Shuman et al. 2013], a blend between graph theory and harmonic analysis. A goal is to generalize fundamental analysis operations for signals from regular grids to irregular structures embodied by graphs. We refer the reader to Belkin and Niyogi 2008 for an introduction of the field.”

Matthias Dehmer, Frank Emmert-Streib, Yongtang Shihttps://doi.org/10.1016/j.ins.2017.08.009

The main goal of quantitative graph theory is the structural quantification of information contained in complex networks by employing a measurement approach based on numerical invariants and comparisons. Furthermore, the methods as well as the networks do not need to be deterministic but can be statistic.

Shuman et al. 2013:Perraudin and Vandergheynst 2016:”the proposed Wiener regularization framework offers a compelling way to solve traditional problems such as denoising, regression or semi-supervised learning”

Experiments on the temperature of Molene. Top: A realization of the stochastic graph signal (first measure). Bottom center: the temperature of the Island of Brehat. Bottom right: Recovery errors (inpainting error) for different noise levels

Page 20: Geometric Deep Learning

Graph Fourier Transform GFTThe use of Graph Fourier Transform in image processing: A new solution to classical problemsVerdoja Francesco. PhD Thesis 2017https://doi.org/10.1109/ICASSP.2017.7952886

On the Graph Fourier Transform for Directed GraphsStefania Sardellitti ; Sergio Barbarossa ; Paolo Di LorenzoIEEE Journal of Selected Topics in Signal Processing ( Volume: 11, Issue: 6, Sept. 2017 )https://doi.org/10.1109/JSTSP.2017.2726979

The analysis of signals defined over a graph is relevant in many applications, such as social and economic networks, big data or biological networks, and so on. A key tool for analyzing these signals is the so called Graph Fourier Transform (GFT). Alternative definitions of GFT have been suggested in the literature, based on the eigen-decomposition of either the graph Laplacian or adjacency matrix. In this paper, we address the general case of directed graphs and we propose an alternative approach that builds the graph Fourier basis as the set of orthonormal vectors that minimize a continuous extension of the graph cut size, known as the Lovasz extension.

Graph-based approaches have recently seen a spike of interest in the image processing and computer vision communities, and many classical problems are finding new solutions thanks to these techniques. The Graph Fourier Transform (GFT), the equivalent of the Fourier transform for graph signals, is used in many domains to analyze and process data modeled by a graph.

In this thesis we present some classical image processing problems that can be solved through the use of GFT. We’ll focus our attention on two main research area: the first is image compression, where the use of the GFT is finding its way in recent literature; we’ll propose two novel ways to deal with the problem of graph weight encoding. We’ll also propose approaches to reduce overhead costs of shape-adaptive compression methods.

The second research field is image anomaly detection, GFT has never been proposed to this date to solve this class of problems; we’ll discuss here a novel technique and we’ll test its application on hyperspectral and medical (PET tumor scan) images

Page 21: Geometric Deep Learning

Graph signal Processing #1Adaptive Least Mean Squares Estimation of Graph SignalsPaolo Di Lorenzo ; Sergio Barbarossa ; Paolo Banelli ; Stefania SardellittiIEEE Transactions on Signal and Information Processing over Networks ( Volume: 2, Issue: 4, Dec. 2016 )https://doi.org/10.1109/TSIPN.2016.2613687

Distributed Adaptive Learning of Graph SignalsPaolo Di Lorenzo ; Sergio Barbarossa ; Paolo Banelli ; Stefania SardellittiIEEE Transactions on Signal Processing ( Volume: 65, Issue: 16, Aug.15, 15 2017 )https://doi.org/10.1109/TSP.2017.2708035

The aim of this paper is to propose a least mean squares (LMS) strategy for adaptive estimation of signals defined over graphs. Assuming the graph signal to be band-limited, over a known bandwidth, the method enables reconstruction, with guaranteed performance in terms of mean-square error, and tracking from a limited number of observations over a subset of vertices.

Furthermore, to cope with the case where the bandwidth is not known beforehand, we propose a method that performs a sparse online estimation of the signal support in the (graph) frequency domain, which enables online adaptation of the graph sampling strategy. Finally, we apply the proposed method to build the power spatial density cartography of a given operational region in a cognitive network environment.

“We apply the proposed distributed framework to power density cartography in cognitive radio (CR) networks. We consider a 5G scenario, where a dense deployment of radio access points (RAPs) is envisioned to provide a service environment characterized by very low latency and high rate access. Each RAP collects data related to the transmissions of primary users (PUs) at its geographical position, and communicates with other RAPs with the aim of implementing advanced cooperative sensing techniques”

“This paper represents the first work that merges the well established field of adaptation and learning over networks, and the emerging topic of signal processing over graphs. Several interesting problems are still open, e.g., distributed reconstruction in the presence of directed and/or switching graph topologies, online identification of the graph signal support from streaming data, distributed inference of the (possibly unknown) graph signal topology, adaptation of the sampling strategy to time-varying scenarios, optimization of the sampling probabilities, just to name a few. We plan to investigate on these exciting problems in our future works”

Page 22: Geometric Deep Learning

Graph signal Processing #2Kernel Regression for Signals over GraphsArun Venkitaraman, Saikat Chatterjee, Peter Händel(Submitted on 7 Jun 2017)https://arxiv.org/abs/1706.02191

Uncertainty Principles and Sparse Eigenvectors of GraphsArun Venkitaraman, Saikat Chatterjee, Peter HändelIEEE Transactions on Signal Processing ( Volume: 65, Issue: 20, Oct.15, 15 2017 )https://doi.org/10.1109/TSP.2017.2731299

We propose kernel regression for signals over graphs. The optimal regression coefficients are learnt using a constraint that the target vector is a smooth signal over an underlying graph. The constraint is imposed using a graph- Laplacian based regularization. We discuss how the proposed kernel regression exhibits a smoothing effect, simultaneously achieving noise-reduction and graph-smoothness. We further extend the kernel regression to simultaneously learn the underlying graph and the regression coefficients.

Our hypothesis was that incorporating the graph smoothness constraint would help kernel regression to perform better, particularly when we lack sufficient and reliable training data. Our experiments illustrate that this is indeed the case in practice.

Through experiments we also conclude that graph signals carry sufficient information about the underlying graph structure which may be extracted in the regression setting even with moderately small number of samples in comparison with the graph dimension. Thus, our approach helps both predict and infer the underlying topology of the network or graph.

When the graph has repeated eigenvalues we explained that s graph Fourier Basis (GFB) is not unique, and the derived lower bound can have different values depending on the selected GFB. We provided a constructive method to find a GFB that yields the smallest uncertainty bound. In order to find the signals that achieve the derived lower bound we considered sparse eigenvectors of the graph. We showed that the graph Laplacian has a 2- sparse eigenvector if and only if there exists a pair of nodes with the same neighbors. When this happens, the uncertainty bound is very low and the 2-sparse eigenvectors achieve this bound. We presented examples of both classical and real-world graphs with 2-sparse eigenvectors. We also discussed that, in some examples, the neighborhood structure has a meaningful interpretation.

Page 23: Geometric Deep Learning

Graph signal Processing #3 Time-varying graphsKernel-Based Reconstruction of Space-Time Functions on Dynamic Graphs Daniel Romero ; Vassilis N. Ioannidis ; Georgios B. GiannakisIEEE Journal of Selected Topics in Signal Processing ( Volume: 11, Issue: 6, Sept. 2017 )https://doi.org/10.1109/JSTSP.2017.2726976

Filtering Random Graph Processes Over Random Time-Varying Graphs Kai Qiu ; Xianghui Mao ; Xinyue Shen ; Xiaohan Wang ; Tiejian Li ; Yuantao GuIEEE Journal of Selected Topics in Signal Processing ( Volume: 11, Issue: 6, Sept. 2017 )https://doi.org/10.1109/JSTSP.2017.2726969

DSLR distributed least squares reconstruction LMS least mean-squaresKKF kernel Kalman filter ECoG electrocorticography NMSE cumulative normalized mean-square error

This paper investigated kernel-based reconstruction of space-time functions on graphs. The adopted approach relied on the construction of an extended graph, which regards the time dimension just as a spatial dimension. Several kernel designs were introduced together with a batch and an online function estimators. The latter is a kernel Kalman filter developed from a purely deterministic standpoint without any need to adopt any state-space model. Future research will deal with multi-kernel and distributed versions of the proposed algorithms.

Schemes tailored for time-evolving functions on graphs include [Bach and Jordan 2004] and [Mei and Moura 2016], which predict the function values at time t given observations up to time t − 1. However, these schemes assume that the function of interest adheres to a specific vector autoregression and all vertices are observed at previous time instances. Moreover, [Bach and Jordan 2004] requires Gaussianity along with an ad hoc form of stationarity.

However, many real-world graph signals are time-varying, and they evolve smoothly, so instead of the signals themselves being bandlimited or smooth on graph, it is more reasonable that their temporal differences are smooth on graph. In this paper, a new batch reconstruction method of time-varying graph signals is proposed by exploiting the smoothness of the temporal difference signals, and the uniqueness of the solution to the corresponding optimization problem is theoretically analyzed. Furthermore, driven by practical applications faced with real-time requirements, huge size of data, lack of computing center, or communication difficulties between two non-neighboring vertices, an online distributed method is proposed by applying local properties of the temporal difference operator and the graph Laplacian matrix.

In the future, we will further study the applications of smoothness of temporal difference signals, and may combine it with other properties of signals, such as low rank. Besides, it is also interesting to consider the situation where both the signal and the graph are time-varying.

Page 24: Geometric Deep Learning

Graph signal Processing #4 Time-varying graphsSignal Processing on Graphs: Causal Modeling of Unstructured DataJonathan Mei, José M. F. Moura(Submitted on 28 Feb 2015 (v1), last revised 8 Feb 2017 (this version, v6))https://arxiv.org/abs/1503.00173

Learning directed Graph Shifts from High-Dimensional Time SeriesLukas Nagel(June 2017)Master Thesis, Institute of Telecommunications (TU Wien)https://pdfs.semanticscholar.org/8822/526b7b2862f6374f5f950c89a14a7a931820.pdf

Many applications collect a large number of time series, for example, the financial data of companies quoted in a stock exchange, the health care data of all patients that visit the emergency room of a hospital, or the temperature sequences continuously measured by weather stations across the US. These data are often referred to as unstructured.

A first task in its analytics is to derive a low dimensional representation, a graph or discrete manifold, that describes well the interrelations among the time series and their intrarelations across time. This paper presents a computationally tractable algorithm for estimating this graph that structures the data. The resulting graph is directed and weighted, possibly capturing causal relations, not just reciprocal correlations as in many existing approaches in the literature. A convergence analysis is carried out. The algorithm is demonstrated on random graph datasets and real network time series datasets, and its performance is compared to that of related methods. The adjacency matrices estimated with the new method are close to the true graph in the simulated data and consistent with prior physical knowledge in the real dataset tested. Frequency ordering depending on the position of the

eigenvalues λ in C. Both graphics are from Sandryhaila and Moura 2014.

Causal graph signal process. Visualization of the information spreading through graph shifts for P3(A, c)

We want to apply the causal graph process estimation algorithm to stock prices and especially point out some additional points of failure we spotted.

In the shift matrix shown in Figure 4.9a, we observe that the stocks number 2, 16 and 24 have many incoming connections. It appears unlikely that this is due to some economic relations and points towards a numerical problem.

As we were interested in potential interpretations of the shift recovered from the stock data, we chose to visualize the largest possible directions of the shift shown in Figure 4.11 as a graph in Figure 4.12. The only observation we could draw from the graph is that there are multiple bank stocks, which affect multiple other stocks. Otherwise, the connected companies show no common ownership structure nor even similar or related products.

The stocks example with no clear expectation did not lead to promising results. Despite this, we described with scaling and averaging two processing steps that could be applied before starting the estimation algorithm. It is unclear if further tuning were needed or the domain of daily stock data cannot reasonably be modeled with causal graph processes, and we, therefore, leave this question open for future research.

Page 25: Geometric Deep Learning

Graph Wavelet transform vs. GFT #1Compression of dynamic 3D point clouds using subdivisional meshes and graph wavelet transformsAamir Anis ; Philip A. Chou ; Antonio OrtegaUniversity of Southern California, Los Angeles, CA; † Microsoft Research, Redmond, WAAcoustics, Speech and Signal Processing (ICASSP), 2016 IEEEhttps://doi.org/10.1109/ICASSP.2016.7472901

The subdivisional structure also allows us to obtain a sequence of bipartite graphs that facilitate the use of GraphBior [Narang et al. (2012)] to compute the wavelet transform coefficients of the geometry and color attributes.

Compact Support Biorthogonal Wavelet Filterbanks for Arbitrary Undirected GraphsSunil K. Narang, Antonio Ortega(Submitted on 30 Oct 2012 (v1), last revised 19 Nov 2012 (this version, v2))https://arxiv.org/abs/1210.8129

In this paper, we provide a framework for compression of 3D point cloud sequences. Our approach involves representing sets of frames by a consistently-evolving high-resolution subdivisional triangular mesh. This representation helps us facilitate efficient implementations of motion estimation and graph wavelet transforms. The subdivisional structure plays a crucial role in designing a simple hierarchical method for efficiently estimating these meshes, and the application of Biorthogonal Graph Wavelet Filterbanks for compression. Preliminary experimental results show promising performances of both the estimation and the compression steps, and we believe this work shall open new avenues of research in this emerging field.

Comparison of graph wavelet designs in terms of key properties: zero highpass response for constant graph-signal (DC), critical sampling (CS), perfect reconstruction (PR), compact support (Comp), orthogonal expansion (OE), requires graph simplification (GS).

In this paper we have presented novel graph-wavelet filterbanks that provide a critically sampled representation with compactly supported basis functions.

The filterbanks come in two flavors: a) nonzeroDC filterbanks, and b) zeroDC filterbanks. The former filterbanks are designed as polynomials of the normalized graph Laplacian matrix, and the latter filterbanks are extensions of the former to provide a zero response by the highpass operators.

Preliminary results showed that the filterbanks are useful not only for arbitrary graph but also to the standard regular signal processing domains. Extensions of this work will focus on the application of these filters to different scenarios, including, for example, social network analysis, sensor networks etc.

Page 26: Geometric Deep Learning

Graph Wavelet transform vs. GFT #2Bipartite Approximation for Graph Wavelet Signal DecompositionJin Zeng ; Gene Cheung ; Antonio OrtegaIEEE Transactions on Signal Processing ( Volume: 65, Issue: 20, Oct.15, 15 2017 )https://doi.org/10.1109/TSP.2017.2733489

Splines and Wavelets on Circulant GraphsMadeleine S. Kotzagiannidis, Pier Luigi Dragotti(Submitted on 15 Mar 2016)https://arxiv.org/abs/1603.04917

(a) Two-channel wavelet filterbank on bipartite graph; (b) Kernels of H

0, H

1

in graphBior Narang et al. (2012) with filter length of 19.

Unlike previous works, our design of the two metrics relates directly to energy compaction for bipartite subgraph decomposition. Comparison with the state-of-the-art schemes validates our proposed metrics for energy compaction and illustrates the efficiency of our approach. We are currently working on different applications of graphBior with our bipartite approximation, e.g., graph-signal denoising, which will benefit from the energy compaction in the wavelet domain.

In this paper, we have introduced novel families of wavelets and associated filterbanks on circulant graphs with vanishing moment properties, which reveal (e-)spline-like functions on graphs, and promote sparse multiscale representations.

Moreover, we have discussed generalizations to arbitrary graphs in the form of a multidimensional wavelet analysis scheme based on graph product decomposition, facilitating a sparsity-promoting generalization with the advantage of lower-dimensional processing. In our future work, we wish to further explore the sets of graph signals which can be annihilated with existing and/or evolved graph wavelets as well as refine its extensions and relevance for arbitrary graphs.

Page 27: Geometric Deep Learning

Graphlet induced subgraphs of a large networkEstimation of Graphlet StatisticsRyan A. Rossi, Rong Zhou, and Nesreen K. Ahmed(Submitted on 6 Jan 2017 (v1), last revised 28 Feb 2017 (this version, v2))https://arxiv.org/abs/1701.01772

Page 28: Geometric Deep Learning

Graph Computing Accelerations Parallel Local Algorithms for Core, Truss, and Nucleus DecompositionsAhmet Erdem Sariyuce, C. Seshadhri, Ali PinarSandia National Laboratories, University of California(Submitted on 2 Apr 2017)https://arxiv.org/abs/1704.00386

Finding the dense regions of a graph and relations among them is a fundamental task in network analysis. Nucleus decomposition is a principled framework of algorithms that generalizes the k-core and k-truss decompositions. It can leverage the higher-order structures to locate the dense subgraphs with hierarchical relations. … We present a framework of local algorithms to obtain the exact and approximate nucleus decompositions. Our algorithms are pleasingly parallel and can provide approximations to explore time and quality trade-offs. Our shared-memory implementation verifies the efficiency, scalability, and effectiveness of our algorithms on real-world networks. In particular, using 24 threads, we obtain up to 4.04x and 7.98x speedups for k-truss and (3, 4) nucleus decompositions.

Page 29: Geometric Deep Learning

P -Laplacian on graphsp-Laplacian Regularized Sparse Coding for Human Activity RecognitionWeifeng Liu ; Zheng-Jun Zha ; Yanjiang Wang ; Ke Lu ; Dacheng TaoIEEE Transactions on Industrial Electronics ( Volume: 63, Issue: 8, Aug. 2016 )https://doi.org/10.1109/TIE.2016.2552147

On the game p-Laplacian on weighted graphs with applications in image processing and data clusteringA. ELMOATAZ, X. DESQUESNES and M. TOUTAIN (3 July 2017) European Journal of Applied Mathematicshttps://doi.org/10.1017/S0956792517000122

In this paper, we have introduced a new class of normalized p-Laplacian operators as a discrete adaptation of the game-theoretic p-Laplacian on weighted graphs. This class is based on new partial difference operator which interpolate between normalized 2- Laplacian, 1-Laplacian and ∞-Laplacian on graphs. This operator is also connected to non-local average operators such as non-local mean, non-local median and non-local midrange. It generalizes the normalized p-Laplacian on graphs for 1 ≤ p ≤ . We have ∞shown the connections with local and non-local PDEs of p-Laplacian types and stochastic game Tug-of-War with noise (Peres et al. 2008). We have proved existence and uniqueness of the Dirichlet problem involving operators of this new class. Finally, we have illustrated the interest and behaviour of such operators in some inverse problems in image processing and machine learning.

The framework of human activity recognition. Firstly, we extract the representative features of human activity including SIFT, STIP and MFCC. Then we concatenate the histograms formed by bags of each feature. Thirdly, we learn the sparse codes of each sample and the corresponding dictionary simultaneously by p-Laplacian regularized sparse coding algorithm. Finally, we input the learned sparse codes into classifiers i.e. support vector machines to conduct human activity recognition.

As a sparse representation, the proposed p-Laplacian regularized sparse coding algorithm can also be employed for modern industry using data-based techniques [Jung et al. 2015; Shen et al. 2015] and other computer vision applications such as video summary and visual tracking [Bai and Li 2014; Yu et al. 2016].

In the future, we will apply the proposed p-Laplacian regularized sparse coding for more practical implementations. We will also study the extensions to the multiview learning and deep architecture construction for more attractive performance.

Sparse coding has achieved promising performance in classification. The most prominent Laplacian regularized sparse coding employs Laplacian regularization to preserve the manifold structure; however, Laplacian regularization suffers from poor generalization. To tackle this problem, we present a p-Laplacian regularized sparse coding algorithm by introducing the nonlinear generalization of standard graph Laplacian to exploit the local geometry. Compared to the conventional graph Laplacian, the p-Laplacian has tighter isoperimetric inequality and the p-Laplacian regularized sparse coding can achieve superior theoretical evidence.

Page 30: Geometric Deep Learning

“Applied Laplacian” Mesh processing #1A

Spectral Mesh ProcessingH. Zhang, O. Van Kaick, R. DyerComputer Graphics Forum 9 April 2010http://dx.doi.org/10.1111/j.1467-8659.2010.01655.x

Page 31: Geometric Deep Learning

Graph Framework for Manifold-valued Data image processingNonlocal Inpainting of Manifold-valued Data on Finite Weighted GraphsRonny Bergmann, Daniel Tenbrinck(Submitted on 21 Apr 2017 (v1), last revised 12 Jul 2017 (this version, v2))https://arxiv.org/abs/1704.06424open source code: http://www.mathematik.uni-kl.de/imagepro/members/bergmann/mvirt/

A Graph Framework for Manifold-valued DataRonny Bergmann, Daniel Tenbrinck(Submitted on 17 Feb 2017)https://arxiv.org/abs/1702.05293

Recently, there has been a strong ambition to translate models and algorithms from traditional image processing to non-Euclidean domains, e.g., to manifold-valued data. While the task of denoising has been extensively studied in the last years, there was rarely an attempt to perform image inpainting on manifold-valued data. In this paper we present a nonlocal inpainting method for manifold-valued data given on a finite weighted graph.

First numerical examples using a nonlocal graph construction with patch-based similarity measures demonstrate the capabilities and performance of the inpainting algorithm applied to manifold-valued images.

Despite an analytic investigation of the convergence of the presented scheme, future work includes further development of numerical algorithms, as well as properties of the -Laplacian for manifold-valued vertex ∞functions on graphs

Illustration of the basic definitions and concepts on a Riemannian manifold M.

In the following we present several examples illustrating the large variety of problems that can be tackled using the proposed manifold-valued graph framework. Furthermore, we compare our framework for the special case of nonlocal denoising of phase-valued data to a state-of-the-art method.

Finally, we demonstrate a real-world application from denoising surface normals in digital elevation maps from LiDAR data. Subsequently, we model manifold-data measured on samples of an explicitly given surface and in particular illustrate denoising of diffusion tensors measured on a sphere. Finally, we investigate denoising of real DT-MRI data from medical applications both on a regular pixel grid as well as on an implicitly given surface. All algorithm were implemented in Mathworks Matlab by extending the open source software package Manifold-valued Image Restoration Toolbox (MVIRT) .

Reconstruction results of measured surface normals in digital elevation maps (DEM) generated by light detection and ranging (LiDAR) measurements of earth’s surface topology.

Reconstruction results of manifold-valued data given on the implicit surface of the open Camino brain data set.

Page 32: Geometric Deep Learning

segmentation of graphs #1Convex variational methods for multiclass data segmentation on graphsEgil Bae, Ekaterina Merkurjev(Submitted on 4 May 2016 (v1), last revised 16 Feb 2017 (this version, v4))https://arxiv.org/abs/1605.01443 | https://doi.org/10.1007/s10851-017-0713-9

Theoretical Analysis of Active Contours on GraphsChristos Sakaridis, Kimon Drakopoulos, Petros Maragos(Submitted on 24 Oct 2016)https://arxiv.org/abs/1610.07381

Detection of triangle on a random geometric graph. Edges are omitted for illustration purposes. (a) Original triangle on graph (b)– (f) Instances of active contour evolution at intervals of 60 iterations, with vertices in the contour’s interior shown in red and the rest in blue (g) Final detection result after 300 iterations, using green for true positives, blue for true negatives, red for false positives and black for false negatives.

Experiments on 3D point clouds acquired by a LiDAR in outdoor scenes demonstrate that the scenes can accurately be segmented into object classes such as vegetation, the ground plane and regular structures. The experiments also demonstrate fast and highly accurate convergence of the algorithms, and show that the approximation difference between the convex and original problems vanishes or becomes extremely low in practice.

In the future, it would be interesting to investigate region homogeneity terms for general unsupervised classification problems. In addition to avoiding the problem of trivial global minimizers, the region terms may improve the accuracy compared to models based primarily on boundary terms. Region homogeneity may for instance be defined in terms of the eigendecomposition of the covariance matrix or graph Laplacian.

Page 33: Geometric Deep Learning

segmentation of graphs #2: Scalable Motif-aware Graph ClusteringCE Tsourakakis, J Pachocki, Michael Mitzenmacher Harvard University, Cambridge, MA, USA

WWW '17 Proceedings of the 26th International Conference on World Wide Web Pages 1451-1460 https://doi.org/10.1145/3038912.3052653

Coarsening Massive Influence Networks for Scalable Diffusion AnalysisNaoto Ohsaka, Tomohiro Sonobe, Sumio Fujita, Ken-ichi KawarabayashiSIGMOD '17 Proceedings of the 2017 ACM International Conference on Management of Data Pages 635-650 https://doi.org/10.1145/3035918.3064045

“superpixelization”/clustering to speed-up computations

Higher-order organization of complex networksAustin R. Benson, David F. Gleich, Jure Leskovec (Submitted on 26 Dec 2016)https://arxiv.org/abs/1612.08447 pre-print to Science→

https://doi.org/10.1126/science.aad9029Theoretical results in the supplementary materials also explain why classes of hypergraph partitioning methods are more general than previously assumed and how motif-based clustering provides a rigorous framework for the special case of partitioning directed graphs. Finally, the higher-order network clustering framework is generally applicable to a wide range of network types, including directed, undirected, weighted, and signed networks.

Page 34: Geometric Deep Learning

Graph Summarization #1A

Graph Summarization: A SurveyYike Liu, Abhilash Dighe, Tara Safavi, Danai Koutra(Submitted on 14 Dec 2016 (v1), last revised 12 Apr 2017 (this version, v2))https://arxiv.org/abs/1612.04883

The abundance of generated data and its velocity call for data summarization, one of the main data mining tasks. … This survey focuses on summarizing interconnected data, otherwise known as graphs or networks. … . In general, graph summarization or coarsening or aggregation approaches seek to find a short representation of the input graph, often in the form of a summary or sparsified graph, which reveals patterns in the original data and preserves specific structural or other properties, depending on the application domain.

Page 35: Geometric Deep Learning

Graph Summarization #1B

Table I: Qualitative comparison of static graph summarization techniques. The first six columns describe the type of the input graph (e.g. with weighted/directed edges, and one/multiple types of node entities), followed by three algorithm-specific properties (i.e., user parameters, algorithmic compexity—linear on the number of edges or higher—, and type of output). The last column gives the final purpose of each approach. Notation: (1) ∗indicates that the algorithm can be extended to handle the corresponding type of input, but the authors do not provide details in the paper, for complexity indicates sub-linear; (2) + means that at least one parameter can be ∗set by the user, but it is not required (i.e., the algorithm provides a default value). - Liu et al. (2017)

Page 36: Geometric Deep Learning

Point cloud resampling via graphsFast Resampling of 3D Point Clouds via GraphsSiheng Chen ; Dong Tian ; Chen Feng ; Anthony Vetro ; Jelena KovačevićAcoustics, Speech and Signal Processing (ICASSP), 2017 IEEEhttps://doi.org/10.1109/ICASSP.2017.7952695https://arxiv.org/abs/1702.06397

Proposed resampling strategy enhances contours of a point cloud. Plots (a) and (b) resamples 2% points from a 3D point cloud of a building containing 381, 903 points. Plot (b) is more visual-friendly than Plot (a). Note that the proposed resampling strategy is able to to enhance any information depending on users’ preferences.

Page 37: Geometric Deep Learning

2D Image Processing with graphsDirectional graph weight prediction for image compressionFrancesco Verdoja ; Marco GrangettoAcoustics, Speech and Signal Processing (ICASSP), 2017 IEEEhttps://doi.org/10.1109/ICASSP.2017.7952410

The experimental results showed that the proposed technique is able to improve the compression efficiency; as an example we reported a Bjøntegaard Delta (BD) rate reduction of about 30% over JPEG.

Future works will investigate the integration of the proposed method in more advanced image and video coding tools comprising adaptive block sizes and richer set of intra prediction modes.

Luminance coding in graph-based representation of multiview imagesThomas Maugey ; Yung-Hsuan Chao ; Akshay Gadde ; Antonio Ortega ; Pascal FrossardImage Processing (ICIP), 2014 IEEEhttps://doi.org/10.1109/ICIP.2014.7025025

(a) Wavelet decomposition on graphs in GraphBior, where shape {circle, triangle, square, and cross} denote coefficients in LL, LH, HL, HH subbands. (b) Parent-children relationship: P node in LH band of level l + 1 has five children from two views in level l marked with blue. (c)The procedure of finding the children node in level for the parent node in level l + 1 (be

Page 39: Geometric Deep Learning

Graph structure known or not?GRAPH KNOWN”Graph well defined, when the temperature measurement positions are known, and temperature measurement uncertainty is small”

- Perraudin and Vandergheynst 2016

GRAPH “Semi-KNOWN””In a way the structure is known as we can quantify graph signal as number of citations with some journal impact factor weighing, but does this really represent the impact of an article?

Scientists are known to game the system and just responding to the metrics[*]. Are they alternative ways to improve the graph to represent better the impact of an article and the

GRAPH NOT KNOWN“Point cloud measured with a terrestrial laser scanner is unordered point cloud given on non-grid x,y,z coordinates. It is not trivial to define how the points are connected to each other”

Bibliometric network analysis by Nees Jan van Eck

[*] See e.g. Clauset, Aaron, Daniel B. Larremore, and Roberta Sinatra. "Data-driven predictions in the science of science." Science 355.6324 (2017): 477-480. DOI: 10.1126/science.aal4217

Sinatra, Roberta, et al. "Quantifying the evolution of individual scientific impact." Science 354.6312 (2016): aaf5239. DOI: 10.1126/science.aaf5239

Furlanello, Cesare, et al. "Towards a scientific blockchain framework for reproducible data analysis." arXiv preprint arXiv: 1707.06552 (2017).

the R-factor, with R standing for reputation, reproducibility, responsibility, and robustness, http://verumanalytics.io/

Overview of the segmentation method: (a) the initial LiDAR point cloud, (b) height raster image, (c) patches formed with adjacent cells of the same value, (d) hierarchized patches, (e) weighted graph, (f) graph partition, (g) partition result on the raster, (h) segmented point cloud.- Strimbu and Strimbu (2015)

Graphics and Media Lab (GML) is a part of Department of Computational Mathematics and Cybernetics of M.V. Lomonosov Moscow State University.http://graphics.cs.msu.ru/en/node/922http://slideplayer.com/slide/8146222/

Page 40: Geometric Deep Learning

Convolutions for graphs #1Deep Convolutional Networks on Graph-Structured DataMikael Henaff, Joan Bruna, Yann LeCun (Submitted on 16 Jun 2015)https://arxiv.org/abs/1506.05163https://github.com/mdeff/cnn_graph

However, as our results demonstrate, their extension poses significant challenges:

• Although the learning complexity requires O(1) parameters per feature map, the evaluation, both forward and backward, requires a multiplication by the Graph Fourier Transform, which costs O(N2) operations. This is a major difference with respect to traditional ConvNets, which require only O(N). Fourier implementations of Convnets bring the complexity to O(N log N) thanks again to the specific symmetries of the grid. An open question is whether one can find approximate eigenbasis of general Graph Laplacians using Givens’ decompositions similar to those of the FFT.

Our experiments show that when the input graph structure is not known a priori, graph estimation is the statistical bottleneck of the model, requiring O(N2) for general graphs and O(MN) for M-dimensional graphs. Supervised graph estimation performs significantly better than unsupervised graph estimation based on low-order moments. Furthermore, we have verified that the architecture is quite sensitive to graph estimation errors. In the supervised setting, this step can be viewed in terms of a Bootstrapping mechanism, where an initially unconstrained network is self-adjusted to become more localized and with weightsharing.

• Finally, the statistical assumptions of stationarity and compositionality are not always verified. In those situations, the constraints imposed by the model risk to reduce its capacity for no reason. One possibility for addressing this issue is to insert Fully connected layers between the input and the spectral layers, such that data can be transformed into the appropriate statistical model. Another strategy, that is left for future work, is to relax the notion of weight sharing by introducing instead a commutation error ∥W

iL −

LWi∥ with the graph Laplacian, which puts a soft penalty on transformations

that do not commute with the Laplacian, instead of imposing exact commutation as is the case in the spectral net.

We explore for two areas of application for which it has not been possible to apply convolutional networks before: text categorization and bioinformatics. Our results show that our method is capable of matching or outperforming large, fully-connected networks trained with dropout using fewer parameters.

Our main contributions can be summarized as follows:

● We extend the ideas from Bruna et al. (2013) to large-scale classification problems, specifically Imagenet Object Recognition, text categorization and bioinformatics.

● We consider the most general setting where no prior information on the graph structure is available, and propose unsupervised and new supervised graph estimation strategies in combination with the supervised graph convolutions.

Page 41: Geometric Deep Learning

Convolutions for graphs #2Learning Convolutional Neural Networks for Graphs Mathias Niepert, Mohamed Ahmed, Konstantin Kutzkov ; Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:2014-2023, 2016.http://proceedings.mlr.press/v48/niepert16.html

A CNN with a receptive field of size 3x3. The field is moved over an image from left to right and top to bottom using a particular stride (here: 1) and zero-padding (here: none) (a). The values read by the receptive fields are transformed into a linear layer and fed to a convolutional architecture (b). The node sequence for which the receptive fields are created and the shapes of the receptive fields are fully determined by the hyper-parameters.

An illustration of the proposed architecture. A node sequence is selected from a graph via a graph labeling procedure. For some nodes in the sequence, a local neighborhood graph is assembled and normalized. The normalized neighborhoods are used as receptive fields and combined with existing CNN components.

The normalization is performed for each of the graphs induced on the neighborhood of a root node v (the red node; node colors indicate distance to the root node). A graph labeling is used to rank the nodes and to create the normalized receptive fields, one of size k (here: k = 9) for node attributes and one of size k × k for edge attributes. Normalization also includes cropping of excess nodes and padding with dummy nodes. Each vertex (edge) attribute corresponds to an input channel with the respective receptive field.

Visualization of RBM features learned with 1-dimensional WL normalized receptive fields of size 9 for a torus (periodic lattice, top left), a preferential attachment graph (Barabási & Albert 1999, bottom left), a co-purchasing network of political books (top right), and a random graph (bottom right). Instances of these graphs with about 100 nodes are depicted on the left. A visual representation of the feature’s weights (the darker a pixel, the stronger the corresponding weight) and 3 graphs sampled from the RBMs by setting all but the hidden node corresponding to the feature to zero. Yellow nodes have position 1 in the adjacency matrices

“Directions for future work include the use of alternative neural network architectures such as recurrent neural networks (RNNs); combining different receptive field sizes; pretraining with e restricted Boltzman machines (RBMs) and autoencoders; and statistical relational models based on the ideas of the approach.”

Page 42: Geometric Deep Learning

Convolutions for graphs #3Geometric deep learning on graphs and manifolds using mixture model CNNsFederico Monti, Davide Boscaini, Jonathan Masci, Emanuele Rodolà, Jan Svoboda, Michael M. Bronstein Submitted on 25 Nov 2016 (v1), last revised 6 Dec 2016 (this version, v3))https://arxiv.org/abs/1611.08402

Left: intrinsic local polar coordinates , ρ θ on manifold around a point marked in white. Right: patch operator weighting functions wi( , )ρ θ used in different

generalizations of convolution on the manifold (hand-crafted in GCNN and ACNN and learned in MoNet). All kernels are L -normalized; red curves ∞represent the 0.5 level set.

Representation of images as graphs. Left: regular grid (the graph is fixed for all images). Right: graph of superpixel adjacency (different for each image). Vertices are shown as red circles, edges as red lines. Learning configuration used for Cora

and PubMed experiments.

. Predictions obtained applying MoNet over the Cora dataset. Marker fill

color represents the predicted class; marker outline color represents

the groundtruth class.

In this paper, we propose a unified framework allowing to generalize CNN architectures to non-Euclidean domains (graphs and manifolds) and learn local, stationary, and compositional task-specific features. We show that various non-Euclidean CNN methods previously proposed in the literature can be considered as particular instances of our framework. We test the proposed method on standard tasks from the realms of image-, graph- and 3D shape analysis and show that it consistently outperforms previous approaches.

Page 43: Geometric Deep Learning

Convolutions for graphs #4Convolutional Neural Networks on Graphs with Fast Localized Spectral FilteringMichaël Defferrard, Xavier Bresson, Pierre VandergheynstAdvances in Neural Information Processing Systems 29 (NIPS 2016)https://arxiv.org/abs/1606.09375https://github.com/mdeff/cnn_graphhttps://youtu.be/cIA_m7vwOVQ

Architecture of a CNN on graphs and the four ingredients of a (graph) convolutional layer.

It is however known that graph clustering is NP-hard [Bui and Jones, 1992] and that approximations must be used. While there exist many clustering techniques, e.g. the popular spectral clustering [von Luxburg, 2007], we are most interested in multilevel clustering algorithms where each level produces a coarser graph which corresponds to the data domain seen at a different resolution.

Future works will investigate two directions.

On one hand, we will enhance the proposed framework with newly developed tools in GSP. On the other hand, we will explore applications of this generic model to important fields where the data naturally lies on graphs, which may then incorporate external information about the structure of the data rather than artificially created graphs which quality may vary as seen in the experiments.

Another natural and future approach, pioneered in [Henaff et al. 2015], would be to alternate the learning of the CNN parameters and the graph.

Page 44: Geometric Deep Learning

Convolutions for graphs #5

Top: Schematic illustration of a standard CNN where patches of w×h pixels are convolved with D×E filters to map the D dimensional input features to E dimensional output features.

Middle: same, but representing the CNN parameters as a set of M = w×h weight matrices, each of size D×E. Each weight matrix is associated with a single relative position in the input patch.

Bottom: our graph convolutional network, where each relative position in the input patch is associated in a soft manner to each of the M weight matrices using the function q(x

i, x

j).

Page 45: Geometric Deep Learning

Convolutions for graphs #6CayleyNets: Graph Convolutional Neural Networks with Complex Rational Spectral FiltersRon Levie, Federico Monti, Xavier Bresson, Michael M. Bronstein(Submitted on 22 May 2017)https://arxiv.org/abs/1705.07664

The core ingredient of our model is a new class of parametric rational complex functions (Cayley polynomials) allowing to efficiently compute localized regular filters on graphs that specialize on frequency bands of interest. Our model scales linearly with the size of the input data for sparsely-connected graphs, can handle different constructions of Laplacian operators, and typically requires less parameters than previous models

Filters (spatial domain, top and spectral domain, bottom) learned by CayleyNet (left) and ChebNet (center, right) on the MNIST dataset. Cayley filters are able to realize larger supports for the same order r.

Eigenvalues of the unnormalized Laplacian h u∆ of the 15-communities graph mapped on the complex unit half-circle by means of Cayley transform with spectral zoom values (left-to-right) h = 0.1, 1, and 10. The first 15 frequencies carrying most of the information about the communities are marked in red. Larger values of h zoom (right) on the low frequency band

Page 46: Geometric Deep Learning

Convolutions for graphs #7Graph Convolutional Matrix CompletionRianne van den Berg, Thomas N. Kipf, Max Welling(Submitted on 7 Jun 2017)https://arxiv.org/abs/1706.02263

Left: Rating matrix M with entries that correspond to user-item interactions (ratings between 1-5) or missing observations (0). Right: User-item interaction graph with bipartite structure. Edges correspond to interaction events, numbers on edges denote the rating a user has given to a particular item. The matrix completion task (i.e. predictions for unobserved interactions) can be cast as a link prediction problem and modeled using an end-to-end trainable graph auto-encoder.

Schematic of a forward-pass through the MC-GC model, which is comprised of a graph convolutional encoder [U, V ] = f(X, M1, . . . , MR) that passes and transforms messages from user to item nodes, and vice versa, followed by a bilinear decoder model that predicts entries of the (reconstructed) rating matrix M = g(U, V), based on pairs of user and item embeddings.

“Our model can be seen as a first step towards modeling recommender systems where the

interaction data is integrated into other structured modalities, such as a social network or a

knowledge graph.

As a next step, it would be interesting to investigate how the differentiable message passing scheme of

our encoder model can be extended to such structured data environments. We expect that

further approximations, e.g. subsampling of local graph neighborhoods, will be necessary in order to

keep requirements in terms of computation and memory in a feasible range.”

Page 47: Geometric Deep Learning

Convolutions for graphs #8Graph Based Convolutional Neural NetworkMichael Edwards, Xianghua Xie(Submitted on 28 Sep 2016)https://arxiv.org/abs/1609.08965

Graph based Convolutional Neural Network components. The GCNN is designed from an architecture of graph convolution and pooling operator layers. Convolution layers generate O output feature maps dependent on the selected O for that layer. Graph pooling layers will coarsen the current graph and graph signal based on the selected vertex reduction method.

Two levels of graph pooling operation on regular and irregular grid with MNIST signal. From left: Regular grid, AMG level 1, AMG level 2, Irregular grid, AMG level 1, AMG level 2.

Feature maps formed by a feed-forward pass of the regular domain. From left: Original image, Convolution round 1, Pooling round 1, Convolution round 2, Pooling round 2

Feature maps formed by a feed-forward pass of the irregular domain. From left: Original image, Convolution round 1, Pooling round 1, Convolution round 2, Pooling round 2.

This study proposes a novel method of performing deep convolutional learning on the irregular graph by coupling standard graph signal processing techniques and backpropagation based neural network design.

Convolutions are performed in the spectral domain of the graph Laplacian and allow for the learning of spatially localized features whilst handling the nontrivial irregular kernel design. Results are provided on both a regular and irregular domain classification problem and show the ability to learn localized feature maps across multiple layers of a network. A graph pooling method is provided that agglomerates vertices in the spatial domain to reduce complexity and generalize the features learnt. GPU performance of the algorithm improves upon training and testing speed, however further optimization is needed. Although the results on the regular grid are outperformed by standard CNN architecture this is understandable due to the direct use of a local kernel in the spatial domain.

The major contribution over standard CNNs is the ability to function on the irregular graph is not to be underestimated. Graph based CNN requires costly forward and inverse graph Fourier transforms, and this requires some work to enhance usability in the community. Ongoing study into graph construction and reduction techniques is required to encourage uptake by a wider range of problem domains.

Page 48: Geometric Deep Learning

Convolutions for graphs #9Generalizing CNNs for data structured on locations irregularly spaced outJean-Charles Vialatte, Vincent Gripon, Grégoire Mercier(Submitted on 3 Jun 2016 (v1), last revised 4 Jul 2017 (this version, v3))https://arxiv.org/abs/1609.08965

In this paper, we have defined a generalized convolution operator. This operator makes possible to transport the CNN paradigm to irregular domains. It retains the proprieties of a regular convolutional operator. Namely, it is linear, supported locally and uses the same kernel of weights for each local operation. The generalized convolution operator can then naturally be used instead of convolutional layers in a deep learning framework. Typically, the created model is well suited for input data that has an underlying graph structure.

The definition of this operator is flexible enough for it allows to adapt its weight-allocation map to any input domain, so that depending on the case, the distribution of the kernel weight can be done in a way that is natural for this domain. However, in some cases, there is no natural way but multiple acceptable methods to define the weight allocation. In further works, we plan to study these methods. We also plan to apply the generalized operator on unsupervised learning tasks.

Page 49: Geometric Deep Learning

Convolutions for graphs #10Robust Spatial Filtering with Graph Convolutional Neural NetworksFelipe Petroski Such, Shagan Sah, Miguel Dominguez, Suhas Pillai, Chao Zhang, Andrew Michael, Nathan Cahill, Raymond Ptucha(Submitted on 2 Mar 2017 (v1), last revised 14 Jul 2017 (this version, v3))https://arxiv.org/abs/1703.00792https://github.com/fps7806/Graph-CNN

Two types of graph datasets. Left: Homogeneous datasets. All samples in a homogeneous graph data have identical graph structure, but different vertex values or “signals”. Right: Heterogeneous graph samples. Heterogeneous graph samples can vary in number of vertices, structure of edge connections, and in the vertex values.

General vertex-edge domain Graph-CNN architecture. Convolution and pooling layers are cascaded into a deep network. FC are fully-connected layers for graph classification. V is vertex set and A is adjacency matrix that define a graph.

Graph convolution and pooling setting. The convolution operation obtains a filtered representation of the graph after a multi-hop vertex filter. Likewise, a compact representation of the graph after a pooling layer

Page 50: Geometric Deep Learning

Convolutions for graphs #11A Generalization of Convolutional Neural Networks to Graph-Structured DataYotam Hechtlinger, Purvasha Chakravarti, Jining Qin(Submitted on 26 Apr 2017)https://arxiv.org/abs/1704.08165https://github.com/hechtlinger/graph_cnn

Visualization of the graph convolution size 5. For a given node, the convolution is applied on the node and its 4 closest neighbors selected by the random walk. As the right figure demonstrates, the random walk can expand further into the graph to higher degree neighbors. The convolution weights are shared according to the neighbors’ closeness to the nodes and applied globally on all nodes.

Visualization of a row of Q(k) on the graph generated over the 2-D grid at a node near the center, when connecting each node to its 8 adjacent neighbors. For k = 1, most of the weight is on the node, with smaller weights on the first order neighbors. This corresponds to a standard 3 × 3 convolution. As k increases the number of active neighbors also increases, providing greater weight to neighbors farther away, while still keeping the local information.

We propose a generalization of convolutional neural networks from grid-structured data to graph-structured data, a problem that is being actively researched by our community. Our novel contribution is a convolution over a graph that can handle different graph structures as its input. The proposed convolution contains many sought-after attributes; it has a natural and intuitive interpretation, it can be transferred within different domains of knowledge, it is computationally efficient and it is effective.

Furthermore, the convolution can be applied on standard regression or classification problems by learning the graph structure in the data, using the correlation matrix or other methods. Compared to a fully connected layer, the suggested convolution has significantly fewer parameters while providing stable convergence and comparable performance. Our experimental results on the Merck Molecular Activity data set and MNIST data demonstrate the potential of this approach.

Convolutional Neural Networks have already revolutionized the fields of computer vision, speech recognition and language processing. We think an important step forward is to extend it to other problems which have an inherent graph structure.

Page 51: Geometric Deep Learning

Autoencoders for graphsVariational Graph Auto-EncodersThomas N. Kipf, Max Welling(Submitted on 21 Nov 2016)https://arxiv.org/abs/1611.07308https://github.com/tkipf/gae

→ http://tkipf.github.io/graph-convolutional-networks/

Latent space of unsupervised VGAE model trained on Cora citation network dataset. Grey lines denote citation links. Colors denote document class (not provided during training).

Future work will investigate better-suited prior distributions (instead of Gaussian here), more flexible generative models and the application of a stochastic gradient descent algorithm for improved scalability.

Modeling Relational Data with Graph Convolutional NetworksMichael Schlichtkrull, Thomas N. Kipf, Peter Bloem, Rianne van den Berg, Ivan Titov, Max Welling(Submitted on 17 Mar 2017 (v1), last revised 6 Jun 2017 (this version, v3))https://arxiv.org/abs/1703.06103

In this work, we introduce relational GCNs (R-GCNs). R-GCNs are specifically designed to deal with highly multi-relational data, characteristic of realistic knowledge bases. Our entity classification model, similarly to Kipf and Welling [see left], uses softmax classifiers at each node in the graph. The classifiers take node representations supplied by an R-GCN and predict the labels. The model, including R-GCN parameters, is learned by optimizing the cross-entropy loss. Our link prediction model can be regarded as an autoencoder consisting of (1) an encoder: an R-GCN producing latent feature representations of entities, and (2) a decoder: a tensor factorization model exploiting these representations to predict labeled edges. Though in principle the decoder can rely on any type of factorization (or generally any scoring function), we use one of the simplest and most effective factorization methods: DistMult [Yang et al. 2014].

(a) R-GCN per-layer update for a single graph node (in light red). Activations from neighboring nodes (dark blue) are gathered and then transformed for each relation type individually (for both in- and outgoing edges). The resulting

representation is accumulated in a (normalized) sum and passed through an activation function (such as the ReLU). This per-node update can be computed in parallel with shared parameters across the whole graph. (b) Depiction of an R-GCN

model for entity classification with a per-node loss function. (c) Link prediction model with an R-GCN encoder (interspersed with fully-connected/dense layers) and a DistMult decoder that takes pairs of hidden node representations

and produces a score for every (potential) edge in the graph. The loss is evaluated per edge.

Page 52: Geometric Deep Learning

Representation Learning For graphs #1Inductive Representation Learning on Large GraphsWilliam L. Hamilton, Rex Ying, Jure Leskovec (Submitted on 7 Jun 2017)https://arxiv.org/abs/1706.02216http://snap.stanford.edu/graphsage/

We propose a general framework, called GraphSAGE (SAmple and aggreGatE), for inductive node embedding. Unlike embedding approaches that are based on matrix factorization, we leverage node features (e.g., text attributes, node profile information, node degrees) in order to learn an embedding function that generalizes to unseen nodes. By incorporating node features in the learning algorithm, we simultaneously learn the topological structure of each node’s neighborhood as well as the distribution of node features in the neighborhood. While we focus on feature-rich graphs (e.g., citation data with text attributes, biological data with functional/molecular markers), our approach can also make use of structural features that are present in all graphs (e.g., node degrees). Thus, our algorithm can also be applied to graphs without node features (i.e. point clouds with only the xyz-coordinates without RGB texture, normals, etc.)

Low-dimensional vector embeddings of nodes in large graphs have proved extremely useful as feature inputs for a wide variety of prediction and graph analysis tasks. The basic idea behind node embedding approaches is to use dimensionality reduction techniques to distill the high-dimensional information about a node’s neighborhood into a dense vector embedding. These node embeddings can then be fed to downstream machine learning systems and aid in tasks such as node classification, clustering, and link prediction (e.g. LINE, see below).

However, previous works have focused on embedding nodes from a single fixed graph, and many real-world applications require embeddings to be quickly generated for unseen nodes, or entirely new (sub)graphs. This inductive capability is essential for high-throughput, production machine learning systems, which operate on evolving graphs and constantly encounter unseen nodes (e.g., posts on Reddit, users and videos on Youtube). An inductive approach to generating node embeddings also facilitates generalization across graphs with the same form of features: for example, one could train an embedding generator on protein-protein interaction graphs derived from a model organism, and then easily produce node embeddings for data collected on new organisms using the trained model.

LINE: Large-scale Information Network EmbeddingJian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, Qiaozhu Mei

(Submitted on 12 Mar 2015)https://arxiv.org/abs/1503.03578

https://github.com/tangjianpku/LINE

Page 53: Geometric Deep Learning

Representation Learning For graphs #2Skip-graph: Learning graph embeddings with an encoder-decoder modelJohn Boaz Lee, Xiangnan Kong04 Nov 2016 (modified: 11 Jan 2017) ICLR 2017 conference submissionhttps://openreview.net/forum?id=BkSqjHqxg&noteId=BkSqjHqxg

We introduced an unsupervised method, based on the encoder-decoder model, for generating feature representations for graph-structured data. The model was evaluated on the binary classification task on several real-world datasets. The method outperformed several state-of-the-art algorithms on the tested datasets.

There are several interesting directions for future work. For instance, we can try training multiple encoders on random walks generated using very different neighborhood selection strategies. This may allow the different encoders to capture different properties in the graphs. We would also like to test the approach using different neural network architectures. Finally, it would be interesting to test the method on other types of heterogeneous information networks.

Page 54: Geometric Deep Learning

Semi-supervised Learning For graphsInductive Representation Learning on Large GraphsThang D. Bui, Sujith Ravi, Vivek Ramavajjala University of Cambridge, United Kingdom; Google Research, Mountain View, CA, USA(Submitted on 14 Mar 2017)https://arxiv.org/abs/1703.04818

We have revisited graph-augmentation training of neural networks and proposed Neural Graph Machines as a general framework for doing so. Its label propagation (for semi-supervised CNNs see e.g. Tarvainen and Valpola 2017) objective function encourages the neural networks to make accurate node-level predictions, as in vanilla neural network training, as well as constrains the networks to learn similar hidden representations for nodes connected by an edge in the graph. Importantly, the objective can be trained by stochastic gradient descent and scaled to large graphs

We validated the efficacy of the graph-augmented objective on various tasks including bloggers’ interest, text category and semantic intent classification problems, using a wide range of neural network architectures (FFNNs, CNNs and LSTM RNNs). The experimental results demonstrated that graph-augmented training almost always helps to find better neural networks that outperforms other techniques in predictive performance or even much smaller networks that are faster and easier to train. Additionally, the node-level input features can be combined with graph features as inputs to the neural networks. We showed that a neural network that simply takes the adjacency matrix of a graph and produces node labels, can perform better than a recently proposed two-stage approach using sophisticated graph embeddings and a linear classifier. Our framework also excels when the neural network is small, or when there is limited supervision available.

While our objective can be applied to multiple graphs which come from different domains, we have not fully explored this aspect and leave this as future work. We expect the domain-specific networks can interact with the graphs to determine the importance of each domain/graph source in prediction. We also did not explore using graph regularisation for different hidden layers of the neural networks; we expect this is key for the multi-graph transfer setting (Yosinski et al., 2014). Another possible future extension is to use our objective on directed graphs, that is to control the direction of influence between nodes during training.

Page 55: Geometric Deep Learning

Recurrent Networks for graphs #1Geometric Matrix Completion with Recurrent Multi-Graph Neural NetworksFederico Monti, Michael M. Bronstein, Xavier Bresson(Submitted on 22 Apr 2017)https://arxiv.org/abs/1704.06803

Main contribution. In this work, we treat matrix completion problem as deep learning on graph-structured data. We introduce a novel neural network architecture that is able to extract local stationary patterns from the high-dimensional spaces of users and items, and use these meaningful representations to infer the non-linear temporal diffusion mechanism of ratings. The spatial patterns are extracted by a new CNN architecture designed to work on multiple graphs. The temporal dynamics of the rating diffusion is produced by a Long-Short Term Memory (LSTM) recurrent neural network (RNN). To our knowledge, our work is the first application of graph-based deep learning to matrix completion problem.

Recurrent GCNN (RGCNN) architecture using the full matrix completion model and operating simultaneously on the rows and columns of the matrix X. The output of the Multi-Graph CNN (MGCNN) module is a q-dimensional feature vector for each element of the input matrix. The number of parameters to learn is O(1) and the learning complexity is O(mn).

Separable Recurrent GCNN (sRGCNN) architecture using the factorized matrix completion model and operating separately on the rows and columns of the factors W, H>. The output of the GCNN module is a q-dimensional feature vector for each input row/column, respectively. The number of parameters to learn is O(1) and the learning complexity is O(m + n).

Evolution of the matrix X(t) with our architecture using full matrix completion model RGCNN (top) and factorized matrix completion model sRGCNN (bottom). Numbers indicate the RMS error.

Absolute value of the first 8 spectral filters learnt by our bidimensional convolution. On the left the first filter with the reference axes associated to the row and column graph eigenvalues.

Page 56: Geometric Deep Learning

Recurrent Networks for graphs #2Learning From Graph Neighborhoods Using LSTMsRakshit Agrawal, Luca de Alfaro, Vassilis Polychronopoulos(Submitted on 21 Nov 2016)https://arxiv.org/abs/1611.06882https://sites.google.com/view/ml-on-structures

→ https://github.com/ML-on-structures/blockchain-lstm →→ Bitcoin blockchain data used in paper

“The approach is based on a multi-level architecture built from Long Short-Term Memory neural nets (LSTMs); the LSTMs learn how to summarize the neighborhood from data. We demonstrate the effectiveness of the proposed technique on a synthetic example and on real-world data related to crowdsourced grading, Bitcoin transactions, and Wikipedia edit reversions.”

The blockchain is the public immutable distributed ledger where Bitcoin transactions are recorded [20]. In Bitcoin, coins are held by addresses, which are hash values; these address identifiers are used by their owners to anonymously hold bitcoins, with ownership provable with public key cryptography. A Bitcoin transaction involves a set of source addresses, and a set of destination addresses: all coins in the source addresses are gathered, and they are then sent in various amounts to the destination addresses.

Mining data on the blockchain is challenging [Meiklejohn et al. 2013] due to the anonymity of addresses. We use data from the blockchain to predict whether an address will spend the funds that were deposited to it.

We obtain a dataset of addresses by using a slice of the blockchain. In particular, we consider all the addresses where deposits happened in a short range of 101 blocks, from 200,000 to 200,100 (included) . They contain 15,709 unique addresses where deposits took place. Looking at the state of the blockchain after 50,000 blocks (which corresponds to roughly one year later as each block is mined on average every 10 minutes), 3,717 of those addresses still had funds sitting: we call these “hoarding addresses”. The goal is to predict which addresses are hoarding addresses, and which spent the funds. We randomly split the 15,709 addresses into a training set of 10,000 and a validation set of 5,709 addresses.

We built a graph with addresses as nodes, and transactions as edges. Each edge was labeled with features of the transaction: its time, amount of funds transmitted, number of recipients, and so forth, for a total of 9 features. We compared two different algorithms:

● Baseline: an informative guess; it guesses a label with a probability equal to its percentage in the training set.

● MLSL of depths 1, 2, 3. The outputs and memory sizes of the learners for the reported results are K2 = K3 = 3. Increasing these to 5 maintained virtually the same performance while increasing training time. Using only 1 output and memory cell was not providing any advances in performance.

Quantitative Analysis of the Full Bitcoin Transaction GraphDorit Ron, Adi Shamir Financial Cryptography 2012http://doi.org/10.1007/978-3-642-39884-1_2

Page 57: Geometric Deep Learning

Time-series analysis with graphs #1Spectral Algorithms for Temporal Graph CutsArlei Silva, Ambuj Singh, Ananthram Swami(Submitted on 15 Feb 2017)https://arxiv.org/abs/1702.04746

We propose novel formulations and algorithms for computing temporal cuts using spectral graph theory, multiplex graphs, divide-and-conquer and low-rank matrix approximation. Furthermore, we extend our formulation to dynamic graph signals, where cuts also capture node values, as graph wavelets. Experiments show that our solutions are accurate and scalable, enabling the discovery of dynamic communities and the analysis of dynamic graph processes.

This work opens several lines for future investigation:

(i) temporal cuts, as a general framework for solving problems involving dynamic data, can be applied in many scenarios, we are particularly interested to see how our method performs in computer vision tasks;

(ii) Perturbation Theory can provide deeper theoretical insights into the properties of temporal cuts [Sole-Ribalta et al. 2013; Taylor et al. 2015]; finally,

(iii) we want to study Cheeger inequalities [Chung 1996] for temporal cuts, as means to better understand the performance of our algorithms.

Temporal graph cut for a primary school network. The cut, represented as node colors, reflects the network dynamics, capturing major changes in the children’s interactions.

Page 58: Geometric Deep Learning

Active learning on GraphsActive Learning for Graph EmbeddingHongyun Cai, Vincent W. Zheng, Kevin Chen-Chuan Chang(Submitted on 15 May 2017)https://arxiv.org/abs/1705.05085https://github.com/vwz/AGE

In this paper, we proposed a novel active learning framework for graph embedding named Active Graph Embedding (AGE). Unlike the traditional active learning algorithms, AGE processes the data with structural information and learnt representations (node embeddings), and it is carefully designed to address the challenges brought by these two characteristics.

First, to exploit the graphical information, a graphical centrality based measurement is considered in addition to the popular information entropy based and information density based query criteria.

Second, the active learning and graph embedding process are jointly run together by posing the label query at the end of every epoch of the graph embedding training process.

Moreover, the time-sensitive weights are put on the three active learning query criteria which focus on the graphical centrality at the beginning and shift the focus to the other two embedding based criteria as the training process progresses (i.e., more accurate embeddings are learnt).

Page 59: Geometric Deep Learning

Transfer learning on GraphsIntrinsic Geometric Information Transfer Learning on Multiple Graph-Structured DatasetsJaekoo Lee, Hyunjae Kim, Jongsun Lee, Sungroh Yoon(Submitted on 15 Nov 2016 (v1), last revised 5 Dec 2016 (this version, v2))https://arxiv.org/abs/1611.04687

Conventional CNN works on a regular grid domain (top); proposed transfer learning framework for CNN, which can transfer intrinsic geometric information obtained from a source graph domain to a target graph domain (bottom).

Overview of the proposed method.

Conclusion We have proposed a new transfer learning framework for deep learning on graph-structured data. Our approach can transfer the intrinsic geometric information learned from the graph representation of the source domain to the target domain. We observed that the knowledge transfer between tasks domains is most effective when the source and target domains possess high similarity in their graph representations. We anticipate that adoption of our methodology will help extend the territory of deep learning to data in non-grid structure as well as to cases with limited quantity and quality of data. To prove this, we are planning to apply our approach to diverse datasets in different domains.

Page 60: Geometric Deep Learning

Transfer learning on Graphs #2Deep Feature Learning for GraphsRyan A. Rossi, Rong Zhou, Nesreen K. Ahmed(Submitted on 28 Apr 2017)https://arxiv.org/abs/1611.04687

This paper presents a general graph representation learning framework called DeepGL for learning deep node and edge representations from large (attributed) graphs. In particular, DeepGL begins by deriving a set of base features (e.g., graphlet features) and automatically learns a multi-layered hierarchical graph representation where each successive layer leverages the output from the previous layer to learn features of a higher-order. Contrary to previous work, DeepGL learns relational functions (each representing a feature) that generalize across-networks and therefore useful for graph-based transfer learning tasks. Moreover, DeepGL naturally supports attributed graphs, learns interpretable features, and is space-efficient (by learning sparse feature vectors).

Thus, features learned by DeepGL are interpretable and naturally generalize for across-network transfer learning tasks as they can be derived on any arbitrary graph. The framework is flexible with many interchangeable components, expressive, interpretable, parallel, and is both space- and time-efficient for large graphs with runtime that is linear in the number of edges. DeepGL has all the following desired properties:

● Effective for attributed graphs and across-network transfer learning tasks● Space-efficient requiring up to 6× less memory ● Fast with up to 182× speedup in runtime ● Accurate with a mean improvement of 20% or more on many applications ● Parallel with strong scaling results.

Page 61: Geometric Deep Learning

Learning Graphs learning the graph itself #1Learning Graph While Training: An Evolving Graph Convolutional Neural NetworkRuoyu Li, Junzhou Huang(Submitted on 10 Aug 2017)https://arxiv.org/abs/1708.04675

“In this paper, we propose a more general and flexible graph convolution network (EGCN) fed by batch of arbitrarily shaped data together with their evolving graph

Laplacians trained in supervised fashion. Extensive experiments have been conducted to demonstrate the superior performance in terms of both the acceleration

of parameter fitting and the significantly improved prediction accuracy on multiple graph-structured datasets.”

In this paper, we explore our approach primarily on chemical molecular datasets, although the network can be straightforwardly trained on other graph-structured data, such as point cloud, social networks and so on. Our contributions can be summarized as follows:

● A novel spectral graph convolution layer boosted by Laplacian learning (SGC-LL) has been proposed to dynamically update the residual graph Laplacians via metric learning for deep graph learning.

● Re-parametrization on feature domain has been introduced in K-hop spectral graph convolution to enable our proposed deep graph learning and to grant graph CNNs the similar capability of feature extraction on graph data as that in the classical CNNs on grid data.

● An evolving graph convolution network (EGCN) has been designed to be fed by a batch of arbitrarily shaped graph-structured data. The network is able to construct and learn for each data sample the graph structure that best serves the prediction part of network. Extensive experimental results indicate the benefits from the evolving graph structure of data.

Page 62: Geometric Deep Learning

Graph structure as the “signal” for predictionDeepGraph: Graph Structure Predicts Network GrowthCheng Li, Xiaoxiao Guo, Qiaozhu Mei(Submitted on 20 Oct 2016)https://arxiv.org/abs/1708.04675

“Extensive experiments on five large collections of real-world networks demonstrate that the proposed prediction model significantly improves the effectiveness of existing methods, including linear or nonlinear regressors that use hand-crafted features, graph kernels, and competing deep learning methods.”

Graph descriptor vs. adjacency matrix. We have described the process in converting an adjacency matrix into our graph descriptor, which is then passed through a deep neural network for further feature extraction. All computation in this process is to obtain a more effective low-level representation of the topological structure information than the original adjacency matrix.

First, isometric graphs could be represented by many different adjacency matrices, while our graph descriptor would provide a unique representation for those isometric graphs. The unique representation simplifies the neural network structures for network growth prediction.

Second, our graph descriptor provides similar representations for graphs with similar structures. The similarity of graphs is less preserved in adjacency matrix representation. Such information loss could cause great burden for deep neural networks in growth prediction tasks.

Third, our graph descriptor is a universal graph structure representation which does not depend on vertex ordering or the number of vertexes, while the adjacency matrix is not.

The motivation in adopting Heat Kernel Signature (HKS) is its theoretical proven properties in representing graphs: HKS is an intrinsic and informative representation for graphs [31]. Intrinsicness means that isomorphic graphs map to the same HKS representation, and informativeness means if two graphs have the same HKS representation, then they must be isomorphic graphs.

A meaningful future direction is to integrate network structure with other types of information, such as the content of information cascades in the network. A joint representation of multi-modal information may maximize the performance of particular prediction tasks.

Page 63: Geometric Deep Learning

GANs For graphsCan GAN Learn Topological Features of a Graph?Weiyi Liu, Pin-Yu Chen, Hal Cooper, Min Hwan Oh, Sailung Yeung, Toyotaro Suzumura(Submitted on 19 Jul 2017)https://arxiv.org/abs/1707.06197

“This paper is first-line research expanding GANs into graph topology analysis. By leveraging the hierarchical connectivity structure of a graph, we have

demonstrated that generative adversarial networks (GANs) can successfully capture topological features of any arbitrary graph, and rank edge sets by

different stages according to their contribution to topology reconstruction. Moreover, in addition to acting as an indicator of graph reconstruction, we find that these stages can also preserve important topological features in a graph.”

Workflow for graph topology interpolater (GTI).

Comparison with graph sampling methods on the Facebook subgraphs.

(a) WS Networks

(b) Kron Networks

(c) Wiki-Vote Networks

(d) Facebook Networks

Page 65: Geometric Deep Learning

Brain connectivity Graph-based machine learning analysis #1

Felipe Petroski Such et al. (2017)

Identifying Deep Contrasting Networks from Time Series Data: Application to Brain Network AnalysisJohn Boaz Lee, Xiangnan Kong, Yihan Bao, Constance Moorehttps://doi.org/10.1137/1.9781611974973.61

In functional brain network analysis, the activity of different brain regions can be represented as multiple time series. An important task in the analysis is to identify the latent network from the observed time series data. In this network, the edges (functional connectivity) capture the correlation between different time series (brain regions). The CNN in our model learns a nonlinear edge-weighting function to assign discriminative values to the edges of a network.

A deep-learning approach to translate between brain structure and functional connectivityVince D. Calhoun ; Md Faijul Amin ; Devon Hjelm ; Eswar Damaraju ; Sergey M. PlisAcoustics, Speech and Signal Processing (ICASSP), 2017 IEEEhttps://doi.org/10.1109/ICASSP.2017.7953339

Learning Longitudinal MRI Patterns by SICE and Deep Learning: Assessing the Alzheimer’s Disease ProgressionAndrés Ortiz, Jorge Munilla, Francisco J. Martínez-Murcia, Juan M. Górriz, Javier Ramírez for the Alzheimer’s Disease Neuroimaging InitiativeMIUA 2017: Medical Image Understanding and Analysis pp 413-424https://doi.org/10.1007/978-3-319-60964-5_36

Deep Learning with Edge Computing for Localization of Epileptogenicity Using Multimodal rs-fMRI and EEG Big DataMohammad-Parsa Hosseini ; Tuyen X. Tran ; Dario Pompili ; Kost Elisevich ; Hamid Soltanian-ZadehAutonomic Computing (ICAC), 2017 IEEEhttps://doi.org/10.1109/ICAC.2017.41

Encoding Multi-Resolution Brain Networks Using Unsupervised Deep LearningArash Rahnama, Abdullah Alchihabi, Vijay Gupta, Panos Antsaklis, Fatos T. Yarman Vural(Submitted on 13 Aug 2017)https://arxiv.org/abs/1708.04232

Page 66: Geometric Deep Learning

Brain connectivity Graph-based machine learning analysis #2Distance Metric Learning using Graph Convolutional Networks: Application to Functional Brain NetworksSofia Ira Ktena, Sarah Parisot, Enzo Ferrante, Martin Rajchl, Matthew Lee, Ben Glocker, Daniel RueckertBiomedical Image Analysis Group, Imperial College London, UK(Submitted on 7 Mar 2017 (v1), last revised 14 Jun 2017 (this version, v2))https://arxiv.org/abs/1703.02161https://github.com/sk1712/gcn_metric_learning

Pipeline used for learning to compare functional brain graphs

Box-plots showing Euclidean distances after PCA (top) and distances learned with the proposed GCN model (bottom) between matching and nonmatching graph pairs of the test set. Differences between the distance distributions of the two classes (matching vs. non-matching) are indicated as significant (*) or non significant (n.s.) using a permutation test with 10000 permutations.

The proposed model could benefit from several extensions. The architecture of our network is relatively simple, and further improvement in performance could be obtained by exploring more sophisticated networks. A particularly exciting prospect would be to use autoencoders and adversarial training to learn lower dimensional representation of connectivity networks that are site independent. Additionally, exploring the use of generalisable GCNs defined in the graph spatial domain [Monti et al. 2016] would allow to train similarity metrics between graphs of different structures.

Page 67: Geometric Deep Learning

Graph classificationGraph Classification via Deep Learning with Virtual NodesTrang Pham, Truyen Tran, Hoa Dam, Svetha Venkatesh(Submitted on 14 Aug 2017)https://arxiv.org/abs/1708.04357

Virtual Column Network (VCN) = Column Network + Virtual Node. (Left) A graph of 3 nodes augmented with a virtual node e0 connecting to all nodes. (Right) The VCN model for the graph, where x

0 is vector of graph

descriptors (if any), x1, x2, x3 are node attributes, and y is the graph label. Boxes are hidden layers.

There are rooms open for further investigations. First, we can use multiple virtual nodes instead of just one. The graph is then embedded into a matrix whose columns are vector representation of virtual nodes. This will be beneficial in several ways. For multitask learning, each virtual node will be used for a task and all tasks share the same node representations. For big graphs with tight subgraph structures, each virtual node can target a subgraph.

Second, other node representation architectures beside Column Networks are also applicable for deriving graph representation, including Gated Graph Sequence Neural Network [Li et al., 2016], Graph Neural Network [Scarselli et al., 2009] and diffusion-CNN [Atwood and Towsley, 2016].

We have proposed a simple solution for learning representation of a graph: adding a virtual node to the existing graph. The expanded graph can then be passed through any node representation method, and the representation of the virtual node is the graph’s. The virtual node, coupled with a recent node representation method known as Column Network, results in a new graph classification method called Virtual Column Network (VCN).

We demonstrate the power of the VCN on two tasks: (i) classification of bio-activity of chemical compounds against a given cancer; (ii) detecting software vulnerability from source code. Overall, the automatic representation learning is more powerful than state-of-the art feature engineering.

Page 68: Geometric Deep Learning

Information diffusion DeepCas: an End-to-end Predictor of Information CascadesCheng Li, Jiaqi Ma, Xiaoxiao Guo, Qiaozhu MeiWWW 2017, April 3–7, 2017, Perth, Australia.http://dx.doi.org/10.1145/3038912.3052643

Inspired by the recent successes of deep learning in multiple data mining tasks, we investigate whether an end-to-end deep learning approach could effectively predict the future size of cascades. Such a method automatically learns the representation of individual cascade graphs in the context of the global network structure, without hand-crafted features or heuristics. We find that node embeddings fall short of predictive power, and it is critical to learn the representation of a cascade graph as a whole. We present algorithms that learn the representation of cascade graphs in an end-to-end manner, which significantly improve the performance of cascade prediction over strong baselines including feature based methods, node embedding methods, and graph kernel methods. Our results also provide interesting implications for cascade prediction in general

Most modern social network platforms are designed to facilitate fast diffusion of information. Information cascades are identified to be a major factor in almost every plausible or disastrous social network phenomenon, ranging from viral marketing, diffusion of innovation, crowdsourcing, rumor spread, cyber violence, and various types of persuasion campaigns.

Finally, to make our conclusion clean and generalizable, we only utilized the network structure and node identities in the prediction. It is interesting to incorporate DeepCas with other types of information when they are available, e.g., content and time series, to optimize the prediction accuracy on a particular domain.

Page 69: Geometric Deep Learning

Community clusteringCommunity Detection with Graph Neural NetworksJoan Bruna, Xiang Li(Submitted on 23 May 2017 (v1), last revised 27 May 2017 (this version, v2))https://arxiv.org/abs/1705.08415https://github.com/alexnowakvila/QAP_pt (Lua Torch)

We input an arbitrary signal (can be random or can be informative) and output a classification of the nodes. The colour saturation representative of the magnitude of the signal, whereas the colour difference encode different label classes (red versus blue in this case).

In this work we have studied data-driven approaches to clustering with graph neural networks. Our results confirm that, even when the signal-to-noise ratio is at the lowest detectable regime, it is possible to backpropagate detection errors through a graph neural network that can ‘learn’ to extract the spectrum of an appropriate operator. This is made possible by considering generators that span the appropriate family of graph operators that can operate in sparsely connected graphs.

One word of caution is that obviously our results are inherently non-asymptotic, and further work is needed in order to confirm that learning is still possible as |V | grows. Nevertheless, our results open up interesting questions, namely understanding the energy landscape that our model traverses as a function of signal-to-noise ratio; or whether the network parameters can be interpreted mathematically. This could be useful in the study of computational-to-statistical gaps, where our model could be used to inquire about the form of computationally tractable approximations.

Page 71: Geometric Deep Learning

Image Classification for 360º imaging #1a

Graph-Based Classification of Omnidirectional ImagesRenata Khasanova, Pascal Frossard(Submitted on 26 Jul 2017)https://arxiv.org/abs/1707.08301

The proposed graph construction method makes response of the filter similar regardless of different position of the pattern on an image from an omnidirectional camera.

Our work is inspired by Khasanova and Frossard (2017) where the authors use graphs to create isometry invariant features of images in Euclidean space. This tackles the first of the aforementioned challenges, however, the same object seen at different positions of an omnidirectional image still remains different from the network point of view. To mitigate this issue, we propose to incorporate the knowledge about the geometry of the omnidirectional camera lens into the signal representation, namely in the structure of the graph. In summery we therefore propose the following contributions:

● A principled way of graph construction based on geometry of omnidirectional images;

● Graph-based deep learning architecture for the omnidirectional image classification task.

Page 72: Geometric Deep Learning

Image Classification #2

Example of how semantic knowledge about the world aids classification. Here we see an elephant shrew. Humans are able to make the correct classification based on what we know about the elephant shrew and other similar animals.

Graph Search Neural Network diagram. Shows initialization of hidden states, addition of new nodes as graph is expanded and the flow of losses through the output, propagation and importance nets.

The GSNN and the framework we use for vision problems is completely general. Our next steps will be to apply the GSNN to other vision tasks, such as detection, Visual Question Answering, and image captioning. Another interesting direction would be to combine the procedure of this work with a system such as NEIL(Babenko et al. 2014) to create a system which builds knowledge graphs and then prunes them to get a more accurate, useful graph for image tasks.

An evaluation of large-scale methods for image instance and class discoveryMatthijs Douze, Hervé Jégou, Jeff Johnson(Submitted on 9 Aug 2017)https://arxiv.org/abs/1708.02898

Page 73: Geometric Deep Learning

Image Classification #3: with situation recognitionSituation Recognition with Graph Neural NetworksRuiyu Li, Makarand Tapaswi, Renjie Liao, Jiaya Jia, Raquel Urtasun, Sanja Fidler(Submitted on 14 Aug 2017)https://arxiv.org/abs/1708.04320

Understanding an image involves more than just predicting the most salient action. We need to know who is performing this action, what tools (s)he may be using, etc. Situation recognition is a structured prediction task that aims to predict the verb and its frame that consists of multiple role-noun pairs. The figure shows a glimpse of our model that uses a graph to model dependencies between the verb and its roles.

Images corresponding to the same verb can be quite different in their content involving verb roles. This makes situation recognition difficult.

The architecture of fully-connected roles GGNN. The undirected edges between all roles of a verb-frame allows to fully capture the dependencies between them.

The architecture of chain RNN for verb riding. The time-steps at which different roles are predicted needs to be decided manually, and has an influence on the performance.

2D t-SNE representation of the learned verb embeddings of the verbs belonging to 11 largest

clusters

Page 74: Geometric Deep Learning

Similarity search #1Fast Spectral Ranking for Similarity SearchAhmet Iscen, Yannis Avrithis, Giorgos Tolias, Teddy Furon, Ondrej Chum(Submitted on 20 Mar 2017 (v1), last revised 22 Mar 2017 (this version, v2))https://arxiv.org/abs/1703.06935

“Even if a nearest neighbor graph is computed offline, exploring the manifolds online remains expensive. This work introduces an explicit embedding reducing manifold search to Euclidean search followed by dot product similarity search. We show this is equivalent to linear graph filtering of a sparse signal in the frequency domain, and we introduce a scalable offline computation of an approximate Fourier basis of the graph. “

Deep Gaussian Embedding of Attributed Graphs: Unsupervised Inductive Learning via RankingAleksandar Bojchevski, Stephan Günnemann(Submitted on 12 Jul 2017 (v1), last revised 13 Jul 2017 (this version, v2))https://arxiv.org/abs/1707.03815

Two queries with the lowest AP from Paris6k and the corresponding top-ranked negative images based on the ground-truth. The left-most image contains the query object, and its AP is reported underneath it. Images considered to be negative based on the ground-truth are shown with their rank underneath. Ranks are marked in blue for incorrectly labeled images that are visually relevant (overlapping), and red otherwise.

Attributed graphs are a natural representation of a wide variety of real-life data such as biological networks (gene/protein interaction networks), sensor networks (smart homes/cities), social and information networks (Facebook friendship networks, Amazon/Epinions rating networks), and coauthor and citation networks (DBLP, Arxiv, Citeseer).

The main contributions of our approach are summarized as follows:

a) We embed nodes as Gaussian distributions allowing us to capture uncertainty.

b) We take into the account the network structure at multiple scales by exploiting the natural ordering of nodes via a ranking formulation on the embedding distances.

c) We propose a completely unsupervised general method applicable to different types of graphs (plain, attributed, directed, undirected) able to handle inductive learning scenarios – embedding unseen nodes that were not part of the initial network without additional training.

2D visualization of the embeddings on the Cora dataset. Color indicates the class label not used during training.

Conclusion We proposed Graph2Gauss – the first unsupervised approach that represents nodes in attributed graphs as Gaussian distributions and is therefore able to capture uncertainty.

Since we exploit the attribute information of the nodes we can effortlessly generalize to unseen nodes, enabling inductive reasoning. Graph2Gauss leverages the natural ordering of the nodes w.r.t. their neighborhoods via a personalized ranking formulation. The strength of the learned embeddings have been demonstrated on several tasks – specifically achieving high link prediction performance even in the case of low dimensional embeddings. As future work we aim to study personalized rankings beyond the ones imposed by hoppness.

Page 75: Geometric Deep Learning

Graphical model discoveryLearning to Discover Sparse Graphical ModelsEugene Belilovsky (CVN, GALEN), Kyle Kastner, Gaël Varoquaux (NEUROSPIN, PARIETAL), Matthew Blaschko(Submitted on 20 May 2016 (v1), last revised 3 Aug 2017 (this version, v3))https://arxiv.org/abs/1605.06359

(a) Illustration of nodes and edges "seen" at edge 4,13 in layer 1 and (b) Receptive field at layer 1. All entries in grey show the o0

i,j in covariance matrix used to compute o14,13. (c) shows the dilation

process and receptive field (red) at higher layers. Finally the equations for each layer output are given, initialized by the covariance entries p

i,j.

Diagram of the DeepGraph structure discovery architecture used in this work. The input is first standardized and then the sample covariance matrix is estimated. A neural network consisting of multiple dilated convolutions (Yu & Koltun, 2015) and a final 1 × 1 convolution layer is used to predict edges corresponding to non-zero entries in the precision matrix.

Cancer Genome Data We perform experiments on a gene expression dataset described in Honorio et al. (2012). DeepGraph has far better stability in the genome experiments and is competitive in the fMRI data.

Average test likelihood for COAD and BRCA subject groups in gene data and neuroimaging data using different number of selected edges. Each experiment is repeated 50 times for genetics data. It is repeated approximately 1500 times in the fMRI to obtain significant results due high variance in the data. DeepGraph with averaged permutation dominates in all cases for genetics data, while DeepGraph+Permutation is superior or equal to competing methods in the fMRI data.

We consider structure discovery of undirected graphical models from observational data. Inferring likely structures from few examples is a complex task often requiring the formulation of priors and sophisticated inference procedures.

We find that on genetics, brain imaging, and simulation data we obtain performance generally superior to analytical methods..

Page 76: Geometric Deep Learning

Mesh processing with graphsSurface NetworksIlya Kostrikov, Joan Bruna, Daniele Panozzo, Denis Zorin(Submitted on 30 May 2017)https://arxiv.org/abs/1705.10819

Page 78: Geometric Deep Learning

Point Cloud Deep learning Intro NIPS 2016Deep learning is proven to be a powerful tool to build models for language (one-dimensional) and image (two-dimensional) understanding. Tremendous efforts have been devoted to these areas, however, it is still at the early stage to apply deep learning to 3D data, despite their great research values and broad real-world applications. In particular, existing methods poorly serve the three-dimensional data that drives a broad range of critical applications such as augmented reality, autonomous driving, graphics, robotics, medical imaging, neuroscience, and scientific simulations. These problems have drawn the attention of researchers in different fields such as neuroscience, computer vision, and graphics.

The goal of this workshop is to foster interdisciplinary communication of researchers working on 3D data (Computer Vision and Computer Graphics) so that more attention of broader community can be drawn to 3D deep learning problems. Through those studies, new ideas and discoveries are expected to emerge, which can inspire advances in related fields.

This workshop is composed of invited talks, oral presentations of outstanding submissions and a poster session to showcase the state-of-the-art results on the topic. In particular, a panel discussion among leading researchers in the field is planned, so as to provide a common playground for inspiring discussions and stimulating debates.

The workshop will be held on Dec 9 at NIPS 2016 in Barcelona, Spain. http://3ddl.cs.princeton.edu/2016/

ORGANIZERS● Fisher Yu - Princeton University

● Joseph Lim - Stanford University

● Matthew Fisher - Stanford University

● Qixing Huang - University of Texas at Austin

● Jianxiong Xiao - AutoX Inc.

http://cvpr2017.thecvf.com/ In Honolulu, Hawaii

“I am co-organizing the 2nd Workshop on Visual Understanding for Interaction in conjunction with CVPR 2017. Stay tuned for the details!”

“Our workshop on Large-Scale Scene Under-standing Challenge is accepted by CVPR 2017.

http://3ddl.cs.princeton.edu/2016/slides/su.pdf

Page 79: Geometric Deep Learning

Point Cloud Deep learning Intro CVPR 2017 #1http://cvpr2017.thecvf.com/ In Honolulu, Hawaii

Main topics of interest. Shape Representations: Silhouettes, Surfaces, Skeletons, Humans, etc..

. Information Geometry: Fisher-Rao and elastic metrics, Gromov-Wasserstein family, Earth-Mover’s distance, etc.

. Dynamical Systems: Trajectories on manifolds, Rate-invariance, Identification and classification of systems.

. Domain Transfer: Ideas and applications.

. Image/Volume/Trajectory: Spatial and temporal registration & segmentation.

. Manifold-Valued Features: Histograms, Covariance, Symmetric positive-definite matrices, Mixture model.

. Big Data: Dimension-reduction using geometric tools.

. Bayesian Inferences: Nonlinear domains, Computational solutions using differential geometry, Variational approaches.

. Machine Learning Approaches on Nonlinear Feature Spaces: Kernel methods, Boosting, SVM-type classification, Detection and tracking algorithms.

. Functional Data Analysis: Hilbert manifolds, Visualization.

. Applications: Medical analysis, Biometrics, Biology, Environmetrics, Graphics, Activity recognition, Bioinformatics, Pattern recognition, etc.

. Geometry of articulated bodies: Applications to robotics, biomechanics, and motor control.

. Computational topology and applications.

http://www-rech.telecom-lille.fr/diffcvml2017/index.html

SyncedIn-Depth AI Technology & Industry Review www.syncedreview.com | www.jiqizhixin.comAug 7

CVPR 2017: The Fusion of Deep Learning and Computer Vision, What’s Next?

As mentioned, the nature of dimensionality challenges researchers with the noisy, low resolution, and incomplete scan data. Current works are beginning to capture global semantic meanings and matching them with local geometry patterns. However, the size of current datasets may no longer support cutting edge researches.

Thus, the next research goal may shift to developing a properly designed dataset for 3D Vision. Other papers like:

Noise Robust Depth From Focus Using a Ring Difference Filter, Learning From Noisy Large-Scale Datasets With Minimal Supervision, Global Hypothesis Generation for 6D Object Pose Estimation, Multi-Scale Continuous CRFs as Sequential Deep Networks for Monocular Depth Estimation,

are dealing with noisy data and estimation problems. “I’m interested in the Geometric Deep Learning, it will be the new trend.” one Ph.D. student said.

3D Vision

Coupling data and model. This will be the future trend. The fundamental problem of current research is that there is no longer enough data for advanced algorithms or models for special applications. So many researchers’ outputs consist of not only algorithms or architectures, but also datasets or methods to amass data.

Page 80: Geometric Deep Learning

Point Cloud Deep learning Intro CVPR 2017 #2DIY A Multiview Camera System: Panoptic Studio TeardownHanbyul Joo, Tomas Simon, Hyun Soo Park, Shohei Nobuhara, Yaser Sheikh

Deep Learning for Objects and ScenesBolei Zhou, Prof. Xiaogang Wang, Ross Girshick, Kaiming He

Geometric deep learning on graphs and manifoldsProf. Michael Bronstein , Prof. Joan Bruna , Prof. Arthur Szlam , Prof. Xavier Bresson Prof. Yann LeCun

Geometric and Semantic 3D ReconstructionChristian Häne, Jakob Engel, Srikumar Ramalingam, Sudipta Sinha

Large-Scale Visual Place Recognition and Image-Based LocalizationTorsten Sattler, Akihiko Torii, Alex Kendall, Giorgos Tolias

http://nrsfm2017.compute.dtu.dk/program

Bridges to 3D Workshop:How can 3D Vision Help and

Be Helped?

July 26th @ CVPR 2017 in Honolulu, Hawaii, USA

https://bridgesto3d.github.io/

Transformation-Grounded Image Generation Network for Novel 3D View SynthesisEunbyung Park, Jimei Yang, Ersin Yumer, Duygu Ceylan, Alexander C. Berg

Generating Holistic 3D Scene Abstractions for Text-Based Image RetrievalAng Li, Jin Sun, Joe Yue-Hei Ng, Ruichi Yu, Vlad I. Morariu, Larry S. Davis

Learning to Align Semantic Segmentation and 2.5D Maps for GeolocalizationAnil Armagan, Martin Hirzer, Peter M. Roth, Vincent Lepetit

Semantic Multi-View Stereo: Jointly Estimating Objects and VoxelsAli Osman Ulusoy, Michael J. Black, Andreas Geiger

Page 81: Geometric Deep Learning

Point Cloud Deep learning Intro CVPR 2017 #3Transformation-Grounded Image Generation Network for Novel 3D View SynthesisEunbyung Park, Jimei Yang, Ersin Yumer, Duygu Ceylan, Alexander C. Berg(Submitted on 8 Mar 2017) https://arxiv.org/abs/1706.09577

https://www.youtube.com/watch?v=etomDJy-ipY

Dynamic Time-of-FlightMichael Schober, Amit Adam, Omer Yair, Shai Mazor, Sebastian Nowozinopenaccess.thecvf.com

https://youtu.be/F8dlTD1I20M

Basic idea of the proposed method: (a) an ordinary (static) time-of-flight camera captures a set of infrared (IR) frames using the same active measurement pattern (grey) in each time step and uses the captured information only once, in the current time step, to reconstruct depth; (b) the proposed dynamic time-of-flight camera uses different measurement patterns in each time step (in this case two, shown in orange and purple) and integrates multiple measurements in time (additional arrows) to improve the depth accuracy.

This combination allows us to integrate information over multiple frames while remaining robust to rapid changes.

This report mainly focuses on the recent research progress in graphics on geometry, structure and semantic analysis of 3D indoor data and different modeling techniques for creating plausible and realistic indoor scenes. We first review works on understanding and semantic modeling of scenes from captured 3D data of the real world. At last, we discuss open problems in indoor scene processing that might bring interests to graphics and all related communities.

In this paper, we present a probabilistic approach that integrates object-level shape priors with image-based 3D reconstruction. Our experiments demonstrate that the proposed shape prior significantly improves reconstruction accuracy, in particular when the number of input images is small. To the best of our knowledge, our approach is the first to simultaneously reconstruct a dense 3D model of the entire scene and a structural representation of the objects in it. Our experiments demonstrate the benefit of this joint inference. Further, we believe such an integrated representation of 3D geometry and semantics is a step towards holistic scene understanding and will benefit applications such as augmented reality and autonomous driving.

Future directions include incorporating parametric shape models to improve generality of our prior. We also believe recent 3D pose estimation methods[Gupta et al. 2015; Kehl et al. 2016; Massa et al. 2016] can be used to improve pose proposals during inference.

Semantic Multi-view Stereo: Jointly Estimating Objects and VoxelsAli Osman Ulusoy, Michael J. Black, Andreas Geigerhttp://www.cvlibs.net/publications/Ulusoy2017CVPR.pdf

Page 82: Geometric Deep Learning

Point Cloud Deep learning PointNet #1A

https://github.com/charlesq34/pointnethttps://arxiv.org/abs/1612.00593

Applications of PointNet. We propose a novel deep net architecture that consumes raw unordered point cloud (set of points) without voxelization or rendering. It is a unified architecture that learns both global and local point features, providing a simple, efficient and effective approach for a number of 3D recognition tasks

Page 83: Geometric Deep Learning

Point Cloud Deep learning PointNet #1B

Our network has three key modules:

1) the max pooling layer as a symmetric function to aggregate information from all the points,

2) a local and global information combination structure,

3) and two joint alignment networks that align both input points and point features.

Page 84: Geometric Deep Learning

Point Cloud Deep learning PointNet #1c

For further details on the “original PointNet”:

Journal club done for Cubicasa with Vid Stojevic

https://www.slideshare.net/PetteriTeikariPhD/pointnet

Page 85: Geometric Deep Learning

Point Cloud Deep learning #2PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric SpaceCharles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas(Submitted on 7 Jun 2017)https://arxiv.org/abs/1706.02413

Page 86: Geometric Deep Learning

Point Cloud Deep learningOptimization Beyond the Convolution: Generalizing Spatial Relations with End-to-End Metric LearningPhilipp Jund, Andreas Eitel, Nichola Abdo, Wolfram Burgard(Submitted on 4 Jul 2017)https://arxiv.org/abs/1707.00893https://youtu.be/d0PSULLu1Vs

To operate intelligently in domestic environments, robots require the ability to understand arbitrary spatial relations between objects and to generalize them to objects of varying sizes and shapes. In this work, we present a novel end-to-end approach utilizing neural networks to generalize spatial relations based on distance metric learning. Our network transforms spatial relations to a feature space that captures their similarities based on 3D point clouds of the objects and without prior semantic knowledge of the relations. It employs gradient-based optimization to compute object poses in order to imitate an arbitrary target relation by reducing the distance to it under the learned metric.

An example of generalizing a relation. The pre-trained Convolutional Network encodes the transformation into the metric space. By backpropagating the error of the euclidean distance between the current scene’s embeddings and the reference scene’s embeddings, we can optimize 3D translation and rotation of two objects to resemble the reference scene’s spatial relations.

Projection layer. Projections to three orthogonal planes. We project each point cloud to the three orthogonal planes defined by y−1 = 0, x = 0, and z−1 = 0. We create a depth image by setting the value of a pixel to the smallest distance of all points that are projected on this pixel multiplied by 100 and shifted by 100. Each projection of the two objects is in a separate channel. For this visualization, we added an all-zero blue channel.

Page 87: Geometric Deep Learning

3D shapes via 2D geometry images #1Deep learning 3D shape surfaces using geometry imagesAyan Sinha, Jing Bai and Karthik RamaniECCV 2016: Computer Vision – ECCV 2016 pp 223-240https://doi.org/10.1007/978-3-319-46466-4_14https://github.com/sinhayan/learning_geometry_images

Left Shape representation using geometry images: The original teddy model to the left is reconstructed (right) using geometry image representation corresponding to the X, Y and Z coordinates (center), Right Learning 3D shape surfaces using geometry images: Our approach to learn shapes using geometry images is applicable to rigid (left) as well as non-rigid objects undergoing isometric transformations (right). The geometry image encode local properties of shape surfaces such as principal curvatures (C

min,

Cmax

). Topology of a non-zero genus surface is accounted for by using a topological mask (Ctop

) as in the bookshelf example.

Authalic vs Conformal parametrization: (Left to right) 2500 vertices of the hand mesh are color coded in the first two plots. A 64× 64 geometry image is created by uniformly sampling a parametrization, and then interpolating the nearby feature values. Authalic geometry image encodes all tip features. Conformal parametrization compress high curvature points to dense regions [12]. Hence, finger tips are all mapped to a very small regions. The fourth plot shows that the resolution of geometry image is insufficient to capture the tip feature colors in conformal parametrization. This is validated by reconstructing shape from geometry images encoding x, y, z locations for both parameterizations in final two plots.

Left Left: Harmonic field corresponding to area distortion on sphere displayed on the original mesh. Center: Area restoring flow on the spherical domain mapped onto the original mesh as a quiver plot. Right: Enlarged plot of area restoring flow. Right: Explanation of geometry image construction from a spherical parametrization: The spherical parametrization (A) is mapped onto an octahedron (B) and then cut along edges (4 colored dashed edges in line plot below) to output a flat geometry image (C). The colored edges share the same color coding as the one in the octahedron. Also the half-edges on either side of the midpoint of colored edges correspond to the same edge of the octahedron.

Left Intrinsic vs. Extrinsic properties of shapes. Top left: Original shape. Top Right: Reconstructed shape from geometry image with cut edges displayed in red. The middle and bottom rows show the geometry image encoding the y coordinates and HKS, respectively of two spherical parameterizations (left and right). The two spherical parameterizations are symmetrically rotated by 180 degrees along the Y-axis. The geometry images for Y-coordinate display an axial as well as intensity flip. Whereas, the geometry images for HKS only display an axial flip. This is because HKS is an intrinsic shape signature (geodesics are persevered) whereas point coordinates on a shape surface are not. Center Intrinsic descriptors (here the HKS) are invariant to shape articulations. Right Padding structure of geometry images: The geometry images for the 3 coordinates are replicated to produce a 3× 3 grid. The center image in each grid corresponds to the original geometry image. Observe no discontinuities exist along the grid edges.

“In the future we wish to build upon these insights for generative modeling of 3D shapes using geometry images instead of traditional images using deep learning. We believe that deep learning using geometry images can potentially spark a closer communion between the 3D vision and geometry community.”

Page 88: Geometric Deep Learning

3D shapes via 2D geometry images #2GWCNN: A Metric Alignment Layer for Deep Shape AnalysisDanielle Ezuz, Justin Solomon, Vladimir G. Kim, Mirela Ben-ChenComputer Graphics Forum, 36: 49–57. doi: 10.1111/cgf.13244

Our approach maps unstructured geometric data to a regular domain by minimizing the metric distortion of the map using the regularized Gromov–Wasserstein objective.

Our pipeline. We first compute point-wise surface features that are mapped to a stack of 2D functions over a canonical domain via our novel metric alignment layer (see Below). This provides a natural input for subsequent CNN layers.

Our metric alignment layer. The inputs are a pairwise distance matrix D, a measure µ, and k per-point descriptors. The layer maps the descriptors to a common domain by learning its metric D

0, and generates a mapping matrix Γ. Finally, the input descriptors are multiplied by Γ to generate

a stack of 2D images to feed to the following layers.

Our convolutional neural network for classification and shape retrieval.

tSNE embedding [Maaten and Hinton, 2008] of mapped descriptors corresponding to different mapping methods: (left) Geometry images [Sinha et al. 2016], (center) GW mapping to a fixed 2D grid, (right) GW mapping with a learned metric. We show the first 10 classes of SHREC’11 [Lian et al. 2011], where each point represents a shape, and shapes from the same class are in the same color. Note that after metric learning our mapped descriptors are nicely clustered, aiding the classification and retrieval tasks we optimized the embedding for.

Toy example, learning a target metric to optimize consistency. Given input descriptors f, ˜f on two shapes we show embeddings to canonical domain (center) before optimization (top) and after optimization (bottom).

Page 89: Geometric Deep Learning

Metric learning for point clouds

We present a novel method based on distance metric learning to reason about the similarity between pairwise spatial relations. Our approach uses demonstrations of a relation given by non-expert teachers (top row) in order to generalize this relation to new objects (bottom row)

Visualization of the descriptor computation for the spatial relation between objects o

k and o

l. We consider the gravity vector g at the

centroid ck of the reference object ok. We compute angles θ and ϕ based

on direction vectors involving all surface points pk and p

l on o

k and o

l,

respectively

CONCLUSION

In this paper, we presented a novel approach to the problem of learning pairwise spatial relations and generalizing them to different objects. Our method is based on distance metric learning and enables a robot to reason about the similarity between scenes with respect to the relations they represent. To encode a relation, we introduced a novel descriptor based on the geometries of the objects. By learning a distance metric using this representation, our method is able to reproduce a new relation from a small number of teacher demonstrations by reasoning about its similarity to previously-encountered ones. In this way, our approach allows for a lifelong learning scenario by continuously leveraging its prior knowledge about relations to bootstrap imitating new ones. We evaluated our approach extensively using real-world data we gathered from non-expert teachers.

Our results demonstrate the effectiveness of our approach in reasoning about the similarity between relations and its ability to reproduce arbitrary relations to new objects by learning interactively from a teacher. In the future, we plan to extend the approach to partial observations of point cloud data.

Metric Learning for Generalizing Spatial Relations to New ObjectsOier Mees, Nichola Abdo, Mladen Mazuran, Wolfram Burgard(Submitted on 6 Mar 2017 (v1), last revised 24 Jul 2017 (this version, v3))https://arxiv.org/abs/1703.01946

Page 90: Geometric Deep Learning

3D Shape MatchingEfficient Deformable Shape Correspondence via Kernel MatchingZorah Lähner, Matthias Vestner, Amit Boyarski, Or Litany, Ron Slossberg, Tal Remez, Emanuele Rodolà, Alex Bronstein, Michael Bronstein, Ron Kimmel, Daniel Cremers(Submitted on 25 Jul 2017 (v1), last revised 1 Aug 2017 (this version, v2))https://arxiv.org/abs/1707.08991

Schematic illustration of the proposed algorithm for maximizing a convex quadratic objective over a convex polytope, by successively maximizing a linear sub-estimate of it. The hot color map encodes the function values. The jet color map encodes the values of the linear sub-estimate. The point around which the objective is linearized is depicted in red. The global maximum is depicted in blue. The maximum of the linear sub-estimate is depicted in green. Notice that the algorithm travels between extreme but not necessarily adjacent points of the polytope, until it converges to a local maximum.

Conceptual illustration of our multiscale approach: (a) correspondence at a coarse scale is given; (b) vertices on the source shape are grouped into sets (left), and the known correspondence is used to group vertices on the target shape (right); (c) vertices at a finer scale are added (small black spheres), and (d) included in the group they reside in; finally, a correspondence is calculated for each group separately.

Qualitative examples on FAUST models (left), SHREC’16 (middle) and SCAPE (right). In the SHREC experiment, the green parts mark where no correspondence was found. Notice how those areas are close to the parts that are hidden in the other model. The missing matches (marked in black) in the SCAPE experiment are an artifact due to the multiscale approach.

Page 91: Geometric Deep Learning

3D Shape Segmentation for graphs #13D shape segmentation via shape fully convolutional networksPengyu Wang, Yuan Gan, Panpan Shui, Fenggen Yu, Yan Zhang, Songle Chen, Zhengxing SunComputers & Graphics Available online 26 July 2017https://doi.org/10.1016/j.cag.2017.07.030

Representation of different forms of data. (a) Image data representation; (b) Shape data representation; (c) Shape data represented as graph structure

Convolution process on a shape. (a) Shape represented as a graph; (b) The neighbourhood nodes of different source nodes searched by a breadth-first search, among which 4, 6, 7 represent source nodes. The areas circled by orange, red and green dotted lines are the neighbourhood nodes searched by each source node. (c) Convolution order of each neighbourhood set. The ellipsis is for other neighbourhood sets not represented.

Example of graph coarsening and pooling. (a) Graph coarsening process. Note: the original graph has 9 arbitrarily ordered vertices. For a pooling of size 4, two coarsenings of size 2 are needed. To ensure that in the coarsening process the balanced binary tree can be constructed, we need to add appropriate fake nodes through calculation, which are identified in red. After coarsening, the node order on G3 is still arbitrary, yet it can be manually set. Then backstepping the coarsening process, we can determine the node order of G2 and G1, and the corresponding relationship between nodes in each layer according to the order of G3. At that point the arrangement of vertices in G1 permits a regular 1D pooling. (b) Pooling order. The nodes in the first layer (blue and red) represent the G1 node order; the second layer (yellow and red) represents G2 node order; the third layer (purple) represents the G3 node order and the corresponding relationship between nodes in each layer. The red nodes are fake nodes, which are set to 0 in the pooling process, as we carry out maximum pooling.

Diagram of the generating layer. (a) The pooling order. A red number represents the offset, nodes in the blue circles represent nodes on the graph, nodes in the red circles are fake nodes. (b) Recorded neighbourhood set of each node. The numbers in the table represent the offset of each node after pooling sorting. (c) Data storage of the generating layer proposed in this paper, based on which our method can implement convolution operation by row and pooling operation by column.

Page 92: Geometric Deep Learning

3D Shape Segmentation for graphs #2SyncSpecCNN: Synchronized Spectral CNN for 3D Shape SegmentationLi Yi, Hao Su, Xingwen Guo, Leonidas Guibas(Submitted on 2 Dec 2016)https://arxiv.org/abs/1612.00606https://github.com/ericyi/SyncSpecCNNhttps://youtu.be/ClvLCXQ9Ipw

Our SyncSpecCNN takes a shape graph equipped with vertex functions (i.e. spatial coordinate function) as input and predicts a per-vertex label. The framework is general and not limited to a specific type of output. We show 3D part segmentation and 3D keypoint prediction as example outputs here.

Architecture of our SyncSpecCNN. Spectral convolution is done through first transforming graph vertex functions into their spectral representation and then pointwise modulating it with a set of multipliers. The multiplied signal is transformed back to spatial domain to perform nonlinear operations. We introduce spectral transformer network to synchronize different spectral domains and allow better parameter sharing in spectral convolution. Convolution kernels are parametrized in a dilated fashion for effective multi-scale information aggregation

Visualization of modulated exponential window function with different dilation parameters in both spectral domain and spatial domain. The same spectral representation could induce spatially different kernel functions, especially when the kernel size is large.

We visualize some segmentation results from our network prediction. The first block shows typical correct segmentations, notice the huge shape variation we can cover. The second to fourth blocks summarize different error patterns we observe in the results.

Page 93: Geometric Deep Learning

Super-resolution and graphsLight Field Super-Resolution Via Graph-Based RegularizationMattia Rossi, Pascal Frossard(Submitted on 9 Jan 2017 (v1), last revised 31 Jul 2017 (this version, v2))https://arxiv.org/abs/1701.02141

On the other hand, the few super-resolution algorithms explicitly tailored for light field data exhibit significant limitations, such as the need to estimate an explicit disparity map at each view. In this work we propose a new light field super-resolution algorithm meant to address these limitations. We adopt a multi-frame alike super-resolution approach, where the complementary information in the different light field views is used to augment the spatial resolution of the whole light field. We show that coupling the multi-frame approach with a graph regularizer, that enforces the light field structure via nonlocal self similarities, permits to avoid the costly and challenging disparity estimation step for all the views. Extensive experiments show that the new algorithm compares favorably to the other state-of-the-art methods for light field super-resolution, both in terms of PSNR and visual quality.

Also, although the proposed algorithm is meant mainly for light field camera data, where the disparity range is typically small, it is flexible enough to handle light fields with larger disparity ranges too. We also compared our algorithm to a state-of-the-art single-frame super-resolution method based on CNNs [Dong et al. 2014], and showed that taking the light field structure into account allows our algorithm to recover finer details and most importantly avoids the reconstruction of a set of geometrically inconsistent high resolution views

Page 94: Geometric Deep Learning

Inpainting and graphs non-Deep learning methods #1Morphological PDEs on Graphs for Image Processing on Surfaces and Point CloudsAbderrahim Elmoataz, François Lozes and Hugues TalbotISPRS Int. J. Geo-Inf. 2016, 5(11), 213; doi:10.3390/ijgi5110213

Methods solving Partial Differential Equations (PDEs) directly on point clouds [Liang et al. (2012), Lai et al. (2013)] by using intermediate representations to approximate differential operators on point clouds. Lai et al. (2013) use a local triangulation that requires pre-processing. This pre-processing is needed to estimate differential operators on triangular meshes. This method can then further be categorized as an intrinsic method. Liang et al. (2012) compute a local approximation of the manifold using moving least squares from the k-nearest neighbors. From this local coordinate system, a local metric tensor is computed at each point such that the differentiation on the manifold is simplified. This method can therefore be categorized as an explicit method.

Inpainting of the color on a point cloud. (a) Point cloud to inpaint; (b) result of inpainting with w=1; (c) result of inpainting with w based on the patch defined on point clouds

In this paper, we have adopted the PDE framework, and we have focused on some PDEs-based continuous morphological operators on Euclidean domains: dilation/erosion, mean curvature flows and the Eikonal equation. We have extended their applications for the processing of colored point clouds of geo-informatics 3D data. We briefly presented the construction of a graph from mesh or point cloud and show several examples, such as restoration, denoising, inpainting, object extraction or estimation of the minimal path. We have shown results on geographic data, such as the estimation of the shortest path on a raw point cloud of a 3D city or the segmentation of a LIDAR 3D point cloud based on the colorimetric values. The results demonstrated many potentials of the point cloud approach for the processing of geo-informatics 3D data.

Nonlocal PDEs on Graphs: From Tug-of-War Games to Unified Interpolation on Images and Point CloudsAbderrahim Elmoataz, François Lozes, Matthieu ToutainJournal of Mathematical Imaging and Vision March 2017, Volume 57, Issue 3, pp 381–401https://doi.org/10.1007/s10851-016-0683-3

Page 95: Geometric Deep Learning

Inpainting and graphs non-Deep learning methods #2Nonlocal Inpainting of Manifold-valued Data on Finite Weighted GraphsRonny Bergmann, Daniel Tenbrinck(Submitted on 21 Apr 2017 (v1), last revised 12 Jul 2017 (this version, v2))https://arxiv.org/abs/1704.06424

Despite an analytic investigation of the convergence of the presented scheme, future work includes further development of numerical algorithms, as well as properties of the -Laplacian for ∞manifold-valued vertex functions on graphs.

With the technological progress in modern data sensors there is an emerging field of processing non-Euclidean data. We concentrate our discussion in the following on manifold-valued data, i.e., each data value lies on a Riemannian manifold. Real examples for manifold-valued images are interferometric synthetic aperture radar (InSAR) imaging, where the measured phase-valued data may be noisy and/or incomplete. Sphere-valued data appears, e.g., in directional analysis. Another application is diffusion tensor imaging (DT-MRI, where the diffusion tensors can be represented as 3 × 3 symmetric positive definite matrices, which also constitute a manifold.

Weighted graph laplacian and image inpaintingZ Shi, S Osher, W Zhu - UCLA CAM Report (16-63), 2016ftp://ftp.math.ucla.edu/pub/camreport/cam16-61.pdf

Conclusion and Future Work. In this paper, we introduce a novel weighted graph Laplacian method. The numerical results show that weighted graph Laplacian provides a reliable and efficient method to find reasonable interpolation on a point cloud in high dimensional space. On the other hand, it was found that with extremely low sample rate, formulation of the harmonic extension may fail [8, 15]. In this case, we are considering using the -∞Laplacian to compute the interpolation on point cloud

Page 96: Geometric Deep Learning

Futurefor 3D shape IMAGE restoration?

Laser Scanner Super-resolutionhttps://doi.org/10.2312/SPBG/SPBG06/009-015Cited by 47 articles

Page 97: Geometric Deep Learning

Geometry images and traditional super-resolution techniques?MemNet: A Persistent Memory Network for Image RestorationYing Tai, Jian Yang, Xiaoming Liu, Chunyan Xu(Submitted on 7 Aug 2017)https://arxiv.org/abs/1708.02209https://github.com/tyshiwo/MemNet

The same MemNet structure achieves the state-of-the-art performance in image denoising, super-resolution and JPEG deblocking. Due to the strong learning ability, our MemNet can be trained to handle different levels of corruption even using a single model.

CVAE-GAN: Fine-Grained Image Generation through Asymmetric TrainingJianmin Bao, Dong Chen, Fang Wen, Houqiang Li, Gang Hua(Submitted on 29 Mar 2017)https://arxiv.org/abs/1703.10155https://github.com/tatsy/keras-generative

The proposed method can support a wide variety of applications, including image generation, attribute morphing, image inpainting, and data augmentation for training better face recognition models

Page 99: Geometric Deep Learning

Geometric DNNs • implementation options in practice #1: GVNNankurhanda/gvnnInsights gvnn: Geometric Vision with Neural Networks

gvnn is primarily intended for self-supervised learning using low-level vision. It is inspired by the Spatial Transformer Networks (STN) paper that appeared in NIPS in 2015 and its open source code made available by Maxime Oquab. The code is self contained i.e. the original implementation of STN by Maxime is also within the repository.

STs were mainly limited to applying only 2D transformations to the input. We added a new set of transformations often needed for manipulating data in 3D geometric computer vision. These include the 3D counterparts of what were used in original STN together with a lot more new transformations and different M-estimators.

SO3 Layer

Rotations are represented as so(3) 3-vector. This vector is turned into rotation matrix via the exponential map. For a more detailed view of the so(3) representation and exponential map read this tutorial from Ethan Eade: Lie-Algebra Tutorial. This is what the exponential map is Exponential Map. Also, Tom Drummond's notes on Lie-Algebra are a great source to learn about exponential maps Tom Drummond's notes. The reason for choosing so3 representation is mainly due to its appealing properties when linearising rotations (via Taylor series expansion) for iterative image alignment via classic linearise-solve-update rule. The figure below shows how linearisation for SO3 is fitting a local plane on the sphere

Optical Flow

Lens Distortion

Projection Layer