Skip to Content

I am a Senior Research Scientist at Google Research in NYC.

These days, I mainly work on:

  1. Scalable & principled graph machine learning methods.
  2. Methods & tools for understanding unsupervised learning.
  3. Data for Gemini and Gemma model families.

I received my Ph.D. at the University of Bonn, advised by Prof. Emmanuel Müller. I did my masters at Skoltech and my bachelors at the Higher School of Economics both in Moscow, Russia. My M.Sc. advisor was Panagiotis Karras. In Russian, my name reads as Антон Цицулин.

For publications, see Google Scholar. For pre-Google code, see GitHub.

Code for papers I write at Google is generally available in our shared github folder.

Feel free to contact me by email: anton at tsitsul dot in.


My interests lie at the intersection of graph algorithms and machine learning. Much of my research is about creating efficient representations of graph-structured objects that can be used in downstream machine learning tasks.

Selected Publications

For the full list, see publications.

Graph Clustering with Graph Neural Networks

JMLR 2023

DMoN bridges graph neural networks and modularity optimization, allowing for efficient clustering of attributed graphs at scale.

InstantEmbedding: Efficient Local Node Representations

InstantEmbedding is the first embedding method that can generate an embedding of one node locally, without maintaining one for every node in a graph.

FREDE: Anytime Graph Embeddings

VLDB 2021

To date, no graph embedding work combined (i) linear space complexity, (ii) a nonlinear transform as its basis, and (iii) nontrivial quality guarantees. In this paper we introduce FREDE (FREquent Directions Embedding), a graph embedding based on matrix sketching that combines those three desiderata

The Shape of Data: Intrinsic Distance for Data Distributions

ICLR 2020

The ability to represent and compare machine learning models is crucial in order to quantify subtle model changes, evaluate generative models, and gather insights on neural network architectures. Existing techniques for comparing data distributions focus on global data properties such as mean and covariance. We develop a first-of-its-kind intrinsic and multi-scale method for characterizing and comparing data manifolds, using a lower-bound of the spectral variant of the Gromov-Wasserstein inter-manifold distance, which compares all data moments.

Just SLaQ When You Approximate: Accurate Spectral Distances for Web-Scale Graphs

WWW 2020 & Google AI blog

Spectral analysis provides quintessential tools for studying the multi-scale structure of graphs. However, computing full spectrum of large graphs is computationally prohibitive. We propose SLaQ, an efficient and effective approximation technique for computing spectral distances between graphs with billions of nodes and edges. We derive the corresponding error bounds and demonstrate that accurate computation is possible in time linear in the number of graph edges.

NetLSD: hearing the shape of a graph

KDD 2018, Audience appreciation award runner up

Comparing graphs is one of the cornerstones in graph analytics. Expressive and efficient measures for graph comparison are very rare - and there were none that scaled to both large graphs and large graph collections. We propose the Network Laplacian Spectral Descriptor (NetLSD): the first theoretically grounded and efficiently computable graph representation that allows for straightforward comparisons of large graphs.

VERSE: Versatile Graph Embeddings from Similarity Measures

WWW 2018

Node representations (aka graph embeddings) are a key tool for modern machine learning on graphs. Past research has addressed the problem of extracting such representations by adopting methods from words to graphs, without defining a clearly comprehensible graph-related objective. We show how the objectives used in past works research implicitly utilize similarity measures among graph nodes. We propose VERtex Similarity Embeddings (VERSE), a simple and efficient method that derives graph embeddings that explicitly preserve the distributions of a selected vertex-to-vertex similarity measure such as personalized PageRank.