Representation Tradeoffs for Hyperbolic Embeddings
11 Dec 2018Introduction

The paper describes a combinatorial approach to embed trees into hyperbolic spaces without performing optimization.

The resulting mechanism is analyzed to obtain dimensionalityprecision tradeoffs.

To embed any metric spaces in the hyperbolic spaces, a hyperbolic generalization of the multidimensional scaling (hMDS) is proposed.
Preliminaries

Hyperbolic Spaces

Have the “tree” like property ie the shortest path between a pair of points is almost the same as the path through the origin.

Generally, Poincare ball model is used given its advantages like conformity to the Euclidean spaces.


Fidelity Measures

Mean Average Precision  MAP
 A local metric that ranks between distances of the immediate neighbors.

Distortion
 A global metric that depends on the underlying distances and not just the local relationship between distances.

Combinatorial Construction for embedding hierarchies into Hyperbolic spaces

Embed the given graph G = (V, E) into a tree T.

Embed the tree T into the poincare ball H_{d} of dimensionality d.
Sarkar’s construction to embed points in a 2d Poincare ball

Consider two points a and b (from the tree) where b is the parent of a.

Assume that a is embedded as f(a) and b is embedded as f(b) and the children of a needs to be embedded.

Reflect f(a) and f(b) across a geodesic such that f(a) is mapped to 0 (origin) while f(b) is mapped to some new point z.

Children of a are placed at points y_{i} which are equally placed around a circle of radius (e^{r}  1) / (e^{r} + 1) and maximally seperated from z, where r is the scaling factor.

Then all the points are reflected back across the geodesic so that all children are at a distance r from f(a).

To embed the tree itself, place the root node at the origin, place its children around it in a circle, then place their children and so on.

In this construct, precision scales logarithmically with the degree of the tree but linearly with the maximum path length.
ddimensional hyperbolic spaces

In the ddimensional space, the points are embedded into hyperspheres (instead of circles).

The number of children node that can be placed for a particular angle grows with the dimension.

Increasing dimension helps with bushy trees (with high node degree).
Hyperbolic multidimensional scaling (hMDS)

Given the pairwise distance from a set of points in the hyperbolic space, how to recover the points?

The corresponding problem in the Euclidean space is solved using MDS.

A variant of MDS called as hMDS is proposed.

MDS makes a centering assumption that points have 0 mean. In hMDS, a new mean (called as the pseudoEuclidean mean) is introduced to enable recovery via matrix factorization.

Instead of the Poincare model, the hyperboloid model is used (though the points can be mapped back and forth).
pseudoEuclidean Mean
 A set of points can always be centered without affecting their pairwise distance by simply finding their mean and sending it to 0 via isometry
Recovery via matrix factorization

Given the pairwise distances, a new matrix Y is constructed by applying cosh on the pairwise distances.

Running PCA on Y recovers X up to rotation.
Dimensionality Reduction with PGA (Principal Geodesic Analysis)

PGA is the counterpart of PCA in the hyperbolic spaces.

First the Karcher mean of the given points is computed.

All points x_{i} are reflected so that their mean is 0 in the Poincare disk model.

Combining that with Euclidean reflection formula and hyperbolic metrics leads to a nonconvex loss function which can be optimized using gradient descent algorithm.
Experiments

Datasets
 Trees: fully balanced and phylogenic trees expressing genetic heritage.
 Treelike hierarchy: WordNet hypernym and graph of Ph.D. advisoradvisee relationships.
 Notree like disease relationships, proteins interactions etc

Results
 Combinatorial construction outperforms approaches based on optimization in terms of both MAP and distortion.
 eg on WordNet, the combinatorial approach achieves a MAP of 0.989 with just 2 dimensions while the previous best was 0.87 with 200 dimensions.