Abstract by Roland Laboulaye
Flow and Language Model based Embedding (FLaME)
Creating easily comparable sentence embeddings requires reducing both vocabulary and sequential dimensions. Typical approaches to latent space embeddings for natural language such as recurrent VAEs are difficult to train given their reliance on iterative generation and on ELBO loss approximation. Normalizing flows overcome the approximation by performing density estimation using invertible transformations, allowing for maximum likelihood estimation. We propose a normalizing flow conditioned on a neural language model to create a mapping from conditional probability in latent space to conditional probability in language space. The model learns by transforming sentences into latent embeddings, avoiding slow iterative generation during training. We constrain the model to map each word in a sentence to similar latent embeddings, creating a centered embedding for a sentence that allows for easy cross-sentence comparisons.