Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A Quantum Many-body Wave Function Inspired Language Modeling Approach #22

Open
flrngel opened this issue Sep 17, 2018 · 0 comments
Open

Comments

@flrngel
Copy link
Owner

flrngel commented Sep 17, 2018

Abstract

  • quantum probability theory
  • quantum-inspired LMs have two limitations
    • not taken into account interaction among the words with multiple meanings
    • lacking theoretical foundation accounting for effective training parameters
  • QMWF
    • can adopt the tensor product to model which interactions mong words

1. Introduction

  • two LM approaches
    • classical
      • increase the number of parameters to be estimated for compound dependencies
    • quantum inspired
      • estimates density matrix
      • encodes both single words and compound words (QLM)
  • QLM and NNQLM
    • end-toend neural network structure
    • are not modeled to complex interaction among words with multiple meanings
    • no principled manner
  • QWMF
    • the wave function can model the interaction among many spinful particles
    • convolutional neural network architecture can be mathematically derived in quantum-inspired language modeling approach
    • outperforms quantum LM counterparts (QLM, NNQLM)

In this paper,

  1. propose Quantum Many-body Wave Function based Language Modeling approach
  • able to represent complex interaction among words
  1. show fundamental connection between QMWF and CNN
  2. performance checking with QA datasets

2. Quantum Preliminaries

2.1. Basic notation and concepts

image
image
image
image

2.2. Quantum Many-body Wave Functions

image

3. Quantum many-body wave function inspired language modeling

3.1. Basic intuitions and architecture

  • QMWF representation can model probability distribution of compound meanings
    • depends on basis vectors

3.2. Language representation and projection via many-body wave function

  • Local representation by product state
    image
    image
  • Global representation for all possible compound meanings
    image
    image

3.3. Production realized by convolutional neural network

image
image
image
image
image
image
image

6. Experiments

image

My notes

  • Can we think that QMWF is generalized version of attention mechanism?
    • sum of a_i^2 is 1
    • M = 1?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant