Despite the buzz surrounding deep neural networks (DNN) models for information retrieval, the literature is still lacking a systematic basic investigation on how generally we can model the retrieval problem using neural networks.

Modeling the retrieval problem in the context of neural networks means the general way that we frame the problem with regards to the essential components of a neural network, including what we consider as the objective function, and which kind of architecture we employ, how we feed the data to the network, etc.

Here, in this post, I try to present different general architectures that can be considered for modeling the retrieval problem. First, I provide a categorization of different models based on their *objective function*, and then I will discuss different approaches with regards to their* inference time*. Note that in the figures, I use the fully connected feed-forward neural network, while it can be replaced by more complex or more expressive neural models like LSTMs, or CNN.

### Categorizing Models by the Type of Objective Function

There are different models that the retrieval problem can be generally formulated in the neural network framework in terms of the objective function which is defined to be optimized: *Retrieval as Regression*, *Retrieval as Ranking*, and *Retrieval as Classification*. I am going to explain these models and discuss their pros and cons.

#### Retrieval as Regression

The first architecture would be framing the retrieval problem as the **scoring problem** which can be phrased as the **regression problem**. In the regression model (left most model in above figure), given the query and the document , we aim at generating a score, which could be for example interpreted as the probability that the document is relevant given the query . In this model, network learns to produce calibrated scores, which at the end, these scores are used to rank documents. This model is also referred as the point-wise model in the learning to rank literature.

(more…)