Learning to Transform, Combine, and Reason in Open-Domain Question Answering
Our paper "Learning to Transform, Combine, and Reason in Open-Domain Question Answering", with Hosein Azarbonyad, Jaap Kamps, and Maarten de Rijke, has been accepted as a long paper at 12th ACM International Conference on Web Search and Data Mining (WSDM 2019).\o/
We have all come to expect getting direct answers to complex questions from search systems on large open-domain knowledge sources like the Web. Open-domain question answering is a critical task that needs to be solved for building systems that help address our complex information needs.
To be precise open-domain question answering is the task of answering a user's question in the form of short texts rather than a list of relevant documents, using open and available external sources.
Most open-domain question answering systems described in the literature first retrieve relevant documents or passages, select one or a few of them as the context, and then feed the question and the context to a machine reading comprehension system to extract the answer.
However, the information needed to answer complex questions is not always contained in a single, directly relevant document that is ranked high. In many cases, there is a need to take a broader context into account, e.g., by considering low-ranked documents that are not immediately relevant, combining information from multiple documents, and reasoning over multiple facts from these documents to infer the answer.
Why should we take a broader context into account?
In order to better understand why taking a broader context into account can be necessary or useful, let's consider an example. Assume that a user asks this question: "Who is the Spanish artist, sculptor and draughtsman famous for co-founding the Cubist movement?"
We can use a search engine to retrieve the top-k relevant documents. The figure below shows the question along with a couple of retrieved documents.