PhD Thesis

I have defended my PhD dissertation, with “cum laude” ( highest distinction in the Netherlands), on Friday, February 28, 2020, at 10:00AM in de Agnietenkapel .


My PhD thesis is about “Learning with Imperfect Supervision for Language Understanding“. You can download my PhD thesis here, or get it from the UvA repository.


Humans learn to solve complex problems and uncover underlying concepts and relations given limited, noisy or inconsistent observations and draw successful generalizations based on them. This rests largely on the poverty of the stimulus argument, or what is sometimes called Plato’s problem: “How do we know so much when the evidence available to us is so meagre?

In contrast, the success of today’s data-driven machine learning models is often strongly correlated with the amount of available high quality labeled data and teaching machines using imperfect supervision remains a key challenge. In practice, however, for many applications, large-scaled high-quality training data is not available, which highlights the increasing need for building models with the ability to learn complex tasks with imperfect supervision, i.e., where the learning process is based on imperfect training samples.

When designing learning algorithms, pure data-driven learning, which relies only on previous experience, does not seem to be able to learn generalizable solutions. Similar to human’s innately primed learning, having part of the knowledge encoded in the learning algorithms in the form of strong or weak biases, can help learning solutions that better generalize to unseen samples.

In this thesis, we focus on the problem of the poverty of stimulus for learning algorithms. We argue that even noisy and limited signals can contain a great deal of valid information that can be incorporated along with prior knowledge and biases that are encoded into learning algorithms in order to solve complex problems. We improve the process of learning with imperfect supervision by (i) employing prior knowledge in learning algorithms, (ii) augmenting data and learning to learn how to better use the data, and (iii) introducing inductive biases to learning
. These general ideas are, in fact, the key ingredients for building any learning algorithms that can generalize beyond (imperfections in) the observed data.

We concentrate on language understanding and reasoning, as one of the extraordinary cognitive abilities of humans, as well as a pivotal problem in artificial intelligence. We try to improve the learning process, in more principled ways than ad-hoc and domain or task-specific tricks to improve the output. We investigate our ideas on a wide range of sequence modeling and language understanding tasks.

And here are the slides of the layman talk I gave at the start, and this is a photo from the day 🙂

Mostafa as a Dr!

About the cover

The cover is part of a painting by Reza Sedighian that was presented in the Nuance exhibition in the Emkan gallery on May 4-14, 2018.

Nuance (Fr. nuer — to shade) means shade of color or meaning, “a delicate variation“.

We live in a world where subtlety and nuance tend to be overwhelmed by visual, auditory and ideological noise. In this world, we want to escape our own confines.

This is not to find a respite from the noise, but in order to awaken from it.