The BC3-Network Corpus is a directed graph indicating the social network of email communication. In that graph, nodes represent participants, and for each email in the dataset, the node that represents the sender of the email is connected to the nodes that represent recipients of that email. This graph can be exploited in many types of research including Expert Finding, Conversation Thread Reconstruction, and so on.
The original BC3 corpus consists of 160 IDs. Therefore, we have 160 nodes in original BC3-Network. The number of total edges in social network graph is 424 and if we consider the weighted graph, the number of edges are 319, with the maximum weight of 8.
However, original BC3 network cannot show all relationships among participants on the grounds that it is a subset of W3C. In some applications, we need to analyze whole people communications and all communications are important and effective. For example, in order to estimate the closeness of participant using neighborhood overlap, not only direct relationships are important, but also indirect communications could be effective. Hence, we extract complete social network of BC3 participants from W3C social network and call it The Extended-BC3-Network corpus. In Extended-BC3-Network, every node either is one of the nodes that exist in BC3-Network or has the direct link to at least one BC3-Network nodes. The Extended-BC3-Network graph has 6807 nodes and the number of total edges in social network graph is 21388. If we consider the network as a weighted graph, the number of edges are 18247.
Citing the BC3-Network Corpus:
When citing or discussing the BC3-Network or Extended-BC3-Network corpora, please reference these papers:
- Mostafa Dehghani, A. Shakery , M. Asadpour, and A. Koushkestani, "A Learning Approach for Email Conversation Thread Reconstruction", Journal of Information Science (JIS), Volume 39 Issue 6, 2013, pp. 846-863. [ACM-DL Link]
- Mostafa Dehghani, M. Asadpour, and A. Shakery, "An Evolutionary-Based Method for Reconstructing Conversation Threads in Email Corpora", In proceedings of The 2012IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM'12), 2012.[ACM-DL Link]
Thanks to Arash Koushkestani for making this corpus preparation possible!
The ConThread-BC3 Corpus by Mostafa Dehghani is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. Based on a work at http://www.cs.ubc.ca/labs/lci/bc3.html. Here you can download BC3-Network and Extended-BC3-Network in Graphml format.
If you have any questions, ideas or suggestions, please do not hesitate to contact me!