Trust, but Verify: Predicting Contribution Quality for Knowledge Base Construction and Curation

Eugene Agichtein
Evgeniy Gabrilovich
Panagiotis Ipeirotis
Chun How Tan

Venue: Seventh ACM Conference on Web Search and Data Mining (WSDM 2014)
Feb 2014
Status: Refereed
Type: Conference
Acceptance rates: 64/356 = 18% accepted

The largest publicly available knowledge repositories, such as Wikipedia and Freebase, owe their existence and growth to volunteer contributors around the globe. While the majority of contributions are correct, errors can still creep in, due to editors’ carelessness, misunderstanding of the schema, malice, or even lack of accepted ground truth. If left undetected, inaccuracies often degrade the experience of users and the performance of applications that rely on these knowledge repositories. We present a new method, CQUAL, for automatically predicting the quality of contributions submitted to a knowledge base. Significantly expanding upon previous work, our method holistically exploits a variety of signals, including the user’s domains of expertise as reflected in her prior contribution history, and the historical accuracy rates of different types of facts. In a large-scale human evaluation, our method exhibits precision of 91% at 80% recall. Our model verifies whether a contribution is correct immediately after it is submitted, significantly alleviating the need for post-submission human reviewing.

Panos Ipeirotis

Trust, but Verify: Predicting Contribution Quality for Knowledge Base Construction and Curation

Panos Ipeirotis

Trust, but Verify: Predicting Contribution Quality for Knowledge Base Construction and Curation

Related Files:

Panos Ipeirotis