This repository contains the code for our submission in Kaggle’s competition Quora Question Pairs in which we ranked in the top 25%. A detailed report for the project can be found here.
train.csv contains ~ 400k question pairs along with the corresponding label (duplicate or not) and
test.csv contains ~ 2300k question pairs. Both the files can be found here.
Firstly, place the
test.csv (see the Data section above) and the pre-trained GloVe embeddings in the
input folder. You can download the embeddings from here. Then, simply run the bash script:
Install them using pip.