A Python script for locating cross-referenced pieces of information in related documents (e.g. articles from different outlets of the same news story). The original news dataset from Kaggle.
Run a demo
python demo.py
The script performs the following:
- Reads the documents, parses them into sentences.
- Uses a naive GA (genetic algorithm) to group sentences into "meaningful" pieces of information (POIs).
- Computes the claim similarity between POIs.
- Creates an undirected graph from the POIs.
- Compute the cliques in the graph, corresponding to cross-referenced pieces of information.
If you use the code, please cite the following publication:
D. Bountouridis, M. Marrero, N. Tintarev, C. Hauff, Explaining Credibility in News Articles using Cross-Referencing, 2018 Workshop on ExplainAble Recommendation and Search (EARS 2018)