Skip to content

woog2ee/NLP-and-Information-Retrieval

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 

Repository files navigation

NLP-and-Information-Retrieval

CAU 2022-1 NLP and Information Retrieval Project Repository
Check out Presentation at Link

👪 Teammates

  • Team name: Tabloid Discriminator (찌라시 판별기)
  • Seunguk Yu: School of Computer Science & Engineering in CAU
  • Minju Kim: School of Computer Science & Engineering in CAU
  • Hunseok Jeong: School of Computer Science & Engineering in CAU

💡 Prototype

Overall Flow
Image

News Search with Summarization
Image

News Search with Evaluation
Image

Terminal Status
Image

🚂 Pipeline

1. Data Crawling & Preprocessing

Crawling some categories of Daum News by Selenium
Preprocessing and Tokenization reflecting the characteristics of Korean by Konlpy

2. News Retrieval System

Implementing news retrieval for input query by Sklearn
Removing similar news from retrieved results

3. News Evaluation

Comparing retrieved results with Daum News by Selenium, Sklearn

4. Service Visualization

Providing retrived news summary and visualization of service by NLTK, PyQt

About

CAU 2022-1 NLP and Information Retrieval Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published