Skip to content

Latest commit

 

History

History
21 lines (16 loc) · 859 Bytes

README.md

File metadata and controls

21 lines (16 loc) · 859 Bytes

Description

Reads a text file with one url on each line to scrape the contents of a web page and extract key terms using natural language processing. Built with python.

Requirements

Instructions

Run the script from the command line. There are a few required options

Required Arguments

  • -i, --input the name of the txt file containing the URLS
  • -c, --content the selector for the content region to parse
  • -o, --output the name of the file to be output. Acceptable formats are csv or json.

Optional Arguments

  • -l, --length the minimum length of each keyword returned by the script