webcrawler

MVP todo:

simple program to GET a html resource & find links (relative & absolute)
add the feature 'simple text analysis' (tally words used)
spin up a DB/Elasticsearch, connect to it, insert links & searchable data
make webcrawler 'oneshot' per page (it kinda already is, but keep it simple by keeping it this way)
perhaps another 'scheduler' service to fetch latest 'unprocessed' URL from DB, and fire 'oneshot' crawler on it (processing it).
- exponential backoff on content diff (be the good guy!)
- also log failure count on URL's, stop after multiple fails (consider it broken, regardless of actual error)

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
screenshots		screenshots
.eslintrc.js		.eslintrc.js
.gitignore		.gitignore
App.ts		App.ts
README.md		README.md
crawler.ts		crawler.ts
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
utils.ts		utils.ts

Provide feedback