Skip to content
This repository has been archived by the owner on Aug 25, 2023. It is now read-only.

redigaffi/Web-Spider

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Web Spider

Crawling the web..


- This is a basic web spider for learning purpose, and for release something to opensource community.

How it Works

How is it done..


- Also, it is very simple, and i'll try to maintain it simple. You introduce a start web (The first web to crawl at), it searches all the urls in the indicated website and put them into an array. Then i'll check the sanity of the urls, also some basic regex stuff like (starts the url with http, ends with a TLD). This all in a while loop for recolecting hundreds of urls, but i implemented a easy system to check the visited urls and not do repeated work.

Future features

Features i want to implement.


- Search keywords in site. - According keywords, categorize site. - A long etc..


Email: [email protected]

About

A basic web spider used for learning purposes.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages