Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrated feature: "Option for a recursion limit on walking directories" submitted by nobody on 2005-06-18 #14

Open
GoogleCodeExporter opened this issue Apr 19, 2016 · 2 comments

Comments

@GoogleCodeExporter
Copy link

Original feature listed here:
http://sourceforge.net/tracker/index.php?func=detail&aid=1223364&group_id=137793
&atid=739386

The recommendations for Google are just to submit html
pages (as opposed to .gifs, .jpgs, etc) So either give an
example in the config if there is a simple way to reject
ALL else using say "regexp" for those not familiar with
regexp, or make a new switch to make it easier to pass
JUST .htm/.html. Right now we had to list and filter all
other possible extensions using the wildcard filters since
it was not acceptable to just pass ALL .htms since
there were some .htm's calls found in the logs with
parameters which we did NOT want to include.

Also it would be nice to have an option for the number of
levels walked in directories. For instance we wanted to
have our root, and only a PORTION of the subdirectories
contained in it walked. Since walking the root apparently
automatically walks ALL subdirectories, the only way
we could think to do this was to filter the rest out by
name. Would be nice to be able to specify if walking the
root included walking ALL subdirectories or only
specified ones.

Original issue reported on code.google.com by [email protected] on 13 Aug 2007 at 7:45

@GoogleCodeExporter
Copy link
Author

[deleted comment]

@GoogleCodeExporter
Copy link
Author

I second the idea to add a recurse=yes/no option to the <directory> tag. I need 
to be able to specify which paths to recurse into. Directories off of the 
web-root that are not anonymous accessible have no business showing up in the 
sitemap.xml, and bots try accessing the URL's and get "access denied" type 
messages.

So today, once again, I was looking for some way to specify the web-root with 
recurse=no, and then specify each directory off of the web-root that is public, 
and have the generator recurse those directories.

Original comment by [email protected] on 29 Oct 2010 at 6:19

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant