Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-1:CON error on internal sites #26

Open
BuscheIT opened this issue Oct 23, 2024 · 5 comments
Open

-1:CON error on internal sites #26

BuscheIT opened this issue Oct 23, 2024 · 5 comments

Comments

@BuscheIT
Copy link

#10 describes the same error we are currently trying to resolve.

Our internal network is using a Windows DNS controller - all domain names resolve nicely on both of the test PCs we are using (Win11 and Linux Mint).

On both machines we are getting the -1:CON error and see no requests in the server logs.

Using WIN11 (with 1.0.8 portable) the report has as first line: "Problem with DNS analysis: Crawler\Analysis\DnsAnalyzer::getDnsInfo: nslookup command failed."

The same empty report under linux (Snap) just without the nslookup command error.

All EXTERNAL sites can be scanned without problems - there is no proxy internally, all browsers work internally and we are using plain HTTP to avoid probable certificate trouble.

Any ideas?

@janreges
Copy link
Owner

Hi @BuscheIT,

Information about nslookup failing on Windows is my fault. Nslookup is not available on Windows/CygWin, so this analysis should not even be performed properly on Windows. It's on the roadmap for future fixes.

  • Can you please write what the specific URL you are trying to call with crawler? Do you put http:// or https:// there? Do you put a custom port there? Do you include an internal domain name or IP address?
  • Can you try to add a fixed record to your OS for resolving this domain? On Linux to /etc/hosts, on Windows to c:\Windows\System32\drivers\etc\hosts, add something like 1.2.3.4 your-domain.xyz (with proper IP, of course). Will this help?
  • Also, if you did it on Linux, can you send a screenshot of the "DNS and SSL" part of the audit report?

@BuscheIT
Copy link
Author

BuscheIT commented Oct 24, 2024

Hello,
we are using "http://se.test" without custom port.
Putting the "IP hostname" into the hosts file under Win11 does not change Crawler or any browser behaviour. The report with the edited hosts file is at https://crawler.siteone.io/html/2024-10-24/30d/w-8m53kf170vh9-208j5.html

Of course we tried with https as well - our .test environment has that and we just went for http to avoid self signed cert trouble - as stated. Also disabled Windows Defender temporarily as no request seems to even hit the .test server.
That it works on any external site makes us bang our head where there could be a problem.
We even changed netmasks and made sure any .test traffic goes to the same IP.

Crawler is the first program giving us such trouble - never had problems with other network depending tools, so really would love to figure this out.

Linux screenshot soon - ideas for wireshark or gdb or cmdline options debugging?

PS: Using the standard options after startup but disable "allow images" to reduce requests a little - shouldnt matter.

@ovenum
Copy link

ovenum commented Oct 25, 2024

Just ran into the same error message when trying to crawl a development site on my local machine. This happens on MacOS though, but the error descriptions looks similar to what i am experiencing.
The site uses a local domain like website-name.customTLD, where every domain under customTLD is directed to my machine via DnsMasq and will respond to ping requests.

In the generated HTML report under DNS and SSL the following message is shown
image

And for the visited URLs section i will get the -1:CON error status
image

@janreges
Copy link
Owner

Hi @ovenum,

as part of the work on this issue I have made a number of improvements in the last few commits. If you know how to work with Git, you can run the current version from the "main" branch, or wait 2-3 weeks when I release version 1.0.9.

  • I added the parameter --resolve (you can find it described in README.md), which can be used to force the crawler to use a specific IP address for your domain and port, so you can enter for example my.domain.tld:443:192.168.1.10 (same format as curl --resolve). This can solve the situation when for any reason there are problems with resolving DNS of locally running projects.
  • on Windows (where the crawler runs in the CYGWIN environment) I have refactored the DNS and SSL/TLS information discovery to use native functions as fallback to unavailable nslookup/dig. You can't get as much information in the report, but the most important ones are there - DNS resolving tree, IP addresses and also basic information about SSL/TLS certificate, including time validity.

If you can test the current version from the main branch, please test it and let me know if everything important is running fine.

@ovenum
Copy link

ovenum commented Jan 21, 2025

Thanks @janreges for looking into this.

Just got the latest version from git and added swoole-cli 6.0.0 macos-arm64.

Running › crawl --url='http://somethig-local.test' will still give me the -1:CON error.
I’m using dnsmasq on macOS to resolve the test domain locally and have it point to 127.0.0.1

Using the resolve parameter solves the issue
./crawler --url='http://somethig-local.test' --resolve='somethig-local.test:80:127.0.0.1'

Let me know if you require more information regarding the issue.

Attached is the report of the failed crawl with -1:CON error


 ####                ####             #####        
 ####                ####           #######        
 ####      ###       ####         #########        
 ####     ######     ####       ###### ####        
  ######################       #####   ####        
    #######    #######       #####     ####        
    #######    #######         #       ####        
  ######################               ####        
 ####     ######     ####              ####        
 ####       ##       ####              ####        
 ####                ####       ################## 
 ####                ####       ################## 

==================================================
# SiteOne Crawler, v1.0.8.20240824               #
# Author: [email protected]                   #
==================================================


Progress report           | URL                                                                   | Status | Type     | Time   | Size   | Cache  | Access.  | Best pr.
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
1/1     | 100% |>>>>>>>>>>| /                                                                     | -1:CON | Other    | 73 ms  | 0 B    | none    |          |         
The analysis has been suspended because no working URL could be found. Please check the URL/domain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants