-
Notifications
You must be signed in to change notification settings - Fork 107
Examples of Harvest Job Errors
- Could not harvest WAF link https://data.noaa.gov/waf/NOAA/nos/ocm/iso/xml/47968.xml: HTTPSConnectionPool(host='data.noaa.gov', port=443): Read timed out.
Source URL gives time-out.
- Error loading json content: not enough values to unpack (expected 2, got 0).
- JSONDecodeError loading json. Expecting value: line 2 column 1 (char 1)
Broken JSON file
- Error loading json content: not enough values to unpack (expected 2, got 0).
- ProxyError getting json source: HTTPSConnectionPool(host='www.nrc.gov', port=443): Max retries exceeded with url: /data.json (Caused by ProxyError('Cannot connect to proxy.', RemoteDisconnected('Remote end closed connection without response'))).
ProxyError. Check Egress, or remote server.
- Error loading json content: not enough values to unpack (expected 2, got 0).
- ProxyError getting json source: HTTPSConnectionPool(host='www.opendataphilly.org', port=443): Max retries exceeded with url: /data.json (Caused by ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 403 Forbidden'))).
Blocked by egress rules.
- Error loading json content: not enough values to unpack (expected 2, got 0).
-----------
- HTTPError getting json source: 503 Server Error: Service Unavailable for url: https://www.bls.gov/data.json.
- HTTPError getting json source: 404 Client Error: Not Found for url: https://data.baltimorecity.gov/data.json?version=2.
- ConnectionError getting json source: HTTPSConnectionPool(host='opendurham.nc.gov', port=443): Max retries exceeded with url: /data.json (Caused by SSLError(CertificateError("hostname 'opendurham.nc.gov' doesn't match either of '*.durhamnc.gov', 'durhamnc.gov'"))).
- HTTPError getting json source: 403 Client Error: Forbidden for url: http://www.state.gov/data.json.
- HTTPError getting json source: 504 Server Error: Gateway Time-out for url: https://data.usaid.gov/data.json.
- ConnectionError getting json source: HTTPConnectionPool(host='www.ed.gov', port=80): Max retries exceeded with url: /data.json (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8883befbb0>: Failed to establish a new connection: [Errno 110] Connection timed out')).
Other connection issues.
- Identifier: some-identofier; Title: some-title; 1 Error(s) Found. ### ERROR #1: 'some-field':some-error-message
Source dataset fails to validate.
- Element not found: some-xml-field
- Transformation to ISO failed
Source dataset fails to validate.
- spatial: Maximum allowed size is 32766. Actual size is #####.
Source dataset spatial field exceed Solr limit.
- Duplicate entry ignored for identifier: 'VA-VHA-OPP-001'.
Source dataset fails to validate.
- Element '{http://www.isotc211.org/2005/gmd}LI_Source', attribute 'id': '_none' is not a valid value of the atomic type 'xs:ID'.
Source dataset fails to validate.
- Object some-id already has this guid some-value
Duplicate guid. guid should be globlly unique.
- Element 'some-xml-field': This element is not expected. Expected is one of ( some-xml-fields).
Source dataset fails to validate.
- Parent identifier not found: "some-id"
Datajson missing parent identifier
- No records to change
WAF file timestamp changed but content has no change. This could happen if the WAF server touched the file without actual change.
- Error parsing bounding box value: could not convert string to float: ''
Wrong values in the dataset.
- Validation Error: {'Name': 'That URL is already in use.'}
Known GitHub issue https://github.com/GSA/data.gov/issues/4046
- Could not parse XML file: internal error: Huge input lookup, line 185572, column 131 (<string>, line 185572)
Reason unknown. Only happens to certain XML. Could be size or validation.
- title: Search. That name cannot be used.
Search is a CKAN keyword. Cant be used as title.
- Point extent defined instead of polygon
Dataset uses wrong value for spatial field.
- Validation Error: {'Name or id': 'Missing value'}
Source dataset fails to validate.