-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
upgrade to version 1.0.2 result in problems to load dtd files #46
Comments
Thanks, I've now got a failing test case. |
Caused by fix for issue #10. At the moment I suspect there may be a feature on the standard jaxp SchemaFactory which allows it to accept dtds and return an empty schema (which is what the old Xerces SchemaFactory did). But alternatively we may need to detect upfront that the file is a dtd. |
Actually this is not a bug, its the correct behaviour. The default language for validation is XMLSchema, and your config is referring to DTDs. The only reason it used to work is due to an old bug in Xerces 2.9.1 which silently ignored faulty schema. The correct way to do this is to have a DOCTYPE decl in your XML document and configure your validation set as follows:
I've raised issue #48 to consider the possibility of doing DTD validation of arbitrary documents (which will be a long time coming) and #47 to improve the documentation. |
@rosslamont , does it mean that there is no way right now to validate xml by DTD that is located only in repository (not somewhere in web) ? Web links to DTD become valid only after release, to validate all files before we used catalog file to let validate by mean of local DTD files. Does it mean that release 1.0.1 also does not do validation ? and we are currently in illusion that plugin do validation. |
@rosslamont I am almost sure, that DTD validation is possible. Lets check that. |
Hi Roman, It's definitely possible to do what you want provided that you have a DOCTYPE decl in your xml document and you've setup a catalog. In your case, you have the DOCTYPE decl and you have the catalog, so you will be fine if you make the config changes I suggested above. To be really clear, I only checked the first failing validationSet in the checkstyle project, so you'll have to check the remaining validationSets to make sure they have the DOCTYPE properly set (and you should double check your catalogs to make sure they have a mapping to your local build for each DTD). Actually, to help there is a new config called
|
Hi Jochen, I assure you its not possible via publicID and systemId in the plugin config. Standard DOCTYPE declaration and catalog work fine as I explain in my previous comment to Roman. I've spent hours going over it, including building a non-optimised version of xerces 2.9.1 and stepping through it, seeing how it silently throws and catches an exception when it tries to turn the dtd into an XMLSchema. (more recent xerces and the default xerces based JAXP implementation also throw the exception but don't catch it). It then goes on and returned an "EmptySchema" object which is then used to validate the document, which of course works fine as that schema allows any XML. I've also done some local test cases to assure myself that even if a totally nonsense DTD is used (ie syntactically incorrect), it is silently ignored. Furthermore, there is documentation to suggest that it's not possible. The JAXP validation documentation states:
There are also some comments on stackoverflow. One possible solution would be to re-instate Xerces and use Xerces XNI to build a custom parser chain, but given the community's distaste for Xerces and bug #10, I don't think that's a good idea. I'll leave this open if you wish to look into it yourself. I can also check my failing test case into a branch to save you some effort if you want to go down that path. Ross |
@rosslamont , thanks a lot for you help. All items you suggested are required to be in configs to make it work, see my commit above. Suggestions:
|
Hi Roman, Documentation for "catalogHandling" is already there (http://www.mojohaus.org/xml-maven-plugin/validation.html). Do you think that needs improvement? I will work on the other documentation issue in issue #47 Yes and some more examples would be great. Best Regards |
validating flag need a bit more attention, in my understanding xsd and DTD is kind of the same, so it is not clear why it was required. |
Hi Roman, the doc in v 1.0.3 should make it clearer when released, but to help you understand, JAXP 1.2 introduced a new validation API, which supported external validations, principally xsd, but theoretically any pluggable validation language such as Relax-NG or Schematron. SAX and DOM also have a much simpler validation mechanism which predates JAXP which only supports a hard-wired DTD and XSD validator. Unfortunately the new validation API does not support DTD, but is supposed to be the "preferred" way of doing things. So there is an overlap for XSD which works on both approaches. This is why the mojo config is quite messy and why we have the validation flag, as that flag drives the old-style SAX validation approach, necessary to get a DTD validation. |
thanks a lot, it would be good to see such explanation in web site close to example or flag. I am pretty sure that most users will never know such nuance. |
found at checkstyle/checkstyle#5713
CIs logs https://travis-ci.org/checkstyle/checkstyle/jobs/365341659#L1322
xsd parsing works fine, but all dtds reference result in exception.
IT https://github.com/mojohaus/xml-maven-plugin/blob/master/src/it/mojo-1438-validate/pom.xml does not have dtd, even it is present on filesystem https://github.com/mojohaus/xml-maven-plugin/blob/master/src/it/mojo-1438-validate/src/main/dtd/sample.dtd
steps to reproduce:
The text was updated successfully, but these errors were encountered: