-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions on whitespace handling #50
Comments
Hi Mark, (1) The setting from the Author specific page replaces the one from the more generic options. So, the Author mode will look only at the setting from the Author specific page to determine if schema information will be used or not for loading and serialization of the document.
and a schema like
then the schema information will tell oXygen that the whitespace of the p element needs to be preserved, because the string built-in type has the whitespace facet with the value preserve. If this information is used then oXygen will not split the content of p on multiple lines, otherwise the content will be formatted by inserting a new line and indent, resulting in something like
Other whitespace information refers to default attributes like xml:space and mixed versus element only content. All this is important to get the correct behavior wrt whitespace. Regards, |
Hi Mark, (2) If an element is added to the default space list then it is processed as if it will have an
then this will remain as is unless we add x in the default list, in that case the y element will go on a separate line. Regards, |
Hi Mark, (3) We can obtain whitespace information from multiple sources including the document itself, the options, the schema information or the CSS. Each of these will provide info that will tell us if the whitespace should be preserved or not, and in the later case if we are in mixed content or in element only content (that means we can completely ignore it). So, we end us basically with 3 levels - ignore, normalize and preserve. When an element is processed the parent also provides a whitespace processing context, materialized in a similar whitespace processing level. The precedence is not between these different sources of information, but on the whitespace processing levels - the source that sets the higher level will win. For example, if a source says that the whitespace should be preserved then we will preserve it, entering in a whitespace preserving state. This state will be exited when we get a specific indication of default whitespace processing, from options or from the document itself (xml:space="default"). Mixed will also win over element only, for example even if the schema says that some content is element only when we detect text inside that element then we switch to mixed mode. Best Regards, |
Hi Mark, (4) The Indent selection does not introduce any new lines or other whitespace inside the line, it will change only the indenting from the start of the line. The other actions use format because they may change all the whitespace, not only the indenting. Best Regards, |
Hi Mark, (6) Zero indent will perform formatting but the lines will start immediately at the left margin, no intenting whitespace will be added. Best Regards, |
Hi Mark, (7) If detect indent on open is enabled then oXygen will try to detect the indent by analyzing the input document and the indent size will be the dafault, in case the detection determines that there is not enough information to decide on the indent size. The "use zero indent, if detected" option allows oXygen to propose zero as a detected indent, otherwise zero will be considered an invalid value and oXygen will report in that case that a specific indent value was not detected and the default will be used. Best Regards, |
Hi Mark, (8) There is no such option now "Preferences>Editor>Indent with tabs". Best Regards, |
Back to question (5): The Open and Save behaviors are two important parts of the editing session. More details below:
Open and save are opposite operations, one needs to normalize or strip white spaces and the other needs to re-add spaces and line breaks in the XML so that it is easier to read.
The switch to the Author mode does the same thing as opening the XML directly in the Author mode.
The reader might at some point notice that Oxygen adds line breaks inside an XML element which the readers considers to be space preserve. If so, the reader would need to know in what places he should make modifications (CSS, associated schema, list of preserve space elements in the "Editor / Format / XML" preferences page or setting the xml:space attribute) so that he can prohibit Oxygen to add the line breaks and indent inside the element.
I would say "stripped". |
Ah, okay, that makes sense, and explains why they specific actions described are different.
Right, so I think the real issue that the reader needs to understand is how does Oxygen determine what is significant whitespace (which it should not mess with) and what is insignificant whitespace (which it is free to add or remove in order to format and indent the content. What I am trying to do in the new topic I am writing is to describe the rules that Oxygen uses to determine whether whitespace is significant or not. The use case for the reader is to identify when Oxygen is treating whitespace the reader deems significant as insignificant, and to figure out how to tell Oxygen that this whitespace is significant. The other use case that matters to the reader is to maintain consistency in how a document is formatted and to avoid unnecessary format changes, so as to avoid trivial differences between versions of a file in a VCS. The rules that govern this are in the Format preferences. So, the reader needs to know two things:
Knowing the exact steps that Oxygen takes when saving, loading, switching, etc. does not help the user with either of these use cases. What they need to know is that Oxygen formats and indents on these events, and that it follows the rules above when it does so. Agreed? |
George. Thanks so much for the very detailed responses! Number 3 seems to be the key to understanding how the whole whitespace system works. I'm going to write to topic in those terms. |
George, Re 7: I was able to get Oxygen to recognize zero indent, and also to create a file with enough variation that it did not recognize it and defaulted to the specified indent size. However, Oxygen did not report the inability to determine the indent size (at least in any form I could see). It just switched to the default. Was it supposed to warn me? And if so, how? |
Hi Mark, No, it will not warn the user in any way, it will just use the default indent. Best Regards, |
I agree that the user does not need all the internal details of whitespace handling. |
Information on whitespace handling is located in several different places in the interface and in several different place in the documentation. This make it hard to get a sense of the whole and how the various parts and settings affect each other. I have been attempting to write a topic that stitches the whole picture together, but I have a number of questions:
There are two different setting that deal with schema aware whitespace handling: Editor>Format>XML>Schema aware format and indent and Editor>Edit modes>Author>Schema aware>Schema aware normalization, format and indent. The first is obviously specific to author mode, but the second is general and should presumably apply to any mode. What is the actual difference between them and what do they affect?
For elements listed in the default space list, is the content normalized and left as is, or is it normalized and then formatted and indented?
If there are elements listed in the Preserve space, Default space, and Mixed content, and Schema aware format and indent are enabled, which takes precedence, the schema definitions or the content of these lists?
There are three menu items that do indenting:
Why does the second one not include the word "format". Is the functionality different? More limited?
http://www.oxygenxml.com/doc/ug-editor/#topics/author-whitespace-handling.html list two sets of formatting and indenting rules, one for when a document is opened in author, and one for when it is saved.
What does zero-size indent mean? Does it simply mean that There is no indenting of the content -- that every line starts at the left margin?
http://www.oxygenxml.com/doc/ug-editor/#tasks/how-to-use-zero-size-indent.html says to use zero indent, disable Detect indent on open and set the indent to zero, but there is an option that says Use zero-indent if detected. Shouldn't the topic say, if Detect indent on open is selected, select *Use zero-indent if detected" and set the indent to zero?
What is the difference between Preferences>Editor>Format>Indent with tabs and Preferences>Editor>Indent with tabs?
The text was updated successfully, but these errors were encountered: