I'm just starting to implement a FilterWriter (or a PrintWriter, may switch the base later) to provide specific formatting capabilities, and I need to do both: scan for "new line" occurrences, and write "new lines" explicitly. While reviewing the existing I/O facilities dealing with reading and writing lines I've encountered the following possible discrepancy.
All classes I've looked at place output line terminators using the system line.separator property value. Classes that read lines recognize all of \n, \r\n, or \r as line terminators. If the line.separator happens to be different than any of the latter, moreover containing anything but \r or \n, for example &=-, it could cause major problems as I've explored and demonstrated further. So my question is:
- Should I strictly use the system line.separator to format lines when performing text output?
My experience is the distinction between \n and \r\n on Windows nowadays is not so important, and \r\n occurrences on *nix cause mostly annoyance to humans.
Java has a lineSeparator() system property:
Returns the system-dependent line separator string. It always returns the same value - the initial value of the system property line.separator.
On UNIX systems, it returns "\n"; on Microsoft Windows systems it returns "\r\n".
It is used by line-oriented output facilities such as (not limited to) PrintWriter.println():
Terminates the current line by writing the line separator string. The line separator is System.lineSeparator() and is not necessarily a single newline character ('\n').
In theory it could be any string. In practice it should be pretty much fixed to one of \n or \r\n, as any other value would cause discrepancies between the generated output and various input requirements, for example:
-
Properties.store():
Writes this property list... in a format suitable for using the load(Reader) method. -
Properties.load():
Properties are processed in terms of lines... A natural line is defined as a line of characters that is terminated either by a set of line terminator characters (\n or \r or \r\n) or by the end of the stream.
Properties.store() will use the system line.separator but Properties.load() will process correctly only \r\n?|\r?\n-delimited lines.
It is similar with formatted XML output. Properties.storeToXML() will use the line.separator to format the output but Properties.loadFromXML() can fail if non-standard line.sepatrator has been previously used.
Here's a code example exploring the effects:
import static java.nio.charset.StandardCharsets.UTF_8; import java.io.BufferedReader; import java.io.BufferedWriter; import java.io.ByteArrayInputStream; import java.io.ByteArrayOutputStream; import java.io.OutputStreamWriter; import java.io.PrintStream; import java.io.StringReader; import java.io.StringWriter; import java.util.Properties; import java.util.function.BiConsumer; static final String LF = "\n"; static final PrintStream out = System.out; // Start the JVM with -Dline.separator=" | " public static void main(String[] args) throws Exception { out.println("Hello,"); out.println("World!"); out.print(LF); String configSource; String xmlSource; { Properties config = new Properties(); config.setProperty("foo", "bar"); config.setProperty("baz", "qux"); String comment = "Comment first line," + LF + "comment second line."; ByteArrayOutputStream buf = new ByteArrayOutputStream(); config.store(new OutputStreamWriter(buf, UTF_8), comment); configSource = buf.toString(UTF_8); out.print(configSource); out.print(LF); buf = new ByteArrayOutputStream(); config.storeToXML(buf, comment); xmlSource = buf.toString(UTF_8); out.print(xmlSource); out.print(LF); } out.print(LF); { StringWriter buf = new StringWriter(); try (BufferedWriter lineWriter = new BufferedWriter(buf)) { lineWriter.write("foo=bar"); lineWriter.newLine(); lineWriter.write("baz=qux"); lineWriter.newLine(); } configSource = buf.toString(); new BufferedReader(new StringReader(configSource)) .lines().forEach(line -> out.append(line).print(LF)); } out.print(LF); { BiConsumer printEntry = (k, v) -> out.append("[") .append(String.valueOf(k)).append("]=") .append(String.valueOf(v)).print(LF); Properties config = new Properties(); config.load(new StringReader(configSource)); config.forEach(printEntry); out.print(LF); config.clear(); config.loadFromXML( new ByteArrayInputStream(xmlSource.getBytes(UTF_8))); config.forEach(printEntry); } out.print(LF); } As you may notice there's an issue even with simple (no other semantics than) text lines:
-
BufferedWriter.newLine():
Writes a line separator. The line separator string is defined by the system property line.separator, and is not necessarily a single newline ('\n') character.
and the related Files.write(path, lines, ...):
Each line is a char sequence and is written to the file in sequence with each line terminated by the platform's line separator, as defined by the system property line.separator.
then -
BufferedReader.readLine():
Reads a line of text. A line is considered to be terminated by any one of a line feed ('\n'), a carriage return ('\r'), a carriage return followed immediately by a line feed, or by reaching the end-of-file (EOF).
and the related Files.readAllLines():
This method recognizes the following as line terminators:
\u000D followed by \u000A, CARRIAGE RETURN followed by LINE FEED - \u000A, LINE FEED
- \u000D, CARRIAGE RETURN
More examples I've explored exhibiting the same issue include writing XML using a Transformer configured with indent=yes output property to a StreamResult.
All this makes me think: Should I actually use the system line.separator when generating text output? When writing some data format (JSON, XML, etc.) or programming source I can't think of a case where other than \n or \r\n is accepted, and both are (almost?) always equally accepted.
A related question I've found: If we're told to use the system property line.separator instead of hard-coding \n, why does my code work on Windows?
Источник: https://stackoverflow.com/questions/780 ... for-output
Мобильная версия