8.4.3 Delimiters
8.4.3.1 Record Delimiter
Each message created in accordance with this part of the RDR standard shall be separated into individual Records with each Record being placed into one line terminated by a line feed (Unicode U+000A) or a carriage return and line feed pair (Unicode U+000D 000A).
8.4.3.2 Primary Delimiter
Cells within a Record are separated by tab characters (Unicode U+0009). The messages created in accordance with this part of the RDR standard are therefore TSV files and have a .tsv file extension.
8.4.3.3 Secondary Delimiter
Should a single Cell contain two or more data elements, these data elements shall be separated by a pipe character (Unicode U+007C).
All data elements in a multi-value Cell shall be of the same primitive data type (see Clause 8.4.4).
8.4.3.4 Namespace delimiter
Should a Cell contain a data element whose origin needs to be provided, the data element shall be preceded by a string that provides a "namespace" and two colon characters (Unicode U+003A).
For example a party identifier can be communicated as ISNI::0000000081266409, indicating that the identifier (0000000081266409) is an International Standard Name Identifier (ISNI).
The sender should ensure that the recipient can, for each specific namespace, ingest data in this form.
8.4.3.5 Spaces and Delimiters
Delimiters shall not be surrounded by extra space characters.
For example, the writer pair Lennon/McCartney should be communicated as Lennon|McCartney
and not as Lennon⎵|⎵McCartney.
8.4.3.6 Received spaces and Delimiters
If a sender has received data with extra white spaces, they are encouraged to trim any such extra white space characters when compiling a message created in accordance with this part of the RDR standard. For example, if the sender received data with the writer, Lennon as “Lennon⎵“ and McCartney as “McCartney⎵“, then the writer pair should be communicated by the sender as Lennon|McCartney
.
However, it is also permitted, for a sender that received data with the writers Lennon as “Lennon⎵” and McCartney as “McCartney⎵”, to communicate the writer pair as Lennon⎵|McCartney⎵
if the sender is required to provide data “as received” from third parties.
8.4.3.7 Communicating Delimiters
To communicate a Delimiter in a Cell, such a Cell shall not be enclosed in double quote characters. Instead the Delimiter shall be immediately preceded by an escaping code as follows:
To escape a tab character contained in a text string, the escaping code is the backslash character (Unicode U+005C). Therefore, the string A[TAB]B would have to be communicated as
A\[TAB]B
(with [TAB] representing the tabulator);
To escape a pipe character contained in a text string, the escaping code is a double backslash character (Unicode U+005C). Therefore, the string A|B would have to be communicated as
A\\|B
; and
To communicate a backslash character, the escaping code is a triple backslash character. Therefore, the string A\B would have to be communicated as
A\\\\B
.
These escaping mechanisms must be used for all special characters in all Cells, whether those Cells allow multiple values or not. A non-escaped pipe character in a single-value Cell is, consequently, an error.
For the avoidance of doubt, escaping a character that should not be escaped, or not escaping a character that should have been escaped, will lead to an invalid message.