Witness represents one document, file, stream or any other character input that is subject to a BetterDiff process.
For the caller, the content of the Witness, regardless of its origin or form, must always be a sequence of characters (characters string). Every character must represent exactly one symbol, letter, command, or any other singular entity. Every such entity is represented to the caller as an array of bytes and the whole content as a string. For more technical details, please see the note at the end of this documentation.
Example 1:
The Witness represents a text file with two lines: "line 1" and "line 2". The content of the Witness will therefore be a string of 13 characters: "line 1" as characters from 1 to 6, new line as a character number 7, and "line 2" as characters from 8 to 13
Example 2:
The Witness represents a DNA fragment of a gene. The content of the Witness will therefore be a string of nucleic acids represented by nucleobases, e.g. "AGCATATCGG", where every single nucleobase is represented by exactly 1 character.
Example 3:
The Witness represents a binary file without a known internal structure. The content of the Witness will therefore be a string of bytes, e.g. 15, 202, 156, 245, 31, where every single byte is represented by exactly 1 character.
Example 4:
The Witness represents a thought of a human being. The content of the Witness will therefore be a string representation of the thought, e.g. neurons constellation and synapses activity, where every single neuron and a synapse activity is represented by exactly 1 character.
Note. The exact representation of the content is a {@link String}, not an array of chars. This is important to follow, because in Java, char is always stored as 2 bytes, but String is a sequence of bytes, where 1 character may occupy more than 1 byte, e.g. 2 bytes for UTF-16, 4 bytes for UTF-32, 1 byte for ASCII etc.
However, while technically a character is an array of bytes, in the Witness' context, a character is always an indivisible entity, regardless of the technical implementation of String class.
Evidence is a set of Witnesses in a defined order.
While the relation among Witnesses might not be a sequence, but rather a tree structure, or a graph, every Witness has its own number. This number must be unique and all numbers must form a sequence starting with number 1, and without any gaps in them. The Witness' number is referenced as ordinal number.
Collecting Evidence is one of the core phases. During this phase, Witnesses are identified and collected into an Evidence.
Generally, this phase consists of following steps:
Collected evidence should not be modified after the phase has been finished.