|Compilers' Toolbox™||- provided by Danish Food Informatics|
Presentation of values
|Significant digits - significant figures|
Numerical values in food data banks are most often measured (analytical) values i.e. values measured with a certain uncertainty. The uncertainty of the measured results depends on many different parameters and is determined by the (analytical) method and the circumstances under which the result is measured.
Significant digits should not be confused with the number of decimal places in a value. It is the number of significant digits that is important. For example, the numbers 123, 12.3, 1.23, 0.123, and 0.0123 all have three significant digits, and all numbers are expressed with the same relative uncertainty. “Rounding” these number to, e.g. 2 decimal places, changes the uncertainty; the two first numbers will be expressed with a lesser uncertainty (higher precision) and higher number of significant digit (123.00 and 12.30, respectively), the third number will keep its precision (same uncertainty), whereas the two latter numbers will be expressed with a higher uncertainty (0.12 and 0.01, respectively).
|Value uncertainty and significant digits|
As numerical results easily can span over several decades, this has an important impact on the expression of numerical results:
“The numerical values of the result and its uncertainty should not be given with an excessive number of digits. Whether expanded uncertainty U or a standard uncertainty u is given, it is seldom necessary to give more than two significant digits for the uncertainty. Results should be rounded to be consistent with the uncertainty given” (Eurachem/CITAC ).
Greenfield & Southgate  indicate for food composition data that
“the last digit cited in the value should reflect the precision of the analysis and values should not be cited in such a way as to give a false impression of the precision with which a constituent can be measured. Because foods vary in composition, it is also fundamentally incorrect to cite values that imply that the composition is defined to a higher level than its natural variation”.
Greenfield and Southgate further emphasize that the number of
significant digits for food composition data never can be higher
than 3 and that the uncertainty of data very often only allows for 2
|Numerical values in database systems|
It is important that numerical values that are results of measurements always, in their expression, carry and maintain the uncertainty with which they were determined. They must maintain the number of significant digits with which they were entered all the way through the system and they must also be preserved through eventual data interchange.
However, the numerical database systems normally have no built-in way of controlling the number of significant digits - only the number of decimals, like in spreadsheets. Under normal circumstances, numerical values cannot be stored in the built-in data representation, i.e. a number (single or double precision), in these systems.
Special precautions may be necessary in the programming of the databanks. The simplest example is that data can be stored as text data types with special conversion algorithms to convert the text when the numerical value is needed. The text data type preserves the value exactly as entered. Other systems include storing data as numerical with information about the original number of decimals (or significant digits). In some rare cases, this method may alter a value slightly, but conversion from text data type to numerical is avoided.
|Numerical values in data interchange systems|
As it is important to preserve the original uncertainty with which the numerical vales were determined all the way through the data processing, it is also important to maintain the original numerical expression of results during and after interchange of data, i.e. preserve the original number of significant digits. In the EuroFIR XML data interchanges schemes , specific measures have been taken to accommodate precise data preservation in the definition of the XML schemata, the formal specification for XML files using the EuroFIR Food Data Transport Package .
The specific data types include a string data type (decimal-as-string) for the representation of numerical values, instead of XML’s standard data type decimal. The data type, decimal-as-string, has been defined for the following reasons:
The EuroFIR XML schema definition of decimal-as string is
Similarly, due to the specifications in the draft EuroFIR Standard’s Technical Annex , it has been necessary to define specific data types for dates allowing for the convention that some dates (e.g. in the Component Value) can be e.g. 'Before 1992' or ‘1992-02’. In most cases, however, the basic XML date data type has been used. In such cases, they are referred to by the prefix ‘xsd’, e.g. xsd:date.
Also for the language attributes it has been necessary to make a specific EuroFIR definition, string-language, instead of the xsd:lang. The reason is that xml:lang is not supported by any validator tool tested by EuroFIR, which means that actually any string would pass the validation. However, the string-language data type is able to check at least the format of the string but not whether it is some existing language defined in the ISO 639 .
The current GS1 GSDN Food and Beverage Extension  uses the general GS1 rules for data types as specified in chapter 8 of the GS1s Technical User Guide to GS1 XML 3.x , and does not include a similar restriction
In numerical food databanks as well as in food data interchange schemes, it is important to keep the original expression of results and their uncertainty. Therefore, specific measures must be taken to preserve the original number of significant digits the results were expressed with.