Data and Information

latest update 2022-10-12  

Introduction

In this topic the difference between data and information is being discussed, their relationship and the way they are handled in ISO 15926-7/8.

Discussion

This can be found in Wikipedia

Data is any sequence of one or more symbols. Datum is a single symbol of data.
Data requires interpretation to become information.

Keys in data provide the context for values. Regardless of the structure of data, there is always a key component present. Data keys in data and data-structures are essential for giving meaning to data values. Without a key that is directly or indirectly associated with a value, or collection of values in a structure, the values become meaningless and cease to be data. That is to say, there has to be at least a key component linked to a value component in order for it to be considered data.

This is one definition amidst many, but it catches the essence for this topic, in which only structured data will be dealt with (unstructured data, such as in some textual document, can be converted to structured data, with some limitations).

(Parts of) sets of data may be required by different parties in different combinations ('views').

So far this is common knowledge, for which you don't need ISO 15926, but you can use your relational data base and SQL.

ISO 15926

In ISO 15926-7/8 information is represented by creating a "template" that combines the required data for that representation.

The semantics of a template are defined using a graph with interrelated ISO 15926 Part 2 entity types.

The scope of ISO 15926 is multi-layered:

thus fulfilling the following requirements: This has shaped ISO 15926 Parts 7 and 8 as follows:

Internationally applicable

This excludes any commercial system, since no such system will ever dominate the entire global IT world.
Having said that, ISO 15926 is not meant to be used for transactional systems. It is meant to serve as a neutral information integration platform in the background.
So it is complementary and enables data sharing between all applications and systems that are used during the entire life cycle of a facility.

Integration of all facility information

All information representations in ISO 15926 are derived from one and the same Upper Ontology, a generic data model as defined in ISO 15926-2. That standard provides the definitions of the fundamental concepts and most of their fundamental relationships (additional relationships can be generated in an ISO 15926-compliant way).

The next extension of that Upper Ontology can be found in a Reference Data Library, that defines generic real world specializations of ISO 15926-2 entity types. For the process industries that RDL can be found here, where the contents of this RDL are in the process of being subjected to ISO ballots in order to formalize this as the next version of ISO 15926-4.

For the International Standard ISO 15926, that has its focus on global semantic interoperability, it is of crucial importance that the data components, that together define information, are themselves defined as information to a level that is well-understood on a global scale, ultimately in the form of a natural language. As a consequence all information is defined by those globally accessible and understandable reference things and/or one or more literals

Integration of life-cycle information

This layer sets ISO 15926-7/8 apart from the rest, because each of its 'templates' is a self-contained representation of some elementary information with full reference to declared objects and concepts in the RDL or company extensions thereof. For example: "Pump P-101 has a capacity of 37.9 m3/h" is represented as:

# declaration of P-101
:f949a224-a9bb-4d4f-8abe-548b0d6071d5 rdf:type lci:InanimatePhysicalObject, 
	dm:WholeLifeIndividual, lci:NonActualIndividual, rdl:RDS327239 ; # PUMP
    rdfs:label  "P-101" ;
    meta:valEffectiveDate  "2021-07-15T13:24:00Z"^^xsd:dateTime .

:296f4c41-47f9-4120-a3d1-b0cbf661ecf7  rdf:type  tpl:IndividualHasPropertyWithValue ;
    tpl:hasPropertyPossessor  :f949a224-a9bb-4d4f-8abe-548b0d6071d5 ; # the UUID of pump P-101
    tpl:hasPropertyType rdl:RDS7354248 ; # CAPACITY (volume flow rate)
    tpl:valPropertyValue  "37.9"^^xsd:decimal ;
    tpl:hasScale rdl:RDS1321064 ; # m3/h
    meta:valEffectiveDate  "2021-11-23T14:47:00Z"^^xsd:dateTime .

Each template instance can have as many meta data as required, and one of those is mandatory: valEffectiveDate, the dateTime at which the information, represented by the template instance, has become effective/valid.

This effectiveDate plays a crucial role in the storage and retrieval of that life-cycle information.
Assume that another impeller is installed in P-101, resulting in a new capacity of 46.2 m3/h.
This is recorded as:

:b84d262f-e5b1-4a72-aa48-78456198f31c  rdf:type  tpl:IndividualHasPropertyWithValue ;
    tpl:hasPropertyPossessor  :f949a224-a9bb-4d4f-8abe-548b0d6071d5 ; # the UUID of pump P-101
    tpl:hasPropertyType rdl:RDS7354248 ; # CAPACITY (volume flow rate)
    tpl:valPropertyValue  "46.2"^^xsd:decimal ;
    tpl:hasScale rdl:RDS1321064; # m3/h    
    meta:valEffectiveDate  "2020-08-23T09:33:00Z"^^xsd:dateTime .

When we want to know what the capacity of P-101 was at 2018-05-22T00:00:00Z, the query looks for the latest dateTime before that date and finds 37.9.

All information stays on record and kind of "sediments", like clay in a river or snow at the Antarctic. We always can find the information valid at any point in the past, keeping in mind that the present is one nanosecond after the past.
NOTE - a dateTime in the future, for example a planned date, can also be represented in ISO 15926, see here.

What about relational data bases?

A study, conducted by a large EPC contractor, revealed that the above is practically impossible to achieve with a relational data base, simply because the number of tables required would be totally unmanageable.

One approach could be to convert the table rows, with a 'permalink' to RDF and add a valEffectiveDate to each row instance. That dateTime then applies to all data in that row instance. In order to recreate the history such row instances could be saved, each with its effectiveDate, as a kind of time slice.

But we would still have the rigid table structures that prevent easy extensions of the data to related domains.