© Northern Lighthouse Ltd - Last updated 29 Aug 2011
The reader is referred to WMO official documentation as the source of definitions of terms related to BUFR and CREX. The information about BUFR and CREX in this file is based on our interpretation of WMO documents, and in case of discrepancies, WMO documentation should be considered the valid one.
This glossary contains frequently used terms and their definitions. Some are related to table-driven codes, some others are general computing terms, and others are terms used in Cipher products. To indicate the scope of each term we use a letter (in parenthesis) after each term:
|
|
API = Application Program Interface, i.e. rules about how to call routines of a given library.
Cipher SoftBUFR and T-CREX class libraries provide a simple and clean API, thanks to OOP.
Using operator descriptor
CREX coding does not support associated fields.
|
|
BUFR = Binary Universal Form for Representation of meteorological data, defined by WMO in its publication No. 306, Manual on Codes (Part B - Binary Codes).
Another recommended source of information is WMO page Guide to WMO Binary Code Forms.
Cipher SoftBUFR is a class library created by Northern Lighthouse Ltd to handle BUFR coding.
Section 0 of BUFR is an
The
There are differences between section 1 in BUFR edition 3 and section 1 in BUFR edition 4.
In both editions, generating centres are allowed to extend the section 1 for local use.
By default, Cipher SoftBUFR creates messages with standard length section 1, but it provides tools to create or access a longer section 1.
This is the
By default, Cipher SoftBUFR creates BUFR messages without section 2, but it does provide tools to create or access section 2.
Section 3 is the
Normally these descriptors need to be expanded to find out the individual descriptors.
Section 4 is the
This is the
BUFR coding system is table driven. The main BUFR tables are:
|
|
The same as Type.
A character coding standard (Comité Consultatif International Télégraphique et Téléphonique, International Alphabet No. 5), functionally equivalent to ASCII. All character data within BUFR and CREX messages are coded according to this standard.
The code number of the originating / generating centre is stored in Section 1 of a BUFR message. CREX messages do not have centre value within metadata.
Originating centres can define local extensions to BUFR and CREX.
The Cipher series is a set of software products created by Northern Lighthouse Ltd to handle BUFR and CREX coding.
A software library providing solutions for a given problem field, created using
OOP.
Some numeric values in BUFR and CREX messages are not pure numeric values but indices (or
codes) to non-numeric values (e.g. descriptor
A compressed BUFR message is a special case of multisubset BUFR message. When several observation reports share exactly the same structure (in particular, no delayed replication) it is possible to encode them using a special compression method and produce a single BUFR message that is even shorter than the multisubset uncompressed BUFR message. This reduction in size contributes to a more efficient use of storing space and communication lines.
CREX messages cannot be compressed.
Cipher SoftBUFR is able to decode a set of observations which have been encoded as a single compressed message much faster than if the observations had been encoded as single messages or as a single uncompressed multisubset message.
A compressed BUFR message is build by storing first the values of the first element descriptor of all observation reports, then the values of the second element descriptor, and so on.
For each element descriptor, the minimum value is computed and that value is stored into a bitfield as described by that element descriptor. Then follows a six-bit field that tells how many bits are required to store the delta values. Then the delta values follow.
Coordinate descriptors are element descriptors of the form
Two adjacent coordinate descriptors define a range.
For instance,
Descriptor
Cipher libraries contain methods to focus on the coordinates or coordinate intervals selected by the user.
A character-based table driven coding method defined by WMO as alternative to the binary code BUFR for areas where BUFR is not yet feasible. CREX stands for Character Representation for EXchange of data. CREX is as flexible as BUFR and it can be read by trained humans.
WMO has decided to stop developing new character-based message types or enhance old ones. In the future all new report types will be coded in BUFR and/or CREX.
CREX entered operational phase in May 2000.
Cipher T-CREX is a class library created by Northern Lighthouse Ltd to handle CREX coding.
Characters "CREX", terminated by end-of-section mark "++".
In CREX edition 1, the section 1 consists of a T-group, an A-group, message descriptors, and an optional letter E, terminated by end-of-section mark "++".
A T-group has the form
An A-group has the form Annn, where
Letter E, if present, indicates that each data value in section 2 is preceded by a check digit.
In CREX edition 2 T-group and A-group have been extended and new groups
In edition 2 T-group has the form
In edition 2 A-group has the form Annnmmm, where
Groups Poooooppp Uuu Ssss Yyyyymmdd Hhhnn contain the following information:
Section 2 contains data values corresponding to the descriptors given in section 1, and is terminated by the end-of-section mark "++". If section 1 ends with letter "E", then each data value is preceded by a check digit.
Optional. When present, it starts with the characters "SUPP", then the local data follow, and it ends with the end-of-section mark "++".
Object oriented programming language that Cipher
SoftBUFR and T-CREX class libraries have been implemented with.
|
|
BUFR edition 3 introduced several new quality control operator descriptors that rely on bit maps to indicate the data values to which quality control or statistical information is applied.
Abbreviation for delayed descriptor replication count.
Interpreting a BUFR or a CREX message back to its original values.
Some messages are built from similar blocks of descriptors, repeated several times. For instance, upper air soundings contain similar data from several levels. The number of such levels differs from observation to observation, but the observations share a common structure described in BUFR section 3 (CREX section 1).
Delayed replication is a concept useful in such cases. The block structure is described in section 3, while the number of blocks repeated in a particular observation is given in BUFR section 4 (in CREX section 2).
Descriptors, stored in section 3 of a BUFR message (in section 1 in a CREX message), describe the structure of the observation report. They are the key elements between the observation values and the bitstream of data in section 4 (groups of data in section 2 in CREX).
All descriptors have a unique name of the form
The value of
Digits
Although in BUFR literature descriptors are often written as 5-digit integers, in Cipher SoftBUFR documentation descriptors are written using character
WARNING: Do not use e.g.
BUFR and CREX are table-driven. WMO has defined BUFR and CREX master tables for parameters in the discipline meteorology.
Section 1 in BUFR and CREX messages contains code number for the discipline, making it possible to assign unique code numbers to other scientific disciplines interested in using BUFR or CREX to code their own data, using their own tables.
Oceanography has already received its own discipline number (10).
|
|
Edition 1 of BUFR was approved in 1988. Since then there has been changes in the BUFR specifications in several occasions. Changes in edition imply changes in the handling software. BUFR edition 4 is operational since 2 Nov 2005, and until 2012 both editions 3 and 4 will be operational in parallel. In BUFR messages the edition number is stored in section 0.
CREX edition 2 is operational since 2 Nov 2005, and until 2012 both editions 1 and 2 will be operational in parallel. In CREX messages the edition number is stored in section 1.
Element descriptors are those descriptors that describe the parameters encoded in section 4 of a BUFR message (section 2 for CREX messages), their units and how the data values are encoded. Element descriptors are defined in Table B.
Building a new BUFR or CREX message.
A method in an operating system to pass information. Programs may check values of environment variables and behave in one way or another depending on those values.
Cipher SoftBUFR and T-CREX class libraries use several environment variables. You can use utility program crextool for CREX) with parameter environ:
bufrtool environ
Exceptions are abnormal situations that disrupt the execution of a software system. Exception handling is a mechanism for dealing with those situations.
Often descriptors stored in BUFR section 3 (CREX section 1) contain sequence descriptors and replication descriptors that need to be resolved into element descriptors to find out the detailed structure of the observation report(s) in the message.
|
|
In a BUFR message, bits that are used to represent on / off flags.
The data values associated to some element descriptors are build of several flagbits. Code & Flag Table tells us how to interpret those values.
FM 94 BUFR is the full name of BUFR, a code defined by WMO. The main reference about BUFR is the Manual on Codes, WMO-No. 306, Volume I.2, Part B - Binary Codes .
FM 95 CREX is the full name of CREX, a code defined by WMO . The main reference about CREX is the Manual on Codes, WMO-No. 306, Volume I.2, Part B.
|
|
The Cipher series has been developed in such a way that you can write generic source code that works with either BUFR or CREX data.
From the same source code you can produce two executables simply by using different options at compile/link time: one to handle BUFR messages (e.g. for use in NWP) and one to handle CREX messages (e.g. for forecasters).
|
|
A public domain Unix operating system clone, named after the creator of its Kernel, Linus Torvalds from Finland.
All Cipher products can run on Linux systems.
Centres can define their own section of metadata as BUFR section 2 (or CREX section 3) and they can extend BUFR section 1. For instance, a centre may want to include metadata relevant for its archive system.
Cipher SoftBUFR and T-CREX can recognise local extensions if they exist, and can also access their contents. As local extensions are not standard, Cipher libraries cannot interpret them, but they provide tools to help user programs to do it. Similarly, Cipher libraries can help encoding messages with local extensions, but the user program needs to take care of the coding logic of the non-standard part.
If your centre has defined local descriptors you need to create local tables.
The descriptors defined by WMO cater for a large range of parameters. However, it may happen that a center needs to encode data for which no suitable descriptors have been defined by WMO yet. Centres can define their own descriptors for their own needs. Local element descriptors are stored in a local table B and local sequence descriptors in a local table D.
Within discipline 0 (meteorology), element and sequence descriptor classes
If a centre distributes externally messages containing local descriptors, recipients also need to have access to the local tables in order to be able to decode the message. For this reason, the use of local descriptors should be avoided when messages are intended for external distribution, unless there is no standard alternative. For those cases, it is possible for the generating center to use a skip-local-descriptor mechanism, which allows those recipients that do not have access to the local tables to skip the local data and decode the rest of the message.
When centres use local descriptors to create BUFR or CREX messages, the definition of these local descriptors must be stored in local tables. It is not possible to fully decode a BUFR or CREX message that contains local descriptors without the local tables used to create the message.
All parameters in WMO BUFR definitions are given in SI units.
Cipher SoftBUFR provides a mechanism that allows users to specify other units instead of SI units when printing values from a BUFR message. By default,
Task display in BUFRtool utility program uses localised units when environment variable VBUFR_LOCAL_UNITS has been set.
|
|
WMO defines and publishes BUFR and CREX master tables for discipline meteorology.
Generating centres can define local tables for internal usage.
Logically, a BUFR or a CREX message consists of one or more observation reports , encoded into BUFR or CREX.
Physically, a BUFR message consists of octets from section 0 ("BUFR") to section 5 ("
A program can produce messages during its execution. Cipher SoftBUFR and T-CREX produce runtime progress messages into standard error output.
Cipher SoftBUFR class library provides two Message Template databases: one for standard WMO message templates, and another one for local templates. The Local Message Template Database (actually it is just a file) can be useful for users who need to produce their own custom BUFR messages. The idea is that users who need to produce BUFR files can create their own entries to facilitate production.
Any BUFR message structure stored into the message template database can be used by any application simply by using message type and local subtype as keys to the template database.
Interactive encode-wmo and encode_local use
Message Template Databases to build the BUFR message structure.
Metadata are data used for describing data.
BUFR and CREX messages contain not only meteorological (or oceanographic or hydrological) data. They also contain metadata that describe those data, such as type and subtype, or who has created the message (centre and subcentre).
Descriptors are also metadata. The descriptor list in section 3 of BUFR or in section 1 of CREX contains the information needed (together with tables) to interpret the data coded in the data section, i.e. section 4 of BUFR or section 2 of CREX.
Most of the data in section 1 and section 3 of BUFR and in section 1 of CREX are metadata. CREX messages contain less metadata than BUFR messages.
If several observation reports share the same structure, i.e. if they have the same descriptors in BUFR section 3 (or CREX section 1), then it is possible to encode them into a single message, with the descriptors included only once, and with the data section containing several observation reports (subsets of the message).
When several observation reports share the same expanded descriptors without delayed replication , it is possible to encode them into a single, compressed, multisubset BUFR message. Compression is not available in CREX.
When using Cipher SoftBUFR or T-CREX class libraries to decode messages, the multisubset structure is transparent to the user, i.e. application programs do not need to find out whether a message contains one or several observation reports. When encoding data, the libraries provide tools that allow users to select the structure that is more convenient for their application. By default, each observation report is encoded as a single subset message.
|
|
C++ compilers add extra characters to internal function names (depending on the types of function's parameters and return value) to allow function overloading. Sometimes this is called
If you are compiling plain C functions with a C++ compiler you may have to define your functions with extern "C" declarations to avoid problems due to (in this case unnecessary) name mangling.
Northern Lighthouse Ltd is the company who has created the Cipher series.
To find more about Northern Lighthouse Ltd or about any of the Cipher products, visit Northern Lighthouse home page .
|
|
Within Cipher SoftBUFR and T-CREX class libraries, the term
Note that in
An observation report is composed of several observed parameter values. Some examples of traditional meteorological observation reports are SYNOP, METAR, TEMP and PILOT.
Several similar observation reports can be stored as subsets into a single BUFR or CREX message.
A BUFR message can contain e.g. pseudo upper air soundings derived from numerical forecast fields. These reports are not of observed data.
Section 3 of BUFR contains a flagbit that indicates whether the data in the message are observed or not.
CREX messages do not have this metadata.
BUFR message structure, although is bit oriented, is based on consecutive octets. Each octet is 8 bits.
Object Oriented Programming provides currently the best tools to build large and complicated systems and making these system easier to maintain.
One fundamental concept of OOP is data hiding which makes class library APIs much simpler compared to traditional Fortran or C interfaces.
The basic layer of software in a computer. It performs some key basic tasks, such as the control of peripheral devices and of other programs. A multitasking operating system allows several programs to run simultaneously. Some popular operating systems are Unix and Microsoft Windows.
Operator descriptors
|
|
Cipher SoftBUFR and T-CREX class libraries print progress messages to standard error (cerr). Messages are grouped according to their severity. Lower level messages can be blocked by setting a specific environment variable.
|
|
Replication descriptor
If
|
|
BUFR and CREX messages are made of several parts called sections.
The sections in a BUFR message are:
The sections in a CREX message are:
Frequently-used sequences of descriptors are stored in Table D and they are called sequence descriptors. They help to shorten the description section of BUFR (or that of CREX ).
As an example, a single sequence descriptor
Le Système International d'unités (the International System of Units, 1960) is a standardised metric system.
All BUFR parameters defined by WMO in element descriptors use SI units.
Cipher SoftBUFR class library provides methods that allow the user to use local units when printing out values decoded from a BUFR message.
CREX parameters use either SI units, or
When producing BUFR or CREX messages, there are some essential metadata that the encoding software needs to know, such as the discipline, the identifier of the originating / generating center and subcenter or the version of the tables that will be used.
Cipher software uses a site configuration file to store these metadata. If some of them change (for instance when new tables are introduced) user applications do not need to be modified and recompiled. It is enough to update the site configuration file.
When a centre uses local descriptors in a BUFR message, a complete decoding of the message is not possible for others unless they have access to the local tables used to create it. However, using a 'skip local descriptor' descriptor allows others to skip the value associated to the local descriptor and successfully decode the standard part of the message.
This is done by using operator descriptor
Computer programs (from simple programs to operating systems), as opposed to Hardware (the physical computer itself).
The same as Subtype.
A centre may define subcentres and allocate numbers that identify them. The centre informs WMO of the number(s) assigned to its subcentre(s), so that the identifier(s) can be included in WMO publications.
In a BUFR message the subcentre number is stored in section 1.
BUFR or CREX messages can contain one single observation report or several reports which share the same structure. A subset is a set of data values corresponding to one report.
All the subsets in a BUFR message are described by the same set of descriptors stored in the section 3. All the subsets in a CREX message are described by the same set of descriptors stored in the section 1.
In BUFR edition 3, generating centres can subdivide BUFR standard types into locally defined subtypes.
In BUFR edition 4, standard types are subdivided into international subtypes, defined by WMO. The use of local subtypes is maintained for backwards-compatibility. Therefore, both international subtype and local subtype can be used in the same message.
Subtypes were not used in CREX edition 1. International subtypes have been introduced with CREX edition 2.
|
|
The flexibility and expandability of BUFR and CREX codes are mainly due to the fact that an important part of the information needed is stored in external tables, so it does not need to be hardcoded in the coding software. Updating external tables is much easier than modifying software.
When software encodes or decodes messages it uses BUFR / CREX tables to find out how to do the job.
Within SoftBUFR and T-CREX class libraries, all tables are written in HTML format, and they can therefore be read also by humans (preferably with the help of a WEB browser).
Table A contains standard message types.
Table B contains element descriptors. It is the most important table, as it contains information on all available parameters within one discipline . Table B provides the connection between element descriptors in BUFR section 3 (CREX section 1) and the data coded in BUFR section 4 (respectively CREX section 2).
This is not an actual table, i.e. it is not available in table directory. Instead it is a collection of rules on how to handle operator descriptors. These rules are hard coded into BUFR or CREX coding libraries.
If there are additions or changes into these rules then the BUFR or CREX edition number will be increased (as they imply changes in the handling
software).
Table D contains sequence descriptors, i.e. information
on how to expand descriptors stored in section 3
of BUFR or section 1 of CREX.
The tables used by Cipher SoftBUFR
(or BUFRtool ) are all stored in the same directory, called BUFR table directory.
The tables used by Cipher T-CREX (or CREXtool)
are all stored in the same directory, called CREX table directory.
If users need to define local tables, these should be located in the table directory as well.
TDCF stands for Table Driven Code Form. Both BUFR
and CREX are TDCFs.
Message template is a predefined sequence of descriptors that define the contents of a specific observation type.
Message Types are defined by WMO (for meteorology) and the type of
each BUFR or CREX message is stored in Section 1. Message types
are found in Table A.
|
|
Update Sequence Number is stored in Section 1 of a BUFR message. In the original message it is 0 and any program that updates the message (adds or changes the content, e.g. quality control information) should increment the value.
Update sequence number is not used in CREX edition 1, but has
been introduced with CREX edition 2.
|
|
Master and local table files carry a version number. When creating a BUFR or a CREX message, the version number of the tables used is stored into the Section 1 of the message, so that the correct version of tables is used later when decoding the message.
BUFR master tables version 1 to 7 are backwards compatible. BUFR master tables version 8 onwards and CREX master tables are mostly backwards compatible, although there are some few exceptions.
|
|
World Meteorological Organization, a subsidiary of United Nations, and the organization that defines the BUFR and CREX standards.
BUFR and CREX coding is explained in detail in Manual on Codes, WMO-No. 306, Volume I.2, Part B - Binary Codes .
World Meteorological Organization has defined standard BUFR and CREX message templates . In
| | |