4. Startup Configuration Files

*.xtable files cannot be loaded directly into the XObjectExplorer (XOE). Instead, you will have to refer to your *.xtable files in an *.xstartup file. This enables you to control which XTables you want to analyze, how these xtable files are related and several other features.

Startup files

The *.xstartup file is a JSON encoded configuration file. The main skeleton is shown in the example. The most important entries are

  • "tableFileName": <String> Name of the *.xtable file to be loaded

  • "objectName": <String> Specifies how the object should be (re)named. If this argument is omitted then the name will be the one that is stored in the *.xtable file.

  • "loadToMemory": <Boolean> If this flag is set to true, then the content of the respective XTable is kept in main memory (compressed or uncompressed). Otherwise the data packages might be fetched from disk each time it is accessed (with obvious implications for access time and memory consumption). (If ommitted, the flag is assumed to be false.)

  • "uncompress": <Boolean> If this flag is set to true, then the data packages are held uncompressed in main memory. Obviously, this might reduce computation time, but consumes more main memory. (If ommitted, the flag is assumed to be false.)

  • "dataLoadMethod": <String> This is either "LOAD_VIA_MAPPED_BYTE_BUFFERS" or "LOAD_VIA_READ_METHOD". The default is "LOAD_VIA_MAPPED_BYTE_BUFFERS".

  • "allocateGlobalIntKeys": <Boolean> If this object has a primary key (i.e. sub-objects are (or can be) attached to this object), and if this key is of categorial type, the flag allocateGlobalIntKeys defines whether “global integer keys” are allocated for this object. In this case, a mapping is built from categorial keys to keys of type integer in order to improve query performance. The new keys are termed “global”, because the mapping from categorial keys to integer keys is built at the end of the startup process globally consistent across all sub-objects attached during startup (note that it is possible that in sub-objects the might be keys existing which did not exist on the level of this object). If - after the first startup process - additional sub-objects are added, and if those sub-objects introduce additional keys (not yet existing in the already allocated global key), then this will result in an exception. Attaching additional sub-objects may also be the result of executing a different startup configuration at a later point in time after server startup (with different sub-objects as compared to the original startup). Therefore, please use global integer keys only very carefully. (If omitted, the flag is assumed to be true.)

  • "globalAttributes": <Object> A mapping of the dimensions to the respective attributes to be loaded, see below. These attributes are global in the sense that they are shared by all users/sessions.

  • "sessionAttributes": <Object> A mapping of the dimensions to the respective attributes to be loaded, see below. These attributes are private/session attributes in the sense that a separate attribute is created for each user/session.

  • "autoGenerateAttributes": <Boolean> Turn on/off automatic generation of a default attribute for each dimension. (If omitted, auto-generation is disabled.)

  • "globalAutoAttributes": <Boolean> If this flag is set to true, then automatically generated attributes are created as global attributes (shared by all sessions). Otherwise, they are attached as session attributes (one private copy per session).

  • "autoAttributeExclusionPattern": <String[]> This entry allows to exclude certain dimensions from automatic attribute generation. The dimensions are specified by name patterns.

  • "renamings": <Object> This entry is used to define how attribute states shall be renamed, see below.

  • "children": <Object[]> The list of child objects. This might contain (recursively) the same kind of entries as the .xstartup file on the top level. In particular, it might contain again a list of (grand)children.

Example:

{
        "tableFileName": "patients.xtable",
        "objectName": "PatientObject",
        "uncompress": true,
        "loadToMemory": true,

        "globalAttributes": {
        },

        "sessionAttributes": {
        },

        "renamings": {
        },

        "children": [
                {
                        "tableFileName": "prescriptions.xtable",
                        "uncompress": false,
                        "loadToMemory": false,
                        "objectName": "PrescriptionObject",
                        "globalAttributes": {  },
                        "sessionAttributes": {  },
                        "renamings": {  }
                },
                {
                        "tableFileName": "prescriptions-sample.xtable",
                        "uncompress": true,
                        "loadToMemory": true,
                        "objectName": "PrescriptionSampleObject",
                        "globalAttributes": {  },
                        "sessionAttributes": {  },
                        "renamings": {  }
                }
        ]
}

globalAttributes and sessionAttributes section

The entries "globalAttributes" and "sessionAttributes" define the attributes that shall be attached to the different dimensions of the respective object. While globalAttributes describes attributes that are shared by all users (or, more precisely, all sessions), sessionAttributes describes attributes that are private attributes for each session (i.e., every session that defines such an attribute gets its own copy of the attribute). Both arguments take their arguments as a map of the dimension names to the respective attribute definition, where attribute definition means one of the choices

  • "autoAttribute",

  • "load" / "loadAttributes",

  • "timeAttribute" / "timeAttributes",

  • "ageAttribute" / "ageAttributes",

  • "intervalHierarchyAttribute" / "intervalHierarchyAttributes", and

  • "stringHierarchyAttribute".

(Note that the plural forms are arrays for multiple attributes.)

{
        "globalAttributes": {
                "DOB": {
                        "timeAttribute": {
                                "name": "Month of birth",
                                "from": "1920-01",
                                "levels": ["years", "months"]
                        }
                },
                "Sex": { "autoAttribute": "Sex" }
        },


        "sessionAttributes": {
                "Year of birth": {
                        "ageAttribute": {
                                "name": "Age group",
                                "bins": [
                                        ["0", "10", "20", "30", "40", "50", "60", "70", "80", "90", "100"],
                                        [  "0",  "2",  "4",  "6",  "8", "10", "12", "14", "16", "18",
                                          "20", "22", "24", "26", "28", "30", "32", "34", "36", "38",
                                          "40", "42", "44", "46", "48", "50", "52", "54", "56", "58",
                                          "60", "62", "64", "66", "68", "70", "72", "74", "76", "78",
                                          "80", "82", "84", "86", "88", "90", "92", "94", "96", "98",
                                         "100"
                                        ]
                                ],
                                "intervalNameScheme": "FROM_TO"
                        }
                }
        }
}

The autoAttribute entry

The "autoAttribute" entry is the most simple form of defining an attribute for a dimension. The system constructs a default attribute according to the data type of the dimension. Hence the only arguments that have to be provided are the name of the dimension and the name of the attribute to be constructed.

The load or loadAttributes entry

The "load" or "loadAttributes" entry provides an option to load attributes from files. Besides the name of the respective dimension, it is necessary to provide the filenames as arguments, and optionally the name of the attribute (under which it will be available in the object tree):

{
        "Product": {
                "load" : [
                        {
                                "fileName" : "Manufacturer.xattribute",
                                "attributeName" : "Manufactured by"
                        },
                        {
                                "fileName" : "Origin.xattribute",
                                "attributeName" : "Built in"
                        }
                ]
        }
}

"loadAttributes" is an older version which does not allow to specify the attribute name:

{
        "Product": {
                "loadAttributes": [
                        "classification.xattribute",
                        "substance.xattribute",
                        "producer.xattribute"
                ]
        }
}

The name will be the default name which comes with the persisted attribute. You may use both in combination, "load" and "loadAttributes".

The timeAttribute(s) entry

The "timeAttribute" entry constructs a whole time hierarchy on top of a dimension holding the time information. In addition to the name of the dimension, it is possible to apply the following parameters:

  • "name": <String> The name of the new attribute.

  • "from": <String> A string that specifies the beginning of the time period for the hierarchy.

  • "to": <String> A string that specifies the end of the time period for the hierarchy. If this entry is not provided, it is assumed as the month of the current date.

  • "recentMonth": <Integer> An integer that specifies the number of recent months (calculated backwards in time from the current date) to determine the beginning of the time period for the hierarchy. This property can be used only if there is no "from" entry (otherwise it will be ignored).

  • "levels": <String[]> An array of strings containing a subset of the keywords "centuries", "decades", "years", "halfyears", "quarters", "months", "weeks", "days", and "hours". At the moment "minutes", "seconds" and smaller units are not yet supported. The unit keywords are not case-sensitive. It is also possible to use the singular forms ("year", "month", and so on).

  • "timeZone": <String> The time zone to be used for the attribute (for conversion between epoch milliseconds and the date/time strings). If no time zone is given, the system default will be used.

  • "baseTimeUnit": <XTimeUnit> A time unit that refers to the underlying measure. At the moment, this is restricted to "MILLISECOND", "DAY", and "MONTH". If no base time unit is given, the default "MILLISECOND" is assumed. (The intention is that space may be saved if we count epoch days instead of epoch milliseconds in the case where we only have to store dates instead of whole timestamps. In this case, the base number class can be changed from "Long" to "Integer".)

{
        "Date": {
                "timeAttribute": {
                        "name": "Month",
                        "from": "2000-01",
                        "levels": ["years", "quarters", "months"],
                        "timeZone": "Europe/Berlin"
                }
        }
}

The plural form "timeAttributes" is expected to be an array of entries of the singular form above.

The ageAttribute(s) entry

The "ageAttribute" entry constructs a hierarchy on top of a dimension holding a year-of-birth or date-of-birth information. In addition to the name of the dimension, it is possible to apply the following parameters:

  • "name": <String> The name of the new attribute

  • "bins": <Integer[][]> The boundaries of the age intervals (one array for each level of the hierarchy). Those boundaries must be consistent in the sense that they must be sorted in increasing order on each level and an interval on a lower level must be contained completely within an interval of the levels above (i.e., the partition of each level is a refinement of the partition on the level above).

  • "intervalNameScheme": <String> One of the keywords "HALFOPEN_INTERVALS" and "FROM_TO". If this option is omitted, the default "HALFOPEN_INTERVALS" is used. This corresponds to interval names that are constructed as half-open intervals using square brackets. Here [a,b[ means “a is included, b is not included in the interval”. And ]a,b] means "a is excluded, b is included". The option "FROM_TO" creates the interval names as a-b instead of [a,b[ or ]a,b].

  • "dateBase": <String> One of the keywords "YEAR" and "DATE_IN_MS". If this option is omitted, the default "YEAR" is used. If "dateBase" is set to "YEAR", then it is expected that the underlying dimension provides the year-of-birth as an int value. In case of "DATE_IN_MS", it is expected that the underlying dimension provides the date-of-birth as a long value representing the number of milliseconds after January 1st, 1970 0:00h (GMT/UTC).

{
        "Year of birth": {
                "ageAttribute": {
                        "name": "Age group",
                        "bins": [
                                ["0", "10", "20", "30", "40", "50", "60", "70", "80", "90", "100"],
                                [  "0",  "2",  "4",  "6",  "8", "10", "12", "14", "16", "18",
                                  "20", "22", "24", "26", "28", "30", "32", "34", "36", "38",
                                  "40", "42", "44", "46", "48", "50", "52", "54", "56", "58",
                                  "60", "62", "64", "66", "68", "70", "72", "74", "76", "78",
                                  "80", "82", "84", "86", "88", "90", "92", "94", "96", "98",
                                 "100"
                                ]
                        ],
                        "intervalNameScheme": "FROM_TO",
                        "dateBase": "DATE_IN_MS"
                }
        }
}

The plural form "ageAttributes" is expected to be an array of entries of the singular form above.

The intervalHierarchyAttribute(s) entry

The "intervalHierarchyAttribute" entry constructs an intervals-based hierarchy on top of a dimension representing values that are comparable among each other. In addition to the name of the dimension, it is possible to apply the following parameters:

  • "name": <String> The name of the new attribute

  • "levelNames": <String[]> The names to be used for the levels of the state hierarchy

  • "bins": <Number[][]> The boundaries of the intervals (one array for each level of the hierarchy). Those boundaries must be consistent in the sense that they must be sorted in increasing order on each level and an interval on a lower level must be contained completely within an interval of the levels above (i.e., the partition of each level is a refinement of the partition on the level above).

  • "binNames": <String[][]> An array of arrays containing the state names for the different intervals. These identifiers must be unique within the whole state hierarchy of this attribute.

  • "binDisplayNames": <String[][]> An array of arrays containing the state names to be displayed in the frontend. In contrast to the names/identifiers in "binNames", these names do not have to be unique.

  • "upperBinBoundaryIncluded": <Boolean> This flag decides whether the intervals shall include the lower boundary and exclude the upper boundary (false) or the other way round (true). In mathematical terms, the intervals are either left-closed and right-open ([a,b[) in the case of false or left-open and right-closed (]a,b]) in the case of true. If this option is omitted then the false setting is assumed.

  • "intervalNameScheme": <String> One of the keywords "HALFOPEN_INTERVALS", "FROM_TO", "FROM_TO_PLUS", or "ONLY_FROM". If this option is omitted, the default "HALFOPEN_INTERVALS" is used. This corresponds to interval names that are constructed as half-open intervals using square brackets. Here [a,b[ means "a is included, b is not included in the interval". And ]a,b] means "a is excluded, b is included". The option "FROM_TO" can be applied to dimensions representing discrete values where the interval names are created as a-b instead of [a,b[ or ]a,b].

  • "type": <String> This is a deprecated entry. It should be left out. (It declares the base type of the states by one of the keywords "Long", "Integer", "Short", "Byte", "Double", "Float". If this entry is omitted, then the type is determined automatically based on type of the dimension.)

{
        "Units Dim": {
                "intervalHierarchyAttribute": {
                        "name": "Units Attr",
                        "levelNames": ["20-level", "10-level", "2-level"],
                        "type": "Integer",
                        "bins": [
                                ["0", "20", "40", "60", "80", "100"],
                                ["0", "10", "20", "30", "40", "50", "60", "70", "80", "90", "100"],
                                [  "0",  "2",  "4",  "6",  "8",
                                  "10", "12", "14", "16", "18",
                                  "20", "22", "24", "26", "28",
                                  "30", "32", "34", "36", "38",
                                  "40", "42", "44", "46", "48",
                                  "50", "52", "54", "56", "58",
                                  "60", "62", "64", "66", "68",
                                  "70", "72", "74", "76", "78",
                                  "80", "82", "84", "86", "88",
                                  "90", "92", "94", "96", "98",
                                 "100"
                                ]
                        ],
                        "intervalNameScheme": "FROM_TO"
                }
        }
}

The plural form "intervalHierarchyAttributes" is expected to be an array of entries of the singular form above.

The stringHierarchyAttribute entry (experimental)

The "stringHierarchyAttribute" entry constructs an intervals-based hierarchy on top of a dimension representing values that are strings. In addition to the name of the dimension, it is possible to apply the following parameters:

  • "name": <String> The name of the new attribute

  • "levelNames": <String[]> The names to be used for the levels of the state hierarchy

  • "bins": <Number[][]> The boundaries of the intervals (one array for each level of the hierarchy). Those boundaries must be consistent in the sense that they must be sorted in increasing lexicographical order on each level and an interval on a lower level must be contained completely within an interval of the levels above (i.e., the partition of each level is a refinement of the partition on the level above).

  • "binNames": <String[][]> An array of arrays containing the state names for the different intervals. These identifiers must be unique within the whole state hierarchy of this attribute.

  • "binDisplayNames": <String[][]> An array of arrays containing the state names to be displayed in the frontend. In contrast to the names/identifiers in "binNames", these names do not have to be unique.

  • "upperBinBoundaryIncluded": <Boolean> This flag decides whether the intervals shall include the lower boundary and exclude the upper boundary (false) or the other way round (true). In mathematical terms, the intervals are either left-closed and right-open ([a,b[) in the case of false or left-open and right-closed (]a,b]) in the case of true. Due to the nature of how strings are normally treated in dictionaries and similar catalogs, this parameter must almost always be set to false. Therefore this is an optional parameter and the default is false.

  • "intervalNameScheme": <String> One of the keywords "HALFOPEN_INTERVALS", "FROM_TO", "FROM_TO_PLUS", or "ONLY_FROM". If this option is omitted, the default "ONLY_FROM" is used. This corresponds to interval names that are constructed just from the start string of each interval. The alternative option "HALFOPEN_INTERVALS" would correspond to interval names that are constructed as half-open intervals using square brackets. Here [a,b[ means "a is included, b is not included in the interval". And ]a,b] means "a is excluded, b is included". For the option "FROM_TO", the interval names are created as a-b instead of [a,b[ or ]a,b].

children section

An object model consists of a root object (e.g., the patients) and its children (e.g., the treatments and the prescriptions). In more complex applications, the children themselves might have children again. In principle, this could be applied recursively up to an arbitrary depth of the object hierarchy. To specify the information about the descendants of an object, the children entry can be filled with an array of object definitions each of which may contain the same entries as the root/parent object. Of course, there is one obvious restriction: it is not allowed to have two children with the same name. (Actually, the system is even more restrictive than necessary: for technical reasons, object names must be unique within the whole object hierarchy.)

{
        "children": [
                {
                        "tableFileName": "prescriptions.xtable",
                        "uncompress": false,
                        "loadToMemory": false,
                        "objectName": "Prescriptions",
                        "globalAttributes": {
                                "Product": {
                                        "loadAttributes": [
                                                "classification.xattribute",
                                                "substance.xattribute",
                                                "producer.xattribute"
                                        ]
                                },
                                "Specialist": {
                                        "loadAttributes": [ "specialist.xattribute" ]
                                }
                        },
                        "sessionAttributes": {
                                "Date": {
                                        "timeAttribute": {
                                                "name": "Month",
                                                "from": "2000-01",
                                                "levels": ["years", "quarters", "months"]
                                        }
                                }
                        },
                        "renamings": {  }
                },
                {
                        "tableFileName": "treatments.xtable",
                        "uncompress": true,
                        "loadToMemory": true,
                        "objectName": "Treatments",
                        "globalAttributes": {
                                "Procedure": {
                                        "loadAttributes": [ "proc-classification.xattribute" ]
                                },
                                "Specialist": {
                                        "loadAttributes": [ "specialist.xattribute" ]
                                }
                        },
                        "sessionAttributes": {
                                "Date": {
                                        "timeAttribute": {
                                                "name": "Month",
                                                "from": "2000-01",
                                                "levels": ["years", "months"]
                                        }
                                }
                        },
                        "renamings": {  }
                }
        ]
}

renamings section

The entry "renamings" allows to redefine the names of the states contained in an attribute hierarchy. It must contain a map from dimension names to another map that maps attribute names to the names of corresponding Excel files where the renamings for this attribute are stored.

{
        "renamings": {
                "Dimension A": {
                        "Attribute A1": "Renaming-A1.xlsx",
                        "Attribute A2": "Renaming-A2.xlsx"
                },
                "Dimension B": { "Attribute B1": "Renaming-B1.xlsx" }
        }
}

clientInfos section (optional)

The clientInfos section enables you to define formatting rules for various data and scripts which should be automatically executed on startup. The following excerpt shows a valid clientInfos section.

{
        "clientInfos": {

                "defaultFormatting": {
                        "comma": ",",
                        "thousandSep": ".",
                        "prefix": null,
                        "postfix": "",
                        "decimalPlaces": 2
                },

                "attributeFormattings": [
                        {
                                "columns": [
                                        "SUM(Turnover)",
                                "AVG(Turnover)"
                        ],
                                "formatOptions": {
                                        "comma": ",",
                                        "thousandSep": ".",
                                        "prefix": null,
                                        "postfix": " &euro;",
                                        "decimalPlaces": 2
                                }
                        }
                ],

    "autostartAnalysis": "mystart.xanalysis"
        }
}

The clientInfo section consists of the following subobjects:

defaultFormatting (optional)

The defaultFormatting object enables you to define how data in a window within the XObjectExplorer is being formatted. If you do not specify this section, the following default values will be used:

{
        "defaultFormatting": {
                "comma": ",",
                "thousandSep": ".",
                "prefix": null,
                "postfix": "",
                "decimalPlaces": 2
        }
}
comma:

defines which character will be used as comma.

thousandSep:

defines which character will be used as thousand separator.

prefix:

which prefix should be used when depicting values.

postfix:

which postfix should be used when depicting values.

decimalPlaces:

defines how many decimal places should be used, if the number depicted is not an integer.

If you use the default settings, the following numbers will be depicted as follows:

Number

Depicted as follows

123

123

1234

1.234

1234,5678

1.234,56

1234,5

1.234,50

attributeFormattings (optional)

The formating rules defined in the defaultFormatting section will be applied to all data depicted. Using the attributeFormattings section enables you to define how specific columns should be formatted. The attributeFormattings section is an array of formatting objects. A valid exmaple looks like this:

{
        "attributeFormattings": [
                {
                        "columns": [
                                "SUM(Turnover)",
                                "AVG(Turnover)"
                        ],
                        "formatOptions": {
                                "comma": ",",
                                "thousandSep": ".",
                                "prefix": null,
                                "postfix": " &euro;",
                                "decimalPlaces": 2
                        }
                }
        ]
}

This example defines one formatting rule which will be applied to two columns: SUM(Turnover) and AVG(Turnover). This means that the formatting options defined below will be applied to all columns with the name being either SUM(Turnover) or AVG(Turnover).

autostartAnalysis

This optional parameter allows you to load and run an analysis immediately after loading this xstartup configuration.