Difference between revisions of "INI file"
(→Auto-increment) |
(→Conditional entries) |
||
(16 intermediate revisions by one user not shown) | |||
Line 1: | Line 1: | ||
− | + | Crawler's INI files are based on a loosely defined de-facto standard; more info can be found [http://en.wikipedia.org/wiki/INI_file ''here'']. | |
== Basic properties == | == Basic properties == | ||
Line 5: | Line 5: | ||
The Crawler INI files have the following properties: | The Crawler INI files have the following properties: | ||
* Section and entry names are case-insensitive by default (but Crawler has built-in support for case-sensitive INI files should the need arise). | * Section and entry names are case-insensitive by default (but Crawler has built-in support for case-sensitive INI files should the need arise). | ||
− | * Comment lines are supported. Prefixing a line with a '#' or a ';' makes it a comment line. In-line comments are not supported: a single line is either a comment line or it is not | + | * Comment lines are supported. Prefixing a line with a '#' or a ';' makes it a comment line. In-line comments are not supported: a single line is either a comment line or it is not. Comments on lines with data are not supported. For example: |
<pre> | <pre> | ||
# This is a comment line | # This is a comment line | ||
Line 13: | Line 13: | ||
* Blank lines are allowed (and ignored) | * Blank lines are allowed (and ignored) | ||
− | * If an INI file | + | * If an INI file has entries that are not preceded by a section line, then those entries are assumed to be in a default section ''[main]'' |
− | * Duplicate names are allowed, and provide an 'override' mechanism. If an entry appears twice, the second appearance will 'win'. | + | * Duplicate entry names are allowed, and provide an 'override' mechanism. If an entry appears twice, the second appearance will 'win'. |
* Entry values can be enclosed between double quotes (") in which case backslashes are used as an escape character as defined in JavaScript. If no double quotes are present, backslashes are not interpreted as escapes. When no double quotes are present, leading and trailing spaces are removed. The following entries are all equivalent: | * Entry values can be enclosed between double quotes (") in which case backslashes are used as an escape character as defined in JavaScript. If no double quotes are present, backslashes are not interpreted as escapes. When no double quotes are present, leading and trailing spaces are removed. The following entries are all equivalent: | ||
<pre> | <pre> | ||
Line 29: | Line 29: | ||
=== Parent-child files=== | === Parent-child files=== | ||
+ | |||
In a number of Crawler personalities, INI files are arranged in a parent-child relationship. | In a number of Crawler personalities, INI files are arranged in a parent-child relationship. | ||
− | * It is possible to derive a new personality from an existing personality. | + | |
− | * Some personalities | + | * It is possible to derive a new personality from an existing personality. This is achieved by adding a special section ''[parent]'' with a single entry ''path'' to the child INI. This entry has the path to the parent INI. The path can be absolute or relative. Relative paths are interpreted relative to the folder that contains the child INI. Forward slashes are allowed on Windows and are considered equivalent to backward slashes. |
− | When two INI files have a parent-child relation, the child file 'inherits' all the contents of the parent's INI file. The child-INI can then either override certain entries in the parent INI (by repeating the same entry and section, and providing a different value) | + | |
+ | <pre> | ||
+ | [parent] | ||
+ | |||
+ | # | ||
+ | # Parent personality: This personality is the same as XHTML but with added/overridden stuff | ||
+ | # | ||
+ | |||
+ | path = "../XHTML/config.ini" | ||
+ | </pre> | ||
+ | |||
+ | * Some personalities use a nested folder structures where INI files in the 'inner' folders implicitly use the INI files in the outer folders as parent files. When two INI files have a parent-child relation, the child file 'inherits' all the contents of the parent's INI file. The child-INI can then either | ||
+ | ** override certain entries in the parent INI (by repeating the same entry name and section name, and providing a different value) | ||
+ | ** perform string concatenation | ||
+ | |||
=== String concatenation=== | === String concatenation=== | ||
When an entry occurs multiple times in the same INI file or in a parent-child INI file arrangement, the use of '+=' is used to allow comma-separated string concatenation. | When an entry occurs multiple times in the same INI file or in a parent-child INI file arrangement, the use of '+=' is used to allow comma-separated string concatenation. | ||
Line 43: | Line 58: | ||
This will set the entry ''dataEntry'' to "some data, some more data, some more more data". Comma's are inserted between the concatenated values. | This will set the entry ''dataEntry'' to "some data, some more data, some more more data". Comma's are inserted between the concatenated values. | ||
=== Auto-increment === | === Auto-increment === | ||
− | INI files are not well suited for managing tabular, repetitive data. The Crawler INI file format offers an enhancement to make repetitive (record-based) data entry easier. A line with just a ++ is interpreted as 'the first/next record follows'. The advantage is that it becomes easy to reorder complete data records in the INI file without needing to manually renumber individual entries. | + | INI files are not well suited for managing tabular, repetitive data. |
+ | |||
+ | The Crawler INI file format offers an enhancement to make repetitive (record-based) data entry easier. A line with just a ++ is interpreted as 'the first/next record follows'. The advantage is that it becomes easy to reorder complete data records in the INI file without needing to manually renumber individual entries. | ||
The two following sections are equivalent: | The two following sections are equivalent: | ||
Line 102: | Line 119: | ||
This example sets the ''selectors'' entry to two separate selectors: ''xhtml'' and ''flow''. The ''personalityConfig'' entry will then be set to "./Personalities/XHTML/config.ini" because the selectors ''text'' and ''hyperlinks'' are not set, so the entry with the selector ''xhtml'' 'wins'. It will override the initial 'default' entry for ''personalityConfig''. | This example sets the ''selectors'' entry to two separate selectors: ''xhtml'' and ''flow''. The ''personalityConfig'' entry will then be set to "./Personalities/XHTML/config.ini" because the selectors ''text'' and ''hyperlinks'' are not set, so the entry with the selector ''xhtml'' 'wins'. It will override the initial 'default' entry for ''personalityConfig''. | ||
+ | |||
+ | === Pre-defined Selectors === | ||
+ | |||
+ | There are a number of system-defined selectors. On Mac systems, the selector ''Mac'' is defined. On Windows systems, the selector ''Win'' is defined. | ||
+ | |||
+ | This allows expressions like | ||
+ | |||
+ | <pre> | ||
+ | ... | ||
+ | FILEPATH?Mac = ~/Desktop/output.txt | ||
+ | FILEPATH?Win = C:\tmp\output.txt | ||
+ | ... | ||
+ | </pre> |
Latest revision as of 02:27, 16 January 2014
Crawler's INI files are based on a loosely defined de-facto standard; more info can be found here.
Contents
Basic properties
The Crawler INI files have the following properties:
- Section and entry names are case-insensitive by default (but Crawler has built-in support for case-sensitive INI files should the need arise).
- Comment lines are supported. Prefixing a line with a '#' or a ';' makes it a comment line. In-line comments are not supported: a single line is either a comment line or it is not. Comments on lines with data are not supported. For example:
# This is a comment line entry = test # test
means to set entry to "test # test". The trailing # test is not seen as a comment.
- Blank lines are allowed (and ignored)
- If an INI file has entries that are not preceded by a section line, then those entries are assumed to be in a default section [main]
- Duplicate entry names are allowed, and provide an 'override' mechanism. If an entry appears twice, the second appearance will 'win'.
- Entry values can be enclosed between double quotes (") in which case backslashes are used as an escape character as defined in JavaScript. If no double quotes are present, backslashes are not interpreted as escapes. When no double quotes are present, leading and trailing spaces are removed. The following entries are all equivalent:
data = my data data = "my data" data=my data data="my\x20data" data="my\u0020data"
Enhancements
Crawler INI files have a few Crawler-specific enhancements.
Parent-child files
In a number of Crawler personalities, INI files are arranged in a parent-child relationship.
- It is possible to derive a new personality from an existing personality. This is achieved by adding a special section [parent] with a single entry path to the child INI. This entry has the path to the parent INI. The path can be absolute or relative. Relative paths are interpreted relative to the folder that contains the child INI. Forward slashes are allowed on Windows and are considered equivalent to backward slashes.
[parent] # # Parent personality: This personality is the same as XHTML but with added/overridden stuff # path = "../XHTML/config.ini"
- Some personalities use a nested folder structures where INI files in the 'inner' folders implicitly use the INI files in the outer folders as parent files. When two INI files have a parent-child relation, the child file 'inherits' all the contents of the parent's INI file. The child-INI can then either
- override certain entries in the parent INI (by repeating the same entry name and section name, and providing a different value)
- perform string concatenation
String concatenation
When an entry occurs multiple times in the same INI file or in a parent-child INI file arrangement, the use of '+=' is used to allow comma-separated string concatenation.
dataEntry = "some data" ... dataEntry += "some more data, some more more data"
This will set the entry dataEntry to "some data, some more data, some more more data". Comma's are inserted between the concatenated values.
Auto-increment
INI files are not well suited for managing tabular, repetitive data.
The Crawler INI file format offers an enhancement to make repetitive (record-based) data entry easier. A line with just a ++ is interpreted as 'the first/next record follows'. The advantage is that it becomes easy to reorder complete data records in the INI file without needing to manually renumber individual entries.
The two following sections are equivalent:
[tableData] name1=Kris hours1=120 name2=John hours2=112 extras2=12 name3=Will hours3=99
[tableData] ++ name=Kris hours=120 ++ name=John hours=112 extras=12 ++ name=Will hours=99
Conditional entries
Crawler-based INI files support conditional entries. This is done by means of a special, predefined section called '[conditionals]'.
In the conditionals section, there is a single predefined entry selectors. This entry is a list of comma-separated strings.
Each of these strings is called a selector. Their presence or non-presence drives the conditional entries.
Conditional entries have an entry name, followed by a question mark and a selector. These are only taken into account if the selector is present.
[conditionals] selectors = xhtml, flow [main] personalityConfig= "./Personalities/default.ini" personalityConfig?xhtml = "./Personalities/XHTML/config.ini" personalityConfig?text = "./Personalities/Text/config.ini" personalityConfig?hyperlinks = "./Personalities/Hyperlinks/config.ini"
This example sets the selectors entry to two separate selectors: xhtml and flow. The personalityConfig entry will then be set to "./Personalities/XHTML/config.ini" because the selectors text and hyperlinks are not set, so the entry with the selector xhtml 'wins'. It will override the initial 'default' entry for personalityConfig.
Pre-defined Selectors
There are a number of system-defined selectors. On Mac systems, the selector Mac is defined. On Windows systems, the selector Win is defined.
This allows expressions like
... FILEPATH?Mac = ~/Desktop/output.txt FILEPATH?Win = C:\tmp\output.txt ...