Difference between revisions of "Processor"
Line 5: | Line 5: | ||
The original [[Granule|''granules'']] are normally dropped from the data flow. They are substituted by the newly created 'processed' granules. | The original [[Granule|''granules'']] are normally dropped from the data flow. They are substituted by the newly created 'processed' granules. | ||
− | + | In the Crawler system, once created, a granule is never modified. A granule is a 'constant' data entity. | |
− | + | So, a processor can only ''substitute'' some of the input granules by newly created granules. | |
− | + | ||
− | + | ||
An example: a processor could be set up to substitute any word granules with all uppercase word granules. | An example: a processor could be set up to substitute any word granules with all uppercase word granules. |
Latest revision as of 19:52, 29 December 2013
A processor is an atomic adapter which can substitute certain granules with different granules.
Inside the processor, there is some programming logic which will select particular types of input granule, and returns a different granule instead.
The original granules are normally dropped from the data flow. They are substituted by the newly created 'processed' granules.
In the Crawler system, once created, a granule is never modified. A granule is a 'constant' data entity.
So, a processor can only substitute some of the input granules by newly created granules.
An example: a processor could be set up to substitute any word granules with all uppercase word granules.
Consider the following data flow which originated somewhere up-flow. This is the input to the example processor:
Word: This Word: is Word: a Word: paragraph Para: This is a paragraph Word: This Word: is Word: another Word: paragraph Para: This is another paragraph TextFrame: pos (10, 20), width 20, height 80
The output could look like this:
Word: THIS Word: IS Word: A Word: PARAGRAPH Para: This is a paragraph Word: THIS Word: IS Word: ANOTHER Word: PARAGRAPH Para: This is another paragraph TextFrame: pos (10, 20), width 20, height 80