MixDEM components - Transformers

Transformers are components of graph that transform data from one or more input ports to output port. Most of transformers use transform function or key or some other method that defines particular data transformation.


Unique

Component type : UNIQUE.
This module removes items that contain duplicate strings. You select the element to filter on, and Unique removes the duplicates - if the original feed has five items with the same title, you can configure Unique so only one of these items is included in the output feed.

AttributeDescriptionDefault
id component node identification
type component type UNIQUE
key Field to remove any duplicate
skiprows specifies how many records/rows should be skipped. 0
maxrows specifies how many records/rows should be read.

Example:
<Node id="Unique" type="UNIQUE" key="Description" />

Top

SORT

Component type : SORT.
This module sorts a feed by any item element, such as title or description. You can sort items in either ascending or descending order.

AttributeDescriptionDefault
id component node identification
type component type SORT
key Sorts the incoming records based this key
order ASC
skiprows specifies how many records/rows should be skipped. 0
maxrows specifies how many records/rows should be read.

Example:
<Node id="Sort" type="SORT" key="Brand" order="DESC" />

Top

FILTER

Component type : FILTER.
The Filter module lets you include or exclude items from a feed. With Filter you create rules that compare feed elements to values you specify.

A single Filter module can contain multiple rules . You can choose whether those rules will Permit or Block items that match those rules. Finally, you can choose whether an item must match all the rules, or if it can just match any rule.

Rules can be stored on the <Attrib> (Node attibute).

AttributeDescriptionDefault
id component node identification
type component type FILTER

<Attrib> Xml attributes :
AttributeDescriptionDefault
name expression name (id)
value filter (rule) value

Example:
<Node id="FILTER Matching" type="FILTER">
    <Attrib name="exp_1"> @{category} == "Top Stories" || @{category} == "World" </Attrib> 
</Node>

Top

REGEX

Component type : REGEX.
The Regex module modifies fields in an data feed using regular expressions, a powerful type of pattern matching. Think of it as search-and-replace on steriods.

You can define multiple Regex rules. Each has the general format: "In [field] replace [regex pattern] with [text]". Entire books have been written about regular expressions, so we'll only discuss the basics here. You can read about them in more depth [http://en.wikipedia.org/wiki/Regular_expressions].

Rules can be stored on the <Attrib> (Node attibute).

AttributeDescriptionDefault
id component node identification
type component type REGEX
skiprows specifies how many records/rows should be skipped. 0
maxrows specifies how many records/rows should be read.

<Attrib> Xml attributes :
AttributeDescriptionDefault
name
field
pattern
replacement

Example:
<Node id="Regex Pattern Matching" type="REGEX">
    <Attrib name="title 1"  field="title"   pattern="/^M/"        replacement=""    />
    <Attrib name="title 2"  field="title"   pattern="/(\d\.\d),/" replacement="$1 :" />
    <Attrib name="pubDate"  field="pubDate" pattern="/ GMT/"      replacement=""    />
</Node>

Top