Using Advanced Triggers

The following section provides information on how to use advanced triggers to extract program names in CIMCO DNC-Max. The extraction of comments in NC-Base is done in the same way.

This section contains information only relevant to advanced users. If you do not need to use or understand advanced triggers, you can skip this section.

NC-Base advanced triggers are a modified version of what is known as Regular Expressions. Regular expressions are a powerful method for searching text strings.

The following example shows how advanced triggers can be used. Suppose you have an ISO NC program with line numbers in the format N2010 at the beginning of each line, but the post processor used to generate the file has inserted a number of comment lines at the beginning of the file without block numbers. If you want to make sure that these lines are not sent to the CNC machine, you should specify the following Start trigger:

This trigger consist of the following elements:

ˆThe following trigger must be found at the beginning of a line
NLook for the character N
[0-9]Any character in the range from 0 to 9
{1,4}Match 1-4 of the previous character (0 - 9)

This means: Start transfer from the first line that has N followed by 1 to 4 digits at the beginning of the line.

A more advanced example is provided at the end of this section.

List of Symbols Recognized by Advanced Triggers

.Match any single character
*0 or more of previous expression
+1 or more of previous expression
-Range
ˆNegate set (inside set delimiters [])
{Start interval
}End interval
[Begin set
]End set
?Previous expression is optional
|Previous expression OR next expression
ˆAnchor to beginning of line
$Anchor to end of line
(Start of sub expression
)End of sub expression
<Start extraction
>End extraction

To use a special symbol as part of the text to be found, precede it with a backslash character '\'.

Example: To find a '\' at the beginning of a line, specify ˆ\\.

Sets (Bounds)

Sets are specified with the '[' and ']' symbols.

Example: [abc] will find an occurrence of any one of the characters 'a', 'b' or 'c'.

You can negate a set by specifying 'ˆ' as the first character in the set.

Example: [ˆabc] match any character that is not 'a', 'b' or 'c'.

Ranges

Ranges are specified with the '-' symbol.

Example: [a-z][0-9] will find any character from 'a' to 'z' followed by any digit from '0' to '9'.
Example: [a-zA-Z0-9] will find any letter or digit.

Interval Expressions

Interval expressions are specified with the symbols '{' and '}'.

Example: [0-9]{1,4} will find 1-4 digits.
Example: [0-9]{3,} will find 3 or more digits.
Example: [0-9]{4} will find exacly 4 digits.

Extracting Sub Expressions

To extract part of the expression, enclose the sub expression in '<' and '>'.

Example: To extract the program number 1234 from the string PRG=1234, specify PRG=<[0-9]{4}>.

Regular Expressions

A regular expression (RE) is one or more non-empty branches separated by '|'. It matches anything that matches one of the branches.

A branch is one or more pieces concatenated. It matches a match for the first, followed by a match for the second, etc.

A piece is an atom possibly followed by a single '*', '+', '?' or bound. An atom followed by '*' matches a sequence of 0 or more matches of the atom. An atom followed by '+' matches a sequence of 1 or more matches of the atom. An atom followed by '?' matches a sequence of 0 or 1 matches of the atom.

A bound is '{' followed by an unsigned decimal integer, possibly followed by ',' possibly followed by another unsigned decimal integer, always followed by '}'. The integers must lie between 0 and 255 inclusive, and if there are two of them, the first may not exceed the second. An atom followed by a bound containing one integer i and no comma matches a sequence of exactly i matches of the atom. An atom followed by a bound containing one integer i and a comma matches a sequence of i or more matches of the atom. An atom followed by a bound containing two integers i and j matches a sequence of i through j (inclusive) matches of the atom.

An atom is a regular expression enclosed in '()' (matching a match for the regular expression), an empty set of '()' (matching the null string) , a bracket expression (see below), '.' (matching any single character), 'ˆ' (matching the null string at the beginning of a line), '$' (matching the null string at the end of a line), a '\' followed by one of the characters ˆ.[$()|*+?{\ (matching that character taken as an ordinary character), a '\' followed by any other character (matching that character taken as an ordinary character, as if the '\' had not been present), or a single character with no other significance (matching that character). A '{' followed by a character other than a digit is an ordinary character, not the beginning of a bound. It is illegal to end a regular expression with '\'.

A bracket expression is a list of characters enclosed in '[]'. It normally matches any single character from the list (but see below). If the list begins with 'ˆ', it matches any single character (but see below) not from the rest of the list. If two characters in the list are separated by '-', this is shorthand for the full range of characters between those two (inclusive) in the collating sequence, e.g. '[0-9]' in ASCII matches any decimal digit. It is illegal for two ranges to share an endpoint, e.g. 'a-c-e'.

To include a literal ']' in the list, make it the first character (following a possible 'ˆ'). To include a literal '-', make it the first or last character, or the second endpoint of a range. To use a literal '-' as the first endpoint of a range, enclose it in '[.' and '.]' to make it a collating element (see below). With the exception of these and some combinations using '[' (see next paragraphs), all other special characters, including '\', lose their special significance within a bracket expression.

Within a bracket expression, a collating element (a character, a multi-character sequence that collates as if it were a single character, or a collating sequence name for either) enclosed in '[.' and '.]' stands for the sequence of characters of that collating element. The sequence is a single element of the bracket expression's list. A bracket expression containing a multi-character collating element can thus match more than one character, e.g. if the collating sequence includes a 'ch' collating element, then the regular expression '[[.ch.]]*c' matches the first five characters of 'chchcc'.

Within a bracket expression, a collating element enclosed in '[=' and '=]' is an equivalence class, standing for the sequences of characters of all collating elements equivalent to that one, including itself. (If there are no other equivalent collating elements, the treatment is as if the enclosing delimiters were '[.' and '.]'.) . For example, if o and ˆ are the members of an equivalence class, then '[[=o=]]', '[[=ˆ=]]', and '[oˆ]' are all synonymous. An equivalence class may not be an endpoint of a range.

In the event that a regular expression could match more than one substring of a given string, the RE matches the one starting earliest in the string. If the RE could match more than one substring starting at that point, it matches the longest. Subexpressions also match the longest possible substrings, subject to the constraint that the whole match be as long as possible, with subexpressions starting earlier in the regular expression taking priority over ones starting later. Note that higher-level subexpressions thus take priority over their lower-level component subexpressions.

Match lengths are measured in characters, not collating elements. A null string is considered longer than no match at all. For example, 'bb*' matches the three middle characters of 'abbbc', '(wee|week)(knights|nights)' matches all ten characters of 'weeknights', when '(.*).*' is matched against 'abc' the parenthesized subexpression matches all three characters, and when '(a*)*' is matched against 'bc' both the whole RE and the parenthesized subexpression match the null string.

Advanced Trigger Example

The advanced triggers can also be used to look for program numbers, path information etc. The following example can be used in the standard protocol auto receive setup to identify the program name for the received file.

Suppose we store the program name in the NC program as O2123 (where 2123 is the program number), but we only want to look for program numbers in a specific line, if the previous line starts with a character %. If this is the case, we should specify:

This trigger consists of the following elements:

ˆThe following trigger must be found at the beginning of a line
%Look for the character %
.Match any character
*Match zero or more of the previous character, in this case any character
\LFMatch a line feed
.Match any character
*Match zero or more of the previous character, in this case any character
OLook for the character O
<Start of program name
[0-9]Any character in the range from 0 to 9
{4}Match 4 of the previous character, in this case any character in the range from 0 to 9
>End of program name
(Start of sub-expression
[ˆ0-9]Any character outside the range from 0 to 9
+Match one or more of the previous character, in this case, any character outside the range from 0 to 9
|Match the expression to the left or to the right of the '|'
$Must be at end of line
)End of sub-expression

This is translated into something like this:

Look for a line that begins with %, then accept all characters until the end of the line. On the next line, accept all characters until O followed by 4 digits. These 4 digits must be followed by either one or more characters that are not digits, or must be at the end of the line.

The last part ensures that exactly 4 digits must be specified. Because something like O12345, is O followed by 4 digits, followed by something that is a digit (5), and it is therefore not matched.

The '<' and '>' are the delimiters for the part of the expression that should be extracted to get the program number. In this case, 4 digits.