AttributeClassifier

From fmepedia

AttributeClassifier is a Workbench Transformer.

Table of contents

Description

Tests if the contents of the source attribute are entirely of a particular character classification, and routes the feature accordingly.

Empty strings are not considered part of any class and so will always fail.

The following character classes are available:

  • Alphanumeric -- Any Unicode alphabet or digit character.
  • Alphabetic -- Any Unicode alphabet character.
  • ASCII -- Any character with a value less than \u0080 (those that are in the 7-bit ASCII range).
  • Boolean -- The value matches ’0’,’1’, ’false’, ’true’, ’no’, or ’yes’
  • Control -- Any Unicode control character.
  • Date -- Any date in the YYYYMMDD format.
  • Digit -- Any Unicode digit character. Note that this includes characters outside of the [0-9] range.
  • Double -- A double precision number, which is: white space; a sign; a sequence of digits; a decimal point; a sequence of digits; the letter ‘‘e’’; and a signed decimal exponent. Any of the fields may be omitted, except that the digits either before or after the decimal point must be present and if the ‘‘e’’ is present then it must be followed by the exponent number.
  • False -- The value matches ’0’, ’false’, or ’no’
  • Graphical -- Any Unicode printing character, except space.
  • Hexdigit -- Any hexadecimal digit character ([0-9A-Fa-f]).
  • Integer -- An integer number, defined as a collection of integer digits, optionally signed and optionally preceded by white space. If the first two characters of string are ‘‘0x’’ then string is expected to be in hexadecimal form; otherwise, if the first character of string is ‘‘0’’ then string is expected to be in octal form; otherwise, string is expected to be in decimal form.
  • Lowercase -- Any Unicode lowercase alphabet character.
  • Not a Number -- A floating point number equal to the special "not a number" value.
  • Printable -- Any Unicode printing character, including space.
  • Punctuation -- Any Unicode punctuation character.
  • Space -- Any Unicode space character.
  • True -- The value matches ’1’, ’true’, or ’yes’
  • Uppercase -- Any uppercase alphabet character in the Unicode character set.
  • Wordchar -- Any Unicode word character. That is any alphanumeric character, and any Unicode connector punctuation characters (e.g., underscore).

For an attribute to pass, all of its contents must belong to the specified classification.



Not a Number

This is often a puzzling test. What it tests for is the presence of a NaN marker. It does not test the value itself to see if it is numeric or not.


Correct Use

Here's an example of correct use.

You overlay points onto a raster elevation model. Some points fall outside of the model and receive an elevation of NaN. You can now use the AttributeClassifier to test which points are NaN.


Incorrect Use

Here's an example of incorrect use.

You have an attribute which represents a US zipcode. You want to filter out values such as ABCDE which are not a proper numeric value. You cannot use the AttributeClassifier for this. It will not reject a feature unless its attribute value is specifically "NaN".


What should you do?

The incorrect example above is easy. Just use AttributeClassifier to check if it is an Integer. Other examples would be less easy. You may need to use a StringSearcher to check for alphabetic characters.

Attached Files
filesizedate
index.php------
User Comments Add a new comment