Pyfme Attribute Type Handling
From fmepedia
| Table of contents |
Overview
This technote provides information regarding how FME Feature attributes are accessed via the pyfme API
Background
PyFME is first, and foremost, a wrapper for the FME Objects API. As such, most of the PyFME methods will produce the exact same results as the original C++ FME Objects API. An additional mandate for the PyFME module is to also include certain API 'enhancements' that take advantage of the Python language.
FME Attribute Types
Internally, each FME attribute has a 'native' type, i.e. String, Integer, Float, etc. The FME Objects API supports retrieving attribute values via access methods that explictly dictate the type of the retrieved value. I.e. getStringAttribute will always retrieve a string value, etc. The attribute value getter methods will never fail due to a type mis-match; FME will implicitly cast attribute values to match the requested type.
Recent versions of FME provide a FMEFeature method for introspecting the type of each attribute.
String Encoding
Internally, FME supports two types of string attributes: "string", and "encoded string". "string" attribute values are always encoded using the system encoding, and "encoded string" attributes can be encoded with an arbitrary string encoding. Each "encoded string" attribute knows exactly what its string encoding is. "encoded attributes" are a recent (FME 2006) addition to FME, and are not fully supported by all readers and writers.
FME Objects supports two methods for accessing the values of string attributes: get/setStringAttribute and get/setEncodedAttribute. The key difference between these access methods is that get/setStringAttribute always uses system encoding. If you attempt to access an encoded string attribute using the get/setStringAttribute, the attribute value will be implicitly transcoded to the default system encoding.
Implicit transcoding facilitates backwards compatibility with previous users of the FME Objects API, but has the potential to cause problems when being used to access attribute values that use a much richer string encoding, such as Unicode.
get/setEncodedAttribute, on the otherhand, permit users to specify exactly what encoding to use when accessing attribute values.
Generally speaking, get/setEncodedAttribute should be used whenever there is any possibility of accessing encoded string attribute values. Many FME Readers and Writers now use encoded string attributes by default, making this scenario increasingly likely.
For deeper background regarding string encoding, Joel Spolsky's article on string encoding (http://www.joelonsoftware.com/articles/Unicode.html) is highly recommended.
Python Type System
Python is a dynamic programing language that uses a dynamic type system which doesn't enforce the strict static typing required by programming languages such as C++.
The PyFME API uses four native Python types for accessing FME attribute values: Integer, Float, String, and UnicodeString.
More information regarding Python types can be found here (http://docs.python.org/lib/types.html).
Attribute Access with PyFME
PyFME supports access to Feature attributes with either explicit or implicit typing.
Depending on the nature of the application, both methods of attribute access are useful.
Explicit Typing
Explicit typing guarantees that getter methods will return a specific type, and setter methods will be passed a Python object with a specific type. This has two very important implications:
- Getter methods will perform type conversion as neccesary
- Setter methods will throw an exception if there is a type mismatch. E.g. if you pass an Integer object to the setStringAttribute method.
The following methods provide explicit typing
- getStringAttribute: returns a Python String object
- setStringAttribute: requires a Python String object. System string encoding is assumed.
- getEncodedStringAttribute: returns a Python String object with the specified encoding
- setEncodedStringAttribute: requires a Python String object whose encoding matches the specified encoding
- getIntegerAttribute: returns a Python Integer object
- setIntegerAttribute: requires a Python Integer object
- getFloatAttribute: returns a Python Float object
- setFloatAttribute: requires a Python Float object.
- getUnicodeString: returns a Python UnicodeString object
- setUnicodeString: requires a Python UnicodeString object. A FME encoded string attribute value will be created.
Implicit Typing
Implicit typing takes advantage of the type intropection functions provided by the Python API and the FME Objects API. Simply put, implicit attribute access provides direct access to attribute values in their 'native' types. Two attribute access methods provide implicit typing: getAttribute and setAttribute.
getAttribute introspects the 'native' type of the specified FME attribute, and returns a Python object whose type most closely matches the type of the original attribute value.
setAttribute introspects the type of the provided Python object, and creates an FME attribute value with a type that most close.
