xml schema understanding datatypes

Upload: thaprem

Post on 06-Apr-2018

243 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 XML Schema Understanding Datatypes

    1/14

  • 8/2/2019 XML Schema Understanding Datatypes

    2/14

    For example, 100.0, 200.0, and so on are values in the value space of datatype float. The value 100.0 can be represented using multiple literalssuch as 10.0E+1, 1.0E2, 1.0E+2, and so on. Similarly, the value 200.0 can be represented using multiple literals such as 2.0E2, 2.0E+2, and so on.All such literals for every value in the value space of float belong to the lexical space of datatype float . (See Figure 1.)

    Figure 1: A value in the value space can map to many literals inthe lexical space.

    Canonical Lexical Representation

    A canonical lexical representation is a set of literals from among the valid set of literals for a datatype such that there is a one-to-one mapping

    between literals in the canonical lexical representation and values in the value space. (See Figures 2 and 3.)

    Figure 2: Many literals in the lexical space map to exactly oneliteral in the canonical lexical representation.

    XML Schema: Understanding Datatypes http://www.oracle.com/technetwork/articles/srivastava-datatypes-087961.html?printOnly=1

    2 of 14 23-03-2012 13:49

  • 8/2/2019 XML Schema Understanding Datatypes

    3/14

    Figure 3: There is always a one-to-one mapping from the value

    space to the canonical lexical representation.

    Canonical representations do not serve any purpose in XML Schema but are useful in other specifications that use XML Schema datatypes. Forexample, the XQuery/XPath datamodel uses XML Schema types as well as the canonical lexical representation to serialize a value. Therefore,when serializing a value such as 100.0, the corresponding canonical lexical representation is used in this case, 1.0E2.

    Datatypes in XMLSchema

    Now that we understand the fundamental concept about datatypes in general, let's explore the datatypes available in XML Schema. Broadlyspeaking, the datatypes in XML Schema can be categorized as ur-Type , built-in , and user-derived (se Table 1 below) and are related to each other

    as shown in Figure 4.

    ur-Type anyType

    anySimpleType

    Built-in (Atomic) Primitive

    Derived

    User-Derived Restriction

    List

    UnionTable 1: XML Schema Datatype Classification

    XML Schema: Understanding Datatypes http://www.oracle.com/technetwork/articles/srivastava-datatypes-087961.html?printOnly=1

    3 of 14 23-03-2012 13:49

  • 8/2/2019 XML Schema Understanding Datatypes

    4/14

    Figure 4: Relationships between datatypessupported by XML Schema

    Now, let's examine the major classifications ur-Type, built-in, and user-derived more closely.

    ur-Type

    An ur-Type is a classification that says there exists a base or root of the entire type system hierarchy in XML Schema datatypes. Any and everydatatype in XML Schema has the ur-Type as its parent or ancestor. The ur-Type has a role similar to that of java.lang.Object in Java, which is thebase class of all built-in and user-defined classes in that language. Similarly, the ur-type is the base of all datatypes in XML Schema. anyType andanySimpleType are the two ur-types available in XML Schema.

    anyType

    The anyType datatype is a concrete ur-Type, which can serve either as a complex type (non-scalar data, means elements), or as a simple type(scalar data) depending on the context. For example, here is an XML Schema using the anyType datatype:

    Here is the corresponding valid instance using scalar data:

    USD

    XML Schema: Understanding Datatypes http://www.oracle.com/technetwork/articles/srivastava-datatypes-087961.html?printOnly=1

    of 14 23-03-2012 13:49

    XML S h U d t di D t t htt // l /t h t k/ ti l / i t d t t 087961 ht l? i tO l 1

  • 8/2/2019 XML Schema Understanding Datatypes

    5/14

    And here is the corresponding valid instance using non-scalar data:

    100

    anySimpleType

    The anySimpleType datatype is also a concrete ur-Type, and is the parent of all built-in datatypes and ancestor of all user-derived scalar

    datatypes. It differs from anyType in the sense that it can hold only scalar data corresponding to any scalar datatype, whereas anyType can holdscalar as well as non-scalar data. For example, here is an XML Schema using the anySimpleType datatype:

    Here is the corresponding valid instance using scalar data:

    USD

    And here is the corresponding invalid instance using non-scalar data:

    100

    XML Schema: Understanding Datatypes http://www.oracle.com/technetwork/articles/srivastava-datatypes-087961.html?printOnly=1

    5 of 14 23-03-2012 13:49

    XML Schema: Understanding Datatypes http://www oracle com/technetwork/articles/srivastava datatypes 087961 html?printOnly=1

  • 8/2/2019 XML Schema Understanding Datatypes

    6/14

    In fact, if you don't specify any type for an element declaration, its type defaults to anyType , and if you don't specify any type for an attributedeclaration, its type defaults to anySimpleType . In the example below, the type of element Currency defaults to anyType and the type ofattribute MoreCurrency defaults to anySimpleType .

    Built-in Datatypes

    Built-in datatypes, which are defined in the W3C XML Schema Datatype Specification, must be supported by all W3C XML Schema-compliantparsers. There are two classifications of built-in datatypes: primitive and derived . The differences between the two have little relevance for the user,but we will examine them here anyway to demonstrate the mechanics and utility of datatype generation. (See the W3C's built-in datatypeinheritance diagram here.)

    Built-in Primitive Datatypes

    Primitive datatypes are indivisible. They are not defined in terms of other datatypes; they exist independently. For example, decimal is awell-defined mathematical concept that cannot be defined in terms of any other datatypes. There are the 19 built-in primitive datatypes supportedby the XML Schema Datatypes Specification:

    stringbooleandecimalfloatdoubleduration

    dateTimetimedategYearMonthgYeargMonthDaygDaygMonthhexBinarybase64BinaryanyURIQNameNOTATION

    XML Schema: Understanding Datatypes http://www.oracle.com/technetwork/articles/srivastava-datatypes-087961.html?printOnly=1

    6 of 14 23-03-2012 13:49

  • 8/2/2019 XML Schema Understanding Datatypes

    7/14

    XML Schema: Understanding Datatypes http://www.oracle.com/technetwork/articles/srivastava-datatypes-087961.html?printOnly=1

  • 8/2/2019 XML Schema Understanding Datatypes

    8/14

    "value space." Constraining the "value space" consequently constrains the "lexical space." Remember, the value space of a datatype can only berestricted and not extended. The XML Schema construct is used to create user-derived datatypes by restricting an existing datatypewith the allowed constraining facets. For example, a string of length 3 can be expressed as:

    In the above example, an anonymous user-derived datatype the base datatype being string is defined along with the constraining facet, length .The same example can be written using a named user-derived datatype for re-usability:

    Following are the 12 constraining facets in XML Schema, which can be used to create a user-derived datatype from other available built-indatatypes. The constraining facets might change however depending on the base datatype:

    length

    XML Schema: Understanding Datatypes http://www.oracle.com/technetwork/articles/srivastava datatypes 087961.html?printOnly 1

    8 of 14 23-03-2012 13:49

    XML Schema: Understanding Datatypes http://www.oracle.com/technetwork/articles/srivastava-datatypes-087961.html?printOnly=1

  • 8/2/2019 XML Schema Understanding Datatypes

    9/14

    minLengthmaxLengthpatternenumerationwhiteSpacemaxInclusive

    maxExclusiveminExclusiveminInclusivetotalDigitsfractionDigits

    User-Defined List Datatype

    In XML Schema a list is a sequence of homogeneous items, separated by a white space (space, tabs, carriage returns, new lines), where all theitems in the list have the same datatype. It is similar to an array in Java, which is self-describing.

    The XML Schema construct is used to create a list datatype. For example, a list of float can be created as under:

    A list need not always be of a built-in datatype; it can also be a list of user-derived datatype. For example, a list of user-derived datatype fromfloat , where the value is restricted from 10.0 to 20.0, can be expressed as:

    g yp p yp p y

    9 of 14 23-03-2012 13:49

    XML Schema: Understanding Datatypes http://www.oracle.com/technetwork/articles/srivastava-datatypes-087961.html?printOnly=1

  • 8/2/2019 XML Schema Understanding Datatypes

    10/14

    To re-use the above defined list datatype, we must name the list datatype as follows:

    A valid instance adhering to the above schema can hold a list of float between the range 10.0 and 20.0, both inclusive:

    10.0 12.415.0

    In the above example the items in the list are restricted to have a value from 10.0 to 20.0, but there is no restriction on the number of items in thelist. If we want to restrict the number of items in the list to say 3, we can do that as follows:

    10 of 14 23-03-2012 13:49

    XML Schema: Understanding Datatypes http://www.oracle.com/technetwork/articles/srivastava-datatypes-087961.html?printOnly=1

  • 8/2/2019 XML Schema Understanding Datatypes

    11/14

    Here we used a facet length to restrict the number of items in the list in the above example. For datatypes derived from list datatype,regardless of the datatype of the individual itemType of list , only the following facets are allowed:

    LengthMinLengthMaxLengthPatternEnumerationWhiteSpace

    User-Derived Union Datatype

    A union datatype is created by taking a union of one or more other datatypes. The XML Schema construct is used to create uniondatatypes. For example, a union of int and float datatypes can be expressed as:

    11 of 14 23-03-2012 13:49

    XML Schema: Understanding Datatypes http://www.oracle.com/technetwork/articles/srivastava-datatypes-087961.html?printOnly=1

  • 8/2/2019 XML Schema Understanding Datatypes

    12/14

    When validating the value of currency in the instance, it is first matched against datatype int. If it is not a valid int then it is matched againstdatatype float . If it is not a valid float either, then an error is raised. As you can see, the order in which memberTypes are declared is indeedsignificant, but only from a datatype validator perspective. From the user's perspective, the order of memberTypes is not significant at all.

    Similar to list , a union can be of primitive datatypes as well as user-derived datatypes. For example, a union of user-derived datatypes from intand float can be expressed as follows:

    A valid instance adhering to the above schema can hold either a single int between the range 10 and 20 or a single float between the range30.0 and 40.0, both inclusive:

    35.0

    12 of 14 23-03-2012 13:49

  • 8/2/2019 XML Schema Understanding Datatypes

    13/14

    XML Schema: Understanding Datatypes http://www.oracle.com/technetwork/articles/srivastava-datatypes-087961.html?printOnly=1

  • 8/2/2019 XML Schema Understanding Datatypes

    14/14

    Conclusion

    Now that you understand datatypes in XML Schema and their usage, moving to other constructs of XML Schema, which define complex elementcontent, should be much easier.

    Rahul Srivastava ( [email protected]) is a senior member of Oracle Application Server development team at Oracle and is presently working inthe EAI space. He has contributed in the development of the Apache open-source Xerces2-J W3C complaint validating XML Parser primarily in thearea of W3C XML Schema. Rahul was also a contributor to JAXP and JSR-173 when working with Sun Microsystems as part of the Web servicesteam.

    Please rate this document:

    Excellent Good Average Below Average Poor

    Send us your comments

    14 of 14 23-03-2012 13:49