ims 4212: data modeling—attributes and domains 1 dr. lawrence west, management dept., university...
TRANSCRIPT
IMS 4212: Data Modeling—Attributes and Domains
1Dr. Lawrence West, Management Dept., University of Central [email protected]
Attributes and Domains
• Attribute Data Types
– Numeric Data Types
– Text Data Types
– Date Data Types
– Other Data Types
• Domain Constraints
IMS 4212: Data Modeling—Attributes and Domains
2Dr. Lawrence West, Management Dept., University of Central [email protected]
Non-key Attributes
• Previous discussion has focused on different kinds of attributes
– Primary keys
– Foreign keys
– Alternate keys (candidate keys)
• We have implied and sometimes illustrated other attributes
– Attributes that just contain information we want to record about an entity
– Do not serve as any sort of key
IMS 4212: Data Modeling—Attributes and Domains
3Dr. Lawrence West, Management Dept., University of Central [email protected]
Non-key Attributes (cont.)
• Four major issues with non-key attributes
– Identifying descriptive attributes
– Placing attributes in the correct entity
• Rules of normalization
• Appropriate functional dependency
– Identifying the appropriate data type for the attribute (field)
– Specifying appropriate domain business rules for the attribute (field)
IMS 4212: Data Modeling—Attributes and Domains
4Dr. Lawrence West, Management Dept., University of Central [email protected]
Attribute Data Types
• When attributes are instantiated in a database they become fields
• Fields must have a data type
– Similar to variable type in programming
– (Variants are not allowed)
• Data type specifies:
– Class of data (numeric, text, date, binary, other)
– Size of data within the class
IMS 4212: Data Modeling—Attributes and Domains
5Dr. Lawrence West, Management Dept., University of Central [email protected]
Attribute Data Types (cont.)
• A field’s type supports processing and storage efficiency
– Storage
• The number 123 can be represented with one byte
• The text “123” requires three bytes
– Processing
• “123” must be converted to a number before it can be used in an arithmetic calculation
• The number 123 can be used directly in an arithmetic calculation
IMS 4212: Data Modeling—Attributes and Domains
6Dr. Lawrence West, Management Dept., University of Central [email protected]
Attribute Data Types (cont.)
• Many DB operations require comparisons of values
• Computers perform numeric operations (comparisons and arithmetic) much more efficiently than textual comparisons or text operations
• But a tremendous amount of useful business data is textual
• Date values are stored and compared as numbers but displayed as conventional dates
• Choose the appropriate data type for your attribute
IMS 4212: Data Modeling—Attributes and Domains
7Dr. Lawrence West, Management Dept., University of Central [email protected]
Attribute Data Types (cont.)
• Some special cases
– Leading zeros in numbers
• ‘Numeric’ fields such as zip codes and social security numbers can start with zeros
• Storing them as numbers will strip off the leading zeros
• Store as text to retain leading zeros
• (And you will rarely perform arithmetic on them)
– International implications
• Many countries use letters in their postal (zip) codes
IMS 4212: Data Modeling—Attributes and Domains
8Dr. Lawrence West, Management Dept., University of Central [email protected]
Numeric Data Types in SQL Server 2000
• Integers
– bigint Integers from -2^63 to 2^63 – 1 (8 bytes)
– int Integers from -2^31 to 2^31 – 1 (4 bytes)
– smallint Integers from -2^15 to 2^15 – 1 (2 bytes)
– tinyint Integers from 0 to 255 (1 byte)
• Note: only positive for tinyint
– bit 0 or 1 only
• Used for boolean values
• No True/False data type in SQL Server
IMS 4212: Data Modeling—Attributes and Domains
9Dr. Lawrence West, Management Dept., University of Central [email protected]
Numeric Data Types in SQL Server 2000 (cont.)
• Decimal numeric values
– Money and SmallMoney are the workhorses of business programming
– money
• ± 2^63 (~ ± 922 trillion)
• decimal point fixed to four digits (1/10,000)
• no monetary formatting ($ or commas)
– smallmoney +/- 214,748.3648 fixed to four places
IMS 4212: Data Modeling—Attributes and Domains
10Dr. Lawrence West, Management Dept., University of Central [email protected]
Numeric Data Types in SQL Server 2000 (cont.)
• Decimal numeric values– Decimal or Numeric types are equivalent– Allow specifying large values of decimal precision– approximate range +/- 10^38 – decimal point can be anywhere in the value– Decimal(p, s)
• p=1-38: Number of digits represented• s<p: Digits right of decimal place• p affects storage size—examples:
– p<=9 5 Bytes– 29 <= p <=38 17 Bytes
IMS 4212: Data Modeling—Attributes and Domains
11Dr. Lawrence West, Management Dept., University of Central [email protected]
Numeric Data Types in SQL Server 2000 (cont.)
• Decimal numeric values (cont.)
– Float and Real
• Values stored in scientific notation
• Float(n) where n=number of digits in coefficient
• These are very imprecise
• Not commonly used in business
IMS 4212: Data Modeling—Attributes and Domains
12Dr. Lawrence West, Management Dept., University of Central [email protected]
Numeric Precision
• The type of numeric field also limits the size of calculated results on that field
• You might choose tinyint for a quantity field in a retail application
• If all quantities are summed and the result exceeds 255 the result will be limited to 255
• There is a hierarchy of precedence for data type conversions
– E.g., Money_value * SmallMoney_value gives a Money_value
– TinyInt + SmallInt gives a SmallInt
IMS 4212: Data Modeling—Attributes and Domains
13Dr. Lawrence West, Management Dept., University of Central [email protected]
Numeric Precision (cont.)
• Date Precedence
– DateTime
– SmallDateTime
• Character Precedence
– nText
– Text
– nVarChar
– VarChar
– Char
• Numeric Precedence
– Float
– Real
– Decimal
– Money
– SmallMoney
– BigInt
– Int
– SmallInt
– TinyInt
– Bit
IMS 4212: Data Modeling—Attributes and Domains
14Dr. Lawrence West, Management Dept., University of Central [email protected]
Numeric Precision (cont.)
Uncomment each pair of SET/SELECT statements and see results
DECLARE @SmallIntOne SmallInt,@SmallIntTwo SmallInt,@SmallIntResultOne SmallInt,@IntResultOne Int
SET @SmallIntOne = 1000SET @SmallIntTwo = 1000
--SET @SmallIntResultOne = @SmallIntOne * @SmallIntTwo--SELECT @SmallIntResultOne
--SET @IntResultOne = @SmallIntOne * @SmallIntTwo--SELECT @IntResultOne
SET @IntResultOne = CAST(@SmallIntOne AS int) * @SmallIntTwoSELECT @IntResultOne
IMS 4212: Data Modeling—Attributes and Domains
15Dr. Lawrence West, Management Dept., University of Central [email protected]
Numeric Data Types in SQL Server 2000 (cont.)
• Identity attributes
– int and bigint attributes can be identity attributes
– The field’s value for the first record will be automatically set to the identity seed value
• I usually set to 1,000 (int) or 10,000 (bigint)
– Each new record will get a new value incremented by the identity increment
• Almost always one
– Widely used for PKs
– Foreign Keys will be the same data type but will not be an identity attribute
IMS 4212: Data Modeling—Attributes and Domains
16Dr. Lawrence West, Management Dept., University of Central [email protected]
Numeric Data Types (cont.)
• BigInt ± 9,223,372,036,854,775,808
• Int ± 2,147,483,648
• SmallInt ± 32,768
• TinyInt + 0-255 (positive only)
• Decimal ± 10^38
• Money ± 922,337,203,685,477.5807
• SmallMoney +/- 214,748.3648
IMS 4212: Data Modeling—Attributes and Domains
17Dr. Lawrence West, Management Dept., University of Central [email protected]
Date Data Types in SQL Server 2000
• datetime
– January 1, 1753 to December 31, 9999
– Accurate to 3 hundredths of a second
• smalldatetime
– January 1, 1900 to June 6, 2079
– Accurate to one minute
– Remember the year 2000 problem
• Required time precision (rather than maximum date allowed) may determine datatype chosen
• Use “___Date” or “___Time” in naming to indicate stored precision
IMS 4212: Data Modeling—Attributes and Domains
18Dr. Lawrence West, Management Dept., University of Central [email protected]
Text Data Types in SQL Server 2000
• Unicode vs. nonUnicode
– Unicode allows the extended character set needed to represent many foreign, especially Oriental, languages
– Unicode uses two bytes per character
– nonUnicode limited to 255 characters and supports most European languages
• Text/character fields are specified with their maximum length
IMS 4212: Data Modeling—Attributes and Domains
19Dr. Lawrence West, Management Dept., University of Central [email protected]
Text Data Types in SQL Server 2000
• char(n) up to 8,000 characters
– Fixed length text field n characters long
– Shorter values will be padded with spaces
– Wasted space but efficient processing
– May require trimming when retrieving
• varchar(n) up to 8,000 characters
– variable length text field n characters long
• text(n) up to 2,147,483,647 characters
– Variable length text field allows for large text fields
– May not be searched with SQL comparisons
IMS 4212: Data Modeling—Attributes and Domains
20Dr. Lawrence West, Management Dept., University of Central [email protected]
Text Data Types in SQL Server 2000 (cont.)
• nchar(n), nvarchar, ntext are Unicode versions of char, varchar, and text data types
• Don’t be stingy declaring text fields but don’t be wasteful, either
– char data types can blow up database size quickly
– varchar and text fields can require additional processing
• Avoid searching within long text values unless necessary
• Full text indexing is available in SQL Server and other databases
IMS 4212: Data Modeling—Attributes and Domains
21Dr. Lawrence West, Management Dept., University of Central [email protected]
Other Data Types in SQL Server 2000
• binary, varbinary, and image are three different data types that store binary values
– images
– formatted documents such as Excel or Word documents
– Avoid storing binary data in the database
• uniqueidentifier is a field that automatically creates a value guaranteed to be unique anywhere in the world
• See text for info on cursor, rowversion, and sql_variant
– Don’t even think about using sql_variant for normal business data design
IMS 4212: Data Modeling—Attributes and Domains
22Dr. Lawrence West, Management Dept., University of Central [email protected]
Data Type Considerations
• The attribute data types are also the same datatypes available for use in stored procedures
– Parameters
– Variables
• Ensure thatvariable/parameterdata type is consistent with its use
– Will hold required value
– Is consistent with field datatypes with which it is compared
DECLARE @SmallIntOne SmallInt,@SmallIntTwo SmallInt,@SmallIntResultOne SmallInt,@IntResultOne Int
IMS 4212: Data Modeling—Attributes and Domains
23Dr. Lawrence West, Management Dept., University of Central [email protected]
Domain Constraints
• Domain constraints are critical limitations on the values that attributes can take on
– Is a value required?
• Always for PK attributes
• What about other attributes?
– Is value numeric, textual, boolean, date?
– Are there ranges the value must be in or must not be in?
– Are there limited choices for the value (e.g., subtype identifier codes, state abbreviations)
– Other examples
IMS 4212: Data Modeling—Attributes and Domains
24Dr. Lawrence West, Management Dept., University of Central [email protected]
Domain Constraints (cont.)
• Some domain constraints are simple
– Last names may not be null and must be alphabetic characters between two and 30 characters in length
• Some domain constraints can be more complex
– Social security numbers will be recorded as character strings with three numeric characters, a dash, two numeric characters, a dash, and four numeric characters and will be exactly eleven characters long
IMS 4212: Data Modeling—Attributes and Domains
25Dr. Lawrence West, Management Dept., University of Central [email protected]
Domain Constraints (cont.)
• Many domain constraints contain ranges
– Minimum or maximum values
• Numbers
• Dates
– Price must be greater than zero and less than 10,000
– Membership date must be after 6/6/55
• Domain constraints may specify a default value
– Quantity defaults to 1
– Purchase date defaults to current date
IMS 4212: Data Modeling—Attributes and Domains
26Dr. Lawrence West, Management Dept., University of Central [email protected]
Domain Constraints (cont.)
• Some domain constraints can be expressed in terms of calculated values when calculations refer to system values
– Birth date must be more than 21 years earlier than today
• More complex calculations to enforce business rules are implemented as operational constraints (triggers)
– Inventory quantity on hand is the sum of all items received less the sum of all items sold less the sum of all inventory adjustments
– Covered in the next lesson
IMS 4212: Data Modeling—Attributes and Domains
27Dr. Lawrence West, Management Dept., University of Central [email protected]
Domain Constraints
• Identifier Attributes (Primary Keys) will have special domain constraints
– Cannot be null
– Single attribute identifiers must be unique
– Composite identifiers must be unique (though each component attribute will not be)
• Some other attributes or sets of attributes may also be set to be unique
– SSN and driver's license number in the same entity
– Use an Index required to be unique to implement
IMS 4212: Data Modeling—Attributes and Domains
28Dr. Lawrence West, Management Dept., University of Central [email protected]
Domain Constraints (cont.)
• Two ways to implement domain constraints
– Specify each aspect of the domain for each attribute
– Specify named domains (rules)
• A "Last Name" domain may have all the rules for last name attributes
• Different entities with last name attributes would use the named domain for those attributesE.g., Salesrep, Employee, Customer, Supplier
• DBMS must support named domains (called rules in SQL Server)
• Bind rule to a column