object oriented database design - a case study cliff frazier cs457/657 december 6, 2002

20
Object Oriented Database Design - A Case Study Cliff Frazier CS457/657 December 6, 2002

Upload: nelson-fitzgerald

Post on 24-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Object Oriented Database Design - A Case Study Cliff Frazier CS457/657 December 6, 2002

Object Oriented Database Design - A Case Study

Cliff FrazierCS457/657

December 6, 2002

Page 2: Object Oriented Database Design - A Case Study Cliff Frazier CS457/657 December 6, 2002

Motivations

• Permanent access to internet-published Linux kernel programming information

• Explore object oriented database design

• Learn Enhanced Entity Relationship (EER) modeling

Page 3: Object Oriented Database Design - A Case Study Cliff Frazier CS457/657 December 6, 2002

Internet Sources of Programming Information

Page 4: Object Oriented Database Design - A Case Study Cliff Frazier CS457/657 December 6, 2002

DB Design Approach

• Data requirements• Functional requirements• Develop data model that

represents our “miniworld” - EER modeling

• Convert data model to physical model

Page 5: Object Oriented Database Design - A Case Study Cliff Frazier CS457/657 December 6, 2002

Data Requirements

• Organize information on Linux kernel development

• Classify data by subject• Query / View / Report capability• Display mailing list and newsgroup

data as threads• Provide annotation capability

Page 6: Object Oriented Database Design - A Case Study Cliff Frazier CS457/657 December 6, 2002

LKPDB Functional Diagram

Mailing List Newsgroup FAQs TutorialHow To

Data Parse/Import

LKPDB

Classify

View

Report

Annotate

Query

ManualFunction

AutomatedFunction

...

Page 7: Object Oriented Database Design - A Case Study Cliff Frazier CS457/657 December 6, 2002

Data Parse/Import

• Use source specific data parsing rules where possible– Mailing lists & newsgroups– Specific rule set for each mailing list & each

newsgroup– Automated data import

• Use generic data parsing rules otherwise– One rule set for each data source type– Manual assistance required for import

Page 8: Object Oriented Database Design - A Case Study Cliff Frazier CS457/657 December 6, 2002

Received: from vmg.prodigy.net by vmg with SMTP; Thu, 5 Dec 2002 12:17:39 -0500X-Originating-IP: [209.116.70.75]. . . Date: Thu, 5 Dec 2002 09:03:03 -0800 (PST)From: Linus Torvalds <[email protected]>To: george anzinger <[email protected]>cc: Jim Houston <[email protected]>, Stephen Rothwell <[email protected]>, LKML <[email protected]>, <[email protected]>, "David S. Miller" <[email protected]>, <[email protected]>, <[email protected]>, <[email protected]>, <[email protected]>, <[email protected]>Subject: Re: [PATCH] compatibility syscall layer (lets try again)In-Reply-To: <[email protected]>Message-ID: <[email protected]>MIME-Version: 1.0Content-Type: TEXT/PLAIN; charset=US-ASCIISender: [email protected]: bulkX-Mailing-List: [email protected]

Mailing List Parsing - Header

Page 9: Object Oriented Database Design - A Case Study Cliff Frazier CS457/657 December 6, 2002

On Thu, 5 Dec 2002, george anzinger wrote:>> I think this covers all the bases. It builds boots and> runs. I haven't tested nano_sleep to see if it does the> right thing yet...

Well, it definitely doesn't, since at least this test is the wrong wayaround (as well as being against the coding style whitespace rules ;-p):

+ if ( ! current_thread_info()->restart_block.fun){+ return current_thread_info()->restart_block.fun(&parm);

Also, I would suggest against having a NULL pointer, and instead justinitializing it with a function that sets it to an error return (don't useENOSYS, since the system call _does_ exist, and ENOSYS is what old kernelswould return if you do it by hand by mistake. I'd suggest -EINTR, sincethat will "DoTheRightThing(tm)" if we somehow get confused).

Linus

Mailing List Parsing - Body

Page 10: Object Oriented Database Design - A Case Study Cliff Frazier CS457/657 December 6, 2002

Mailing List Parsing - Postscript

-To unsubscribe from this list: send the line "unsubscribe linux-kernel" inthe body of a message to [email protected] majordomo info at http://vger.kernel.org/majordomo-info.htmlPlease read the FAQ at http://www.tux.org/lkml/

Page 11: Object Oriented Database Design - A Case Study Cliff Frazier CS457/657 December 6, 2002

What kinds of things should be threaded or multitasked?

If you are a programmer and would like to take advantage of multithreading, the natural question is what parts of the program should/ should not be threaded. Here are a few rules of thumb (if you say "yes" to these, have fun!):

Are there groups of lengthy operations that don't necessarily depend on other processing (like painting a window, printing a document, responding to a mouse-click, calculating a spreadsheet column, signal handling, etc.)? Will there be few locks on data (the amount of shared data is identifiable and "small")? Are you prepared to worry about locking (mutually excluding data regions from other threads), deadlocks (a condition where two COEs have locked data that other is trying to get) and race conditions (a nasty, intractable problem where data is not locked properly and gets corrupted through threaded reads & writes)? Could the task be broken into various "responsibilities"? E.g. Could one thread handle the signals, another handle GUI stuff, etc.?

Parsing for Linux Threads FAQ

Page 12: Object Oriented Database Design - A Case Study Cliff Frazier CS457/657 December 6, 2002

Classification

• Both automatic & manual modes• Each entry classified based on

keywords• Multiple categories allowed• Categories:

– Scheduler– Virtual memory management– File system

Page 13: Object Oriented Database Design - A Case Study Cliff Frazier CS457/657 December 6, 2002

Classification Categories (cont)

– Interprocess communication– Modules– Networking– Architecture related– Symmetric multiprocessing– Device drivers– Compiling– Debugging

Page 14: Object Oriented Database Design - A Case Study Cliff Frazier CS457/657 December 6, 2002

Query Operations

• SQL based• Queries used for Views, Reports,

Annotations, and Classification• Primary use to perform SELECTs to

search for and view or print certain data

• Also include keyword search capability

Page 15: Object Oriented Database Design - A Case Study Cliff Frazier CS457/657 December 6, 2002

Annotation Example from the Kernel HowTo

. . .7. Now, give the make command -

The gcc compiler distributed with RedHat 7.0 will not compile the kernel correctly. They do supply a kernel compatible compiler as well, which is invoked with kgcc. On RH 7.0 distributions of Linux, change all occurrences of gcc to kgcc in the root level Makefile before giving the make command.___________________________________________________________ bash# cd /usr/src/linux bash# man nohup bash# nohup make bzImage & bash# man tail bash# tail -f nohup.out (.... to monitor the progress) This will put the kernel in /usr/src/linux/arch/i386/boot/bzImage ___________________________________________________________

. . .

Page 16: Object Oriented Database Design - A Case Study Cliff Frazier CS457/657 December 6, 2002

Data Model

• Enhanced Entity Relationship (EER) modeling

• Enhanced = object oriented concepts

• Initial design: list entity types and their attributes

• Refinement: some attributes converted to relationships

Page 17: Object Oriented Database Design - A Case Study Cliff Frazier CS457/657 December 6, 2002

Mailing List Entity Attributes

• MAILING_LIST_POST– Name e.g. Linux-Kernel M.L. *– Serial number– Author– Subject– Date/time stamp– Header– Body of post *

• * Converted to relationships

Page 18: Object Oriented Database Design - A Case Study Cliff Frazier CS457/657 December 6, 2002

EER Diagrams

• Rectangle - entity• Oval - attribute• Diamond - relationship• Structural constraints

– Participation– Cardinality ratio

• Added types besides mailing lists

Page 19: Object Oriented Database Design - A Case Study Cliff Frazier CS457/657 December 6, 2002

INFO_SOURCE

Name URL

d

LIST_ENTRY

DOCUMENT

Type Date Pub.

Post_SN

Thread_SN

Parent

Child

DOC_TEXT

FAQ *

d

BOOK *UNSTR. *

Chap.

Quest.

Subs.

Ans.

Date_Stamp

Author

Subject LIST

INCLUDES

N

1

CONTAINS

1

1

* Same relationships to DOC_TEXT and A_TEXT as LIST_ENTRY

KeywordCategory

TEXT

A_TEXT

Size

Annotated

Offset

1

N

Page 20: Object Oriented Database Design - A Case Study Cliff Frazier CS457/657 December 6, 2002

Conclusion

• An OODB for Linux kernel programming information was designed using EER

• Attributes vs. relationship roles change during design process

• The design methodology influences the content of the DB

• Next project - Implement DB