Advanced RDBMS
1.0 Introduction
Information is represented in object-oriented
database, in the form of objects as used in Object-Oriented Programming.
When database capabilities are combined with object programming language
capabilities, the result is an object database management system (ODBMS). An
ODBMS makes database objects appear as programming language objects in one or
more object programming languages. An ODBMS supports the programming language
with transparently persistent data, concurrency control, data recovery,
associative queries, and other capabilities.
Object database
management systems grew out of research during the early to mid-1980s into
having intrinsic database management support for graph-structured objects. The
term "object-oriented database system" first appeared around 1985.
Object database management
systems added the concept of persistence to object programming languages. The
early commercial products were integrated with various languages: GemStone
(Smalltalk), Gbase (Lisp), and Vbase (COP). COP was the C Object Processor, a
proprietary language based on C that pre-dated C++. For much of the 1990s, C++
dominated the commercial object database management market. Vendors added Java
in the late 1990s and more recently, C#.
Starting in 2004, object
databases have seen a second growth period when open source object databases
emerged that were widely affordable and easy to use, because they are entirely
written in OOP languages like Java or C#, such as db4objects and Perst (McObject).
Benchmarks between ODBMSs and
relational DBMSs have shown that ODBMS can be clearly superior for certain
kinds of tasks. The main reason for this is that many operations are performed
using navigational rather than declarative interfaces, and navigational access
to data is usually implemented very efficiently by following pointers.
Critics of Navigational
Database-based technologies, like ODBMS, suggest that pointer-based techniques
are optimized for very specific "search routes" or viewpoints. However,
for general-purpose queries on the same information, pointer-based techniques
will tend to be slower and more difficult to formulate than relational. Thus,
navigational appears to simplify specific known uses at the expense of general,
unforeseen, and varied future uses.
Other things that work against
ODBMS seem to be the lack of interoperability with a great number of
tools/features that are taken for granted in the SQL world including but not
limited to industry standard connectivity, reporting tools, OLAP tools and
backup and recovery standards. Additionally, object databases lack a formal
mathematical foundation, unlike the relational model, and this in turn leads to
weaknesses in their query support. However, this objection is offset by the
fact that some ODBMSs fully support SQL in addition to navigational access,
e.g. Objectivity/SQL++ and Matisse. Effective use may require compromises to
keep both paradigms in sync.
In fact there is an intrinsic
tension between the notion of encapsulation, which hides data and makes it
available only through a published set of interface methods, and the assumption
underlying much database technology, which is that data should be accessible to
queries based on data content rather than predefined access paths. Database-centric
thinking tends to view the world through a declarative and attribute-driven
viewpoint, while OOP tends to view the world through a behavioral viewpoint.
This is one of the many impedance mismatch issues surrounding OOP and
databases.
Although some commentators have
written off object database technology as a failure, the essential arguments in
its favor remain valid, and attempts to integrate database functionality more
closely into object programming languages continue in both the research and the
industrial communities
1.1 Objectives
The objective
of this lesson is to learn the Object-Oriented database concepts with respect
to Object Identity, Object Structure, Object Databases Standards, Language
and Design and Overview of CORBA.
1.2 Content
1.2.1
Concepts for Object-Oriented Databases
A database is a
logical term used to refer a collection of organized and related
information. In any business, certain
piece of information about Customer, Product, Price and so on are called
database. A data is just a data until it
is organized in a meaningful way at which point it becomes information.
Through a
Database Management System one can Insert, Update, Delete and View the records
in existing file
Ø
Traditional Data Models : Hierarchical, Network (since mid-60’s), Relational (since 1970 and
commercially since 1982).
Ø
Object Oriented (OO) Data
Models since mid-90’s.
Ø
Reasons for creation of Object
Oriented Databases
–
Need for more complex applications
–
Need for additional data
modeling features
–
Increased use of
object-oriented programming languages
Ø
Commercial OO Database
products – several in the 1990’s, but did not make
much impact on mainstream data management
Ø
Languages: Simula (1960’s), Smalltalk (1970’s), C++ (late 1980’s), Java
(1990’s)
Ø
Experimental Systems: Orion at MCC, IRIS at H-P labs, Open-OODB at T.I., ODE at ATT Bell
labs, Postgres - Montage - Illustra at UC/B, Encore/Observer at Brown.
Ø
Commercial OO Database
products: Ontos, Gemstone, O2 ( -> Ardent),
Objectivity, Objectstore ( -> Excelon), Versant, Poet, Jasmine (Fujitsu –
GM).
1.2.2 Overview of Object Oriented Concepts.
l
MAIN CLAIM: OO databases try to maintain
a direct correspondence between real-world and database objects so that objects
do not lose their integrity and identity and can easily be identified and
operated upon
l
Object: Two components: state (value)
and behavior (operations). Similar to
program variable in programming language, except that it will typically have a
complex data structure as well as specific operations defined by the programmer
l
In OO databases, objects may
have an object structure of arbitrary complexity in order to contain all
of the necessary information that describes the object.
l
In contrast, in traditional
database systems, information about a complex object is often scattered over
many relations or records, leading to loss of direct correspondence between a
real-world object and its database representation.
l
The internal structure of an
object in OOPLs includes the specification of instance variables, which
hold the values that define the internal state of the object.
l
An instance variable is similar
to the concept of an attribute, except that instance variables may be
encapsulated within the object and thus are not necessarily visible to external
users
l
Some OO models insist that all
operations a user can apply to an object must be predefined. This forces a
complete encapsulation of objects.
l
To encourage encapsulation, an operation is defined
in two parts:
–
signature or interface of the
operation, specifies the operation name and arguments (or parameters).
–
method or body, specifies the
implementation of the operation.
l
Operations can be invoked by
passing a message to an object, which includes the operation name
and the parameters. The object then executes the method for that
operation.
l
This encapsulation permits
modification of the internal structure of an object, as well as the
implementation of its operations, without the need to disturb the external
programs that invoke these operations
l
Some OO systems provide capabilities
for dealing with multiple versions of the same object (a feature
that is essential in design and engineering applications).
l
For example, an old version of
an object that represents a tested and verified design should be retained until
the new version is tested and verified: it is very crucial for designs in
manufacturing process control, architecture , software systems.
l
Operator polymorphism: It refers to an
operation’s ability to be applied to different types of objects; in such a
situation, an operation name may refer to several distinct implementations, depending on the type of objects
it is applied to.
l This feature is also called operator overloading
1.2.3 Object identity, Object
Structure and Type constructors
l Unique Identity: An OO database system provides a unique identity to each independent
object stored in the database. This unique identity is typically implemented
via a unique, system-generated object identifier, or OID
l The main property required of an
OID is that it be immutable; that is, the OID value of a particular object
should not change. This preserves the identity of the real-world object being
represented
l Type Constructors: In OO databases, the state (current value) of a complex object may be
constructed from other objects (or other values) by using certain type
constructors.
-The three most basic constructors
are atom, tuple, and set. Other commonly used constructors include list, bag,
and array. The atom constructor is used
to represent all basic atomic values, such as integers, real numbers, character
strings, Booleans, and any other basic data types that the system supports
directly.
Example 1,
one possible relational database state corresponding to COMPANY schema
We use i1, i2, i3, . . .
to stand for unique system-generated object identifiers. Consider the following
objects:
o1 = (i1, atom, ‘Houston’)
o2 = (i2, atom, ‘Bellaire’)
o3=(i3,atom,‘Sugarland’)
o4 = (i4, atom, 5)
o4 = (i4, atom, 5)
o5 = (i5, atom, ‘Research’)
o6 = (i6, atom,
‘1988-05-22’)
o7 = (i7, set, {i1, i2,
i3})
o8 = (i8, tuple,<dname:i5,
dnumber:i4, mgr:i9, locations:i7, employees:i10,
projects:i11>)
o9 = (i9, tuple,
<manager:i12, manager_start_date:i6>)
o10 = (i10, set, {i12,
i13, i14})
o11 = (i11, set {i15,
i16, i17})
o12 = (i12, tuple,
<fname:i18, minit:i19, lname:i20, ssn:i21, . .
., salary:i26,
supervisor:i27,
dept:i8>)
The first six objects listed in this example
represent atomic values. Object seven is
a set-valued object that represents the set of locations for department
5; the set refers to the atomic objects with values {‘Houston’, ‘Bellaire’,
‘Sugarland’}. Object 8 is a tuple-valued
object that represents department 5 itself, and has the attributes DNAME,
DNUMBER, MGR, LOCATIONS, and so on.
This
example illustrates the difference between the two definitions for comparing
object states for equality.
o1 = (i1, tuple, <a1:i4,
a2:i6>)
o2 = (i2, tuple, <a1:i5,
a2:i6>)
o3 = (i3, tuple, <a1:i4,
a2:i6>)
o4 = (i4, atom, 10)
o5 = (i5, atom, 10)
o6 = (i6, atom, 20)
In this example, The objects o1 and o2
have equal states, since their states at the atomic level are the same
but the values are reached through distinct objects o4 and o5.
However, the states of objects o1 and o3
are identical, even though the objects themselves are not because they
have distinct OIDs. Similarly, although
the states of o4 and o5 are identical, the actual objects o4
and o5 are equal but not identical, because they have distinct OIDs.
1.2.3 Encapsulation of
Operations, Methods and Persistence Encapsulation
l One of the main characteristics of
OO languages and systems
l Related to the concepts of abstract
data types and information hiding in programming languages
l Specifying Object Behavior via
Class Operations:
l The main idea is to define the
behavior of a type of object based on the operations that can be externally
applied to objects of that type.
l In general, the implementation of
an operation can be specified in a general-purpose programming language that
provides flexibility and power in defining the operations.
l For database applications, the
requirement that all objects be completely encapsulated is too stringent.
l One way of relaxing this
requirement is to divide the structure of an object into visible and hidden
attributes (instance variables).
l Adding operations to definitions
of Employee and Department
l Specifying Object Persistence via
Naming and Reachability:
l Naming Mechanism: Assign an object
a unique persistent name through which it can be retrieved by this and other
programs.
l Reachability Mechanism: Make the
object reachable from some persistent object.
l An object B is said to be
reachable from an object A if a sequence of references in the object graph lead
from object A to object B.
l In traditional database models
such as relational model or EER model, all objects are assumed to be
persistent.
l In OO approach, a class
declaration specifies only the type and operations for a class of objects. The
user must separately define a persistent object of type set
(DepartmentSet) or list (DepartmentList) whose value is the collection
of references to all persistent DEPARTMENT objects
Creating Persistent objects by
naming and reachability
Define
class DepartmentSet:
Type set(Department);
Operations
add_dept(d:Department): Boolean;
(*
adds a department to the DepartmentSet object *)
remove_dept(d:Department):
Boolean;
(*
this will remove a department from the DepartmentSet Object *)
create_dept_set: DepartmentSet;
destroy_dept_set: Boolean;
end
DepartmentSet;
…….
persistent
name AllDepartments: DepartmentSet;
(*
AllDepartments is a persistent named
object of type DepartmentSet *)
…..
1.2.5 Type
Hierarchies and Inheritance
Type (class) Hierarchy
A type in its simplest form can be defined by
giving it a type name and then listing the names of its visible (public)
functions
When specifying a type in this section, we
use the following format, which does not specify arguments of functions, to
simplify the discussion:
§ TYPE_NAME: function, function, . .
. , function
Example: PERSON: Name, Address,
Birthdate, Age, SSN
Subtype: when the designer or user must create a new type that is similar but
not identical to an already defined type
Supertype: It inherits all the functions of the subtype
Example
(1):
EMPLOYEE: Name, Address, Birthdate, Age, SSN,
Salary, HireDate, Seniority
STUDENT: Name, Address, Birthdate, Age, SSN,
Major, GPA
OR:
EMPLOYEE subtype-of PERSON: Salary, HireDate,
Seniority
STUDENT subtype-of PERSON: Major, GPA
Example
(2): Consider a
type that describes objects in plane geometry, which may be defined as follows:
GEOMETRY_OBJECT: Shape, Area, ReferencePoint
Now
suppose that we want to define a number of subtypes for the GEOMETRY_OBJECT
type, as follows:
RECTANGLE subtype-of GEOMETRY_OBJECT: Width, Height
TRIANGLE subtype-of GEOMETRY_OBJECT: Side1, Side2, Angle
CIRCLE subtype-of GEOMETRY_OBJECT: Radius
An alternative way of declaring these three
subtypes is to specify the value of the Shape attribute as a condition that
must be satisfied for objects of each subtype:
RECTANGLE subtype-of GEOMETRY_OBJECT (Shape=‘rectangle’): Width, Height
TRIANGLE subtype-of GEOMETRY_OBJECT
(Shape=‘triangle’): Side1, Side2, Angle
CIRCLE subtype-of GEOMETRY_OBJECT
(Shape=‘circle’): Radius
l Extents: In most OO databases, the
collection of objects in an extent has the same type or class. However, since
the majority of OO databases support types, we assume that extents are
collections of objects of the same type for the remainder of this section.
l Persistent Collection: It holds a
collection of objects that is stored permanently in the database and hence can
be accessed and shared by multiple programs
l Transient Collection: It exists
temporarily during the execution of a program but is not kept when the program
terminates
1.2.6
Complex Objects
l Unstructured complex object: It is
provided by a DBMS and permits the storage and retrieval of large objects that
are needed by the database application.
l Typical examples of such objects
are bitmap images and long text strings (documents); they are
also known as binary large objects, or BLOBs for short.
l This has been the standard way by
which Relational DBMSs have dealt with supporting complex objects, leaving the
operations on those objects outside the RDBMS
l Structured complex object: It
differs from an unstructured complex object in that the object’s structure is
defined by repeated application of the type constructors provided by the
OODBMS. Hence, the object structure is defined and known to the OODBMS. The
OODBMS also defines methods or operations on it.
1.2.7
Other Object-Oriented
Concepts
Object
Databases Standards
Why a standard is needed? A Standard in
any Object Model refers to the following aspects:
ü Portability: execute an
application program on different systems with minimal modifications to the
program.
ü Interoperability
ODMG standard refers to - object model,
object definition language (ODL), object query language (OQL), and bindings to
object-oriented programming languages.
An Object Model explains the data model upon
which ODL and OQL are based. It also provides data type and type constructors.
SQL report describes a standard data model for relational database.
Relation between an Object and literal is – a
Literal has only a
value but no object identifier. An Object has four characteristics:
•identifier
•Name
•life time
(persistent or not)
•Structure (how
to construct)
Object
Database Language
a. Object Definition Language
(ODL)
An Object Definition Language is
designed to support the semantic constructs of the ODMG data model. It is
Independent of any programming language and helps to Create object
specifications such as classes and
interfaces and also Specify a database schema.
b. Object Query Language (OQL)
An Object
Query language is:
•Embedded
into one of these programming languages
•Return
objects that match the type system of that language
•Similar
to SQL with additional features (object identity, complex objects, operations,
inheritance, polymorphism, relationships)
c. OQL Entry Points and Iterator
Variables
.
Entry
point is a named persistent object (for many queries, it is the name of the
extent of a class). An Iterator variable is used when a collection is
referenced in OQL query.
d. OQL -Query Results and Path
Expressions
Any
persistent object is a query, result is a reference to that persistent object.
Path expression is used to specify a path to related attributes and objects
once an entry point is specified.
e. OQL Collection Operators
OQL
Collection Operators include Aggregate operators such as: min, max, count,
sum, and avg.
Object
Database Conceptual Design
The Object Database Conceptual Design includes:
ODB: relationships are handled by OID
references to the related objects.
RDB: relationships among tuples are
specified by attributes with matching values (value references).
ORDBMS: enhancing the capabilities of
RDBMS with some of the features in ODBMS.
Other
Concepts of Object Database
·
Polymorphism (Operator Overloading):
§ This concept allows the same operator
name or symbol to be bound to two or more different implementations
of the operator, depending on the type of objects to which the operator is
applied
·
Multiple Inheritance and Selective Inheritance
Multiple inheritances in a type hierarchy
occurs when a certain subtype T is a subtype of two (or more) types and hence
inherits the functions (attributes and methods) of both supertypes.
For
example, we may create a subtype ENGINEERING_MANAGER that is a subtype of both
MANAGER and ENGINEER. This leads to the
creation of a type lattice rather than a type hierarchy.
l Versions and Configurations
l Many database applications that
use OO systems require the existence of several versions of the same object
l There may be more than two
versions of an object.
l Configuration: A configuration of
the complex object is a collection consisting of one version of each module
arranged in such a way that the module versions in the configuration are compatible
and together form a valid version of the complex object.
ODMG
(Object Data Management Group)
ODMG 2.0 of the
ODMG Standard differs from Release 1.2 in a number of ways. With the wide
acceptance of Java, we added a Java Persistence Standard in addition to the
existing Smalltalk and C++ ones. The ODMG object model is much more
comprehensive, added a meta object interface, defined an object interchange
format, and worked to make the programming language bindings consistent with
the common model. The changes made throughout the specification based on
several years of experience implementing the standard in object database
products.
As with Release
1.2, we expect future work to be backward compatible with Release 2.0. Although
we expect a few changes to come, for example to the Java binding, the Standard
should now be reasonable stable.
The major
components of ODMG 2.0 are:
Object Model. We have used
the OMG Object Model as the basis for our model. The OMG core model was
designed to be a common denominator for object request brokers, object database
systems, object programming languages, and other applications. In keeping with
the OMG Architecture, we have designed an ODBMS profile for the model, adding
components (relationships) to the OMG core object model to support our needs.
Release 2.0 introduces a meta model.
The Object Data Management Group
(ODMG) was a consortium of object database and object-relational mapping
vendors, members of the academic community, and interested parties. Its goal
was to create a set of specifications that would allow for portable
applications that store objects in database management systems. It published
several versions of its specification. The last release was ODMG 3.0. By 2001,
most of the major object database and object-relational mapping vendors claimed
conformance to the ODMG Java Language Binding. Compliance to the other
components of the specification was mixed. In 2001, the ODMG Java Language
Binding was submitted to the Java Community Process as a basis for the Java
Data Objects specification. The ODMG member companies then decided to
concentrate their efforts on the Java Data Objects specification. As a result,
the ODMG disbanded in 2001.
Many object database ideas were
also absorbed into SQL:1999 and have been implemented in varying degrees in
object-relational database products.
In 2005 Cook, Rai, and
Rosenberger proposed to drop all standardization efforts to introduce
additional object-oriented query APIs but rather use the OO programming
language itself, i.e., Java and .NET, to express queries. As a result, Native
Queries emerged. Similarly, Microsoft announced Language Integrated Query
(LINQ) and DLINQ, an implementation of LINQ, in September 2005, to provide
close, language-integrated database query capabilities with its programming
languages C# and VB.NET 9.
In February 2006, the Object
Management Group (OMG) announced that they had been granted the right to
develop new specifications based on the ODMG 3.0 specification and the
formation of the Object Database Technology Working Group (ODBT WG). The ODBT
WG plans to create a set of standards that incorporates advances in object
database technology (e.g., replication), data management (e.g., spatial
indexing), and data formats (e.g., XML) and to include new features into these
standards that support domains in real-time systems where object databases are
being adopted
Object
Definition Language (ODL)
Lets take a look at something that comes closer to bearing a
relationship to our everyday programming. Whether you generate your
applications or code them, somehow you need a way to describe your object
model. The goal of this Object Definition Language (ODL) is to capture enough
information to be able to generate the majority of most SMB web apps directly
from a set of statements in the language . . .
Here is a rough cut of ODL along with comments. This is
very much a work in progress. Now that I have a meta-grammar and a concrete
syntax for describing languages, I can start to write the languages I have been
playing with. I will then build up to those languages in the framework so that
the framework can consume metadata that can be transformed automatically from
ODL, allowing for the automatic generation of most of my code. Expect to see
BIG changes in this grammar as I combine “top down” and “bottom up”
programming, write some real world applications and see where everything meets
in the middle!
Most importantly, we have objects that are comprised of
1..n attributes and that may or may not have relationships. This is the high
level UML model kind of stuff. Note that ODL is describing functional metadata,
so an object would be “Article” – not “ArticleService” or “ArticleDAO” which
are implementation decisions and would be generated from the Article metadata
automatically.
Object
Query Language (OQL)
But before
that we will digress into built-in functions supported in OQL The built-in
functions in OQL fall into the following categories:
·
Functions that operate on
individual Java Objects
1. sizeof(o)-- returns size of Java object in bytes
2.
objectid(o)-- returns unique id
of Java object
3.
classof(o)-- returns Class
object for given Java object
4.
identical(o1, o2) -- returns
(boolean) whether two given object are identical or not (essentially
objectid(o1) == objectid(o2). Do not use simple JavaScript reference comparison
for Java Objects!)
5.
referrers(o) -- returns array
of objects refering to given Java object
6.
referees(o) -- returns array of
objects referred by given Java object
7.
reachables(o) -- returns array
of objects directly or indirectly referred from given Java object (transitive
closure of referees of given object)
·
Functions that operate operate
on arrays
1.
contains(array, expr) --
returns array contains an element that satisfies given expression The expression
can refer to built-in variable 'it'. This is current object iterated
2.
count(array, [expr]) -- returns
number of elements satisfying given expression
3.
filter(array, expr) -- returns
a new array containing elements satisfying given expression
4.
map(array, expr) -- returns a
new array that contains results of applying given expression on each element of
input array
5.
sort(array, [expr]) -- sorts
the given array. optionally accepts comparison expression to use. if not given,
sort uses numerical comparison
6.
sum(array) -- sums all elements
of array
As you can see, most array operating functions accept boolean
expression -- the expression can refer to current object by it variable.
This allows operating on arrays without loops -- the built-in functions loop
through the array and 'apply' the expression on each element.
There is also
built-in object called heap. There are various useful methods in heap
object.
Now, let us see some interesting queries.
Select all objects referred by a SoftReference:
select f.referent from java.lang.ref.SoftReference f
where f.referent != null
referent is a private field of
java.lang.ref.SoftReference class (actually inherited field from
java.lang.ref.Reference. You may use javap -p to find these!) We filter
the SoftReferences that have been cleared (i.e., referent is null).
Show referents
that are not referred by another object. i.e., the referent is reachable only
by that soft reference:
select f.referent from java.lang.ref.SoftReference f
where f.referent != null && referrers(f.referent).length == 1
Note that use of referrers built-in function to
find the referrers of a given object. because referrers returns an array, the
result supports length property.
Let us refine
above query. We want to find all objects that referred only by soft references
but we don't care how many soft references refer to it. i.e., we allow more
than one soft reference to refer to it.
select f.referent from java.lang.ref.SoftReference f
where f.referent != null && filter(referrers(f.referent),
"classof(it).name != 'java.lang.ref.SoftReference'").length== 0
Note that filter function filters the referrers array using a
boolean expression. In the filter condition we check the class name of referrer
is not java.lang.ref.SoftReference. Now, if the filtered arrays contain atleast
one element, then we know that f.referent is referred from some object that is
not of type java.lang.ref.SoftReference!
Find all
finalizable objects (i.e., objects that are some class that has 'java.lang.Object.finalize()'
method overriden)
select f.referent from java.lang.ref.Finalizer f
where f.referent != null
How does this work? When an instance of a class that
overrides finalize() method is created (potentially finalizable object), JVM registers
the object by creating an instance of java.lang.ref.Finalizer. The referent
field of that Finalizer object refers to the newly created "to be
finalized" object. (dependency on implementation detail!)
Find all
finalizable objects and approximate size of the heap retained because of those.
select { obj: f.referent, size: sum(map(reachables(f.referent),
"sizeof(it)")) }
from java.lang.ref.Finalizer f
where f.referent != null
Certainly this
looks really complex -- but, actually it is simple. The JavaScript object
literal used to select multiple values in the select expression (obj and size
properties). reachables finds objects reachable from given object. map
creates a new array from input array by applying given expression on each element.
The map call in this query would create an array of sizes of each reachable
object. sum built-in adds all elements of array. So, we get total size
of reachable objects from given object (f.referent in this case). Why do I say
approximate size? HPROF binary heap dump format does not account for actual
bytes used in live JVM. Instead sizes just enough to hold the data are used.
For eg. JVMs would align smaller data types such as 'char' -- JVMs would use 4
bytes instead of 2 bytes. Also, JVMs tend to use one or two header words with
each object. All these are not accounted in HPROF file dump. HPROF uses minimal
size needed to hold the data - for example 2 bytes for a char, 1 byte for a
boolean and so on
1.2.8
Overview of C++ Language Binding
The C++
binding to ODBMSs includes a version of the ODL that uses C++ syntax, a
mechanism to invoke OQL, and procedures for
operations on databases and transactions
The Object
Definition Language (ODL) is the declarative portion of C++ ODL/OML. The C++
binding of ODL is expressed as a library that provides classes and functions to
implement the concepts defined in the ODMG object model. OML is a language used
for retrieving objects from the database and modifying them. The C++ OML syntax
and semantics are those of standard C++ in the context of the standard class
library.
ODL/OML
specifies only the logical characteristics of objects and the operations used
to manipulate them. It does not discuss the physical storage of objects. It does
not address the clustering or memory management issues associated with the
stored physical representation of objects or access structures. In an ideal
world, these would be transparent to the programmer. In the real world, they
are not. An additional set of constructs called "physical pragmas" is
defined to give the programmer some direct control over these issues, or at
least to enable a programmer to provide "hints" to the storage
management subsystem provided as part of the ODBMS run time. Physical pragmas
exist within the ODL and OML. They are added to object type definitions
specified in ODL, expressed as OML operations, or shown as optional arguments
to operations defined within OML.
These pragmas
are not in any sense stand-alone languages, but rather a set of constructs
added to ODL/OML to address implementation issues.
The
programming-language-specific bindings for ODL/OML are based on one basic
principle -- that the programmer feels that there is one language, not two
separate languages with arbitrary boundaries between them.
The ODMG
Smalltalk binding is based upon two principles -- it should bind to Smalltalk
in a natural way that is consistent with the principles of the language, and it
should support language interoperability consistent with ODL specification and
semantics. We believe that organizations specifying their objects in ODL will
insist that the Smalltalk binding honor those specifications. These principles
have several implications that are evident in the design of the binding:
·
There is a unified type system
that is shared by Smalltalk and the ODBMS.
·
This type system is ODL as
mapped into Smalltalk by the Smalltalk binding.
·
The binding respects the
Smalltalk syntax, meaning the Smalltalk language will not have to be modified
to accommodate this binding.
·
ODL concepts will be
represented using normal Smalltalk coding conventions.
·
The binding respects the fact
that Smalltalk is dynamically typed. Arbitrary
Smalltalk objects may be stored persistently, including ODL-specified
objects, which will obey the ODL typing semantics.
·
The binding respects the
dynamic memory-management semantics of
Smalltalk. Objects will become persistent when they are referenced
by other persistent objects in the database, and will be removed when they are
no longer reachable in this manner.
As with other
language bindings, ODMG Java binding is based on one fundamental principle --
the programmer should perceive the binding as a single language for expressing
both database and programming operations, not two separate languages with
arbitrary boundaries between them. This principle has several corollaries:
·
There is a single, unified type
system shared by the Java language and the
object database; individual instances of these common types can be
persistent or transient.
·
The binding respects the Java
language syntax, meaning that the Java language
will not have to be modified to accommodate this binding.
·
The binding respects the
automatic storage management semantics of Java. Objects will become persistent
when they are referenced by other persistent objects in the database, and will
be removed when they are no longer reachable in this manner.
The Java
binding provides persistence by reachability, like the ODMG Smalltalk binding
(this has also been called "transitive persistence"). On database
commit, all objects reachable from database root objects are stored in the
database.
The Java
binding provides two ways to declare persistence-capable Java classes:
·
Existing Java classes can be
made persistence capable.
·
Java class declarations (as
well as a database schema) may automatically be
generated by a preprocessor
for ODMG ODL.
One possible
ODMG implementation that supports these capabilities would be a postprocessor
that takes as input the Java .class file (bytecodes) produced by the Java
compiler, then produces new modified bytecodes that support persistence.
Another implementation would be a preprocessor that modifies Java source before
it goes to the Java compiler. Another implementation would be a modified Java
interpreter.
We want a
binding that allows all of these possible implementations. Because Java does
not have all the hooks we might desire, and the Java binding must use standard
Java syntax, it is necessary to distinguish special classes understood by the
database system. These classes are called persistence-capable classes. They can
have both persistent and transient instances. Only instances of these classes
can be made persistent. The current version of the standard does not define how
a Java class becomes a persistence-capable class.
Object Database conceptual Design
l Traditional Data Models :
Hierarchical, Network (since mid-60’s), Relational (since 1970 and commercially
since 1982)
l Object Oriented (OO) Data Models
since mid-90’s
l Reasons for creation of Object
Oriented Databases
–
Need for more complex applications
–
Need for additional data modeling features
–
Increased use of object-oriented programming languages
l Commercial OO Database products –
several in the 1990’s, but did not make much impact on mainstream data
management
l MAIN CLAIM: OO databases try to
maintain a direct correspondence between real-world and database objects so
that objects do not lose their integrity and identity and can easily be
identified and operated upon
l Object: Two components: state
(value) and behavior (operations).
Similar to program variable in
programming language, except that it will typically have a complex data
structure as well as specific operations defined by the programmer
l In OO databases, objects may have
an object structure of arbitrary complexity in order to contain all of
the necessary information that describes the object.
l In contrast, in traditional database systems,
information about a complex object is often scattered over many relations or
records, leading to loss of direct correspondence between a real-world object
and its database representation
l The internal structure of an
object in OOPLs includes the specification of instance variables, which hold
the values that define the internal state of the object.
l An instance variable is similar to
the concept of an attribute, except that instance variables may be
encapsulated within the object and thus are not necessarily visible to external
users
l Some OO models insist that all operations
a user can apply to an object must be predefined. This forces a complete
encapsulation of objects.
l To encourage encapsulation, an
operation is defined in two parts:
1. signature or interface of the
operation, specifies the operation name and arguments (or parameters).
2. method or body, specifies the
implementation of the operation.
l Operations can be invoked by
passing a message to an object, which includes the operation name
and the parameters. The object then executes the method for that
operation.
l This encapsulation permits
modification of the internal structure of an object, as well as the
implementation of its operations, without the need to disturb the external
programs that invoke these operations
l Some OO systems provide
capabilities for dealing with multiple versions of the same
object (a feature that is essential in design and engineering
applications).
1. For example, an old version of an
object that represents a tested and verified design should be retained until
the new version is tested and verified:
2. very crucial for designs in
manufacturing process control, architecture , software systems …..
l Operator polymorphism: It refers
to an operation’s ability to be applied to different types of objects; in such
a situation, an operation name may refer to several distinct implementations,
depending on the type of objects it is applied to.
l This feature is also called operator
overloading
l Unique Identity: An OO database
system provides a unique identity to each independent object stored in the
database. This unique identity is typically implemented via a unique,
system-generated object identifier, or OID
l The main property required of an
OID is that it be immutable; that is, the OID value of a particular object
should not change. This preserves the identity of the real-world object being
represented
l Type Constructors: In OO
databases, the state (current value) of a complex object may be constructed
from other objects (or other values) by using certain type constructors.
l The three most basic constructors
are atom, tuple, and set. Other commonly used constructors include list, bag,
and array. The atom constructor is used
to represent all basic atomic values, such as integers, real numbers, character
strings, Booleans, and any other basic data types that the system supports
directly.
No comments:
Post a Comment