Please check out Realm of Shadows !

Member Discussions

terms



[Previous] [Next] [Post] [Reply] [Topics] [Summary] [Search]


Pages: 1 | 2 | 3

1. Object Databases vs Relational Databases (vs flat files) Thu Jun 9, 2005 [8:31 AM]
shasarak
Email not supplied
member since: Dec 10, 2004
Reply
The following are a couple of extracts from another thread, written by lindahlb (and, I should add, taken violently out of context):

I suggest looking at an object-oriented database backend that provides consistency. I've been working on one for a while tailored specifically for MUDs, one that supports effecient database-wide sequence points (sort of like a 'commit-all').

And also:

WRT queries, if you use an object-oriented database, you probably will never need to be bothered with a query - just use the normal data access paths for the programming language and the database will fetch as needed.
[snip]
In-memory caching is what a database system actually does. We're delving into database implementation now, but suffice to say, I've found that impressive optimizations can be made for heavy-update applications (MUDs) when you assume a long running single-level transaction (one or several event execution loops). This is what I've been working on over the past 2 years or so. No more objects gone missing or duplicated due to inopportune crashes, and complete persistence as well. And, I wouldn't be surprised at all to see a significant improvement in performance over the load-update-save cycles of current MUD codebases which will almost guarantee poor disk locality. I hope to see it revolutionize how modern MUDs deal with persistence - at least in C++ (the only interface I'm bothering to support).


Perhaps surprisingly, given that I programme in Smalltalk for a living, I've never actually had much to do with object databases. I brushed very briefly with an elderly system called VOSS many years ago, and I'm aware of the existence of ObjectStore and Gemstone, but I can't claim to know much about them. I think it would be interesting to dedicate a whole discussion thread to the comparative merits of relational and object-oriented databases when it comes to providing persistent storage for MUDs.

I guess comparisons between either method and the flat-file approach that is (I believe) typically employed by Diku-derivatives might also be interesting.


Some initial questions:


1) Commercial object DB packages seem to be designed to be accessed natively from inside a fairly small set of specific OO languages - C++, VisualAge Smalltalk, VisualWorks Smalltalk, Java, etc.

Will an object database therefore be difficult to use efficiently from inside a custom-made scripting language? Will it also be tricky to interface with an object DB from C? (The latter might be useful if, for example, you wanted to use an object DB to add persistance to an LP MUD by allowing LPC objects to save and load via efuns).


2) How powerful are object DBs when it comes to actually running queries, as opposed to providing object persistence? Suppose (to take a very simple example) we have a coordinate-based space system (rather than room-based), with each object holding a reference to a point in 3D space. With a relational DB one can obtain a list of (say) all objects within r metres of a player at point (px, py, pz) by a query like this:

select * from mud_objects as mo where (((mo.x - px) * (mo.x - px)) + ((mo.y - py) * (mo.y - py)) + ((mo.z - pz) * (mo.z - pz))) < (r * r)

(There are obviously lots of ways in which the above query could be optimised). This sort of approach means that you don't have to constantly keep track of the objects that are 'close' to the player - they can be determined fairly efficiently on-the-fly whenever you need to know what they are. I have a feeling that quickly being able to obtain a list of objects satisfying particular conditions without having to track them in advance will prove useful.

Are object databases likely to be as efficient for this sort of thing as relational DBs?


3) What non-commercial object database servers are available/suitable?


4) Lindahlb, if you're reading this, would you like to favour us with a bit more detail concerning your own project?



Please do not feed the troll.


2. RE: Object Databases vs Relational Databases (vs flat files) Thu Jun 9, 2005 [12:11 PM]
eiz
Email not supplied
member since: Dec 24, 2002
In Reply To
Reply
How powerful are object DBs when it comes to actually running queries, as opposed to providing object persistence?

I believe some commercial databases support a query language called OQL. No personal experience with it, but it seems alright. Query support has traditionally been weak in OODBMSes.

4) Lindahlb, if you're reading this, would you like to favour us with a bit more detail concerning your own project?

There's a pretty extensive description here.

(Comment added by eiz on Thu Jun 9 13:14:33 2005)

3) What non-commercial object database servers are available/suitable?

The main one I know of is GOODS. ColdStore and ColdC/Genesis might also qualify.

(Comment added by eiz on Thu Jun 9 13:21:38 2005)

Oops, that page is outdated. New link.

(Comment added by eiz on Thu Jun 9 13:51:19 2005)

My personal preference is actually for flat files, due to their sheer usability. They're just simpler from a management perspective: you can use existing tools to merge/version them, there's no import/export to deal with, no need for external libraries or servers.

I'd like to experiment with a system that dynamically builds indexes on top of a text based format. I believe XQuery implementations are doing this.


3. RE: Object Databases vs Relational Databases (vs flat files) Thu Jun 9, 2005 [1:46 PM]
lindahlb
Email not supplied
member since: Mar 2, 2001
In Reply To
Reply
I guess comparisons between either method and the flat-file approach that is (I believe) typically employed by Diku-derivatives might also be interesting.

The relational method is pretty much identical to the flat-file approach. You retrieve and store MUD objects to and from the database explicitly. There are three major differences.

1) effeciency - you should see a minor speedup if you load and unload objects frequently enough
2) query capability - this one is obvious
3) reliability - if your MUD crashes in the middle of saving a file, you'll be left with an inconsistent and perhaps incomplete object (dangerous). A relational database will give you transactions to work safely with respect to storing objects.

An object-oriented database system offers a completely different interface for MUDs. They allow you to completely ignore keeping your own in-memory copies of objects and allow you to interface with the database directly for each function. Really, an embedded database is the only way to go with this interface, given the update rate of MUDs - so you'll need to find one that can bypass the network transmissions (Eiz's link to GOODS will take you to a listing of a bunch of ideal suspects). I think my description of my project at Eiz's link explains how the interface should ideally work. Most object-oriented database systems offer similiar, if not identical (nearly), interfaces.

You can also (instead) maintain a MUD object cache with object-oriented databases, and you'll generally get better performance for retrieving chains of objects than you would with relational or flat-file system.

These are the main surface differences.

1) The tight coupling of interface to programming language is derived from the idea that object-oriented databases are more or less a persistent object storage system with support for transactions and recoverability. Designing a seperate interface can be done if you can create wrapper functions for the database interface - requiring cross-language communication, which could be designed for LPC by using interface wrappers translating LPC to C and then from C to C++. It shouldn't be too difficult, but you do lose the syntax simplicity of smart pointers (in the case of C++). Some object-oriented database systems provide functional interfaces, others provide a hidden interface, lying somewhere between the compiled and language layers. For the latter, interface wrappers will generally be impossible. I believe ObjectStore uses this form of low-level interface. You'll generally see this form only in commercial object-oriented database systems, whose cost is prohibitive - for MUD enthusiasts.

2) Query languages are quite common in the commercial database systems which want to steal some of the relational customer base. However, in small home-grown projects that will offer free object-oriented database systems, it's doubtful you'll find any sort of strong querying capability. Suffice to say, it's not really needed. Just generate algorithms and data structures to give you such querying capability through navigation. Depending on what you want to do, it'll be either faster or slower than a relational database query. This gets very database specific, so I don't want to go any further. Either way, a general query like you mentioned wouldn't execute fast enough no matter what database system you're hiding behind. Queries weren't designed with requirements for retrieving high-update or immediate data. It sounds like you want such a querying capability in order to avoid the complexity of effecient nearest-neighbor algorithms - it's not going to happen, at least in this lifetime.

3) Eiz's links provided you with a good start. I would have pointed you to the same link.

4) Again, Eiz's link should answer more questions you may have - but feel free to come back if you need more information. I like discussing my projects, just like the rest of us.


4. RE: Object Databases vs Relational Databases (vs flat files) Fri Jun 10, 2005 [6:02 AM]
shasarak
Email not supplied
member since: Dec 10, 2004
In Reply To
Reply
Query languages are quite common in the commercial database systems which want to steal some of the relational customer base. However, in small home-grown projects that will offer free object-oriented database systems, it's doubtful you'll find any sort of strong querying capability. Suffice to say, it's not really needed. Just generate algorithms and data structures to give you such querying capability through navigation. Depending on what you want to do, it'll be either faster or slower than a relational database query. This gets very database specific, so I don't want to go any further. Either way, a general query like you mentioned wouldn't execute fast enough no matter what database system you're hiding behind. Queries weren't designed with requirements for retrieving high-update or immediate data. It sounds like you want such a querying capability in order to avoid the complexity of effecient nearest-neighbor algorithms - it's not going to happen, at least in this lifetime.


We-ell... it's all very well saying 'do everything by object navigation', but there are two possible problems with that....


1) You need to have set up the necessary navigation beforehand. Suppose (somewhat implausibly) that you suddenly want to get hold of every level 12 monster in the game and transport them all into the same room. To obtain all level 12 monsters in a relational setup you do something like:

select * from monster where level = 12

In order to achieve this by navigating an object tree, you need to have already set up a collection of all level-12 monsters that is dynamically maintained at all times. But what if you haven't? And, even if you have, you could potentially waste a lot of processing resources updating collections and trees that are very rarely actually used.


2) There's a performance and memory-usage issue. In principle there's no difference between doing select * from monster where level = 12 and iterating across a collection of monster objects, selecting all the ones with the correct attributes. A bit of Smalltalk code might say Monster allDBInstances select: [:z| z level = 12]

However, there's a (potential) difference in implementation. What you want to do is:

- iterate across all monster records (cached or otherwise);
- for those not cached fetch only the level value;
- flag those monsters that are level 12;
- for flagged monsters only, fetch in all the other fields so you have a completely instantiated object.

Now, I would hope that an object DB would be capable of the same thing - but I would be concerned that what it might actually be doing is fully instantiating every monster before checking its level, which is obviously much less efficient.

Are object DBs capable of caching field-by-field as well as object-by-object, or is this too inefficient in data access terms?

Please do not feed the troll.


5. RE: Object Databases vs Relational Databases (vs flat files) Fri Jun 10, 2005 [9:35 AM]
muir
Email not supplied
member since: Sep 14, 2003
In Reply To
Reply
Another option is to use object-relational mapping (or ORM). Traditional RDBMS is used for storage but the mapping between it and your objects is automated to varying degree depending on your framework.

.


6. RE: Object Databases vs Relational Databases (vs flat files) Fri Jun 10, 2005 [10:15 AM]
lindahlb
Email not supplied
member since: Mar 2, 2001
In Reply To
Reply
You're liable to be bitten by join times unless you are doing your own object caching or have a flat object heirarchy.


7. RE: Object Databases vs Relational Databases (vs flat files) Fri Jun 10, 2005 [10:40 AM]
lindahlb
Email not supplied
member since: Mar 2, 2001
In Reply To
Reply
In order to achieve this by navigating an object tree, you need to have already set up a collection of all level-12 monsters that is dynamically maintained at all times. But what if you haven't? And, even if you have, you could potentially waste a lot of processing resources updating collections and trees that are very rarely actually used

This is no different from a relational database. The only difference is that for relational databases, you instruct at a higher level (DBA) for the database to generate an index for monster levels, as opposed to creating the index structure yourself - and all object databases come with ready-to-use index structures. The ineffeciencies of wasted resources and extra processing time for building and maintaining the index are required for both database types.

However, for both cases, if you don't use an index structure, your query IS going to be ineffecient, no matter what you do - as you have to scan the entire type set. This is the tradeoff you get with object databases vs. relational databases.

In relational databases, all objects of a single type are stored in a single file, therefore you'll have mostly sequential access during the scan. With object databases, you'll generally have your objects spread throughout the database so there will be less sequential reading (although you can keep types constrainted to database partitions, significantly speeding this up).

However, on the object database side, if you have all objects of a single type stored in a single file, and you have a very structured heirarchy of types (as is typical with object-oriented programming), to access a single object, you'll need to access several files, severely breaking locality and killing your access speed.

So.. the question comes down to.. do you want slightly more effecient ad-hoc queries? Or extremely more efficient single-object access times? My preference is obvious, what is yours?

- iterate across all monster records (cached or otherwise);
- for those not cached fetch only the level value;
- flag those monsters that are level 12;
- for flagged monsters only, fetch in all the other fields so you have a completely instantiated object.


See above. The only difference of effeciency between an object and a relational database in queries, is that for relational databases, all the values for a particular type (table) are located in the same file, in sequential order, so the disk seek times are negligable - compared to scanning object types in the object database, which are generally maintained as an outside index since the focus of object databases is NOT ad-hoc querying.

Are object DBs capable of caching field-by-field as well as object-by-object, or is this too inefficient in data access terms?

Relational databases can only do this if you set up an index and have the level cache'd in the index - which an object database can do as well. Otherwise, it'll be scanning rows in the same manner that objects would be scanned. The only difference between the two implementations would be locality - which you certainly can enforce in an object database. If you need this type of ad-hoc querying in object databases, then it's best to enforce locality by generating objects of the type in a particular partition - but keep in mind that it may hurt clustering (which is no different than a relational table, really).

It's a tradeoff. And for MUDs with any form of inheritance trees, the tradeoff is greatly in favor of an object database - it's really that simple. You want you optimize single-object access and storing, this is what you're going to be doing 99% of the time. You, however, can afford to take a hit for ad-hoc querying - its not like you need the data immediately, as it's only for administrative purposes.


8. RE: Object Databases vs Relational Databases (vs flat files) Fri Jun 10, 2005 [12:16 PM]
Tyche
Email not supplied
member since: Apr 4, 2000
In Reply To
Reply
This is no different from a relational database. The only difference is that for relational databases, you instruct at a higher level (DBA) for the database to generate an index for monster levels, as opposed to creating the index structure yourself - and all object databases come with ready-to-use index structures. The ineffeciencies of wasted resources and extra processing time for building and maintaining the index are required for both database types.

In this case it would probably not be more efficient to define an index for such a column. I'm guessing the cardinality of this column would likely be to high. Every additional index increases insert and update time.

However, for both cases, if you don't use an index structure, your query IS going to be ineffecient, no matter what you do - as you have to scan the entire type set. This is the tradeoff you get with object databases vs. relational databases.

Baloney.. Adhoc queries in OODMSs suffer the same or greater penalties as RDMSs. There is no tradeoff here.

In relational databases, all objects of a single type are stored in a single file

The file formats, architectures, and access methods vary immensely between RDBMSs.

therefore you'll have mostly sequential access during the scan.

Even when they stored in a single file this is also not true. Many RDBMS store index and leaf nodes together inside the same file.

With object databases, you'll generally have your objects spread throughout the database so there will be less sequential reading (although you can keep types constrainted to database partitions, significantly speeding this up).

As it happens sequential prefetch of pages is much much faster than random reads, and RDMSs excel at determining when to use it and when not to. BTW commercial OODBMSs also use sequential prefetch for adhoc queries as well (that is IF they can do them at all).

Most modern RDBMSs can collect statistics and can be reorganized and optimized automatically based on access patterns.
The Sourcery - http://sourcery.dyndns.org
TeensyMud - http://teensymud.kicks-ass.org
"A man can receive nothing, except it be given him from heaven."


9. RE: Object Databases vs Relational Databases (vs flat files) Fri Jun 10, 2005 [1:21 PM]
lindahlb
Email not supplied
member since: Mar 2, 2001
In Reply To
Reply
Lots of damage control here.. lol.. I mispoke a few too many times.

In this case it would probably not be more efficient to define an index for such a column. I'm guessing the cardinality of this column would likely be to high.

Correct. The number of times the index would be used, and the required response time, is not worth the overhead of maintaining it.

However, for both cases, if you don't use an index structure, your query IS going to be ineffecient, no matter what you do - as you have to scan the entire type set. This is the tradeoff you get with object databases vs. relational databases.

Baloney.. Adhoc queries in OODMSs suffer the same or greater penalties as RDMSs. There is no tradeoff here.


The 'tradeoff' comment was misplaced and should have been located lower in my post.

The trade-off is between single object access times (navigational) and ad-hoc query times. And yes, there are all sorts of special cases, but this is generally what it comes down to. Object databases were designed because, with deep inheritence heirarchies, relational databases required time-ineffecient table joins or a space-ineffecient flat inheritence model. The complexity of the type system made it ineffecient to consider a model based on tables and rows, and performing relational joins.

The file formats, architectures, and access methods vary immensely between RDBMSs.

Correct, what I meant was that the data (rows) in each table are clustered as much as possible. Given a single type (table), most of it's instances (rows) will be clustered as much as possible, next to each other - this is what gives you better ad-hoc query (non-indexed) times than object databases - clustering based on type (table). I spoke nothing of access methods. Object databases generally make no effort to cluster based on type since their main concern was single object access times - although partitioning does allow this for certain databases (as I mentioned in my earlier post).

Even when they stored in a single file this is also not true. Many RDBMS store index and leaf nodes together inside the same file.

In typical relational database instances, tables are stored as a series of rows and indexes are built off to the side. Sometimes values other than the index are stored in the leaves to avoid going to the row to retrieve the queried data - I believe this is what you are talking about. Other times the index just points to the row's physical location, acting as a logical pointer. There are other kinds of indices (bitmap indices), but these are the most common kinds.

Very few relational databases implement a table as an index with the leaf nodes containing rows - it's not what relational databases were designed for, they were designed for requirements of ad-hoc queries under a consistent logical model, not for a simple key-value lookup architecture (use BerkelyDB for that).

Relational rows are meant to be identified by value, not by key. Keys are only supposed to be used for relationships between tables and really shouldn't be seen by the user of the database - this is the relational theory that RDBMS are grounded on. People selecting a row by unique keys are misusing the relational model. Yes, it works, but it's not what the model is based on, it is only optimized for such access by happen-stance (optimization for joins).

As it happens sequential prefetch of pages is much much faster than random reads, and RDMSs excel at determining when to use it and when not to.

They can prefetch because they know where objects of each type are - in the next section of the tablespace (usually a single file). This is exactly what I was talking about with respect to sequential scanning. And they always use this method with ad-hoc queries (non-indexed). It's used to varying degrees, however, based on the load. In addition, indices can be used to generate a map of what rows need to be retrieved - using this map, one can fetch the rows (or pages) in sequential order.

BTW commercial OODBMSs also use sequential prefetch for adhoc queries as well (that is IF they can do them at all).

Like I mentioned, they can if partioning is used, or if they have an index to generate the map of which objects need to be retrieved, similiar to the above situation.

Most modern RDBMSs can collect statistics and can be reorganized and optimized automatically based on access patterns.

They will cache and generate query plans, analyze costs for different orders of index usage, etc. - but the results are generally only noticed for complex queries when lots of indices are involved (not useful for MUDs). It's up to the DBA to do the rest of it - which involves optimizing SQL statements, tweaking automatic optimizations, choosing what should be indexed, etc.

Nothing prevents object databases from doing the same, of course - they're just less mature and their focus isn't on such querying capabilities.

(Comment added by lindahlb on Fri Jun 10 14:26:04 2005)

correction:
... People selecting a row by unique keys are misusing the relational model...

Note that it isn't necessarily bad if you temporarily hold onto the key for relationship lookup, however the relational model wasn't contrived for this sort of usage. You should generally only be selecting on values (not keys).

(Comment added by lindahlb on Fri Jun 10 18:59:34 2005)

more corrections:
...Very few relational databases implement a table as an index with the leaf nodes containing rows...

What I mean is that most relational databases allow for this, but it is rarely a default. It's usually not a desirable architecture for relational databases.

Most modern RDBMSs can collect statistics and can be reorganized and optimized automatically based on access patterns.

I wanted to add more to my response to this comment. All the RDBMS's I know will not reorganize themselves without instruction from the DBA - they do however collect statistics to provide the DBA with information to make better such decisions. Clustering and similiar reorganizations are *very* rarely done automatically.


10. RE: Object Databases vs Relational Databases (vs flat files) Fri Jun 10, 2005 [6:26 PM]
Tyche
Email not supplied
member since: Apr 4, 2000
In Reply To
Reply

However, for both cases, if you don't use an index structure, your query IS going to be ineffecient, no matter what you do - as you have to scan the entire type set. This is the tradeoff you get with object databases vs. relational databases.


Still no trade-off yet. In order to make his query you have to access the entire set of objects in an OODBMS as well.

The trade-off is between single object access times (navigational) and ad-hoc query times.

An RDBMS generally has data mapping/translation overhead. Many OODBMS do not and map directly into the OOP of choice. That's the major difference in single object access times. Navigationally, modern ORDBMSs now support contextual prefetch as an optimization. A contextual prefetch can fetch relationship rows in addition to the row requested, a navigational optimization. ORDBMSs now also support restricted ADTs and algorithms (sets, tries, hashes, dictionaries, etc.). Also see the GIS and spatial constructs of DB2, Oracle and Postgres.

Meesa:
Even when they stored in a single file this is also not true. Many RDBMS store index and leaf nodes together inside the same file.

Yoosa:
In typical relational database instances, tables are stored as a series of rows and indexes are built off to the side. Sometimes values other than the index are stored in the leaves to avoid going to the row to retrieve the queried data - I believe this is what you are talking about. Other times the index just points to the row's physical location, acting as a logical pointer. There are other kinds of indices (bitmap indices), but these are the most common kinds.


No. Once again there is NO TYPICAL STORAGE FORMAT for an RDBMS. There is a great deal of differences between Btrieve, GDBM, Berkely DB, InnoDB, Hash Tables, B-Trees, RRDS, XML, and VSAM, no? And those are just a handful of data storage architectures that RDBMSs use. I won't even begin to describe the complex proprietary formats that often use multiple formats depending on key type and ADT specializations.

If you wish to make a specific claim about how data is stored and accessed in an RDBMS, then you need to specify the exact RDBMS you are talking about and which optimizations, modes and features you turned on for the tablespace.

Very few relational databases implement a table as an index with the leaf nodes containing rows
- it's not what relational databases were designed for, they were designed for requirements of ad-hoc queries under a consistent logical model, not for a simple key-value lookup architecture (use BerkelyDB for that).


Your are hopelessly confusing and entangling implementation and the relational model to no useful purpose whatsover. More importantly see Codd's rules 8 and 9. What RDBMSs are you referring to? Most do combine the index and data.

Relational rows are meant to be identified by value, not by key. Keys are only supposed to be used for relationships between tables and really shouldn't be seen by the user of the database - this is the relational theory that RDBMS are grounded on.

People selecting a row by unique keys are misusing the relational model. Yes, it works, but it's not what the model is based on, it is only optimized for such access by happen-stance (optimization for joins).
..........
(Comment added by lindahlb on Fri Jun 10 14:26:04 2005)
correction:
... People selecting a row by unique keys are misusing the relational model...

Note that it isn't necessarily bad if you temporarily hold onto the key for relationship lookup, however the relational model wasn't contrived for this sort of usage. You should generally only be selecting on values (not keys).


Citations and references to the above information please. None of the above sounds even remotely familiar or sensible to me. While most RDBMSs allow you to ignore it, relational calculus presumes there is a unique primary key on a table. If there isn't then you have trouble with the 'general access rule'.

Most object-relational mapping strategies assume that key to be a unique object id, but objects have nothing to do with the relational model.
The Sourcery - http://sourcery.dyndns.org
TeensyMud - http://teensymud.kicks-ass.org
"A man can receive nothing, except it be given him from heaven."


11. RE: Object Databases vs Relational Databases (vs flat files) Fri Jun 10, 2005 [8:06 PM]
Tyche
Email not supplied
member since: Apr 4, 2000
In Reply To
Reply
What I mean is that most relational databases allow for this, but it is rarely a default. It's usually not a desirable architecture for relational databases.

Nonsense.
The Sourcery - http://sourcery.dyndns.org
TeensyMud - http://teensymud.kicks-ass.org
"A man can receive nothing, except it be given him from heaven."


12. RE: Object Databases vs Relational Databases (vs flat files) Sat Jun 11, 2005 [11:05 AM]
lindahlb
Email not supplied
member since: Mar 2, 2001
In Reply To
Reply
We're getting sidetracked here. Reguardless of whether or not an index is in the tablespace or not, relational databases cluster rows (tuples - the term had escaped me previously) for the table, and this is an advantage for ad-hoc queries over object databases which do not cluster by type (some allow you to explicitly do such - the partition technique, I mentioned).

While most RDBMSs allow you to ignore it, relational calculus presumes there is a unique primary key on a table. If there isn't then you have trouble with the 'general access rule'.

Is this primary key a value or an arbitrary integral key (object id)? If it's a value, then it fits into the relational model, if it's an arbitrary number then the programmer should not be exposing himself to it, as it is only used to perform joins. In other words, using an arbitrary unique object id that has no real value is a rough hack. You even hint at it yourself:

Most object-relational mapping strategies assume that key to be a unique object id, but objects have nothing to do with the relational model.

Both objects AND unique object ids have nothing to do with the relational model. The relational model identifies unique tuples by value, not by an arbitrary id number. I wish I did have references for this, but it's all accumulated knowledge from research on object oriented databases and relational databases during development of MudDB - it doesn't come from any one source that I can identify for you.

Anyhow, this argument is a sidetrack as well, and isn't important for the thread. So it's probably best to drop it as well.

Still no trade-off yet. In order to make his query you have to access the entire set of objects in an OODBMS as well.

Yes, but the locality of the scan is very important. For relational databases, it is almost always better, unless you explicitly set it up for the object database (using partitions) - but if you do this, you're making the query access path more efficient than the navigational access path, and you have to be aware of that (since object databases tend to cluster for the navigational path).

An RDBMS generally has data mapping/translation overhead. Many OODBMS do not and map directly into the OOP of choice. That's the major difference in single object access times.

Right, the most ineffecient part of data mapping and translation is mapping the inheritence heirarchy, because it involves ineffecient joins OR space wastage (I've covered this earlier). The parsing and translation does present overhead, but this pales in comparison to joins OR the ineffeciencies of wasted space (difficult to measure).

(Comment added by lindahlb on Sat Jun 11 15:20:19 2005)

with respect to the last point:

Relational databases were designed mostly for business solutions which, by nature, have a flat inheritence heirarchy. There is almost never any mapping or translation done at all for such applications. The fact that mapping and translation must be done is a direct implication that the tools you are using are inadequate for the problem space - or that you've designed your problem space poorly to fit the tools you have at your disposal.


13. RE: Object Databases vs Relational Databases (vs flat files) Mon Jun 13, 2005 [4:45 AM]
shasarak
Email not supplied
member since: Dec 10, 2004
In Reply To
Reply
4) Again, Eiz's link should answer more questions you may have - but feel free to come back if you need more information. I like discussing my projects, just like the rest of us.

One basic but very useful idea to come out of that (and I find myself wondering why on Earth I never thought of it before) is the idea of having the code and the data associated with MUD objects controlled by two separate processes, using shared memory.

If I ever do get around to starting my own project I might still opt for writing my own simple object cache that sits on top of (and, generally, masks) a relational DB. But, either way, there's clearly a lot to be said for the object server running as a separate process from the MUD code and doing its own object-level transactions.

That way, even if the entire MUD process crashes and GPFs, you'll still end up with the actual state of every MUD object preserved perfectly - even down to data being cached in memory - and guaranteed to revert to the state it was in before the event cycle that caused the crash kicked off. And you also automatically solve the 'transparent reboot' problem - a newly kicked-off MUD process will be referencing precisely the same values in the same memory locations as the old one was. Very neat.

Please do not feed the troll.


14. RE: Object Databases vs Relational Databases (vs flat files) Mon Jun 13, 2005 [8:01 AM]
lindahlb
Email not supplied
member since: Mar 2, 2001
In Reply To
Reply
That way, even if the entire MUD process crashes and GPFs, you'll still end up with the actual state of every MUD object preserved perfectly - even down to data being cached in memory - and guaranteed to revert to the state it was in before the event cycle that caused the crash kicked off.

Not necessarily, you still have to worry about consistency unless you have a solid rollback feature. For instance, if you were in the middle of executing an 'exchange' event between two players, and the game crashes with only one side getting what he or she wanted, then when the MUD comes back, you've lost the information that would complete the event (unless you store your events as well). So you'll need some form of recovery to roll you back. But yes, the transparent reboot is a nice feature.


15. RE: Object Databases vs Relational Databases (vs flat files) Mon Jun 13, 2005 [12:07 PM]
Tyche
Email not supplied
member since: Apr 4, 2000
In Reply To
Reply
We're getting sidetracked here. Reguardless of whether or not an index is in the tablespace or not, relational databases cluster rows (tuples - the term had escaped me previously) for the table, and this is an advantage for ad-hoc queries over object databases which do not cluster by type (some allow you to explicitly do such - the partition technique, I mentioned).

RDBMSs will often not even include columns from the same rows on the same data segments/pages in disk storage. That is why... 'select * from foo;' is generally a bad idea unless you really mean it. The conditions for which columns are subject to these optimizations depend on the RDBMS. For example, DB2 may not store BLOB, CLOB, DBLOB, VARCHAR or VARGRAPH columns with the on the same pages. Tablespace partitioning can also done in RDBMSs for performance.

> While most RDBMSs allow you to ignore it, relational calculus presumes there is a unique primary key on a table. If there isn't then you have trouble with the 'general access rule'.

Is this primary key a value or an arbitrary integral key (object id)? If it's a value, then it fits into the relational model, if it's an arbitrary number then the programmer should not be exposing himself to it, as it is only used to perform joins. In other words, using an arbitrary unique object id that has no real value is a rough hack. You even hint at it yourself:


The logical model can and will differ from the physical model. That's why we have views and stored procedures. The only relational requirement is that the primary key is unique, not null, and that it is stable. The physical model may implement a surrogate key. In practice the application domain already has arbitrary keys, often poorly chosen ones. If arbitrary integrals are used in the application domain then they are often used in the logical model (i.e. employee badge number, invoice number, etc.).

See...
http://www.dbpd.com/vault/9805xtra.htm
http://www.devx.com/ibm/Article/20702/1763

> Most object-relational mapping strategies assume that key to be a unique object id, but objects have nothing to do with the relational model.

Both objects AND unique object ids have nothing to do with the relational model. The relational model identifies unique tuples by value, not by an arbitrary id number. I wish I did have references for this, but it's all accumulated knowledge from research on object oriented databases and relational databases during development of MudDB - it doesn't come from any one source that I can identify for you.


No but they can be implemented in a relational model. Most ORMs dealt with SQL-92. Most ORMs do not recognize that many applications will 'share' a data model. However SQL-99 has extended the relational model to have 'row types' where each tuple has a row type T which can be referenced with REF(T). For example...


create row type FooType ( public name char(40) unique, private desc char(50) );
create row type BarType ( name char(40) unique, desc char(50) );
create row type FooBarType ( foo ref(FooType), bar ref(BarType) );
create table Foos OF TYPE FooType;
create table Bars OF TYPE BarType;
create table FooBars OF TYPE FooBarType;
select foo->name from FooBars where bar->name = 'mybar' ;


And rows can hold self-referential row types...

create row type FooType ( name char(40) unique, desc char(50), FooID ref(FooType));
create table Foos of type FooType values for FooID are system generated ;


RDBMSs have implemented an inheritence model through subtables, SQL-99 implements user-defined ADTs, functions and encapsulation, and also allows polymorphism with function overloading on subtables.

Yes, but the locality of the scan is very important. For relational databases, it is almost always better, unless you explicitly set it up for the object database (using partitions) - but if you do this, you're making the query access path more efficient than the navigational access path, and you have to be aware of that (since object databases tend to cluster for the navigational path).

There is still not much meaningful one can say about the physical implementaion of RDBMSs without specifying what you are talking about. Everything you've said about the physical implementation thus far appears to apply only to mySQL's myISAM.

> An RDBMS generally has data mapping/translation overhead. Many OODBMS do not and map directly into the OOP of choice. That's the major difference in single object access times.

Right, the most ineffecient part of data mapping and translation is mapping the inheritence heirarchy, because it involves ineffecient joins OR space wastage (I've covered this earlier). The parsing and translation does present overhead, but this pales in comparison to joins OR the ineffeciencies of wasted space (difficult to measure).


I was talking about the mapping overhead of types, not objects nor rows but value types. Most RDBMs do not store data as it's represented in C serialization at all.

Yes, the fastest RDBMSs Oracle and DB2 will create tons of wasted space on tables for performance and data integrity reasons. 'Space wastage' is done intentionally. Correctness is the number one design criteria in RDBMSs. No DBMS has all three speed, memory, and correctness and all at the same time. With an RDBMS you can choose at the table level. You can even choose to violate correctness if you want (see mySQL).

Relational databases were designed mostly for business solutions which, by nature, have a flat inheritence heirarchy.

I wish they were. While video store and university catalogs seem to be the common examples they are much simpler than business applications, insurance, AR, AP, and banking. I'd love to share the inheritance model of the policy rating system I have to maintain but unfortunately I can't. It's well over 200 tables, and just one of 30 main business systems that have to interface with each other. Just this one system happens to 10x as complex of any mud I've seen. Mud implementors have the luxury of building a purely fictive abstract reality created in their own head rather than one that must model reality.

(Comment added by Tyche on Mon Jun 13 15:07:36 2005)

BTW...
"Most RDBMs do not store data as it's represented in C serialization at all."
...is an incomplete thought. See previous sentence.
The Sourcery - http://sourcery.dyndns.org
TeensyMud - http://teensymud.kicks-ass.org
"A man can receive nothing, except it be given him from heaven."


16. RE: Object Databases vs Relational Databases (vs flat files) Mon Jun 13, 2005 [12:32 PM]
eiz
Email not supplied
member since: Dec 24, 2002
In Reply To
Reply
RDBMSs will often not even include columns from the same rows on the same data segments/pages in disk storage.

I just want to state again that you cannot say anything at all about how "RDBMSs" in general store data. For example, kdb+ (which is used mainly for time-series analysis) stores columns not rows, rendering this entire line of discussion kind of irrelevant.

BTW, are there any free databases that support SQL99's ORDBMS features?


17. RE: Object Databases vs Relational Databases (vs flat files) Mon Jun 13, 2005 [5:02 PM]
lindahlb
Email not supplied
member since: Mar 2, 2001
In Reply To
Reply
I was under the assumption we were talking about MUD systems here, not any possible hypothetical system.

RDBMSs will often not even include columns from the same rows on the same data segments/pages in disk storage.

Obviously they don't have to. They can cluster the columns based on the queries used on each table - but what will you be using for a MUD? When using an RDBMS as a backend for a MUD, you'll be loading objects to and from the database in their entire form and caching them locally - this means that to split columns up would be an insane proposal.

Reguardless, this is an pointless comment, as I'm talking about how clustering by type (what RDBMs generally do) will improve query time over an object oriented database that doesn't cluster by type. You certainly aren't going to partition up columns for a commonly used query across the disk, so I fail to see how this is a counterpoint. You're still going to be clustering to improve locality of queries - something that object databases are generally lacking (this was my original point). You're getting caught up in the semantics of what I'm saying and the examples I'm using, and missing the concept.

Tablespace partitioning can also done in RDBMSs for performance

I assume you mean clustering subclass table data along with the parent table data. However, when you have a complex inheritance graph, this obviously becomes stupid idea - a complete waste of space which will kill your locality just as much as a join (a waste of space that Oracle admits to - the link is on my laptop).

The fact of the matter is that you either cluster for queries, or you cluster for navigational access (single object too). You can't cluster for both because each has a completely different locality model - unless you add redundancy, which presents a whole new plethora of performance problems.

There is still not much meaningful one can say about the physical implementaion of RDBMSs without specifying what you are talking about. Everything you've said about the physical implementation thus far appears to apply only to mySQL's myISAM.

The peril of RDBMS comes from the normalization process - the physical models I've been explaining only illustrate some problems. I've used these models in an attempt to explain how you can't cluster for both queries and navigational access. This is what I was trying to explain to the original poster. Focus on my real argument, not the validity of my exmaples. I may not be an RDBMS expert, but I certainly understand the limitations of clustering and physical orientation.

However SQL-99 has extended the relational model to have 'row types' where each tuple has a row type T which can be referenced with REF(T).

And how mature are the implementations of SQL-99? (which, IMO, attempts to hack solutions for polymorphism in the relational model)

I was talking about the mapping overhead of types, not objects nor rows but value types. Most RDBMs do not store data as it's represented in C serialization at all.

Of course not. They store it in a normalized form, which means expensive joins.

Yes, the fastest RDBMSs Oracle and DB2 will create tons of wasted space on tables for performance and data integrity reasons. 'Space wastage' is done intentionally.

You missed my point. In order to accomodate complex inheritance graphs in RDBMS, you either sacrifice lots of speed for joins or lots of space plus some time (locality reasons). This extra time lost due to poor locality because of space wasteage is very hard to measure, but I'm sure it trumps the time spent mapping objects - the time spent mapping is minimal compared to either of these (join time or poor locality from space wastage by type clustering). The reason? To improve query performance.

Relational databases were designed mostly for business solutions which, by nature, have a flat inheritence heirarchy.
I wish they were. While video store and university catalogs seem to be the common examples they are much simpler than business applications, insurance, AR, AP, and banking.


Excuse me, I meant: relational databases were designed mostly for business solutions of the era (70-80s) in which solutions were developed with flat inheritence heirarchies. As such, they carry baggage that has given them inane performance and complexity for solutions using object-based modeling - complexity which you've seen yourself it appears. For the history of relational databases click here.

(Comment added by lindahlb on Mon Jun 13 18:04:41 2005)

You're getting caught up in the semantics of what I'm saying and the examples I'm using, and missing the concept.

This is my fault as much as it is yours (I'm not necessarily assigning blame).


18. RE: Object Databases vs Relational Databases (vs flat files) Mon Jun 13, 2005 [10:17 PM]
eiz
Email not supplied
member since: Dec 24, 2002
In Reply To
Reply
I'm not even sure what this thread is supposed to be about anymore (aside from a pissing match between the two of you), but some things you should know:

1. In regards to your SQL-99 question, there are plenty of perfectly good ORDBMSs. PostgreSQL has direct support for inheritance, various commercial databases have ORDBMS support, some have SQL-99 support. Aside from that there are fast non-SQL ORDBMSs like GigaBASE (from the author of GOODS). As a matter of fact if you had come to me before starting work on MudDB I would have probably said you were insane and to use gigabase instead. =)

2. World of Warcraft uses Oracle for storage. Somehow, they get by.

3. RDBMSs have wildly varying implementations. So you shouldn't go and make generalizations about how they "work." This is like the fourth time this has been brought up, and you're still doing it.


19. RE: Object Databases vs Relational Databases (vs flat files) Tue Jun 14, 2005 [11:06 AM]
Drey
Email not supplied
member since: Mar 19, 2000
In Reply To
Reply
"Keys are only supposed to be used for relationships between tables and really shouldn't be seen by the user of the database - this is the relational theory that RDBMS are grounded on."

As a professional database administrator, I'd just like to say that I and my users use the key *all the time*. A key column that's only used for joining tables together is a waste of space (but sometimes necessary). I don't know about the theory you're referring to but I do know there are a lot of Ivory Tower academics who put more stock in theory then usefulness; I'm not saying you're one of them, but I am saying that maybe you've been unduly influenced by one.


20. RE: Object Databases vs Relational Databases (vs flat files) Tue Jun 14, 2005 [11:09 AM]
Drey
Email not supplied
member since: Mar 19, 2000
In Reply To
Reply
"Reguardless of whether or not an index is in the tablespace or not, relational databases cluster rows (tuples - the term had escaped me previously) for the table, and this is an advantage for ad-hoc queries over object databases which do not cluster by type (some allow you to explicitly do such - the partition technique, I mentioned)."

Not always and not necessarily usefully. Microsoft SQL Server, for instance, let's you decide whether or not to cluster the rows and even then you have to decide which index you're going to cluster it on. As Tyche has said repeatedly, you can't make blanket statements about RDBMSes because they're all different.


21. RE: Object Databases vs Relational Databases (vs flat files) Tue Jun 14, 2005 [11:40 AM]
lindahlb
Email not supplied
member since: Mar 2, 2001
In Reply To
Reply
World of Warcraft uses Oracle for storage. Somehow, they get by.

I'm not surprised, for three reasons:
1) The costs of commercial object databases are insane, at least the ones you'd want to trust your data with and have the necessary support. You don't want to be hiring a contractor for object database support - you'll most likely have to go to the developer which means, guess what, more insane costs. The cost of an Oracle DBA is much, much less.
2) I'd be surprised if their inheritence graphs were complex for the data being stored in the database (can't confirm this one way or another, of course).
3) They'll be doing lots of reporting and data collection on the information contained in the databases, so effecient query support seems only natural (relational databases have maturity here).

I'm not at all suprised of the limited use object databases have in the commercial world - the commercial world uses reports and data collection often. There's a reason why object databases are used much more often in embedded devices - where queries and reports are minimal. Some of it is the low overhead, but mostly the superior access and update times allows for cheaper hardware and don't need the maturity relational databases have in query support. The same holds true with any MUD - they don't need to generate complex queries to provide to management - there is no management. Tweaking the game is done by analyzing players as opposed to statistics.

As a matter of fact if you had come to me before starting work on MudDB I would have probably said you were insane and to use gigabase instead.

You've always thought of me as insane for creating it.. nothing new there. Looking over the specifications, I'm glad I stuck to MudDB - there's a lot of places where I'd still cut corners (big time), but the interface to the type system is similiar at least. Allocations are probably similiar too. It's definately a good choice if you want to go the object relational route. However, I wouldn't really call it relational, as it has object references which imply a direction. Relations do not have direction. It is no more relational than other object databases with extended query support.

RDBMSs have wildly varying implementations. So you shouldn't go and make generalizations about how they "work." This is like the fourth time this has been brought up, and you're still doing it.

I'm not saying the storage architecture is similiar. I'm saying how they optimize the storage architecture is similiar. There is a very obvious reason why object databases were created - because of the ineffeciencies of relational databases when modeling complex inheritence graphs, and this should NOT be ignored. I don't see whats so hard to understand about this. I made some generalizations at first to help explain this concept - unfortunately I was nitpicked for my examples because I was too general (I didn't realize the folly of my examples).

Why do object databases (not necessarily embedded) have update rates in the thousands while relational databases struggle to even reach one thousand? It's not because of mapping or anything silly like that. It's because of physical structure - a structure that is too wrapped up in clustering for queries in a flat inheritence graph as opposed to clustering for navigation and direct referential access. The relational model does allow for improvement in physical structure for this sort of activity, but, in practice, it's not done because it becomes unneccessarily complex when you add in the baggage associated with today's implementations of the relational model. This has everything to do with how RDBMS work, in general.

Sure, in theory, RDBMS don't work in any particular manner, but lets talk about reality here - they offer piss-poor performance when it comes to complex object inheritence graphs. Certainly don't take my word for it, look at the numbers, yourself.

On a side note, I don't think RDBMS will ever be capable of reaching the same performance threshold as object databases. Why? I attribute this to the relational model - one that requires relations to ignore direction. This means you can't optimize away inverse references (as GIGAbase calls them). This definately isn't the real reason why the performance is so different today, but once RDBMS become better at organizing complex object graphs, this will be one problem that cannot be solved by altering implementation - it requires abandonment of the relational model.

(Comment added by lindahlb on Tue Jun 14 13:18:36 2005)

The relational model does allow for improvement in physical structure for this sort of activity

To avoid another nitpick: The relational model allows for anything physically.

(Comment added by lindahlb on Tue Jun 14 13:20:47 2005)

On a side note, I don't think RDBMS will ever be capable of reaching the same performance threshold as object databases.

Another: with respect to navigational access and single object access in complex object graphs.

(Comment added by lindahlb on Tue Jun 14 22:22:11 2005)

this will be one problem that cannot be solved by altering implementation - it requires abandonment of the relational model.

A bit strong for a side comment with no real evidence, so reduce the tone of this statement a little when you read it.


22. RE: Object Databases vs Relational Databases (vs flat files) Tue Jun 14, 2005 [12:03 PM]
lindahlb
Email not supplied
member since: Mar 2, 2001
In Reply To
Reply
"As a professional database administrator, I'd just like to say that I and my users use the key *all the time*. A key column that's only used for joining tables together is a waste of space (but sometimes necessary). I don't know about the theory you're referring to but I do know there are a lot of Ivory Tower academics who put more stock in theory then usefulness; I'm not saying you're one of them, but I am saying that maybe you've been unduly influenced by one."

I already addressed this statement in a subsequent, but perhaps you missed it. I had meant non-value keys. Do your keys represent a value or is it an arbitrary number? If they are an arbitrary number, their usage does not fit the relational model because they're being used as a direct reference. Keys such as employee numbers are actually values that help you identify the tuples because you cannot realistically hold enough information about a human to uniquely identify him or her (unless you want to reveal SSNs). However, part numbers are often used incorrectly and break the relational model - as parts should be identified uniquely by name (save for when you have duplicate names for some reason). Same problem comes with using 'identifiers' to perform key-value access and storage as a persistence engine - you're violating the relational model (you may not care, however). The main point here, is that you're using keys to provide referential access.

"Not always and not necessarily usefully. Microsoft SQL Server, for instance, let's you decide whether or not to cluster the rows and even then you have to decide which index you're going to cluster it on. As Tyche has said repeatedly, you can't make blanket statements about RDBMSes because they're all different."

Correct, I've addressed this in my above post by admitting that I used wrongly generalizations as examples to get my point across. Unfortunately, I don't know how each particular RDBMS works, I only know that they are ineffecient for complex object graphs and am giving possible examples as to why using information I've gathered over a period of time while doing research on (countless) database systems.

(Comment added by lindahlb on Tue Jun 14 13:24:45 2005)

The main point here, is that you're using keys to provide referential access.

The maint point here, is that you're using keys that represent no real value but only to provide referential access. The relational database should be doing this for you if it's for performance reasons - and shouldn't be exposing it to the user (just administrators for integrity reasons only).


23. RE: Object Databases vs Relational Databases (vs flat files) Tue Jun 14, 2005 [10:57 PM]
eiz
Email not supplied
member since: Dec 24, 2002
In Reply To
Reply
The same holds true with any MUD - they don't need to generate complex queries to provide to management - there is no management. Tweaking the game is done by analyzing players as opposed to statistics.

This is another silly generalization. I happen to like reports and statistics, and thus a good query language is important to me.

I'm not saying the storage architecture is similiar. I'm saying how they optimize the storage architecture is similiar. There is a very obvious reason why object databases were created - because of the ineffeciencies of relational databases when modeling complex inheritence graphs, and this should NOT be ignored. I don't see whats so hard to understand about this. I made some generalizations at first to help explain this concept - unfortunately I was nitpicked for my examples because I was too general (I didn't realize the folly of my examples).

Why it's so hard is that you haven't convinced me that it's true: in particular, that a MUD needs "complex inheritance graphs," - we certainly don't have any - that RDBMSs can't or don't solve this problem, and that they all optimize storage in similar ways. Why doesn't Oracle's CREATE CLUSTER, for example, solve the problem of costly subclass joins? What sort of "baggage" is preventing these systems from being fast?

Certainly don't take my word for it, look at the numbers, yourself.

Since you're the one making claims about "piss poor" performance, I think it's your responsibility to provide some numbers yourself.

Just to be clear, I agree that an object database (possibly object-relational) is probably a better choice for a MUD, but you have made some fairly expansive claims about RDBMS that I don't think have been fully justified.


24. RE: Object Databases vs Relational Databases (vs flat files) Wed Jun 15, 2005 [2:33 AM]
Tyche
Email not supplied
member since: Apr 4, 2000
In Reply To
Reply
I was under the assumption we were talking about MUD systems here, not any possible hypothetical system.

Well muds are hypothetical applications unless you've got some specific mud design in mind.

> RDBMSs will often not even include columns from the same rows on the same data segments/pages in disk storage.

Obviously they don't have to. They can cluster the columns based on the queries used on each table - but what will you be using for a MUD? When using an RDBMS as a backend for a MUD, you'll be loading objects to and from the database in their entire form and caching them locally - this means that to split columns up would be an insane proposal.


My statement was not a proposal, it's a fact. As I said, DB2 is an example of an RDBMS which may split certain column value types into different storage segments. Another different and user controlled optimization is MDC or multi-dimensional clustering. Physical segments are divided into blocks, cubes, or more dimensions based on columns. Select predicates which match MDC table columns will invoke blockscans instead of tablescans. Again I only mention it because these notions about 'clustering' and 'locality' don't fit into your view of how RDBMSs work.

Now IRT to your proposed mud design above, I don't think that's obvious at all. Why would you cache objects twice? And why are you loading data you don't need (entire objects)?

Reguardless, this is an pointless comment, as I'm talking about how clustering by type (what RDBMs generally do) will improve query time over an object oriented database that doesn't cluster by type.


You ain't been listening. They don't generally do that. If anything one will often want that the table be usually clustered by the primary key index, unclustered can be a performance winner too. That's why RDBMSs can and do outperform OODBMSs at navigation as well. Part of that is there is only one way to navigate an OODBMS, while a well-design RDBMS has many ways, bi-directional, and with some paths shorter and faster than OO model navigation.

> Tablespace partitioning can also done in RDBMSs for performance

I assume you mean clustering subclass table data along with the parent table data.


No I do not mean that. The statement is frankly unambiguous.

However, when you have a complex inheritance graph, this obviously becomes stupid idea

Define 'complex inheritance graph'. Are we talking DikuMud, TMI mudlib, ColdCore... just what do you consider complex inheritance graph. Since ORDBMSs do multiple inheritance and yours does single inheritance I guess they do a better job then?

The fact of the matter is that you either cluster for queries, or you cluster for navigational access (single object too). You can't cluster for both because each has a completely different locality model - unless you add redundancy, which presents a whole new plethora of performance problems.

Nonsense.

The peril of RDBMS comes from the normalization process - the physical models I've been explaining only illustrate some problems.

The peril of OO is there is no logical/physical model separation. Tight coupling is a maintenance nightmare. You change your object model you have to migrate the whole damn object store, while in an RDBMS you change a few tables. OODBMSs have zero portability. If you develop a new application the database from your previous application is not reusable. RDBMSs logical data models have reusability across applications. What a paradox, OO supposedly creates reusable software, yet creates non-reusable databases.

I've used these models in an attempt to explain how you can't cluster for both queries and navigational access.

You are wrong. Mostly because the statement is nonsensical.

And how mature are the implementations of SQL-99? (which, IMO, attempts to hack solutions for polymorphism in the relational model)

Quite mature. Oracle, DB2, Sybase, Informix do. I'm certain there are others. How many OODBMSs support ODMG? How mature are they? Does you ODBMS support ODMG 2.0?

> Yes, the fastest RDBMSs Oracle and DB2 will create tons of wasted space on tables for performance and data integrity reasons. 'Space wastage' is done intentionally.

You missed my point.


If you check what the above was in response to, you'll find I was again disputing your notions about physical storage, clustering and locality. Your myopic view of only one aspect of a problem prevents you from seeing the big picture. This might be stunning news, but many applications performance is dependent on efficient updates, deletes, insertions, and multi user access. RDBMSs physical storage is/can be optimized for that too. And of course there's the most important RDBMS requirement ACID. ('Space wastage').
The Sourcery - http://sourcery.dyndns.org
TeensyMud - http://teensymud.kicks-ass.org
"A man can receive nothing, except it be given him from heaven."


25. RE: Object Databases vs Relational Databases (vs flat files) Wed Jun 15, 2005 [2:37 AM]
Tyche
Email not supplied
member since: Apr 4, 2000
In Reply To
Reply
World of Warcraft uses Oracle for storage. Somehow, they get by.

The Sims Online does as well. EA claims they can do 30K SQL tps per hardware cluster.
The Sourcery - http://sourcery.dyndns.org
TeensyMud - http://teensymud.kicks-ass.org
"A man can receive nothing, except it be given him from heaven."


Pages: 1 | 2 | 3



[Previous] [Next] [Post] [Reply] [Topics] [Summary] [Search]