IIUG Conference 2018

Looking forward to meeting in Washington DC next October!

Tuesday, October 26, 2010

My next new feature request

I've thought of my next feature request.  Speaking with other users, especially those in high volume near 24x7 environments, I often hear the lament that after a server restart it can take many minutes and often hours before the server reaches its normal steady state with active rows in the cache and all data dictionary entries, data distributions, stored procedures, common statements, and other cached objects loaded up.  During this ramp-up period performance is significantly reduced since most queries require a physical IO to read in the pages that are needed and possibly fill in other cache entries.

Based on this lament, my feature request is that in the background, after every checkpoint, the engine dumps out the list of cached data and index page addresses so that during a restart the engine can - optionally - read in the list and repopulate the cache.  If this goes through the normal page processing algorithms I expect that the data dictionary and distribution caches can be filled in as well.  The statements cache and stored procedure caches may have to be handled separately, though the procedure cache should be simple enough if the procnames are saved along with the page addresses at checkpoint time.  The statement cache may be the costly one since the entire statement texts will have to be saved and restored to memory, but this is also doable.  At least from an outsider's point of view.

I ran this by some of the Informix development team last night here at IOD and no one thought that this one was difficult to pull off.  The only thing we lack is a strong user case.  What do YOU think?  Is this a useful feature?

Monday, October 18, 2010

Fear the Panther

So, after much waiting and absolutely no pre-event hoopla, Panther is finally here.  You will see much in the press and in the IBM official announcements about the features of Panther (Informix Dynamic Server version 11.70) that IBM thinks are important to the market place. Jerry Keesee is calling Panther "The Last of the Big Cats" hinting that IBM will be finding another theme for the next release.

I don't want to repeat all of what you will hear from IBM, but I do want to provide my own announcement of those features that I feel will be the most important in the market place over the next 12 months.  This is a tremendous release as far as features are concerned.  Without further ado, here's my list, categorized as IBM has done theirs:

OLTP Data Security

What do I mean by OLTP Data Security?  Well, what is the greatest fear in an OLTP environment?  To me it is loss of transactional data.  If a user has completed a transaction and thinks it is secured in your database, you cannot afford to lose it.  If that transaction was a $20 ATM withdrawal, your company is on the hook for the money.  If it was a multi-million dollar electronic funds transfer, even more so.

If a user is in the midst of a transaction and your primary server crashes, do you tell the users "Sorry.  System problem.  Please try again later."  You will lose a customer, or several hundred customers.

"OK.", you ask, "How can Panther help?"  Panther has a new feature that IBM is calling "transaction survival".  I call it "uninterruptible transactions".  If your primary MACH11 server goes down any transactions begun by a user connected to any secondary server (HDR, SDS, or RSS) will not be lost but will be picked up by the surviving SDS secondary server which is promoted to be the new primary server and the transaction will continue unharmed.  There is no application change or ONCONFIG parameter needed to enable this feature and make it work.  Your existing applications don't have to be modified to reconnect to the new primary server.  This is all automatic and effort-free.  No other RDBMS server in the industry has this feature.  Oracle does have a similar feature but it's not real and it's not automatic.  You have to enable it in your application because it is actually implemented in the front-end libraries which fake survival by recording the transaction locally on the client in a flat file and replay the transaction from scratch after reconnecting to the new server.  The feature is so lame that Oracle doesn't even talk about it.

To me transaction survival is a biggie!  The biggest feature in Panther.

OLTP Performance

First, ADTC's testing shows that Panther is about 11% faster than 11.50xC7 without taking advantage of any of the performance enhancing features mentioned below.  However, Panther has several features to further improve performance of OLTP systems. 

One is the removal of the requirement for a foreign key constraint to have an index on the foreign key.  The supporting index can now be optionally disabled at constraint create time (or later) and its storage will be released.  The selectivity of these indexes on the foreign keys pointing to lookup tables with few rows is low and the utility of requiring the index has always been questionable.  The engine does not need these indexes to enforce the constraints unless cascading deletes are enabled.  You likely don't need them for searching because most systems have composite keys containing those foreign key columns that are used for searching and filtering.  It is rare in an OLTP style query for the lookup tables to be selected as the primary query tables requiring an index on the dependent table to support the join.  Removing this requirement will reduce the overhead of inserting and modifying data in tables with many code columns without having any impact on query performance.

Another one is the new Forest of Trees indexes.  This is a new indexing method that combines many of the advantages of a Hash index with those of a traditional Informix B+tree index.  For indexes on keys who's first column(s) have low selectivity but are still important to the correct processing of many OLTP queries (think geo, country, region of an index supporting a company's geographic reporting structure as an example) the b+tree indexes can become rather deep with relatively few unique elements on a level.  What Forest of Trees indexes do is to let the Database Architect specify one or more of the leading columns to be used to create a hash key.  Then a separate b+tree is created for each hash value containing only the remaining key parts following the hash columns.  This results in several flatter b+tree indexes.  You can't perform range scans on the hash columns (for that you will also need a pure b+tree index) but you can do so on the remaining columns in the index key.

Multiple index scans in the optimizer is another big win for OLTP.  Often you have a complex join between three or four tables using the filtered rows from the independent tables to act as filters on the dependent table and several filters on multiple columns in the dependent table.  The optimizer in earlier releases of IDS had to select the "best index" often resulting the engine having to read many rows of data and perform the final filters using the actual row data even though indexes on the filter columns are available.  The old XPS optimizer could use more than one index on each table in a query.  Now Panther can as well.  This can reduce the number of long key compound indexes which can also improve insert, delete, and update performance.  For databases that are delivered as part of a third party application where you can't control the schema, this one may be a HUGE win since the composite indexes that could make such queries reasonably efficient may never have existed in the first place.

Data Warehouse Performance

New Star Schema  support in the optimizer.  What does this one mean?  This is another offshoot of XPS and the multi-index support above.  In Panther, when you have many dimension tables, you can create a single compound index on all of the dimension table keys in the fact table.  The optimizer will realize this and use the filters on the dimension tables to generate a list of valid combinations of the dimension tables' keys into a temp table and join that temp table to the fact table using the compound index to quickly locate the rows that satisfy the criteria on the dimension tables.  Whoosh!  At ADTC we've tested this one quite a bit and the results are impressive.

Administration

I put the new Heterogeneous Server Grid feature under this heading because I can't think of a better one.  IBM is actually making quite a bit of noise about this one and rightfully so.  If you have lots of old hardware around, don't throw it out.  Don't sell it to a junker or to a used equipment reseller.  Configure it as an IDS grid node and add to your company server farm's net computing power at near zero cost (since the boxes are already paid for and amortized).  I'm waiting for confirmation about whether this means you can have an HDR or other MACH11 secondary that's build from different hardware or different OS.  Is suspect not, but who knows what the future may bring.  Heterogeneous grid was a big request from a number of very large Informix customers, so if Heterogeneous MACH11 members are what you need, speak up.

Zero downtime rolling upgrades.  Speaks for itself.  Jarrod down in New Zealand is jumping for joy over this one I'm sure.  He and his one assistant have to upgrade a very large number of servers during the few weeks a year that his company's servers can afford to be offline even for a few hours.  Being able to configure their server grid to be able to do this anytime to roll in a fixpack that removes a critical bug will be a god send for him and many of us out there.

The IIUG will soon be posting its new Features Survey which we will then hand to IBM to help them design the next release of Informix Dynamic Server.  If you don't think that you have a voice in how IBM determines features for Informix, think again.  Every one of the features mentioned here was requested by users like you/me/us either directly or through an expressed need that had to be filled.  Several are features that were asked for and received high marks on the previous Features Surveys.  So, when you see the Insider or email announcement of the Survey, fill it out. 


In the next release I'd like to see:

  • Transaction survival when the server your session is connected to crashes!
  • Bitmap indexes with multiple bitmaps used to generate a net bitmap of satisfying rows.

What would you like to see?  Comments welcome.

Wednesday, July 28, 2010

New Journaled Filesystem Rant

There have been questions from multiple posters on the Informix Forums lately asking about Journaled File Systems (JFSes) like EXT3, EXT4, and ZFS among others.  Bottom line?  JFSes should NEVER be used for storing data for a database system.  ANY database system, whether it is Berkley DB, Oracle, Sybase, DB2, MySQL, PostGreSQL, MS SQL Server, Informix, whatever.  "But", you protest, "the journaling makes the filesystem safer.  It speeds recovery.  It is 'good thing'!"  No.  Not for databases.  Flat out - no!

First, your database is already performing its own logging (read journaling for the DB neophite).  That is sufficient to permit proper and secure recovery.  It is also fast - if it weren't the database product would have gone the way of DBase2 and DBase3 long ago.  The filesystem's journal is redundant at best and at worst will actually slow recovery (versus using RAW or COOKED - non-filesystem - space for storage) by requiring two sets of recovery operations to happen sequentially.  Note that all properly designed database systems use O_SYNC or O_DIRECT mode write operations to ensure that their data is safely on disk.  However, it has come to my attention that many journaling filesystems do not obey these directives when it comes to metadata changes.  On these filesystems metadata is ALWAYS cached.  Therefore there is neither a safety nor recovery speed gain from using JSFes for database storage.

Most JFSes use metadata only journaling.  Here is some insight into that process, and why JFSes should not be used for database storage:
  • This (logical metadata only journaling) is the method used by EXT3, EXT4, JFS2, and ZFS
  • All of these except AIX's JSFS2  use block relocation instead of physical block journaling (AIX's JFS2 - and the Open Source JFS filesystem derived from it - does not journal or relocate data blocks so it is safe).  This means that on write a block is always written to a new location rather than overwriting the existing block on disk.  A properly designed JFS will commit the new version of the disk block before updating the metadata or the logical journal (that's the problem with EXT4 - and EXT3 with write-back enabled - they write the metadata first, then the journal entry before actually committing the physical change to disk).  Once the write and journal are completed the FS metadata is updated and the write is acknowledged.  This means that, in a proper JFS, on a crash there are three possibilities:
    • The new block version was partially or completely written but the journal entry was not written.
    • The new block version and journal entry were written and committed.
    • The new block version, journal, and metadata were written and committed.
In the first case, after recovery, the file remains unchanged, however the changes are lost.  In the second case, after recovery, the FS makes the missed metadata entries and the file is modified during recovery and the original block version is freed for reuse.  In the third case all was well before the crash and the original version of the block was released for reuse. 
The problem with EXT4 (and EXT3 with write-back enabled) is that the application (meaning in this case Informix or other database system) thinks everything is hunky dory since the FS acknowledged the change as committed.  However, immediately after the acknowledgment the physically modified block is still ONLY in cache and only the metadata and journal entry have been saved to disk.  At this point if there is a crash, the file is actually unrecoverable!  The metadata and the journal entry say the block has been moved to a new location and rewritten, but the new location has garbage in it from some previous block.  This one made Linus Torvalds absolutely livid and he tore the EXT4 designers a new one over the design.  You can GOOGLE his rants on the subject yourself.  Last I heard you could not disable the write-back behavior of EXT4 - Linus was pushing to have that fixed, but I don't know if it ever was.  I use EXT3 default mode for filesystems and EXT2 (the original non-journaled Linux FS) for database storage that I care about.

JFS2 and the Open Source JFS filesystem have no serious problems.  EXT3 in default mode and ZFS at least are safe, but the problem with them is just the fact of the block relocations.  There is the performance problem of rewriting a whole block every time the database changes a single page within the block and so negating much of the gains of caching and there is the bigger problem that the file is no longer even as contiguous as a non-journaled filesystem would have it be.  Standard UNIX filesystems (EXT2 and UFS as examples) allocate blocks of contiguous space and try to leave free space that is contiguous with those allocated blocks unused when allocating space for other files so that as a file grows it remains mostly contiguous in multi-block chunks.  This fragments the free space in an FS making it difficult to write very large files (like Informix chunks) that are contiguous, but if you keep the chunks on an FS that's dedicated to Informix chunks that has not been a real problem up until recently since Informix did not extend existing chunks over time prior to the recent release of Informix v 11.70.  Informix 11.70 can, optionally, extend the size of an existing chunk.  JFS's break that rule keeping the level of contiguous bits of a file the same as the block level.  Even if a chunk were allocated as contiguous initially, over time the JFS will cause the file to become internally fragmented.  A two logically contiguous blocks that were originally also physically contiguous can become spread out within the file's allocated space over time when they are rewritten.  If you make the FS block size smaller to alleviate the costs of multiple block rewrites, you make the file fragmentation worse.

These problems don't affect filesystems and normal files as much as databases because the nature of the IO to files is different than IO to databases.  When you write to a flat file, you write mostly sequentially, you rarely rewrite a portion of the file (unless you rewrite the entire file) and you never sync the file to disk before you close the file.  That means that the cache will coalesce all writes until an entire block has been written out before the FS and OS cause a flush and sync of the cache to disk.  That means that the FS has the ability to try to keep the rewritten blocks contiguous by allocating the replacement blocks contiguously.  Essentially the file is relocated whole if it is rewritten. 

Databases don't work that way.  Informix, for example, writes every block to a COOKED device or filesystem chunk either under O_SYNC or O_DIRECT control both of which force the single write operation (and Informix only ever writes a single page or eight contiguous pages at a time) to be physically written and committed before the write() call returns.  That means that the coalescing features of the FS and OS cache management are bypassed in favor of data safety.  So, if the engine performs what it thinks is a sequential scan, it is actually performing a random read of the file swinging the read/write heads back and forth across the disk.  If the physical structure is shared with other applications and even other machines (can you say massive SAN?) that will also be competing with those other storage clients for head positioning.  In normal sequential scanning (ie RAW or COOKED device or non-JFS files) the disk, controller, filesystem, and database read ahead processing reduces the performance impact of this head contention somewhat.  In a JFS that uses block relocation read ahead cannot help at all.

All of this having been said, I guess I have to change my mantra:

NO JFS, NO RAID5!!!   NO JFS, NO RAID5!!!   NO JFS, NO RAID5!!!   NO JFS, NO RAID5!!!   NO JFS, NO RAID5!!!   NO JFS, NO RAID5!!!   NO JFS, NO RAID5!!!   NO JFS, NO RAID5!!!   NO JFS, NO RAID5!!!   NO JFS, NO RAID5!!!   NO JFS, NO RAID5!!!   NO JFS, NO RAID5!!!   NO JFS, NO RAID5!!!   NO JFS, NO RAID5!!!   NO JFS, NO RAID5!!!   NO JFS, NO RAID5!!!   NO JFS, NO RAID5!!!   NO JFS, NO RAID5!!!   NO JFS, NO RAID5!!!   NO JFS, NO RAID5!!!   NO JFS, NO RAID5!!!   NO JFS, NO RAID5!!!   NO JFS, NO RAID5!!!   NO JFS, NO RAID5!!!   NO JFS, NO RAID5!!!   NO JFS, NO RAID5!!! 

Oh!  Also, PLEASE:  NO RAID6!!!!!!!!!!!  Yuck.

Wednesday, June 16, 2010

RAID5 Rant

I used to post to the Informix NewsGroup about once a year.  I haven't done so in a long time, and now I have a different forum to do that in.  This BLOG.  So, for all of your reading pleasure here is my analysis of why RAID5 is Unsafe at Any Speed:


RAID5 versus RAID10 (or even RAID3 or RAID4)

What is RAID5?

OK here is the deal, RAID5 uses ONLY ONE parity drive per stripe and many
RAID5 arrays are 5 (if your counts are different adjust the calculations
appropriately) drives (4 data and 1 parity though it is not a single
drive that is holding all of the parity as in RAID 3 & 4 but read on). If
you have 10 drives or say 20GB each for 200GB RAID5 will use 20% for
parity so you will have 160GB of storage.  Now since RAID10, like
mirroring (RAID1), uses 1 (or more) mirror drive for each primary drive
you areusing 50% for redundancy so to get the same 160GB of storage you
will need 8 pairs or 16 - 20GB drives, which is why RAID5 is so popular. 
This intro is just to put things into perspective.

RAID5 is physically a stripe set like RAID0 but with data recovery
included.  RAID5 reserves one disk block out of each stripe block for
parity data.  The parity block contains an error correction code which can
correct any error in the RAID5 block, in effect it is used in combination
with the remaining data blocks to recreate any single missing block, gone
missing because a drive has failed.  The innovation of RAID5 over RAID3 &
RAID4 is that the parity is distributed on a round robin basis so that
there can be independent reading of different blocks from the several
drives.  This is why RAID5 became more popular than RAID3 & RAID4 which
must sychronously read the same block from all drives together.  So, if
Drive2 fails blocks 1,2,4,5,6 &7 are data blocks on this drive and blocks
3 and 8 are parity blocks on this drive.  So that means that the parity on
Drive5 will be used to recreate the data block from Disk2 if block 1 is
requested before a new drive replaces Drive2 or during the rebuilding of
the new Drive2 replacement.  Likewise the parity on Drive1 will be used to
repair block 2 and the parity on Drive3 will repair block4, etc.  For
block 2 all the data is safely on the remaining drives but during the
rebuilding of Drive2's replacement a new parity block will be calculated
from the block 2 data and will be written to Drive 2.

Now when a disk block is read from the array the RAID software/firmware
calculates which RAID block contains the disk block, which drive the disk
block is on and which drive contains the parity block for that RAID block
and reads ONLY the data drive.  It returns the data block . If you later
modify the data block it recalculates the parity by subtracting the old
block and adding in the new version then in two separate operations it
writes the data block followed by the new parity block.  To do this it
must first read the parity block from whichever drive contains the parity
for that stripe block and reread the unmodified data for the updated block
from the original drive. This read-read-write-write is known as the RAID5
write penalty since these two writes are sequential and synchronous the
write system call cannot return until the reread and both writes complete,
for safety, so writing to RAID5 is up to 50% slower than RAID0 for an
array of the same capacity.

Now what is RAID10:

RAID10 is one of the combinations of RAID1 (mirroring) and RAID0
(striping) which are possible.  There used to be confusion about what
RAID01 or RAID01 meant and different RAID vendors defined them differently.
Several years ago I proposed the following standard language which
seems to have taken hold.  When N mirrored pairs are striped together
this is called RAID10 because the mirroring (RAID1) is applied before
striping (RAID0).  The other option is to create two stripe sets and mirror
them one to the other, this is known as RAID01 (because the RAID0 is applied
first).  In either a RAID01 or RAID10 system each and every disk block is
completely duplicated on its drive's mirror.  Performance-wise both RAID01
and RAID10 are functionally equivalent.  The difference comes in during
recovery where RAID01 suffers from some of the same problems I will describe
affecting RAID5 while RAID10 does not.

OK, so what?

Now if a drive in the RAID5 array dies, is removed, or is shut off data is
returned by reading the blocks from the remaining drives and calculating
the missing data using the parity, assuming the defunct drive is not the
parity block drive for that RAID block.  Note that it takes 4 physical
reads to replace the missing disk block (for a 5 drive array) for four out
of every five disk blocks leading to a 64% performance degradation until the
problem is discovered and a new drive can be mapped in to begin recovery.

If a drive in the RAID10 array dies data is returned from its mirror drive
in a single read with only minor (6.25% on average) performance reduction
when two non-contiguous blocks are needed from the damaged pair and none
otherwise.

One begins to get an inkling of what is going on and why I dislike RAID5,
but, as they say on late night info-mercials, wait, there's more.

What's wrong besides a bit of performance I don't know I'm missing?

OK, so that brings us to the final question of the day which is: What is
the problem with RAID5?  It does recover a failed drive right?  So writes
are slower, I don't do enough writing to worry about it and the cache
helps a lot also, I've got LOTS of cache!  The problem is that despite the
improved reliability of modern drives and the improved error correction
codes on most drives, and even despite the additional 8 bytes of error
correction that EMC puts on every Clariion drive disk block (if you are
lucky enough to use EMC systems), it is more than a little possible that
a drive will become flaky and begin to return garbage.  This is known as
partial media failure.  Now SCSI controllers reserve several hundred disk
blocks to be remapped to replace fading sectors with unused ones, but if
the drive is going these will not last very long and will run out and SCSI
does NOT report correctable errors back to the OS!  Therefore you will not
know the drive is becoming unstable until it is too late and there are no
more replacement sectors and the drive begins to return garbage.  [Note
that the recently popular ATA drives do not (TMK) include bad sector
remapping in their hardware so garbage is returned that much sooner.] 
When a drive returns garbage, since RAID5 does not EVER check parity on
read (RAID3 & RAID4 do BTW and both perform better for databases than
RAID5 to boot) when you write the garbage sector back garbage parity will
be calculated and your RAID5 integrity is lost!  Similarly if a drive
fails and one of the remaining drives is flaky the replacement will be
rebuilt with garbage also.

Need more? 

During recovery, read performance for a RAID5 array is degraded by
as much as 80%.  Some advanced arrays let you configure the preference
more toward recovery or toward performance.  However, doing so will
increase recovery time and increase the likelihood of losing a second
drive in the array before recovery completes resulting in catastrophic
data loss.  RAID10 on the other hand will only be recovering one drive out
of 4 or more pairs with performance ONLY of reads from the recovering pair
degraded making the performance hit to the array overall only about 20%!
Plus there is no parity calculation time used during recovery - it's a
straight data copy.

What about that thing about losing a second drive? 

Well with RAID10 there is no danger unless the one mirror that is recovering
also fails and that's 80% or more less likely than that any other drive in a RAID5
array will fail!  And since most multiple drive failures are caused by
undetected manufacturing defects you can make even this possibility
vanishingly small by making sure to mirror every drive with one from a
different manufacturer's lot number. 

"Oh!", I can hear you say, "This schenario does not seem likely!" 
Unfortunately it is all too likely.  It happened to me and I have heard
from several other DBAs and SAs who have similar experiences.  My former
employer lost 50 drives over two weeks when a batch of 200 IBM OEM drives
began to fail.  IBM discovered that that single lot of drives would have
their spindle bearings freeze after so many hours of operation.  Fortunately
due in part to RAID10 and in part to a herculean effort by DG techs and our
own people over 2 weeks no data was lost. HOWEVER, one RAID5 filesystem was
a total loss after a second drive failed during recover.  Fortunately
everything was on tape and the restore succeeded.  However, that filesystem
was down for several hours causing 1500 developers to twiddle their thumbs
for most of a day.  That one internal service outage of only a few hours
cost more in lost productivity than the extra cost of using RAID10 for all
of those filesystem arrays! 

Conclusion?  For safety and performance favor RAID10 first, RAID3 second,
RAID4 third, and RAID5 last!  The original reason for the RAID2-5 specs
was that the high cost of disks was making RAID1, mirroring, impractical.
That is no longer the case!  Drives are commodity priced, even the biggest
fastest drives are cheaper in absolute dollars than drives were then and
cost per MB is a tiny fraction of what it was.  Does RAID5 make ANY sense
anymore?  Obviously I think not.

To put things into perspective:  If a drive costs $1000US (and most are
far less expensive than that) then switching
from a 4 pair RAID10 array to a 5 drive RAID5 array will save 3 drives or
$3000US.  What is the cost of overtime, wear and tear on the technicians,
DBAs, managers, and customers of even a recovery scare?  What is the cost
of reduced performance and possibly reduced customer satisfaction? Finally
what is the cost of lost business if data is unrecoverable?  I maintain
that the drives are FAR cheaper!  Hence my mantra:

NO RAID5!  NO RAID5!  NO RAID5!  NO RAID5!  NO RAID5!  NO RAID5!  NO RAID5!  NO RAID5!  NO RAID5!  NO RAID5!

Sunday, May 2, 2010

Selling Informix

OK, I'm a tech weinie.  A nerd.  A geek.  But it wasn't always so.  I grew up working with my family in retail hobbies and sporting goods and I started my undergraduate career as a Mathematics major.  After a break of a couple of years - caused by a night of never-to-be repeated seizures and a month of heavy duty downers before I could get an appointment with a Neurologist who could tell me I didn't need to take them - I finished up with a degree in Management and Marketing.  I also spent a couple of years in sales following that before deciding that my first love was Programming which I first learned in High School.

Why do I mention this?  As background to today's post.  I want to promote my own view of how IBM SHOULD be presenting their database portfolio to the world.  The personal history is here to indicate that I'm not JUST a tech weinie.  When it comes to sales and marketing I probably have as much training and experience as more than a few IBM sales and marketing weinies.  I have been promoting this idea to several members of the Informix user community and to any IBMer who will give me 5 minutes to badger them - and that includes Rob, Arvind, Jerry, Gary P., Cindy F., and others.  I prepared a Powerpoint slide for Rob Thomas outlining my ideas when he met with the IIUG BOD this winter (and BTW, one reason that I am, as IB I said in my first post, cautiously optimistic about IBM's new attitude is the effort that Rob made to drive all the way up directly from the airport in Philadephia in the aftermath of a snow storm after flying many hours back from South America and before even going home to his family).  I also displayed it on the big screens at the Conference for two hours or so on Wednesday.  I'll try to attach it as a JPEG here. 

So, what's the point?  IBM keeps saying that they have had trouble over the years, even, as they admit, when they have tried which they have not always done, positioning Informix Dynamic Server in the market place and especially against or alongside IBM DB2.  Note that I spelled out IDS because IBM - the quintessential purveyor of acronyms - has decided that outside of our community "IDS" has no meaning and we should just say "Informix" from now on.  I agree with the sentiment, but prefer to use the full product name rather than the brand name.

What the presentation above shows is that IBM should not be trying to position Informix Dynamic Server and DB2 at all.  IBM has a thing to be envied throughout the database software community - a database server product line.  No one else has or can possibly dream of having such a complete package to present to customers.  Trying to say to a customer that one of these products (and I'm not excluding DB2 here) is the one database server that the customer should purchase is the wrong approach.  Trying to develop a strategy under which salespeople are scripted to present one server to customers using or considering Oracle and another to customers using or considering MS SQL Server is simply misguided thinking.  I like to explain this with an analogy taken from one of the bastions of Informix Dynamic Server use - The Home Depot.

If you go into THD's Tool Hut looking for a hammer, do you expect to find a sales associate standing there who will look you over, evaluate your needs at a glance, and recommend the perfect hammer for your needs.  Will you have the sales associate running through a script in his or her head that says "If the customer is a woman recommend the light weight, inexpensive, wooden hammer with the smallish head.  If the customer is wearing overalls and a tank top recommend the kevlar and graphite fiber handled hammer with the titanium composite head." etc?  No!

You will find a wall rack displaying THD's hammer product line.  Many hammers.  Wooden, metal, fiberglass, and graphite handles.  Small heads and medium heads and massive heads.  Claw hammers and ball peen hammers.  Hammers from a local forge and hammers from national brands.  When you look at that impressive and possibly confusing display, one of two things will happen.  Either you will look upon that vast collection and evaluate size, construction materials, weight, and price and pick one that you think meets your needs on your own, or, having stared starry eyed for several minutes, you will be approached by a sales associate who will say something on the order of  "Hi.  You look lost.  Looking for a hammer?"  "Yes", you will respond, "but there are so many to choose from...".  The associate will cheerily reply "No problem.  Somewhere up there is the perfect hammer for any particular job.  Tell me what you need it for and together we'll find the one best hammer or perhaps two hammers for you and the jobs you have."  Being retail hardware geeks and not sophisticated software salespeople, they won't be quite that eloquent, but you get the idea.

IBM is trying to be that nonexistent hammer pusher.  Instead they need to be that THD sales associate.  If they do that, no other RDBMS vendor can compete.  Oracle might be able to come in and offer MySQL Enterprise, Oracle Database 11g, and Oracle with TimesTen. Sybase can offer embedded ISAM database and Sybase Adaptive Server Enterprise.  Microsoft can offer only Access and SQL Server.  There are HUGE gaps in those product lines.

Only IBM can offer Informix Standard Engine, Informix Online 5.20, Informix Dynamic Server Workgroup Edition, Informix Dynamic Server, IBM DB2, and Informix Dynamic Server or IBM DB2 with the SolidDB Cache.  Unlike Oracle's "Editions" the five distinct products IBM has in its tool corral represent four distinct database engines with different characteristics and capabilities.  Standard Engine is the database that created Informix.  It started out as the embedded database engine delivered with the 4GL development language.  It was so good for real business database applications that it had to become a separate product. It is still a great database engine for smallish OLTP databases and while it does not scale to massive amounts of data (though it can certainly handle that physically) it does scale to large numbers of users and it is FAST for the simple queries these users require.  Great engine for students, startups and small businesses.  Great intro engine for VARs and ISVs.

OnLine was the first RDBMS to permit taking an archive while the engine was actively being accessed and updated, hence Online.  This is still an ideal database engine for medium sized OLTP databases and still the same fast engine that put Informix on the map as a competitor to Oracle.  It scales to many users and to fairly large databases being able to handle up to about 200TB of data total and single tables up to 32GB.

That brings us to the Informix Dynamic Server "Editions".  Workgroup Edition and Enterprise Edition are the same codebase but WE has throttles to limit the amount of resources you can access and some features are reserved for licensees of EE.  None of the competing databases (DB2 is NOT competition, it's part of the strategy - but the following holds for DB2 as well) can handle more data, process data faster, is more flexible, uses a little resources to get the job done, or requires as little maintenance and monitoring (as much as the latter is one of my own focuses) as Informix Dynamic Server.  For the medium sized business, for data marts, and for small data warehouses Informix Dynamic Server Workgroup Edition is ideal.  For massive OLTP databases and even medium sized data warehouses Informix Dynamic Server Enterprise Edition is unparalleled.  If you want to embed business logic into the engine itself there really is only one choice because only Informix Dynamic Server, of all of the players, can make that work efficiently and seamlessly.

All of the Informix branded database servers are embeddable to an extent that others are not.  Sybase had to go back to the drawing board to develop that capability which has been purpose built into every Informix brand database server from the beginning.

During the Q&A on Wednesday morning, Inhi Cho dismissed MySQL because "No business is putting MySQL into real production service." or something very close to that (I wasn't recording).  Those of us out in the field know that this is just not true, even for big business, as Gumby points out, MySQL is a real part of their data infrastructure strategy.  But for startups and small businesses it is becoming a strategic database server.  Now, I do agree with her other comment that "MySQL cannot scale to handle a business as it grows" from a startup to a serious enterprise.  However, neither Informix Dynamic Server nor IBM DB2 can fill that need in the critical years when a business is too small to pay for an Enterprise Class database server.  But IBM has products that can.  Also, let me state this emphatically and with no fear of being contradicted, any application written for Informix Standard Engine will run unmodified on OnLine or Dynamic Server and if it was compiled with a recent enough SDK without even recompiling! (OK, one exception is if the app embedded an SE style database name/path - and that should be fixed in future releases of SE in my opinion.)  Absolutely every application written to run on Informix OnLine will run unmodified against Dynamic Server without exception.  If it was compiled with an SDK released after 1994 it will even run without recompilation/relinking.  Forward compatibility has always been built into Informix branded database servers and their SDKs.  How's that for an upgrade path!

The chart above shows that for any market segment, there is an IBM RDBMS that fills the need.  The Orange represents database access types or patterns.  OLTP versus Decision Support versus Data Mart versus Data Warehouse.  The Pink represents organization size or type.  Student developers versus business startups versus established small businesses versus medium sized business versus large business versus ISVs and VARs.  The Green represents the data store type Relational versus Object Relational.

Opposite this all is the Blue section at the top showing the Informix portion of the IBM DBMS product line and my suggestions for price-point positioning.  I do believe that DB2 LUW and SolidDB Cache (as well as the SolidDB embedded database) are part of this same single product line strategy, I only did not include them because I produced this to answer Rob's notion of an "Informix Product Strategy" so I concentrated on that. 

That's it.  Comments from the field or from IBM are welcome.  Let's start a discussion around this.

Thursday, April 29, 2010

After the IIUG Conference

Just a quick word:

The International Informix Users' Group Conference for 2010 is over. The members of the IIUG Board of Directors are winding down and catching our breath. I'm posting my first ever Blog posting to this my new blog. It's weird.

Anyway I'm posting during a break in the action at the IBM Customer Advisory Council meeting here in Overland Park KS. The conference we GREAT! Attendence was up noticeably from last year with attendees from 22 or 24 countries (depending on who was counting and whether we count Kansas as a separate country). There were users here from as far away as Malaysia and Venezuala, from all four corners and borders of the US.

IBM execs and, support people, and technical folk have been incredibly receptive to and active in acquiring our needs and requirements for product features, support, marketing, etc. It has been a very gratifying experience.

Overall, the atmosphere has been very positive which is a change from last year. Altogether a good conference. If you were not here this year, plan to attend next year!