IIUG Conference 2018

Looking forward to meeting in Washington DC next October!

Wednesday, December 14, 2011

Solution to a problem

I received an email this morning with a link to a Linked-In post that pointed to someone's BLOG.  The Blogger was proposing a solution to a problem someone had presented to him at work that he thought was rather elegant and so posted it for the consumption and edification of the Linked-In community.  The problem:  In a single SQL statement register an attendee for an event unless the attendee is already registered for an overlapping event.  The solution was, to me, overly complex and absolutely non-portable in that it takes advantage of Oracle's conditional INSERT statement, so I said to myself: "How would I solve this better and in a fairly portable fashion?"  So, presented here is my solution:

Setup:
-- Table attendee( attendee_id serial, name char(40) );
-- Table event( event_id serial, title char(80), start_time datetime year to minute, end_time datetime year to minute, location char(10) );
-- Table attendance( attendee_id int, event_id int, register_time datetime year to fraction(3), attended boolean );

Assume all tables have the appropriate RI constraints and indexes in place.

I propose that a simple INSERT TRIGGER (and yes an UPDATE trigger would also be appropriate but I leave that to the reader as an exercise) is the correct, simple, and portable solution:

CREATE PROCEDURE attendee_check() REFERENCING NEW AS neu FOR attendance;

DEFINE clash INT;

       SELECT count(*)
       INTO clash
       FROM attendee AS a, event AS e1, attendance AS ae, event AS e2
       WHERE a.attendee_id = neu.attendee_id
         AND e1.event_id = neu.event_id
         AND ae.attendee_id = a.attendee_id
     AND e2.event_id = ae.event_id
     AND e1.start_time <= e2.end_time
     AND e1.end_time >= e2.start_time
     AND e2.event_id != e1.event_id;

       IF (clash > 0) THEN
             RAISE EXCEPTION -746, 0, "Overlapping Event Rejected!";
       END IF

       RETURN;

END PROCEDURE;


CREATE TRIGGER attend_trig_i INSERT ON attendance
       REFERENCING NEW as neu
       FOR EACH ROW
       ( EXECUTE PROCEDURE attendee_check( ) WITH TRIGGER REFERENCES );



-- Attendees
insert into attendee( 0, 'Art' );
insert into attendee( 0, 'Fred' );
insert into attendee( 0, 'Barney' );

-- Events
insert into event values( 0, 'Event 1',  '2011-12-28 09:00', '2011-12-28 10:00', 'Here');
insert into event values( 0, 'Event 2',  '2011-12-28 09:00', '2011-12-28 10:00', 'There');
insert into event values( 0, 'Event 3',  '2011-12-28 09:30', '2011-12-28 10:30', 'Elsewhere');
insert into event values( 0, 'Event 4',  '2011-12-28 10:30', '2011-12-28 11:30','Elsewhen');

-- Test:
> insert into attendance (attendee_id, event_id, register_time, attended ) values ( 1, 1, current, 'f' );

1 row(s) inserted.

> insert into attendance (attendee_id, event_id, register_time, attended ) values ( 1, 4, current, 'f' );
1 row(s) inserted.

> insert into attendance (attendee_id, event_id, register_time, attended ) values ( 1, 1, current, 'f' );
  268: Unique constraint (informix.attendance_pk) violated.

  100: ISAM error:  duplicate value for a record with unique key.
Error in line 1
Near character position 51
> insert into attendance (attendee_id,event_id,  register_time, attended ) values ( 1, 2, current, 'f' );

  746: Overlapping Event Rejected!
Error in line 1
Near character position 1
> insert into attendance ( attendee_id, event_id, register_time, attended ) values ( 1, 3, current, 'f' );

  746: Overlapping Event Rejected!
Error in line 1
Near character position 1
>
select * from attendance;

attendee_id    event_id signup_time             attended

          1           1 2011-12-14 10:48:11.221        f
          1           4 2011-12-14 10:49:21.361        f

2 row(s) retrieved.

>

Simple, elegant, and except for the details of the syntax of the stored procedure (and in some cases the trigger) it is portable and can be implemented in most RDBMSes, unlike the original solution.  The final advantage is that this solution will prevent overlapping registrations no matter who writes the application making the insert which is now a simple insert.  The application developer is not required to know in advance that he has to meet this requirement (as obvious as it should be ;-).




Tuesday, August 9, 2011

The best database for every purpose!

As a consultant I often am asked "What is the best database for everything we do?" Unfortunately, there is no easy answer for that, and I know this is not the response you all expect from me. Bear with me on this and I will explain.

The best database, and to be clear by 'database' I mean Database Management System, is not MySQL, PostGreSQL, Ingres, Oracle, Sybase, IBM DB2 branded products, IBM's Informix branded products, Voltdb, OpenSQL, Berkeley DB, Progress, Unify, MS SQL Server, or any of the literally hundreds of other RDBMS products on the market whether they are free or for-cost. No single database product can serve all of the needs of any large organization nor for most smaller organizations. That's a fact! While I think that IBM Informix Dynamic Server Ultimate Edition is the cats meow of RDBMS products, the performance, feature, and innovation leader in my book, it is not best for every purpose in every organization.

So, the question remains out there: "What is the best database for ...", what do I recommend? Because if my clients have to use a different database server product for each group of applications that fit best, or even groupings that are sometimes second best, they are still going to have to maintain and be able to manage and tune several products from several companies won't they? The list of skill sets that their DBAs will need will be very broad and I will have to hire more DBAs than I might need if I could just use one product.

I've been thinking about this for a while, and I keep coming back to the marketing presentation I made to IBM two years ago and to how appropriate that was to this question even more so than to the question of how IBM should market its database products. If my clients could get the "right" database, or close enough, for every situation from a single vendor, and if that vendor could make the skills needed to manage all of those database products they market sufficiently portable, that would reduce my clients' costs of ownership substantially making this single vendor solution even more attractive.

OK, so what product lines are out there? Oracle effectively has MySQL for very small projects and their Enterprise Class Oracle server in its various "editions". Sybase has an Enterprise class server and a mobile server. Ingres has only an Enterprise class server but not much third party software support left. Progress is a good SMB server only. Same with Unify. PostGreSQL has many nice enterprise class features but not all and many organizations won't bet the ranch on an open source product that doesn't have corporate support behind it.  Microsoft has only SQL Server in its two editions. That brings us to IBM which, as I pointed out two years ago, has a plethora or database management systems appropriate for any size organization and any task:
  • C-ISAM and IMS for application level databases.
  • IBM Informix Standard Edition for small business.  Clean simple OLTP performance for simple schema databases serving up to several hundred users with zero maintenance.  This is the original Informix database server and it is still a viable product for the right organization and project.
  • IBM Informix OnLine for small to medium business.  Clean simple OLTP performance for simple schema databases up to several TB in size serving up to several thousand users with extremely low maintenance and outstanding reliability and uptime.  This is the RDBMS that revolutionized the industry by finally allowing one to archive a database without having to take it offline or block transactions.
  • IBM Informix Dynamic Server (Innovator, Growth/Choice, and Ultimate Editions) for:
    • Medium to large business
    • OLTP performance and better than five-9s reliability
    • Medium to large Data Warehouse
    • Large Data Mart and DSS applications
    • Object Relational features
    • Industry leading expandability and extendability
    • Industry leading uptime and reliability with near zero unplanned outage and very low planned outage requirements
    • Industry leading TCO requiring 1/2 to 1/4 the DBA support of other RDBMS products
    • Databases up to 128PB serving many thousands of concurrent users
    • Industry leading replication and clustering technology that scales better than the competition
    • Unmatched embeddability with configurable footprint
    • Heterogeneous clusters of  hundreds of servers on different hardware and OS platforms without special backbones or infrastructure
  • IBM DB2 Mainframe for mainframe based applications
  • IBM DB2 LUW which fills most of the same niches as Informix Dynamic Server 
    • Strength in 3rd party application support
    • Very large data warehouses.
    • Distributed server
    • DB2 Pure Scale highly scalable homogeneous clustering
  • IBM Informix Ultimate Warehouse Edition with the Informix Warehouse Accelerator for accelerating complex data warehouse style queries by up to 300X without losing Informix's OLTP performance and features and without requiring expensive special purpose hardware.
  • Netezza provides performance even beyond Informix Ultimate Warehouse for dedicated DW applications that need ultimate speed.
So is IBM missing anything?  Yes!  It is missing a unified structure of management and maintenance tools that can cross and span the entire range of database products so that an organization's DBA skills can also span the product line.  Put that in place and two things happen:
  1. I have the answer to my clients' questions and
  2. IBM can approach the market with this now unified product line that no one can touch
 Instead of producing a white paper entitled "Six reasons to move from Oracle to IBM" which should really have been entitled "Six reasons to move from Oracle to IBM DB2" they can produce a white paper entitled "Sixty reasons to move from Oracle to IBM" that can highlight all of the benefits of using a unified database product line within your organization.  This makes IBM a one-stop-shop and makes my life as a consultant a whole lot easier.

Thursday, June 30, 2011

IBM Finally Wakes Up

It looks like IBM took my test and passed! On June 8, 2011, IBM in cooperation with SmartGRIDNews.com, ran a webinar presentation to over 450 Energy Industry executives and technologists. The main thrust of the presentation was avoiding the pitfalls of implementing Smart Meter systems, however, more than 50% of the time was spent explaining how Informix is the only product that can handle the volume of data that smart data produces. (See: http://www.smartgridnews.com/artman/publish/Video-Webinars/). Kudos to IBM's Richard Wozniak (an old Informixer BTW) for a clear presentation.

Thursday, June 16, 2011

Dumb and Dumber

This is a quiz. Just like the SATs/GREs/ACTs, first we have the story, then the questions to see if you are paying attention:

Let's say I have a product that has a feature that a large segment of the market needs. Let's add to this scenario that there is no other product that can fill the particular need as well. There are competing offerings, but none will work as well, all will cost more to purchase than I will sell mine for, and all will cost the customer more to maintain and manage than using my product. Being a poor historian I remember the bit of wisdom erroneously attributed to Ralph Waldo Emerson: "If you build a better mousetrap the world will beat a path to your door." (it was actually coined by Elbert Hubbard seven years after Emerson's death as something Mr. Emerson might have said if he had thought of it). Taking that sentiment to heart I post a note in my BLOG, announce the product on my product's Facebook Wall and LinkedIn page and wait for the sales to roll in.

When that doesn't work I hire some new sales people, move others from other products, and shuffle a successful sales leader from another product line over to head up a "SWAT Team" to go talk to all of my existing customers and try to get them to start using this peachy feature so I can generate some PR success stories about them that might prompt new customers to take a look at my product and this nifty feature.

Question 1:
The most likely outcome of this scenario is:
a. I will sell thousands of new licenses for my product in just weeks.
b. I may get a few existing customers to begin using this neat feature, but it will generate no new net licenses.
c. Nothing.

Answer: [ ]

Question 2:
The best word to describe me as a business man is:
a. Entrepreneur
b. Genius
c. Market leader.
d. Idiot

Answer: [ ]

Question 3:
The response from existing customers to my efforts to get them to use the nifty feature will likely be:
a. "Hey, this is really cool and fast and saves storage too! Thanks! When can I shoot my PR video?"
b. "Dude! We don't do that kind of stuff here, so I have no need for this nifty feature. Sorry."
c. "I know all about this feature, have for years. Don't need it. Go away!"

Answer: [ ]

Thursday, May 5, 2011

AUS -versus- dostats

Do you like the idea of using the task manager to take care of update statistics tasks but prefer to use dostats than AUS (Auto Update Statistics) in 11.xx?  No problem, for several releases dostats_ng.ec has supported creating an AUS-like refresh function.

For a while now I have noticed that clients that are trying to use AUS to manage their statistics on larger databases are experiencing problems with the system catalogs and some tables having data distributions that are out-of-date or insufficiently detailed.  Many are going back to using dostats with our help.

In this entry I am going to show you how to completely replace AUS with dostats.  The steps are simple:
  1. Get utils2_ak and build and install dostats (by default it is now built from the next generation dostats_ng.ec source).
    1. Get utils2_ak: This one is easy.  Go to the IIUG Software Repository (www.iiug.org/software) and download the package.
    2. Build the package: Usually easy, on Linux and any platform with GNU Make and ESQL/C configured to use GCC, just type 'make'.  The file BUILDING and notes in the makefile (Makefile.nognumake) explain what to do if you have to use UNIX make and the native cc compiler on various platforms.  But this is also easy, it is usually a 5 minute task to modify the makefile for your platform - don't forget to modify the makefile in myschema.d if you want to build myschema as well.  If you are on AIX, you will also have to get my portable at utility "ar2" and compile and use that to extract the source for myschema from it's System V/POSIX format ar archive file.  If you have trouble - email me.
    3. Install:  There is an 'install' target in the makefile, so you just have to edit the target path into which the files will be installed and "make install".
  2. Disable the AUS tasks in the task manager.
    1. Easy Peasey.  You can do this in OAT or just go into dbaccess or OAT's SQL editor and run:
      UPDATE sysadmin:ph_task set tk_enable = 'f' where tk_name matches 'Auto *';
  3. Install new tasks.
    1. There are two tasks, the refresh and the evaluator.  The refresh is VERY easy because dostats takes care of that for you.  Configuring the evaluator is a bit trickier.  Fortunately, I've scripted the whole shebang for you.  The setup_dostats.ksh script below will take care of everything. 
    2. There are three scripts below, setup_stats.ksh, evaluate_stats.ksh, and initial_stats.ksh  Just install all three scripts in your path and make two edits to customize the scripts for your environment:
      1.  In setup_stats.ksh change the start_time to the time you want the refresh task to kick off and estart_time to the time you want the next day's evaluator task to kick off.  Normally you want the evaluator to run after the end of a normal day's processing cycles but before the refresh task.  The refresh task should probably run before large end-of-day processes to optimize their runs.  The evaluator takes about two minutes to process a 2000 table database.
      2. Change the location of the environment setup script you use in all three scripts.
      •  There is a line in each file, below the legend:
      • # Environment "dot" script
      • Change that next line to execute your environment file that gives access to the instance and which must also include the bin directory containing dostats and the scripts themselves in its PATH. 
    3. All three scripts take two arguments, the name of the database you are scheduling, and the path to the directory into which you want log files for the runs to be written.   The script setup_stats.ksh installs a stored procedure in the target database that runs the evaluate_stats.ksh script at the appropriate time, creates a table in sysadmin, dostats_stmts, to hold the set of commands for the refresh to run, creates the SPL procedure that executes the refresh commands and installs it as a task to be run daily at the appropriate times, and creates an initial set of commands for the first run.
    4. As configured the evaluator script will use whatever thresholds for aging and browsing you have already set in OAT for AUS to use.  You can modify those settings in OAT or by editing the values stored in the tables in sysadmin, or you can edit the evaluate_stats.ksh script and replace --aus-thresholds with the -a -A [days] and -b -B [pct_chng] parameters.


  • The third script is just for you to use to unconditionally wipe clean your existing distributions and create a complete new set using dostats.  This will "prime the pump" for the aging threshold so that every table has the same initial starting point for its stats.  You will want to run this script once initially during a quiet time on your system and again say monthly or quarterly just to get everything on an even keel again periodically.
  •  
    ##################################################################
    # setup_stats.ksh - Install a procedure to generate updated data
    #     distributions for the named database.
    #
    # Written by: Art S. Kagel, Advanced DataTools Corp.
    #
    ##################################################################
    dbs=$1
    logdir=$2 

    export start_time='03:00:00'
    export estart_time='00:00:00'
    # Environment "dot" script
    . ~informix/ADTC/aliases.art
    #
    # Install the initial evaluator procedure and populate the dostats_stmts table with
    # an initial set of commands.  Also schedules the refresh job to run at 3h
    #
    {
        dostats -d $dbs -Q 100 -w 10 --aus-thresholds --isolation d -m --procedure dostats_refresh_$dbs --schedule --frequency d --first-time $start_time
     2>&1
    #
    # Install evaluator task to run at midnight daily
    #
    dbaccess -e $dbs - <<EOF

    create procedure evaluate_${dbs}_stats();
    define cmd char(200);
    let cmd = '/home/informix/ADTC/bin/evaluate_stats.ksh $dbs &';
    system cmd;
    end procedure;
    database sysadmin;
    delete from ph_task where tk_name = 'dostats_evaluator_task';
    INSERT INTO ph_task( tk_id,        tk_name, tk_type, tk_group, tk_description, tk_execute,
        tk_start_time, tk_stop_time, tk_frequency, tk_Sunday, tk_Monday,
        tk_Tuesday, tk_Wednesday, tk_Thursday, tk_Friday, tk_Saturday ) VALUES
    ( 0, 'dostats_evaluator_${dbs}_task', 'TASK', 'DOSTATS', 'dostats run for evaluate_${dbs}_stats',
    'EXECUTE PROCEDURE $dbs:evaluate_${dbs}_stats();', $estart_time, '10:00:05', ' 1 00:00:00', 't', 't', 't', 't', 't', 't','t' );
    EOF
    }  | tee $logdir/setup_stats.$dbs.$(date +%Y.%m.%d.%H.%M.%S).out
    echo "Report also written to: $logdir/setup_stats.$dbs.$(date +%Y.%m.%d.%H.%M.%S).out"


    ================================================================================
    #! /user/bin/ksh
    ##################################################################
    # initial_stats.ksh - unconditionally generate a full set of data
    #     distributions for the named database.
    #
    # Written by: Art S. Kagel, Advanced DataTools Corp.
    #  
    ##################################################################
    dbs=$1
    logdir=$2
    # Environment "dot" script
    . ~informix/ADTC/aliases.art
    {
        dostats -d $dbs -E -Q 100 -w 10 -n 100 --aus-thresholds --time-display --drop-distributions --isolation d -m  2>&1
    }  | tee $logdir/initial_stats.$dbs.$(date +%Y.%m.%d.%H.%M.%S).out
    echo "Report also written to: $logdir/initial_stats.$dbs.$(date +%Y.%m.%d.%H.%M.%S).out"
    ============================================================================
    #! /user/bin/ksh
    ##################################################################
    # evaluate_stats.ksh - Install a procedure to generate updated data
    #     distributions for the named database.
    #
    # Written by: Art S. Kagel, Advanced DataTools Corp.
    #
    ##################################################################
    dbs=$1
    logdir=$2
    # Environment "dot" script
    . ~informix/ADTC/aliases.art
    {
        dostats -d $dbs -Q 100 -w 10 --isolation d -m --procedure dostats_refresh_$dbs  2>&1
    }  | tee $logdir/evaluate_stats.$dbs.$(date +%Y.%m.%d.%H.%M.%S).out