Cloud computing and database artices

This has been a very exciting year for cloud computing - especially as relates to database technologies.  Amazon announced SimpleDB, Microsoft announced SQL Server Data Services and Google unveiled their big-table based Google App Engine.

For the first time in a long time, it looks like database technologies are a hot topic again. 

I've written some articles on cloud computing for Database trends and applications.  One on Amazon AWS,  one on Google App Engine and a two part article on next generation database technologies.  The first part of the article discusses the rise of the cloud database and the second part looks at Mike Stonbraker et al's proposals for next generation DBMS.

Read More

Uploading Oracle data to Google App Engine

Over the past 6 months, we've seen 3 significant cloud computing database announcements:

  • SimpleDB from Amazon
  • BigTable inside  Google App Engine
  • Microsoft SQL Server Data Services (SSDS)

These cloud databases provide non-relational simple but scalable data services to applications that want to live in a public cloud.  While they currently exist in a rapidly expanding hype-bubble, it may well be that cloud databases represent the biggest paradigm shift in DBMS since the relational revolution in the 80s.

Anyway, I was lucky enough to get a Google App Engine account and have been playing with the data access API.  I wanted to load up some data I had in Oracle tables and after a bit of mucking around decided I'd write a general purpose loader which others might fine useful.

Read More

Using _spin_count to reduce latch contention in 11g

Back in 2003, I published the results of experimenting with various values of _spin_count to improve throughput on latch congested systems. You can find the original paper here.

The study involved varying spin count on an Oracle 8.1.6 database suffering from heavy cache buffer chains latch contention.  The database was running on a Windows 2000 server.  The results of this study confirmed that _spin_count can be very effective in improving the throughput of latch contention-bound systems.  In the test, throughput almost doubled when spin_count was raised from the default of 2000 to 10000.

I recently repeated these tests for Oracle 11g.  This time, the database was experiencing shared pool and library cache latch contention and was running on RHEL 5.0.  The results will be published soon in my contention series at ToadWorld.   To summarize however,  increasing spin_count in 11g was equally effective in reducing latch contention.  As _spin_count increased, latch  waits reduced and throughput increased, up until CPU saturated, at which point no further improvements were achieved.

Read More

Accessing Oracle from Powershell

I've used perl extensively - and other scripting languages from time to time - to write little unitities or simulations for Oracle.  Perl is available on windows, but it doesn't provide easy access to all the Windows API and .NET utilities (such as accessing WMI counters).  For that PowerShell would be the best choice.

It's not entirely obvious how to connect to oracle from Powershell.  Here's how I did it.

Read More

D.I.Y. MySQL 5.1 monitoring

I wrote recently about using events and the new processlist table in MySQL 5.1 to keep track of the number of connected processes.  Although having the PROCESSLIST available as an INFORMATION SCHEMA table is usefull, it seemed to me that having SHOW GLOBAL STATUS exposed in a similar fashion would be far more useful.  So at the MySQL UC last year, I asked Brian Aker if that would be possible.  I know how many enhancement requests MySQL has to deal with, so I was really happy to see that table appear in the latest 5.1 build (5.1.14 beta). 

This table, together with the EVENT scheduler, lets us keep track of the values of status variables over time without having to have any external deamon running.  This won't come anywhere near to matching what MySQL have made avalable in Merlin, but still could be fairly useful.  So lets build a simple system using events to keep track of "interesting" status variables....

Read More

Unit testing stored procedures with Junit

Anyone who has used an automated unit testing framework such as Junit knows just how life-changing an automated test suite can be.   Once you've  experienced validating that recent changes have not broken old code, or discovering subtle bugs via junit that would otherwise have remained undetected , you naturally want to have this capability in all your programming environments.

Guisseppe Maxia has written a few stored procedure snippets to assist with automated unit testing of MySQL routines.  Unfortunately, the MySQL stored procedure language itself does not have the necessary abilities to fully implement the sort of unit testing we would like.  In particular, the inability for a stored procedure to capture the result sets generated by another stored procedure prevents a stored procedure from fully unit testing another. 

So I decided that - for me - Junit offered the best solution.  I created an extension to the Junit class that contains some assertions useful when unit testing stored procedure and extended my PlainSql JDBC wrapper classes to allow a stored procedure to return a single object that contains all its result sets and the values of OUT or INOUT parameters.   This object can be made the target of the various assertions.

If you're not familiar with Java and/or, you might feel that this solution is not for you.  However, the amount of java programming you need to do is very minimal and GUI environments such as Eclipse make it very easy to set up.  The rest of this article is available here;  it contains instructions, my Java classes and examples for setting up Junit test cases for MySQL stored procedures.

Read More

MySQL stored procedures with Ruby

Ruby's getting an incredible amount of attention recently, largely as the result of Ruby on Rails.  I've played a little with Ruby on Rails and it certainly is the easiest way I've seen so far to develop  web interfaces to a back-end database.

At the same time,  I've been shifting from perl to Java as my language of choice for any serious database utility development.  But I still feel the need for something dynamic and hyper-productive when I'm writing something one-off or for my own use.  I've been playing with Python, but if Ruby has the upper ground as a web platform then maybe I should try Ruby. 

So seeing as how I've just finished the MySQL stored procedure book, first thing is to see if I can use Ruby for MySQL stored procedures.

Database - and MySQL - support for Ruby is kind of all over the place.  There's a DBI option (similar to perl) which provides a consistent interface and there's also native drivers.  For MySQL there are pure-ruby native drivers and drivers written in C.  Since the DBI is based on the native driver, I thought I'd try the native driver first.  The pure-ruby driver gave me some problems so I started with the C driver on Linux (RHAS4). 

Retrieving multiple result sets

The main trick with stored procedures is that they might return multiple result sets. OUT or INOUT parameters can be an issue too, but you can always work around that using session variables. 

If you try to call a stored procedure that returns a result set, you'll at first get a "procedure foo() can't return a result set in the given context error".  This is because the CLIENT_MULTI_RESULTS flag is not set by default when the connection is created.  Luckily we can set that in our own code:

dbh=Mysql.init
dbh.real_connect("127.0.0.1", "root", "secret", "prod",3306,nil,Mysql::CLIENT_MULTI_RESULTS)

The "query" method returns a result set as soon as it is called, but I found it easier to retrieve each result set manually, so i set the query_with_result attribute to false:

dbh.query_with_result=false

The next_result and more_results methods are implemented in the Ruby MySql driver, but there's some weird things about the more_results C API call that causes problems in python and PHP.  In Ruby, the more_results call returns true whether or not there is an additional result.   The only reliable way I found to determine if there is another result set is to try and grab the results and bail out if an exception fires (the exception doesn't generate an error code, btw);
      
    dbh.query("CALL foo()")
    begin
      rs=dbh.use_result
    rescue Mysql::Error => e 
      no_more_results=true
    end

.
We can then call more_results at the end of each rowset loop.  So here's a method that dumps all the result sets from a stored procedure call as XML using this approach (I'm know the Ruby is probably crap, it's like my 3rd Ruby program):

def procXML(dbh,sql)
  connect(dbh)
  no_more_results=false
  dbh.query(sql)
  printf("<?xml version='1.0'?>\n");
  printf("<proc sql=\"%s\">\n",sql)
  result_no=0
  until no_more_results
    begin
      rs=dbh.use_result
    rescue Mysql::Error => e 
      no_more_results=true
    end 
     if no_more_results==false
      result_no+=1
      colcount=rs.fetch_fields.size
      rowno=0
      printf("\t<resultset id=%d columns=%s>\n",result_no,colcount)
      rs.each do |row|
        rowno+=1
        printf "\t\t<row no=%d>\n",rowno
        rs.fetch_fields.each_with_index do |col,i|
          printf("\t\t\t<colvalue column=\"%s\">%s</colvalue>\n",col.name,row[i])
        end
        printf("\t\t</row>\n")
      end
      printf("\t</resultset>\n");
      rs.free
      dbh.next_result
    end
  end
  printf("</proc>\n")
end

No C programming required!

Whew!  No need to hack into the C code.  So you can use MySQL stored procedures in Ruby with the existing native C driver. The problem is that the C driver is not yet available as a binary on Windows yet and trying to compile it turns out to be beyond my old brain (and yes, I used minGW and all the other "right" things).   Hopefully a copy of the MySQL binary driver it will be available in the one-click installer Ruby installer eventually.

The above code doesn't work using the pure-Ruby driver on windows by the way -  there's an "out of sequence" error when trying to execute the stored proc.  I might hack around on that later (at the moment I'm 35,000 ft with 15 minutes of battery left on the way to the MySQL UC).  For now if you want to use MySQL stored procedures in a ruby program on windows I can't help.

Note that ruby seems to hit a bug that causes MySQL to go away if there are two calls to the same stored proc in the same session and the stored proc is created using server-side prepared statements.  Fixed soon hopefully, but for now if you get a "MySQL server has gone away error" you might be hitting the same problem.   Wez posted on this problem here.

I suppose the end of this investigation will probably be to see if there's any way to use stored procedure calls to maintain a Rails AcitveRecord object.  Not that I think you'd necessarily want to, but it would probably be a good learning exercise.

Read More

Plain Old SQL Statements in Java

Lot's of Java developers want to avoid writing SQL or even avoid directly accessing with a relational data store.  I, on the other hand, want to use SQL in my Java programs (which are mostly database utilities) but I want it to be as easy to use SQL in Java as it is in Python, Perl or PHP. 

Recently, I decided to use Java rather than Perl as my language of choice for database utilities.  Perl is quick to write, but tends to be easier to write than to read.  I played with Python for a while, but in the end decided that by using Java I would have an easier time distributing by stuff and it could more easily be re-used inside of my company, which has Java developers but not many Perl people.  So developed a set of JDBC wrappers that would let me use SQL easily within my utilities.   

My Design :

  • Require no configuration files (no XML mappings for instance).  All the data access logic should be right there in the code.  Although I did end up allowing SQL statement to be held in an XML file just to avoid the whole messy string handling involved in really long SQL statements.   
  • Easy processing of result sets, leveraging Java collections (which were not around when JDBC was first speced).
  • Make it easiest to follow best practices.  For instance, make it very easy to use bind variables, re-use cursors, etc.
  • Allow interoperability with the underlying JDBC objects so that I could use these classes without worrying that I would run into a brick wall.
  • Work well with dynamic SQL.
  • Where appropriate, avoid some of the tedium involved with DML and DDL.
  • Cache prepared statements so that you don't have to worry about which prepared statements to keep open and when to close.
  • Be RDBMS neutral. 

This was a bit of a learning experience - had not programmed in Java for quite a while and I tried to follow best Java practice such as creating Junit tests, JavaDoc and ant builds.  If anyone's interested,  here's the latest versions:

I don't really expect anyone else to use this, posting it was I guess one of the ways to enforce a bit of discipline on myself as regards quality. However, it will be embedded in most of my database utilities so I guess it will see some use in our internal benchmarking routines and the like.

Read More

MySQL 5.1 events

I finally got around to working with the 5.1 scheduler.  I wanted to have a simple but non-trivial example, and when I saw Brian Akers post on the new processlist table, I thought of a useful little application:  I would submit an event that would summarize the users and their statuses at regular intervals so I could track user trends.

First off,  I needed to enable the scheduler by adding the following line to my configuration file:

event_scheduler=1

Now the scheduler is ready for action.  So I created a table to hold my process list details:

CREATE TABLE processhistory (h_timestamp DATETIME,
                             processcount INTEGER,
                             activecount INTEGER,
                             lockedcount INTEGER)$$

And then the DML to create an event:

CREATE EVENT evt_process_history
     ON SCHEDULE EVERY 2 MINUTE
     DO
BEGIN
    INSERT INTO processhistory (h_timestamp,processcount,
            activecount,lockedcount)
     SELECT NOW() AS h_timestamp,COUNT(*) AS processcount,
            SUM(active) AS activecount  ,
            SUM(locked) AS lockedcount
       FROM (SELECT CASE command WHEN 'Sleep' THEN 0 ELSE 1
                     END AS active ,
                    CASE state WHEN 'Locked' THEN 1 ELSE 0
                     END AS locked
               FROM information_schema.`PROCESSLIST` P) Q;
END$$

Every two minutes, the event summarizes the status of the sessions currently connected and stores them to the table.  I could use various tools to analze this data, but for convenience I used Excel with the ODBC driver to create a chart of activity:

Chart_1

Cool! Now I can keep track of active sessions over time, which could be useful. On the test database, there is a little ruby program that locks up a table needed by my java TP simulation, so we see those spikes of lock activity. I'm hoping that MySQL expose the SHOW STATUS command as a table as well, since we can't get at the contents of SHOW STATUS from within the stored program language.

Read More

Building ruby with Oracle and MySQL support on windows

If you did the setup neccessary to compile perl with MySQL and Oracle support (), you are well setup to do the same for ruby.  Why this should be so hard I don't know:  python produces very easy to install windows binaries, but if you want anything beyond the basics in perl and ruby you need to try and turn windows into Unix first. Sigh.

http://www.rubygarden.org/ruby?HowToBuildOnWindows explains the ruby build procedure.   I'm really just adding instructions for getting the ruby dbi modules for mysql and oracle.

Make sure mingw and msys are first in your path.

Enter the mingw shell:  sh

sh-2.04$ ./configure --prefix=/c/tools/myruby

sh-2.04$ make

sh-2.04$ make test

sh-2.04$ make install

Now, lets do ruby gems:

sh-2.04$ cd /tmp/rubygems
sh: cd: /tmp/rubygems: No such file or directory
sh-2.04$ cd /c/tmp
sh-2.04$ cd rubygems
sh-2.04$ export PATH=/c/tools/myruby/bin:$PATH
sh-2.04$ which ruby.exe
/c/tools/myruby/bin/ruby.exe
sh-2.04$ ls
rubygems-0.8.11
sh-2.04$ cd rubygems-0.8.11

sh-2.0.4$ unset RUBYOPT  #If you have cygwin this might be set
sh-2.04$ ruby ./setup.rb
c:\tools\myruby\bin\ruby.exe: no such file to load -- ubygems (LoadError)

Read More

Compiling DBD::mysql and DBD::Oracle on windows

Last week my laptop crashed and while installing the new one I decided to update my perl versions. I mainly use the DBD::mysql and DBD::Oracle modules and although I'm confortable building them on Linux/Unix, like most people I use the Activestate binaries on windows.

However it turns out that Oracle licensing changes now prevent Activestate from distributing an Oracle binary, so I was forced to build them from source. It wasn't easy, but now both the Oracle and MySQL modules are working. Here's the procedure in case it helps anyone.

Install Pxperl

Firstly, you probably want to move to the pxperl windows binaries. Pxperl support the familiar CPAN system for updates. Get Pxperl at www.pxperl.com. The installation should be straight forward.

I installed into c:\tools\pxperl

Install MinGW

You'll need a C compiler capable of building native windows binaries. I used the MinGW system. You can't use cygwin, although I believe that Cygwin might be capable of installing MinGW. Anyway, I got the MinGW system from http://www.mingw.org/. I couldn't use the auto-installer for firewall reasons, so I did a manual download and install.

Firstly, I unpacked the following .gz files into c:\tools\mingw:

  • gcc-java-3.4.2-20040916-1.tar.gz
  • gcc-objc-3.4.2-20040916-1.tar.gz
  • mingw-runtime-3.9.tar.gz w32api-3.5.tar.gz
  • binutils-2.15.91-20040904-1.tar.gz
  • mingw-utils-0.3.tar.gz gcc-core-3.4.2-20040916-1.tar.gz
  • gcc-g++-3.4.2-20040916-1.tar.gz

You probably don't need all of these, and of course the version numbers might be different by the time you read this.

Then I ran the following two executables

  • MSYS-1.0.10.exe
  • msysDTK-1.0.1.exe

...installing both into c:\tools\msys. You must make sure you provide the correct location for MinGW when prompted. Finally, MinGW installs it's own version of perl, so I removed that as well as the make.exe which is inferior.

I added both the bin directories to my path, which now starts something like this:  c:\mysql;c:\tools\msys\1.0\bin; c:\tools\mingw\bin; C:\tools\PXPerl\parrot\bin; C:\tools\PXPerl\bin

Installing DBD::Oracle

Now you can go into cpan (just type CPAN at the command line) and run "Install DBI".  That worked OK for me.

Then I ran "install DBD::Oracle".  That failed.  I can't remember the exact error, but it turns out that a trailing backslash in the include directory for the DBI doesn't work on Windows.  To fix that, run "configure_pxperl" and add an include for that directory in the "Include Directories" section.  For me, the directory was /tools/PXPerl/site/lib/auto/DBI , since I installed pxperl into the tools directory.

Installing DBD::Mysql

For some reason I thought this would be the easy part.  But it actually was really difficult.

In the end, it turns out you need to create your own version of mysqlclient.lib and manually link to that. Check out MySQL Bugs: #8906, for some more details.  Here's the steps that worked for me:

  1. run "install DBD::mysql" from the CPAN prompt
  2. You will get a whole lot of undefined symbol errors which will include the names of the normal mysql client API calls, suffixed with '@4' , '@0' , etc. Make a list of all of these.
  3. Add the missing symbols to the file include/libmysql.def.   
  4. Build your own libmysqlclient library with the following commands (from the directory just above your include directory):
  5. dlltool --input-def include/libmySQL.def --dllname lib/libmySQL.dll --output-lib lib/libmysqlclient2.a -k
  6. Go to the CPAN build area for the DBD-mysql,  for me that was: cd \tools\PXPerl\.cpan\build\DBD-mysql-3.0002
  7.   nmake realclean 
  8. perl Makefile.PL --libs="-L/mysql/lib -lmysqlclient2 -lz -lm -lcrypt -lnsl"
  9. nmake install

And - voila! - you should be OK. The only think you might need to do now is add the top level MySQL directory to your path.  DBD-Mysql wants to find "lib/mysql.dll" so you need to add the directory above that to your path.  I moved all the libraries to c:\mysql\lib and include files to c:\mysql\include, so I added to my path like this:

set PATH=c:\mysql;%PATH%

All done!

Seems to be working OK now for both Oracle and MySQL.  Much more difficult than installing the Activestate binaries but at least now that I'm working from source I can potentially fix bugs although having done it on Linux it's not for the faint hearted (or the incompentent in C++!)

Hopefully pxperl will gain in popularity and as it matures things will work as easily as on Linux.  That would be great.

Read More