More on the database flash cache

I'm eagerly awaiting my high-performance flash SSD (an Intel X-25 E), but in the meantime I've played a bit more with the database flash cache on the cheap hardware setup described in my last post.   Sometimes it can be useful to test new features on slow hardware, since you see phenomenon that don't occur when everything is running full speed.

I originally naively imagined that blocks would be copied into the flash cache by the Oracle server process .  Eg, that if I read from disk, I deposit the block in both the buffer cache or the flash cache.  However,  upon investigation it appears that blocks are moved from the buffer cache to the flash cache by the DBWR as they are about to be flushed from the buffer cache.  

This is of course, a much better approach.  The DBWR can write to the flash cache asynchronously, so that user sessions get the benefit - less time reading from magnetic disk - without having to wait while blocks are inserted into the flash cache. 

 

So the lifecycle of a block looks something like this:

Read More

Using the Oracle 11GR2 database flash cache

Oracle just released a patch which allows you to use the database flash cache on Oracle Enterprise Linux even if you don't have exadata storage.  The patch is the obscurely named:

  •    8974084:META BUG FOR FLASH CACHE 11.2PL BUGS TO BACKPORT TO 11.2.0.1 OEL

Once you install the patch you can use any old flash device as a database flash cache.  Below I've documented some initial dabbling on a very old server and a cheap usb flash device.   The results are not representative of the performance you'd get on quality hardware, but are still interesting, I think.

Read More

Performant Oracle programming: perl

In the last installment, we looked at the fundamentals of efficient Oracle programming with C#.  In particular, we saw how to use bind variables and how to perform array processing.  Now lets take a look at perl.

I have a strong affection for perl:  I started using perl somewhere around 1992 and did some early hacking to allow the Oracle interface of the day - oraperl - to use stored procedures.  My very first true performance monitoring tools were all written using perl.  Perl has always had good support for Oracle and it's possible to write very efficient and powerful Oracle database utilities in perl.

You can get the perl program that implements the examples in this posting here

Read More

Oracle performance programming: .NET

In Oracle SQL High Performance tuning, I included an appendix in which I outlined how to achieve good cursor management, bind variables and array processing in the major languages of the day.   I had intended to do the same in Oracle Performance Survival Guide, but I ran out of time and space in the book.  So the examples in the book are generally in Java or PL/SQL only.

I wanted to get up to date on the various languages, some of which (like Python) I haven't used for a while and others (Ruby for instance) I've never used with Oracle.  So I thought I'd kill two birds with one stone by writing a series of blog posts on how to program efficiently in the various languages.

There's lots of best practices in each language, but I think most of us would agree that you at least need to know how to do the following:

  1. Use bind variables appropriately.
  2. Ensure you are performing array fetch when retrieving more than one row
  3. Ensure that you use array insert when doing bulk inserts 

The ways of doing this are different in each language.  In Chapter 6 of Oracle Performance Survival Guide,  I describe these techniques and their performance implications, using Java and sometimes PL/SQL examples.  In this series I'll go through the techniques for other languages. 

Let's start with ODP.NET, which is Oracle's ADO.NET driver for .NET languages such as C#, VB.NET and Powershell (see here for a posting on using Oracle with powershell). 

Read More

Oracle OpenWorld again!

Along with tens of thousands of other Oracle-types,  I'll shortly be in San Francisco for Oracle Open World 2009.  This will be my 12th OOW!    

We'll be giving away signed copies of Oracle Performance Survival Guide at the Quest booth (#335) on Wednesday October 14th at 1:30pm.   I'll also be giving a few copies away at the two Oracle Develop presentations I'm giving:

- Session ID S308361: High Performance PL/SQL  Oct. 12, 5:30 p.m., Hilton Hotel, Golden Gate 6/7

- Session ID S308362: Oracle Performance by Design  Oct. 13, 4 p.m., Hilton Hotel, Franciscan A/B

The rest of the time I'll be attending sessions and catching up with folk.  If you'd like to try and catch me, either direct message me at @guyharrison or email me at guy.a.harrison@gmail.com

 

Oracle Performance Survival Guide available as PDF

My new book Oracle Performance Survival Guide just became available for PDF download at Informit.com!

The print version is available for pre-order at Informit or Amazon and elsewhere, and is due in book stores on October 14th.  There will be copies available at the Oracle Open World book store next week however and we'll be having a book signing with giveaways at the Quest Software booth, probably on Wednesday.   I haven't seen a print copy myself yet, but I was probably the first to buy a digital copy :-).   It will also soon be available on Kindle and on the Safari on-line bookshelf. 

This is the most challenging book I've written, and the most extensive in terms of scope.  I tried to write a book that would be accessible across a broad range of expertise, and which would systematically address most major aspects of Oracle RDBMS performance tuning.   Overall, I'm satisfied that it's a worthwhile contribution to Oracle performance literature.   

You can see the full table of contents at the books Informit webpage, but here's the short chapter list to give you an idea:  

 

PART I: METHODS, CONCEPTS, AND TOOLS
1 Oracle Performance Tuning: A Methodical Approach
2 Oracle Architecture and Concepts  
3 Tools of the Trade

PART II: APPLICATION AND DATABASE DESIGN
4 Logical and Physical Database Design
5 Indexing and Clustering
6 Application Design and Implementation

PART III: SQL AND PL/SQL TUNING
7 Optimizing the Optimizer
8 Execution Plan Management
9 Tuning Table Access
10 Joins and Subqueries
11 Sorting, Grouping, and Set Operations
12 Using and Tuning PL/SQL
13 Parallel SQL
14 DML Tuning

PART IV: MINIMIZING CONTENTION
15 Lock Contention  
16 Latch and Mutex Contention  
17 Shared Memory Contention

PART V: OPTIMIZING MEMORY
18 Buffer Cache Tuning  
19 Optimizing PGA Memory
20 Other Memory Management Topics

PART VI: IO TUNING AND CLUSTERING
21 Disk IO Tuning Fundamentals
22 Advanced IO Techniques
23 Optimizing RAC

Scripts, examples and packages are available for download here

MTS + AMM + BULK COLLECT = Trouble!

Most databases don't run with shared servers - A.K.A. Multi-Threaded Servers or MTS - nowadays.   While reducing the number of server processes can reduce overall memory demand and sometimes certain forms of contention, the drawbacks - particularly delays caused when all serves are busy - are generally regarded as greater than the advantages. 

MTS becomes downright dangerous when Automatic Shared Memory Management (ASMM) or Automatic Memory Management (AMM) is in place.  When you use MTS and AMM together, PL/SQL programs that try to create large collections can effectively consume all available memory with disastrous consequences. 

Read More

More on the parallel execution queue

In the last post, I looked at how the 11GR2 parallel execution queue worked.  Essentially the holder of the JX enqueue is the next in line for parallel execution and others in the queue are waiting on the JX enqueue.  

I thought it would be useful to write an SQL to list the PQO statements running, and those waiting.  The script is here:  it joins V$SESSION, V$WAIT_CHAINS, V$PX_SESSION and V$SQL to show execution parallel SQLs and the SQLs holding or waiting on the JX enqueue. 

Here's some sample output:

Read More

The parallel_degree_policy parameter in 11gR2

Oracle 11G release 2 introduced a number of singificant tweaks to parallel SQL.  One of the more significant was the new parameter PARALLEL_DEGREE_POLICY.  

The PARALLEL_DEGREE_POLICY default setting of MANUAL  results in  Oracle 11G release 1 behavior.  A setting of AUTO results in the following new behaviors:

 

  • The Degree of Parallelism (DOP) may be calculated based on the types of operations in the SQL statement and the sizes of the tables.  The DOP for a sort of a massive table might be set higher than that of a small table, for instance. 
  • If the requested or required DOP is currently not possible because parallel servers are busy, then Oracle will delay statement execution rather than downgrading or serliazing the SQL. 
  • Oracle parallel slaves may use buffered IO rather than direct IO:  Oracle calls this (misleadingly I think) "in-memory parallel execution". 

There's a mid-range setting of LIMITED that enables automatic DOP only. 

In this post I'm going to look at the last two changes:  defered execution of Parallel SQL and buffered parallel IO. 

Read More

Columnar compression in 11gR2

We often think of compression as being a trade off between performance and storage:  compression reduces the amount of storage required, but the overhead of compressing and decompressing makes things slower.  However, while there is always some CPU overhead involved in compression the effect on table scan IO can be favourable, since if a table is reduced in size it will require fewer IO operations to read it. 

Prior to 11g, table compression could only be achieved when the table was created, rebuilt or when using direct load operations.  However, in 11g, the Advanced Compression option allows data to be compressed when manipulated by standard DML.   Compression becomes even more attractive in 11gR2 because we can use columnar compression to get much higher compression levels than were previously possible. 

Read More

Using hints with join methods

Jonathan Lewis often says that hints are generally unadvisable on production systems,  and - with some reservations - I agree.  The most significant problem with hints is that they can go disastrously wrong when schema or other changes to the database occur.  They can also prevent the optimizer from exploiting possible improvements that might otherwise be obtained when new indexes or other changes are made.  If you use hints, you are increasing the amount of care you need to take when making schema changes - especially when changing indexing.

Despite that warning,  hints clearly do have a place when all else fails, and are definitely useful in test environments to compare the costs of various plan options.   In Spotlight on Oracle,  we issue SQL that has very stringent performance requirements against tables (X$ views usually) where we are unable to create histograms and where most other optimization methods are unavailable.  Consequently we fairly routinely hint our production SQL.   We might be special case , but the world is made up of special cases!

In this post I want to provide an example of where a simple hint can go disastrously wrong, and at the same time be the only practical way of achieving a certain plan. 

Read More

Optimizing GROUP and ORDER BY

   

Starting with Oracle 10.2,  you may notice a significant degradation in performance when you combine a GROUP BY with an ORDER BY on the same columns.

Oracle introduced a hash-based GROUP BY in 10.2.  Previously, a group by involved sorting the data on the GROUP BY columns, then accumulating the aggregate results.   The hash method involves passing through the data without sorting it and accumulating the aggregates during this single pass.  Unfortunately, the presence of an ORDER BY causes Oracle to revert to the older sort-based GROUP BY with a correpsonding drop in performance.   However, you can reword your SQL in a such a way to avoid this degradation.

Read More

Pivot performance in 11g

"Pivoting" a result set - mapping values in a particular row to columns - is a commonplace activity when analyzing data in spreadsheets, but has always been a bit awkward in SQL.  The PIVOT operator - introduced in Oracle 11g - makes it a somewhat easier and - as we'll see - more efficient.

Prior to PIVOT, we would typically use CASE or DECODE statements to create columns that contained data only if the pivot column had the appropriate value.  For instance, to produce a report of product sales by year, with each year shown as a separate column, we might use a query like this:

Read More

SIGNAL and RESIGNAL in MySQL 5.4 and 6.0

One of the most glaring ommissions in the MySQL Stored procedure implementation was the lack of the ANSI standard SIGNAL and RESIGNAL clauses.  These allow a stored procedure to conditionally return an error to the calling program.

When Steven and I wrote MySQL Stored Procedure programming we lamented this ommission, and proposed an addmittedly clumsy workaround.  Our workaround involved creating and procedure in which dynamic SQL in which the error message was embedded in the name of a non-existent table.  When the procedure was executed, the non-existing table name at least allowed the user to see the error.  So for instance, here is the my_signal procedure:

Read More

Oracle disk IO latency on EC2

 

A colleague pointed me to this blog post , in which one attendees at a cloud computing panel described disk IO in the cloud (presumably AWS) as "punishingly slow".

We use EC2 for testing purposes here in the Spotlight on Oracle team, and generally we’ve been pretty happy – at least with throughput. Latency is a bit of a mixed bag however. Here’s a typical db file sequential read histogram from one of our bigger EC2 instances (from Spotlight on Oracle):

Read More

Joining V$PQ_TQSTAT to PLAN_TABLE output

When optimizing parallel execution, we usually leverage a number of sources of information, most importantly:

 

  • The parallel execution plan as revealed by EXPLAIN PLAN and DBMS_XPLAN
  • The data flows between each set of parallel processes as revealed by V$PQ_TQSTAT

 


The execution plan allows us to identify any serial bottlenecks in an otherwise parallelized plan, usually revealed by PARALLEL_FROM_SERIAL or S->P values in either the OTHER_TAG of the PLAN_TABLE or in the IN-OUT column of DBMS_XPLAN.


V$PQ_TQSTAT shows us the number of rows sent between each sets of parallel slaves. This helps determine if the work is being evenly divided between parallel slaves. Skew in the data can result in some slaves being over worked while others are underutilized. The result is that the query doesn’t scale well as parallel slaves are added.


Unfortunately, V$PQ_TQSTAT output is only visible from within the session which issued the parallel query and only for the most recent query executed. This limits its usefulness in a production environment, but it is still invaluable when tuning parallel queries.
It’s quite hard to correlate the output of DBMS_XPLAN and V$TQ_STAT so I wrote a script that tries to correlate the output of both. It generates a plan for the last SQL executed by the session, and then joins that to the V$PQ_TQSTAT data. This is a little tricky, but I think the resulting script will generate useful output in a reasonably wide range of scenarios.


You can get the scrip here: tq_plan.sql. Your session will need access

Read More

PARTITION BY versus GROUP BY

 

Here’s some more research that I did for the book that didn’t make make the final content.  It's a bit esoteric but interesting. 

In some circumstances you can use PARTITION BY to avoid doing a self-join to a GROUP BY subquery. However, although the PARTITION BY avoids duplicate reads of the table – usually a good thing – it won’t always lead to better performance.

For instance, if I want to create a report that compares each sale value to the average sale value for the product, I might join the sales data to a subquery with a GROUP BY product_id:

WITH /*+ gather_plan_statistics gh_pby1*/

     totals AS (SELECT prod_id,

                   AVG(amount_sold) avg_product_sold

                  FROM sales

                 GROUP BY prod_id)

SELECT prod_id, prod_name, cust_id, time_id, amount_sold,

       ROUND(amount_sold * 100 / avg_product_sold,2) pct_of_avg_prod

  FROM sales JOIN products USING (prod_id)

  JOIN totals USING (prod_id);

 

Of course, that approach requires two full scans of the SALES table (I used an expanded non-partitioned copy of the SH.SALES table from the Oracle sample schema). If we use the PARTITION BY analytic function we can avoid that second scan:

SELECT /*+ gather_plan_statistics gh_pby2*/

       prod_id,prod_name, cust_id, time_id, amount_sold,

       ROUND(amount_sold * 100 /

            AVG(amount_sold) OVER (PARTITION BY prod_name)

       ,2) pct_of_avg_prod

  FROM sales JOIN products USING (prod_id) ;

You’d expect that avoiding two scans of SALES would improve performance. However, in most circumstances the PARTITION BY version takes about twice as long as the GROUP BY version.

Read More

Flashback Data Archive performance (pt 1)

 

I did some research on Flashback Data Archive (FBDA) for my upcoming Oracle performance book, but decided not to include it in the book due to space limitations and because not many people are using FBDA just yet.

FBDA – also called Total Recall - is described by Oracle like this:

 

Overview

Flashback Data Archive provides the ability to automatically track and store all transactional changes to a record for the duration of its lifetime. This feature also provides seamless access to historical data with "as of" queries. It provides flashback functionality for longer time periods than your undo data. You can use Flashback Data Archive for compliance reporting, audit reports, data analysis and decision support.


http://www.oracle.com/technology/obe/11gr1_db/security/flada/flada.htm

Sounds good, but from my brief tests you pay a very high price for this functionality. Also, it looks like the background processing has been changed to foreground processing in 11.0.6.   In 11.0.6, it's the FBDA background process that populates the FBDA tables, but in 11.0.7 this is done by the session that issues the DML.

Let’s review FBDA set up first. Create a tablespace and a data archive:

 

CREATE TABLESPACE fb_arc_ts1 DATAFILE

'/oradata/g11r22a/fb_arc1.dbf' SIZE 1024 M AUTOEXTEND OFF;

DROP FLASHBACK ARCHIVE fb_arc1;

 

/* Create the flashback archive */

 

CREATE FLASHBACK ARCHIVE DEFAULT fb_arc1

TABLESPACE fb_arc_ts1

QUOTA 1024 M

RETENTION 1 DAY;

Now we can mark a table for flashback archive:

ALTER TABLE fba_test_data FLASHBACK ARCHIVE;

DML statements run about the same time on a FBDA table, but COMMIT times go through the roof:


SQL> UPDATE fba_test_data

2 SET datetime = datetime + .01;

 

999999 rows updated.

 

Elapsed: 00:01:13.43

SQL>

SQL> COMMIT;

 

Commit complete.

 

Elapsed: 00:24:10.29

 

That’s right – 1 minute update, 24 minute commit time!! I’ve seen the same performance from DELETEs and INSERTs.

When the COMMIT occurs, Oracle runs recursive SQL to update the data archive. Here’s an example of one of the SQLs that runs (shown in Spotlight on Oracle's trace file viewer):

 

The statement above is the final in a sequence of at least 4 that are executed for every transaction.

In 11.0.6, there’s a bug in the FBDA SQL. The following SQL has a hint missing the “+” sign. Consequently the intended direct path inserts do not occur and - in some cases - free buffer waits can result:

 

The free buffer waits are a consequence of creating lots of new blocks for the DBWR to write to disk while at the same time pulling lots of blocks into the buffer cache from the source table.  If the DBWR can't write out to disk as fast as you are creating new blocks then the free buffer waits results.

 

FBDA will large transactions is therefore not snappy.  For short transactions the effect (at COMMIT time on 11.0.7) is less noticeable:

 

SQL> INSERT INTO fba_test_data d(id, datetime, data)

2 SELECT ROWNUM id,

3 SYSDATE - 1000 + DBMS_RANDOM.VALUE(1, 1000) datetime,

4 RPAD(SYSDATE - 1000 + DBMS_RANDOM.VALUE(1, 1000), 900,

5 'x')

6 data

7 FROM DUAL

8 CONNECT BY ROWNUM < 10;

 

9 rows created.

 

Elapsed: 00:00:00.48

SQL>

SQL> COMMIT;

 

Commit complete.

 

Elapsed: 00:00:00.53

 

The SQLs that FBDA generates are interesting case studies for SQL tuning – the use of hints in particular is interesting since it limits the ability of Oracle to respond to different volumes of changes with different SQL plans. I’ll post an analysis of the various SQLs issued in a future post.

For now, the clear lesson is to be very cautious when implementing FBDA – it definitely has some performance implications!

 

New Book, New blog

 

I'm not a very good blogger at the best of times.  I'm just unable to get beyond the first, draft second draft, review process that works OK for articles and books, but which is not what you want when blogging.  Over the last six months I've been even worse, since all my spare energy has been going into writing the Oracle Performance Survival Guide This book has been by far the most challanging yet rewarding writing effort in my career.  I set out to write a book that could stand alone as a single guide to Oracle performancecovering all aspects of Oracle performance management,  suitable for all levels of expertise and across all relevant disciplines.  So there's equal weight given to application and database design, SQL and PL/SQL tuning, contention management, memory optimization and disk tuning.

Anyway,  it's been quite an effort and I learned a lot while writing it.  But it's taken over my life!

The books going to be available probably around Oracle Open World in October.  You can pre-order it from Amazon.

Now that I'm over the biggest hump in getting the book together,  I thought I'd try to get back into occasional blogging.  I'd heard good things about SquareSpace and once I took a look I was hooked.  This thing is way ahead of what I was using and is great for total website management, not just blog publishing.  Therefore I'm shutting down my old blog at http://guyharrison.typepad.com/ and running everything from here.  Hopefully,  I'll be motivated to blog more often than in the past!

 

 

Oracle on Amazon AWS

Ever since Oracle announced support for Amazon AWS at OOW I’ve been eager to play around with it.  I finally found the time and here are my notes in case it helps anyone get going.

Using Amazon AWS in general is a big topic.  Getting started in a nutshell:

  1. Get an AWS account from www.amazon.com/aws (you need your credit card)
  2. Download and install ElasticFox (http://sourceforge.net/projects/elasticfox/).  I’ve used the Amazon command line utilities in the past, but ElasticFox leaves that for dead. 
  3. Install putty (www.putty.org) a free SSH client
  4.  Enter your AWS credentials in ElasticFox
  5. Create a keypair in the keypars tab of ElasticFox

The only hickup I had with that was that on windows you can't use a keypair created in ElasticFox with the putty ssh client.  You have to create a putty compatible key.   You need to import your .PEM file in the PUTTYGEN program that comes with putty and generate a .PPK key which you then use when connecting via Putty.

Now, what I set out to do was create an Oracle database in the cloud that was on persistent storage.  That means two things:

1. Create the database on Elastic Block Storage (EBS) devices
2. Create a template of my AMI that I can use to startup a new image if I terminate my initial image

Port configuration

Read More