Tuesday, January 28, 2014

8 Bulk Update Methods Compared

Reference - http://www.orafaq.com/node/2450

What I love about writing SQL Tuning articles is that I very rarely end up publishing the findings I set out to achieve. With this one, I set out to demonstrate the advantages of PARALLEL DML, didn't find what I thought I would, and ended up testing 8 different techniques to find out how they differed. And guess what? I still didn't get the results I expected. Hey, at least I learned something.
As an ETL designer, I hate updates. They are just plain nasty. I spend an inordinate proportion of design time of an ETL system worrying about the relative proportion of rows inserted vs updated. I worry about how ETL tools apply updates (did you know DataStage applys updates singly, but batches inserts in arrays?), how I might cluster rows together that are subject to updates, and what I might do if I just get too many updates to handle.
It would be fair to say I obsess about them. A little bit.
The two most common forms of Bulk Updates are:

  1. Update (almost) every row in the table. This is common when applying data patches and adding new columns.
  2. Updating a small proportion of rows in a very large table.
Case 1 is uninteresting. The fastest way to update every row in the table is to rebuild the table from scratch. All of these methods below will perform worse.
Case 2 is common in Data Warehouses and overnight batch jobs. We have a table containing years worth of data, most of which is static; we are updating selected rows that were recently inserted and are still volatile. This case is the subject of our test. For the purposes of the test, we will assume that the target table of the update is arbitrarily large, and we want to avoid things like full-scans and index rebuilds.

And the nominees are...

The methods covered include both PL/SQL and SQL approaches. I want to test on a level playing field and remove special factors that unfairly favour one method, so there are some rules:
  • Accumulating data for the update can be arbitrarily complex. SQL updates can have joins with grouping and sub-queries and what-not; PL/SQL can have cursor loops with nested calls to other procedures. I'm not testing the relative merits of how to accumulate the data, so each test will use pre-preared update data residing in a Global Temporary Table.
  • Some methods - such as MERGE - allow the data source to be joined to the update target using SQL. Other methods don't have this capability and must use Primary Key lookups on the update target. To make these methods comparable, the "joinable" techniques will use a Nested Loops join to most closely mimic the Primary Key lookup of the other methods. Even though a Hash join may be faster than Nested Loops for some distributions of data, that is not always the case and - once again - we're assuming an arbitraily large target table, so a full scan is not necessarily feasible.
  • Having said that we're not comparing factors outside of the UPDATE itself, some of the methods do have differences unrelated to the UPDATE. I have included these deliberately because they are reasonably common and have different performance profiles; I wouldn't want anyone to think that because their statements were *similar* to those shown here that they have the same performance profile.
The 8 methods I am benchmarking here are as follows (in rough order of complexity):

  1. Explicit Cursor Loop
  2. Implicit Cursor Loop
  3. UPDATE with nested SET subquery
  4. BULK COLLECT / FORALL UPDATE
  5. Updateable Join View
  6. MERGE
  7. Parallel DML MERGE
  8. Parallel PL/SQL
For all of the tests, the following table structures will be used:
TEST{n} (Update Source) - 100K rows             TEST (Update target) - 10M rows

Name                           Type             Name                           Type             
------------------------------ ------------     ------------------------------ ------------
PK                             NUMBER           PK                             NUMBER
FK                             NUMBER           FK                             NUMBER
FILL                           VARCHAR2(40)     FILL                           VARCHAR2(40)
The data has the following characteristics:

  • TEST.PK will contain values 0 .. 9999999, but not in that order.
  • TEST.PK is poorly clustered. It is generated by reversing the digits in LPAD(ROWNUM, '0', 7). PK values of 1,2, and 3 are adjacent in the primary key index but one-million rows apart in the table.
  • TEST.FK will contain values 0 .. 99, unevenly distributed to favour lower numbers.
  • For the first round of testing, the column FK will be indexed with a simple b-tree index.

Method 1: Explicit Cursor Loop

Not many people code this way, but there are some Pro*C programmers out there who are used to Explicit Cursor Loops (OPEN, FETCH and CLOSE commands) and translate these techniques directly to PL/SQL. The UPDATE portion of the code works in an identical fashion to the Implicit Cursor Loop, so this is not really a separate "UPDATE" method as such. The interesting thing about this method is that it performs a context-switch between PL/SQL and SQL for every FETCH; this is less efficient. I include it here because it allows us to compare the cost of context-switches to the cost of updates.
DECLARE
    CURSOR c1 IS
        SELECT *
        FROM test6;

    rec_cur c1%rowtype;
BEGIN
    OPEN c1;
    LOOP
        FETCH c1 INTO rec_cur;
        EXIT WHEN c1%notfound;

        UPDATE test
        SET    fk = rec_cur.fk
        ,      fill = rec_cur.fill
        WHERE  pk = rec_cur.pk;
    END LOOP;
    CLOSE C1;
END;
/

Method 2: Implicit Cursor Loop

This is the simplest PL/SQL method and very common in hand-coded PL/SQL applications. Update-wise, it looks as though it should perform the same as the Explicit Cursor Loop. The difference is that the Implicit Cursor internally performs bulk fetches, which should be faster than the Explicit Cursor because of the reduced context switches.
BEGIN
    FOR rec_cur IN (
        SELECT *
        FROM test3
    ) LOOP
        UPDATE test
        SET    fk = rec_cur.fk
        ,      fill = rec_cur.fill
        WHERE  pk = rec_cur.pk;
    END LOOP;
END;
/

Method 3: UPDATE with nested SET subquery

This method is pretty common. I generally recommend against it for high-volume updates because the SET sub-query is nested, meaning it is performed once for each row updated. To support this method, I needed to create an index on TEST8.PK.
UPDATE test
SET    (fk, fill) = (
           SELECT test8.fk, test8.fill
           FROM   test8
           WHERE  pk = test.pk
)
WHERE  pk IN (
           SELECT pk
           FROM   test8
);

Method 4: BULK COLLECT / FORALL UPDATE

This one is gaining in popularity. Using BULK COLLECT and FORALL statements is the new de-facto standard for PL/SQL programmers concerned about performance because it reduces context switching overheads between the PL/SQL and SQL engines.
The biggest drawback to this method is readability. Since Oracle does not yet provide support for record collections in FORALL, we need to use scalar collections, making for long declarations, INTO clauses, and SET clauses.
DECLARE
    CURSOR rec_cur IS
    SELECT *
    FROM test4;

    TYPE num_tab_t IS TABLE OF NUMBER(38);
    TYPE vc2_tab_t IS TABLE OF VARCHAR2(4000);

    pk_tab NUM_TAB_T;
    fk_tab NUM_TAB_T;
    fill_tab VC2_TAB_T;
BEGIN
    OPEN rec_cur;
    LOOP
        FETCH rec_cur BULK COLLECT INTO pk_tab, fk_tab, fill_tab LIMIT 1000;
        EXIT WHEN pk_tab.COUNT() = 0;

        FORALL i IN pk_tab.FIRST .. pk_tab.LAST
           UPDATE test
            SET    fk = fk_tab(i)
            ,      fill = fill_tab(i)
            WHERE  pk = pk_tab(i);
    END LOOP;
    CLOSE rec_cur;
END;
/

Method 5: Updateable Join View

This is really a deprecated pre-9i method; the modern equivalent is the MERGE statement. This needs a unique index on TEST1.PK in order to enforce key preservation.
UPDATE (
        SELECT /*+ ordered use_nl(old)*/ new.pk as new_pk
        ,       new.fk as new_fk
        ,       new.fill as new_fill
        ,       old.*
        FROM test1 new
        JOIN test old ON (old.pk = new.pk)
)
SET fk = new_fk
,   fill = new_fill
/

Method 6: MERGE

The modern equivalent of the Updateable Join View. Gaining in popularity due to its combination of brevity and performance, it is primarily used to INSERT and UPDATE in a single statement. We are using the update-only version here. Note that I have included a FIRST_ROWS hint to force an indexed nested loops plan. This is to keep the playing field level when comparing to the other methods, which also perform primary key lookups on the target table. A Hash join may or may not be faster, that's not the point - I could increase the size of the target TEST table to 500M rows and Hash would be slower for sure.
MERGE /*+ FIRST_ROWS*/ INTO test
USING test2 new ON (test.pk = new.pk)
WHEN MATCHED THEN UPDATE SET
        fk = new.fk
,       fill = new.fill
/
Here is the Explain Plan
-------------------------------------------------------------------------------
| Id  | Operation                      | Name    | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------------
|   0 | MERGE STATEMENT                |         |   130K|  9921K|   258K  (1)|
|   1 |  MERGE                         | TEST    |       |       |            |
|   2 |   VIEW                         |         |       |       |            |
|   3 |    NESTED LOOPS                |         |   130K|    11M|   258K  (1)|
|   4 |     TABLE ACCESS FULL          | TEST2   |   128K|  6032K|   172   (5)|
|   5 |     TABLE ACCESS BY INDEX ROWID| TEST    |     1 |    48 |     2   (0)|
|   6 |      INDEX UNIQUE SCAN         | TEST_PK |     1 |       |     1   (0)|
-------------------------------------------------------------------------------

Method 7: Parallel DML MERGE

Now we're getting clever... This is the MERGE example on steroids. It uses Oracle's Parallel DML capability to spread the load over multiple slave threads.
ALTER SESSION ENABLE PARALLEL DML;

MERGE /*+ first_rows parallel(test) parallel(test2) */ INTO test
USING test5 new ON (test.pk = new.pk)
WHEN MATCHED THEN UPDATE SET
        fk = new.fk
,       fill = new.fill
/
Note the differences in the Explain Plan.
--------------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name     | Rows  | Bytes | Cost (%CPU)|    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------
|   0 | MERGE STATEMENT                       |          |   109K|  8325K|  1880   (1)|        |      |            |
|   1 |  PX COORDINATOR                       |          |       |       |            |        |      |            |
|   2 |   PX SEND QC (RANDOM)                 | :TQ10002 |   109K|    10M|  1880   (1)|  Q1,02 | P->S | QC (RAND)  |
|   3 |    INDEX MAINTENANCE                  | TEST     |       |       |            |  Q1,02 | PCWP |            |
|   4 |     PX RECEIVE                        |          |   109K|    10M|  1880   (1)|  Q1,02 | PCWP |            |
|   5 |      PX SEND RANGE                    | :TQ10001 |   109K|    10M|  1880   (1)|  Q1,01 | P->P | RANGE      |
|   6 |       MERGE                           | TEST     |       |       |            |  Q1,01 | PCWP |            |
|   7 |        PX RECEIVE                     |          |   109K|    10M|  1880   (1)|  Q1,01 | PCWP |            |
|   8 |         PX SEND HYBRID (ROWID PKEY)   | :TQ10000 |   109K|    10M|  1880   (1)|  Q1,00 | P->P | HYBRID (ROW|
|   9 |          VIEW                         |          |       |       |            |  Q1,00 | PCWP |            |
|  10 |           NESTED LOOPS                |          |   109K|    10M|  1880   (1)|  Q1,00 | PCWP |            |
|  11 |            PX BLOCK ITERATOR          |          |       |       |            |  Q1,00 | PCWC |            |
|  12 |             TABLE ACCESS FULL         | TEST2    |   107K|  5062K|     2   (0)|  Q1,00 | PCWP |            |
|  13 |            TABLE ACCESS BY INDEX ROWID| TEST     |     1 |    48 |     0   (0)|  Q1,00 | PCWP |            |
|  14 |             INDEX UNIQUE SCAN         | TEST_PK  |     1 |       |     0   (0)|  Q1,00 | PCWP |            |
--------------------------------------------------------------------------------------------------------------------

Method 8: Parallel PL/SQL

This is much easier to do with DataStage than with native PL/SQL. The goal is to have several separate sessions applying UPDATE statements at once, rather than using the sometimes restrictive PARALLEL DML alternative. It's a bit of a kludge, but we can do this in PL/SQL using a Parallel Enable Table Function. Here's the function:
CREATE OR REPLACE FUNCTION test_parallel_update (
 test_cur IN SYS_REFCURSOR
) 
RETURN test_num_arr
PARALLEL_ENABLE (PARTITION test_cur BY ANY)
PIPELINED
IS
 PRAGMA AUTONOMOUS_TRANSACTION;

 test_rec TEST%ROWTYPE;
 TYPE num_tab_t IS TABLE OF NUMBER(38);
 TYPE vc2_tab_t IS TABLE OF VARCHAR2(4000);

 pk_tab NUM_TAB_T;
 fk_tab NUM_TAB_T;
 fill_tab VC2_TAB_T;

 cnt INTEGER := 0;
BEGIN
 LOOP
  FETCH test_cur BULK COLLECT INTO pk_tab, fk_tab, fill_tab LIMIT 1000;
  EXIT WHEN pk_tab.COUNT() = 0;

  FORALL i IN pk_tab.FIRST .. pk_tab.LAST
   UPDATE test
   SET    fk = fk_tab(i)
   ,      fill = fill_tab(i)
   WHERE  pk = pk_tab(i);

  cnt := cnt + pk_tab.COUNT;
 END LOOP;

        CLOSE test_cur;

 COMMIT;
 PIPE ROW(cnt);
 RETURN;
END;
/
Note that it receives its data via a Ref Cursor parameter. This is a feature of Oracle's parallel-enabled functions; they will apportion the rows of a single Ref Cursor amongst many parallel slaves, with each slave running over a different subset of the input data set.
Here is the statement that calls the Parallel Enabled Table Function:
SELECT sum(column_value)
FROM   TABLE(test_parallel_update(CURSOR(SELECT * FROM test7)));
Note that we are using a SELECT statement to call a function that performs an UPDATE. Yeah, I know, it's nasty. You need to make the function an AUTONOMOUS TRANSACTION to stop it from throwing an error. But just bear with me, it is the closest PL/SQL equivalent I can make to a third-party ETL Tool such as DataStage with native parallelism.