Friday, February 12, 2016

Level up your PL/SQL code

Introduction

The time has come to rejuvenate this blog. And what better start is there than immediately talk about PL/SQL?

Object oriented programming vs. PL/SQL


Comparing them


Years before Java and .NET made their entrance, PL/SQL already had a way of encapsulating data en giving the right responsibilities to your program units using packages. Unfortunately, I often stumble upon code where people do not fully understand the need of a package specification and a package body. They basically put everything in the specification and the body.

Basically, you need to remember 2 things:

Specification: Globally accessible by everyone
Body: Accessible only by the package itself

A rule often used in object oriented programming is that an object has attributes and methods. To access these attributes an external function needs to use a method to retrieve it. PL/SQL provides a way to logically structure your code in a same manner. Comparing OO class components with PL/SQL components in a package:
  • public attribute: global package spec variable
  • private attribute: global package body variable
  • public method: function/procedure declared in spec and body
  • private method: function declared only in the body

"But I am not programming object oriented, I am programming PL/SQL"


That is a contradiction! You ARE accessing objects all the time. Within the database everything is an object: A table, a sequence, a trigger... Think of collection types for instance. The type definition is a class and the declaration of it is an object. A record type even has a constructor.You are working with objects. On top of that, many principles in PL/SQL exist before object oriented programming, so why not use their power at its fullest?

Real life experiences


Something I often see is variables declared in the specification. There are exceptions like cursors and user-defined exceptions, or packages without a body used as a global structure in your session, etc... but my general rule is: "don't make it a habit". The problem is that the variables are globally accessible which impairs control of it.

A second thing are packages where all procedures and functions exist in both the spec and the body. If a function or procedure is not used outside the package, you should only put it in the body.

Here is an example of an alternative way on how a global variable inside a package could be accessed:

CREATE OR REPLACE PACKAGE dennis_test
IS   
   FUNCTION get_my_global
   RETURN NUMBER;
   
END dennis_test;
/

CREATE OR REPLACE PACKAGE BODY dennis_test
IS   
   g_my_global NUMBER;
   
   FUNCTION get_my_global
   RETURN NUMBER
   IS
   BEGIN
      RETURN g_my_global;
   END get_my_global;

END dennis_test;
/

Rather than putting it in the specification we use a function to return the body variable. The body variable will have session state just like specification variables. In this example it may look tedious, but this actually makes your codemore transparent  as it progresses and easier to maintain. After all, we will always rely on this package if one wants to acquire the value of the variable, plus we can seperate business logic in the function and calling application or procedure.

Working with context switches


A second topic I would like to talk about are context swtiches. If you have just started to learn PL/SQL chances are high you are unfamiliar with this concept. A context switch is when the parser switches his context mode from PL/SQL to SQL and visa versa. For example, when you open a cursor or when you invoke a DML or DDL (EXECUTE IMMEDIATE) statement. It is important to understand that many repeated context switching attacks the performance of your code. Here is an example you can test yourself.


CREATE TABLE my_logging
  (
     log_date DATE, 
     message VARCHAR2(4000), 
     backtrace VARCHAR2(4000)
  )
;


This testing table is just for logging. It should quickly insert as there are no constraints nor indexes.
Following procedure loops through a cursor returning 100000 rows, and does some intensive logging when looping. + 2 additional logging statements at the beginning and the end of the procedure.


CREATE OR REPLACE PROCEDURE some_procedure
IS
   CURSOR c_cur
   IS
      SELECT 1
      FROM DUAL  
      CONNECT BY ROWNUM <= 100000
   ;
   
   v_logging_enabled BOOLEAN := TRUE;   
   v_cur             c_cur%ROWTYPE;        

   PROCEDURE log_msg (p_msg IN VARCHAR2)
   IS
      PRAGMA AUTONOMOUS_TRANSACTION;
   BEGIN
   
      IF v_logging_enabled THEN
         INSERT INTO my_logging(message, backtrace, log_date)
         VALUES (p_msg, dbms_utility.format_error_backtrace, sysdate);
         
         COMMIT;                  
      END IF;
      
   END log_msg;
    
BEGIN
   
   log_msg('START');
   
   FOR v_rec IN c_cur
   LOOP      
      log_msg('..10: do intersting stuff');
      
      -- ... interesting stuff happens here
      
      log_msg('..20: another set of interesting stuff');
      
      -- ... interesting stuff happens here
      
      log_msg('..30: end of loop');
   END LOOP;   

   log_msg('END');

END some_procedure;


Our procedure has a big problem. It has 300002 (yes, three hundred thousand and two) context switches. Each time the log_msg procedure is called. This heavily impacts its performance. We can test this:


declare
   v_time NUMBER;
begin
   v_time := dbms_utility.get_time;
   some_procedure;
   v_time := (dbms_utility.get_time - v_time) / 100;
   DBMS_OUTPUT.PUT_LINE ( 'log time without bulk (300002 context switches) takes [' || v_time || '] seconds.' );
end;


And here is my output:

log time without bulk (300002 context switches) takes [28.74] seconds.

28.74 seconds is quite a while, but then again it needs to swtich its context from PL/SQL to SQL 300002 times!

We can write exactly the same procedure using a BULK insert. We stock our messages in a collection and persist the elements only at the end. This way, only the persistence itself has a context switch...


CREATE OR REPLACE PROCEDURE some_procedure_2
IS
   CURSOR c_cur
   IS
      SELECT 1
      FROM DUAL  
      CONNECT BY ROWNUM <= 100000
   ;

   TYPE v_log_tab_type IS TABLE OF my_logging%ROWTYPE INDEX BY PLS_INTEGER;
   
   v_logging_enabled BOOLEAN := TRUE;   
   v_cur             c_cur%ROWTYPE;
   v_i               PLS_INTEGER := 1;
   v_log_tab         v_log_tab_type;

   PROCEDURE log_msg (p_msg IN VARCHAR2)
   IS
   BEGIN
   
      IF v_logging_enabled THEN
         v_log_tab(v_i).message := p_msg;
         v_log_tab(v_i).log_date := sysdate;
         v_i := v_i + 1; 
                  
      END IF;
      
   END log_msg;
   
   PROCEDURE persist_msgs
   IS
      PRAGMA AUTONOMOUS_TRANSACTION;
   BEGIN
   
      IF v_logging_enabled THEN
         FORALL i IN 1 .. v_log_tab.COUNT
            INSERT INTO my_logging(message, backtrace, log_date)
            VALUES (v_log_tab(i).message,dbms_utility.format_error_backtrace, v_log_tab(i).log_date);
             
         v_log_tab.DELETE;
         v_i := 1;
      END IF;
      
      COMMIT;     
   END persist_msgs;   
BEGIN
   v_i := 1;
   
   log_msg('START');
   
   FOR v_rec IN c_cur
   LOOP      
      log_msg('..10: do intersting stuff');
      
      -- ... interesting stuff happens here
      
      log_msg('..20: another set of interesting stuff');
      
      -- ... interesting stuff happens here
      
      log_msg('..30: end of loop');
   END LOOP;   

   log_msg('END');
   persist_msgs;

END some_procedure_2;


Again, we can test this code:


declare
   v_time NUMBER;
begin
   v_time := dbms_utility.get_time;
   some_procedure_2;
   v_time := (dbms_utility.get_time - v_time) / 100;
   DBMS_OUTPUT.PUT_LINE ( 'log time with bulk (1 context switch) takes [' || v_time || '] seconds.' );   
end;


And here is my output:

log time with bulk (1 context switch) takes [5.71] seconds.

This procedure only takes 5.71 seconds instead of 28.74 seconds, and the result is the same!

We can conclude that BULK operations are a very interesting and useful way to optimize your PL/SQL code. There are exceptions but it is always good practice to make the least possible context switches.  

No comments:

Post a Comment

Search This Blog