Let’s talk about deterministic algorithms and the functions you put them in. From Wikipedia:
A deterministic algorithm is an algorithm that, given a particular input, will always produce the same output, with the underlying machine always passing through the same sequence of states. Deterministic algorithms are by far the most studied and familiar kind of algorithm, and one of the most practical since they can be run on real machines efficiently.
OK, I confess: I am not really sure what is meant by “real machines” in that last sentence, but I am sure it is true.
Let’s now convert that definition into PL/SQL-speak. There are, roughly speaking, two “containers” (a.k.a., subprograms) for code in PL/SQL: a procedure and a function. Each can be defined with zero, one, or more parameters.
A PL/SQL subprogram is deterministic if and only if when you pass in the same values for its parameters, it always does the same thing (for a procedure takes the same action(s), for a function returns the same value).
I hope that right away, you can see that lots, perhaps most, of the code you write in PL/SQL will most decidedly not be deterministic. That’s because any subprogram that contains a SQL statement is non-deterministic – that’s an external dependency that can affect the behavior of the subprogram.
But you might be surprised to find out just how much of your logic is deterministic. And it’s worth paying attention to this characteristic because deterministic subprograms are easier to test and maintain.
Plus, if your function is deterministic, you can declare it as such right in the function definition, which will allow you to use that function in a:
- function-based index
- virtual column definitions that use PL/SQL functions, and
- materialized views that have query-rewrite enabled require special function properties
Let’s look at a simple example of a function-based index, which is probably the most common use of deterministic functions. As you are likely aware, if I apply a function (even a built-in function) to a column in a where clause, any index that references that column is not used. In the query below, even if I have defined an index on last_name, it will be ignored.
select employee_id from employees where upper (last_name) like 'FEU%'
One way to fix that is to create an index like this:
create index employees_upper_last_name_fbi on upper (last_name)
The index will now be used in that query – and you can call your own functions in such an index definition – if that function is declared as deterministic. Here’s a very simple example:
create table determ_demo (a_name varchar2(100)) / begin insert into determ_demo values ('Hammer'); insert into determ_demo values ('Jamming'); end; / create or replace function betwnstr ( string_in in varchar2, start_in in integer, end_in in integer ) return varchar2 authid definer is begin return (substr (string_in, start_in, end_in - start_in + 1)); end; / select a_name from determ_demo where betwnstr (a_name, 3, 4) = 'mm' /
create index determ_demo_a_name_fbi on determ_demo (substr (betwnstr (a_name, 3, 4), 1, 100)) / ORA-30553: The function is not deterministic create or replace function betwnstr ( string_in in varchar2, start_in in integer, end_in in integer) return varchar2 deterministic authid definer is begin return (substr (string_in, start_in, end_in - start_in + 1)); end; / create index determ_demo_a_name_fbi on determ_demo (substr (betwnstr (a_name, 3, 4), 1, 100)) / Index DETERM_DEMO_A_NAME_FBI created.
(Note: you can also achieve case-insensitive searching using the collation feature in Oracle Database 12c and higher. Again, Tim Hall and ORACLE-BASE explains all about it.)
Why is determinism important in the situations listed above? Well, if you are trying to speed up the performance of your query by applying an index, you sure do want that index to be efficient. But how efficient is it going to be if it keeps on calling that function over and over again?
So…the big deal about deterministic functions from a performance standpoint is (to quote the documentation):
When the database encounters a deterministic function, it tries to use previously calculated results when possible rather than re-executing the function.
And it can do that because previously calculated results with a specific set of argument values will also be valid for any futurely (yes, that’s right, I just made up a new word) calculated results – precisely because the function is deterministic.
And Oracle Database is sophisticated enough to build a little cache for any queries that call a deterministic function (kind of like a cheap and scope-limited version of the function result cache). So as you execute your deterministic function in that query, the database keeps track of the argument values and the return value. If you call the function with the same input values again in that SQL statement, it will not execute the function and instead simply return the value.
The Demonstration of deterministic function caching in PL/SQL LiveSQL script shows you precisely that.
You can also declare a procedure to be deterministic, but it won’t change program behavior (in terms of caching or allowing you to do things that would be impossible in a non-deterministic procedure). So should you even bother marking a procedure as deterministic? Yes! And I will explain below in the benefits section.
The benefits of declaring determinism
Remember how I mentioned earlier that deterministic functions are easier to test? That’s because you don’t have to worry about (and write code to set up and tear down) external dependencies, like the contents of tables.
Testing deterministic programs are so much easier that it allowed me to build and offer on Oracle LiveSQL, two scripts that will generate test code or at least template code for testing your deterministic functions:
Generate regression test template for deterministic functions
Unit Test Template for Deterministic Functions
I hope you’ll find them useful or perhaps inspiring to do a variation for your own code.
Even if adding the deterministic keyword to a procedure or function does not immediately have an impact, I strongly suggest that you add that keyword – when the subprogram really does have no external dependencies or side effects.
By declaring a subprogram as deterministic, you are (a) documenting expected behavior for future developers and (b) giving more information to the PL/SQL engine.
Documenting expected behavior for future developers: if you have designed the subprogram to avoid side effects, it is important to tell anyone who may be maintaining the program in the future. Of course, you could do that in a comment, but it is so, so much better to make it a part of the subprogram definition.
Giving more information the PL/SQL engine: even if PL/SQL can’t or doesn’t use that keyword (and its implications) today, it might in the future. So give the compiler all the information you possibly can, thereby leaving your application in the best possible state for future optimal performance and behavior.
More generally, I suggest that you segregate out all deterministic logic into their own subprograms and then call those from your non-deterministic “parent” procedures and functions.