For this tip, we tried something a bit different. I posted some code (below) that I believed could use some refactoring. During the session we discussed the code (I did most of the talking). I then showed my rewrite.
I also pointed everyone to this excellent companion post by Samuel Nitsche, which would make it easier for you to build a regression test script for a variation of this code. An excellent reminder that…..
It’s hard to refactor successfully, if you cannot test for regressions.
Here’s the piece of code I dug out of my archives. The developer needs to write data from various log tables into files. This was his first pass at a “proof of concept” of the program, which he asked me to look over before proceeding to apply that approach to the rest of the files.
Here’s the recording of my Feuertip
declare type T_LINES is table of varchar2(4000); k_file1_name constant varchar2(20) := 'abcd'; k_file2_name constant varchar2(20) := 'arty'; k_file3_name constant varchar2(20) := 'rett'; k_file4_name constant varchar2(20) := 'qsur'; k_file5_name constant varchar2(20) := 'wede'; k_file6_name constant varchar2(20) := 'cfrt'; k_file7_name constant varchar2(20) := 'ored'; l_file1_lines T_LINES; l_file2_lines T_LINES; l_file3_lines T_LINES; l_file4_lines T_LINES; l_file6_lines T_LINES; l_file5_lines T_LINES; l_file7_lines T_LINES; l_file1_lines_cnt number := 0; l_file2_lines_cnt number := 0; l_file3_lines_cnt number := 0; l_file4_lines_cnt number := 0; l_file6_lines_cnt number := 0; l_file5_lines_cnt number := 0; l_file7_lines_cnt number := 0; l_file1_line varchar2(4000); l_file2_line varchar2(4000); l_file3_line varchar2(4000); l_file4_line varchar2(4000); l_file6_line varchar2(4000); l_file5_line varchar2(4000); l_file7_line varchar2(4000); l_file1_name varchar2(100) := k_file1_name; l_file2_name varchar2(100) := k_file2_name; l_file3_name varchar2(100) := k_file3_name; l_file4_name varchar2(100) := k_file4_name; l_file6_name varchar2(100) := k_file5_name; l_file5_name varchar2(100) := k_file6_name; l_file7_name varchar2(100) := k_file7_name; l_current_key number default 0; l_first_time number default 0; procedure init as begin l_file1_lines := T_LINES(); l_file2_lines := T_LINES(); l_file3_lines := T_LINES(); l_file4_lines := T_LINES(); l_file6_lines := T_LINES(); l_file5_lines := T_LINES(); l_file7_lines := T_LINES(); end init; procedure print_lines ( p_file_name in varchar2, p_lines in T_LINES ) as l_file utl_file.file_type; begin l_file := utl_file.fopen('TEMP', p_file_name, 'w'); for i in p_lines.first..p_lines.last loop utl_file.put_line(l_file, p_lines(i)); end loop; utl_file.fclose (l_file); end print_lines; procedure add_line ( p_file_name in varchar2, p_line in varchar2 ) as begin case p_file_name when k_file1_name then l_file1_lines.extend(); l_file1_lines(l_file1_lines.count) := p_line; l_file1_lines_cnt := l_file1_lines_cnt + 1; when k_file2_name then l_file2_lines.extend(); l_file2_lines(l_file2_lines.count) := p_line; l_file2_lines_cnt := l_file2_lines_cnt + 1; when k_file3_name then l_file3_lines.extend(); l_file3_lines(l_file3_lines.count) := p_line; l_file3_lines_cnt := l_file3_lines_cnt + 1; when k_file4_name then l_file4_lines.extend(); l_file4_lines(l_file4_lines.count) := p_line; l_file4_lines_cnt := l_file4_lines_cnt + 1; when k_file5_name then l_file5_lines.extend(); l_file5_lines(l_file5_lines.count) := p_line; l_file5_lines_cnt := l_file5_lines_cnt + 1; when k_file6_name then l_file6_lines.extend(); l_file6_lines(l_file6_lines.count) := p_line; l_file6_lines_cnt := l_file6_lines_cnt + 1; when k_file7_name then l_file7_lines.extend(); l_file7_lines(l_file7_lines.count) := p_line; l_file7_lines_cnt := l_file7_lines_cnt + 1; end case; end add_line; begin init; for crs in ( select * from dual ) loop case when l_current_key != crs.key_id then l_first_time := 1; l_current_key := crs.key_id; else l_first_time := 2; end case; if l_first_time = 1 then for crs2 in ( select * from dual ) loop l_file1_line := crs.key_id || '|' || crs.val1 || '|' || crs.val2 || '|' || crs.val3; add_line( p_file_name => k_file1_name, p_line => l_file1_line ); end loop; end if; end loop; print_lines(l_file1_name, l_file1_lines); /* Next! */ for crs in ( select * from dual /* a different query */) loop case when l_current_key != crs.key_id then l_first_time := 1; l_current_key := crs.key_id; else l_first_time := 2; end case; if l_first_time = 1 then for crs2 in ( select * from dual /* a different query */) loop l_file2_line := crs.key_id || '|' || crs.val1 || '|' || crs.val2 || '|' || crs.val4 || '|' || crs.val5 || '|' || crs.val6; add_line( p_file_name => k_file2_name, p_line => l_file2_line ); end loop; end if; end loop; print_lines(l_file2_name, l_file2_lines); /* And so on.... */ end;
Thoughts On This Code
I hope that you had the same fundamental reaction to this code: SO MUCH REPETITION!
Whenever I see patterns like this in code….
l_file1_lines T_LINES; l_file2_lines T_LINES; l_file3_lines T_LINES; l_file4_lines T_LINES; l_file6_lines T_LINES; l_file5_lines T_LINES; l_file7_lines T_LINES;
….I am simultaneously alarmed and delighted.
Alarmed that it was written in the first place. Delighted that I can (probably) find a way to get rid of redundant code, and come up with something that is clear and easy to maintain.
A Refactored Version
Below you will find a refactored version of the anonymous block. Here’s an overview of the changes:
1. Move to a stored program unit, in this case a schema-level procedure. Generally you do not want to build and run anonymous blocks. You want to compile program units and then run those. It is much more likely you will be able to reuse code that way. The compiler can also offer more assistance, such as with compile-time warnings.
2. Switched from integer to Boolean to determine if running loop body for the first time. That’s a small thing, but the PL/SQL team went to the bother of implementing a Boolean type. The least we can do is use it.
3. The big change: use a string indexed collection of lines of text, in which the index values are the names of the files. This one step allowed me to remove all that redundant code! You see this more clearly in the add_line procedure. That massive case statement is now replaced by this single assignment:
l_file_contents(p_file_name)(l_file_contents(p_file_name).count + 1) := p_line;
By taking this approach, all the lines for a given file are automatically, elegantly stored within a nested collection identified by the file name.
If you are not yet comfortable working with string-indexed collections, I hope this motivates you to do some studying and write some code!
create or replace procedure push_to_destination_files is k_file1_name constant varchar2(20) := 'abcd'; k_file2_name constant varchar2(20) := 'arty'; k_file3_name constant varchar2(20) := 'rett'; k_file4_name constant varchar2(20) := 'qsur'; k_file5_name constant varchar2(20) := 'wede'; k_file6_name constant varchar2(20) := 'cfrt'; k_file7_name constant varchar2(20) := 'ored'; /* string indexed collection containing all lines for all files */ type t_lines is table of varchar2(4000) index by pls_integer; type t_files is table of t_lines index by varchar2(100); l_file_contents t_files; procedure add_line ( p_file_name in varchar2, p_line in varchar2 ) as l_scope logger_logs.scope%type := 'add_line'; l_params logger.tab_param; begin l_file_contents(p_file_name)(l_file_contents(p_file_name).count + 1) := p_line; end add_line; procedure load_file1_lines is l_one_line varchar2(4000); l_current_spl number default 0; l_first_time boolean default true; l_current_key integer; begin for crs in ( select 1 key_id from dual ) loop case when l_current_key != crs.key_id then l_first_time := true; l_current_key := crs.key_id; else l_first_time := false; end case; if l_first_time then for crs2 in ( select 1 val1, 2 val2, 3 val3 from dual ) loop l_one_line := crs.key_id || '|' || crs2.val1 || '|' || crs2.val2 || '|' || crs2.val3; add_line( p_file_name => k_file1_name, p_line => l_one_line ); end loop; end if; end loop; end; procedure write_lines_to_files is l_file utl_file.file_type; l_file_name varchar2(1000); procedure fill_one_file ( p_file_name in varchar2, p_lines in t_lines ) as l_file utl_file.file_type; begin l_file := utl_file.fopen('TEMP', p_file_name, 'w'); for i in p_lines.first..p_lines.last loop utl_file.put_line(l_file, p_lines(i)); end loop; utl_file.fclose(l_file); end; begin /* Use names of files in the array index to drive loop */ l_file_name := l_file_contents.first; while l_file_name is not null loop if l_file_contents(l_file_name).count > 0 then fill_one_file(l_file_name, l_file_contents(l_file_name)); end if; l_file_name := l_file_contents.next(l_file_name); end loop; end; begin load_file1_lines; write_lines_to_files; end;
[…] Feuertip #13: Let’s refactor! – Insum […]
Hello Steven,
at refactoring the code for selecting the data and writing it into a logfile into nested procedures like LOAD_FILE1_LINES() you have made a mistake by not initializing the variable L_CURRENT_KEY which will always set L_FIRST_TIME to false as “l_current_key != crs.key_id” will evaluate to NULL and so always the ELSE branch of the CASE will be executed.
And this little variable L_CURRENT_KEY was also a candidate to make problems in your original code (assuming that the SELECT’s for the first and the second log file haven’t been the same): you haven’t re-initialized the variable between the first and second outer loop. So if the last CRS.KEY_ID from the first loop is equal to the first CRS.KEY_ID of the second loop, the first time of the second outer loop wouldn’t have been catched with L_FIRST_TIME equal to 1.
Kind regards, Niels
Thanks for pointing out these flaws, Niels!