Thursday, March 05, 2015

The mwrynn PL/SQL Table API

The mwrynn Table API is a PL/SQL object type I’ve written for Oracle. It wraps up some handy functions for generating SQL queries, and performing common operations such as “disable all indexes” that Oracle doesn’t provide conveniently out of the box. (I made it an object type instead of a package so it can be extended for a particular table, to the user's liking.) 

My initial inspiration was the the many times I’ve wanted the following construct to be available in SQL: Imagine you had a 50 column table, and you wanted to select col1 through 48, but not 49 and 50. I want to do this: SELECT *-(col49, col50) FROM mytable.

Typing it all out is Tedium City: SELECT col1, col2, col3, col4, col5, col6, col7, col8, col9, col10, col11, col12 [snip — we’re just 25% there]

At the same time, “SELECT *” can be a bad practice to use in your code. Generally fine for running ad-hoc queries, but I wouldn’t want to do that in a PL/SQL package that I plan on using in prod. So, my Table API would provide the entire select list (col1, col2, etc. etc.) for you, which you could just copy and paste. 

It kind of snowballed from there. I’ve got a number of functions, for example I built one that matches column names between two tables. This could be useful for merge/update statements — your external table linked to a flat file probably has column names matching the destination you want to copy to. So, this function would generate a.col1=b.col1, a.col2=b.col2, etc.

Another neat function generates random data. It automatically reads each column’s data type, and generates the appropriate type of data. I’ve found this useful for testing purposes, particularly although it has its limits of course (completely random values tend to have a high cardinality for example — maybe I’ll add optional cardinality constraints in the future!) I’ve found in the case of dates and timestamps it’s useful to have a date range limit, so I've implemented that. For numbers and strings well, there are no constraints at the moment. Foreign keys are generated by selecting a random PK value from the parent table.

Enough talk. Let’s see some action. Let’s run this PL/SQL block. Comments explain what I’m doing at each step. But first, we need a little DDL

 
 CREATE TABLE state (id VARCHAR2(2) PRIMARY KEY, name VARCHAR2(30));  
 INSERT INTO state SELECT 'NY', 'New York' FROM dual;  
 INSERT INTO state SELECT 'CA', 'California' FROM dual;  
 INSERT INTO state SELECT 'FL', 'Florida' FROM dual;  
 INSERT INTO state SELECT 'CO', 'Colorado' FROM dual;  

 CREATE TABLE person (id INT PRIMARY KEY, first_name VARCHAR2(10), last_name VARCHAR2(10), address VARCHAR2(20), city VARCHAR2(10), state VARCHAR2(2) REFERENCES state(id), zip VARCHAR2(5))   
 CREATE INDEX person_city_idx ON person(city);  
 CREATE INDEX person_zip_idx ON person(zip);   

 CREATE TABLE person_ext (id INT PRIMARY KEY, first_name VARCHAR2(10), last_name VARCHAR2(10), address VARCHAR2(20), city VARCHAR2(10), state VARCHAR2(2), zip VARCHAR2(5))   
 DECLARE  
  person_table_obj table_obj;  
  person_ext_table_obj table_obj;  
  cnt INTEGER;  
BEGIN
  --create table objects for mwrynn.person mwrynn.person_ext; third param null means local table, i.e. no dblink is used in qualifying table name
  person_table_obj := table_obj('PERSON', 'MWRYNN', null);
  person_ext_table_obj := table_obj('PERSON_EXT', 'MWRYNN', null);

  --display all the columns
  dbms_output.put_line(person_table_obj.all_cols);

  --disable all indexes and display number of successful disables
  dbms_output.put_line('Disabled ' || person_table_obj.disable_indexes || ' indexes.');

  --re-enable the indexes, and display number of successful renables
  dbms_output.put_line('Enabled ' || person_table_obj.enable_indexes || ' indexes.');
  
  --get list of columns to join on
  dbms_output.put_line(person_table_obj.join_list(person_ext_table_obj, 'a', 'b'));
  
  --get the list of matching columns, i.e. a.col1=b.col1, a.col2=b.col2, etc.
  dbms_output.put_line(person_table_obj.matched_update_list(person_ext_table_obj, 'a', 'b'));
  
  --generate random data, just 5 rows for our purposes. The foreign key column, state, will be looked up in the parent lookup table 
  cnt := person_table_obj.insert_random_rows(5, null, null); 
END;
 /  
 SELECT * FROM person;  

The output:

 anonymous block completed  
 ID,FIRST_NAME,LAST_NAME,ADDRESS,CITY,STATE,ZIP  
 Disabled 2 indexes.  
 Enabled 2 indexes.  
 a.ID=b.ID  
 a.FIRST_NAME=b.FIRST_NAME,a.LAST_NAME=b.LAST_NAME,a.ADDRESS=b.ADDRESS,a.CITY=b.CITY,a.STATE=b.STATE,a.ZIP=b.ZIP  
     ID FIRST_NAME LAST_NAME ADDRESS       CITY    STATE ZIP   
 ---------- ---------- ---------- -------------------- ---------- ----- -----  
 5041759599 DrUeedsMza cmUybFRdok ZyLBSrpNarfAsnNbHxzK xTkXtWdngA CA  bWjJs   
  885481864 VXNsOpNbwb POwZITUopy sqFQHHWcxFfcVNCLJyzq ysCdZKLKEQ FL  UcfkU   
 2357440692 vtDRoSaFXI JThrcRBQbL YrEyCAEAxXYErDInLciw SVdmHxtCNw CO  KxuIS   
 7331548629 pyKOLSGsTL NTlFxgpxAQ wsEZwiBlLaZAbsLenkeb IHWzBljRVL FL  LlhLA   
 4568868701 sftvDZyZpc HxEWPyenqR uPcMxkmJEMCgPYXMgUYq OrOZidpfKB CA  kpPgE   

First we've disabled and re-enabled the two indexes. (The primary key's index is never disabled.) Then we can see the join condition. Next is the output of matched_update_list -- the list of matching columns for the purpose of UPDATEs and MERGEs.

Finally we see the randomly generated data. As you can see, the values for the state column were pulled from the parent state table. The rest of the data is hideous but that's all it is -- random nonsense. If I had a lookup table of city/state/zip combinations, it could draw from that one. But I don't. :)

That's the mwrynn Table API for now. With such few columns this might not look so amazing, but it can really save time. I once had to work with about a dozen tables, each with about 20-40 columns, and write out my insert queries and such, plus generate test data. There are GUI tools that may provide some of the generation functionality, but I always found them cumbersome to use. All in all, this object type saved me quite a number of hours!

Note: I've added this API to Github! https://github.com/mwrynn/plsql-table

No comments: