Lantana Systems: Understanding the problem with batched SQL reports with date parameters, and a new solution using SQL/Input

If you create an SQL report and edit it so that it runs in a batch sequence using passed date parameters, you may have found that the report runs extremely slow. This blog posting explains why this happens and provides a solution that solves this problem, including a new solution that works for SQL reports that are run using the Pronto 'Report Distribution Management' module.

When you run an SQL report, the SQL system has to work out the best way to collect the data you want, including what indexes to use when it reads data from an object. For example, if you are extracting data from the sales-order-archive file and you specify a date range for the field so-processing-date, the SQL system will normally decide that the best index to use is index 2 that uses the fields so-processing-date and so-order-no.

This works fine if you explicitly specify the date, for example:
where so-processing-date = 21-DEC-2009

However, if you use any programming function to specify a field, the SQL system ignores the field when it is working out what index to use. For example, if you use:
where so-processing-date = today() - 1
then the SQL system will NOT choose the date index and instead it will use another index, normally index 1 that uses the field so-invoice-no. Consequently, it reads a huge amount of data to find yesterday's orders and it takes a l-o-n-g time.

You can test this yourself by creating two SQL reports with where clauses like those above and adding the reports to a batch sequence. If you check the mail you get when the batch job finishes, you will find the first will say something like:
234 records extracted (from 234 records tested)
and the second will say something like:
234 records extracted (from 30146253 records tested)
It is easy to see why one takes the extra time to run.

If you setup an SQL report to run in a batch sequence and use the 'Date-Parameter-Overlay' option to set the date when the batch runs, you need to handle the passed parameter in your SQL using something like:
where so-processing-date = date-to-julian($1)
This is using a function and the date will not be used to work out the index to use.

I worked out a solution to this problem back in 1994 when I created my program 'SQL/Input' which acts as front-end to the Pronto SQL system. Its main feature is that it allows you to set up meaningful defaults, including date defaults, when you prompt the user for input values to use in an SQL report. However, it also allows you to run an SQL report in a batch sequence where the default parameters will be used. The program rewrites the SQL report into a temporary file and then runs this temporary SQL as a batch job. The result is that the SQL runs as if the date had been explicitly specified, and the correct date index is used by the SQL system to extract the data.

SQL/Input has been available as a free download from my website for many years.

My report management program 'Rep-Runner' was able to handle SQL reports that used 'SQL/Input' for the date parameter substitution, so such reports could be used as part of regular overnight processing and the output automatically distributed to one or more users.

'SQL/Input' could not be used with the standard Pronto 'Report Distribution Management' module because it loaded the substituted SQL as an extra job on the batch queue and the output was not captured and distributed to users. I did not have a problem with this as it worked fine with 'Rep-Runner'.

I have now decided to retire 'Rep-Runner' from Pronto version 670 onwards, as Pronto have changed the way that batch job sequences are stored and submitted and I do not want to rewrite the 'Rep-Runner' program. My 'Rep-Runner' software was initially released in 1995 when Pronto lacked the functionality that 'Rep-Runner' provided, but Pronto now has this functionality so it does not seem worthwhile to modify 'Rep-Runner' for the batch sequence changes.

However, that leaves the problem of how to run an SQL with date parameters using the correct indexes in a batch sequence that is run using the 'Report Distribution System' module, so I have now come up with a brand new solution.

I have modified the 'SQL/Input' program so that it can be called in a new mode with the name of a text file that contains a list of SQL reports to process. The program then reads each of the listed SQL reports, calculates the default parameters, and rewrites the SQL reports into new files in another directory. The new files have all parameters replaced with their calculated values so that they can be processed by the SQL system as if the values were explicitly specified. These new SQL report files can be correctly run in batch mode using the correct indexes by the 'Report Distribution System' and the output files will be correctly captured and distributed.

The solution is very easy to implement as it requires only one extra batch job to be run either prior to the 'Report Distribution System' or as part of an included sequence. The only requirement is that this extra call to 'SQL/Input' is made prior to any of the individual SQL reports in the sequence.

The free version of 'SQL/Input' that is available on my website does not include this new functionality. The enhanced version is available at no charge to existing users of my 'Rep-Runner' software, or for $300 (+GST) to other users.

Lantana Systems

Monday, December 28, 2009

Understanding the problem with batched SQL reports with date parameters, and a new solution using SQL/Input

No comments:

Post a Comment

Blog Archive

About Me