Christoph's 2 Cents

A Backup for My Brain!

Oracle DevelopementOracle PerformanceOracle Tools

An AWKward look at tkprof.

Reading trough a tkprof file can be a long, arduous process. How do you find the things you’re looking for, without having to dredge through endless lines of trace data?
Here are some awk one-liners that can help with that.
What do you really care about in a trace file? Usually you are looking for the SQL that either consumed the most time, or the most logical I/Os.
Here is a on-liner that will show you the top-5 SQL statements that took the most elapsed time: (in my examples the tkprof output file name is tk.txt)

awk '/^call/' tk.txt | head -n 1;  awk '/^total/' tk.txt | sort -k 4 -nr | head -n 5

The result looks something like this:
call     count       cpu    elapsed       disk      query    current        rows
total        1      0.07     332.15         13         26         51           1
total        1      0.07     332.15         13         26         51           1
total   698231     48.27      72.56        478      57114    5254875     2260618
total       15     14.42      17.73         15       9249    2226312      139033
total       12     12.77      16.87         23      10983    2084232      107561
You can see right away, that the SQL statement that took the longest, took 332.15 seconds to execute.
Explanation: The first awk statements simply displays the column headings. The second looks at all the totals, sorts them numerically descending, and displays only the first five records.
Notice that it sorts the fourth column, which is the elapsed time. To sort on a different column simply change the integer after the –k argument. The following columns are available:
call     count       cpu    elapsed       disk      query    current        rows
You can ingore the first column (call), which just tells you that we are looking at the totals for a particular SQL statement.
The meanings of the following columns are explained in the header of each tkprof output file:
count    = number of times OCI procedure was executed
cpu      = cpu time in seconds executing
elapsed  = elapsed time in seconds executing
disk     = number of physical reads of buffers from disk
query    = number of buffers gotten for consistent read
current  = number of buffers gotten in current mode (usually for update)
rows     = number of rows processed by the fetch or execute call
To find which SQL statement caused the particular elapsed time, simply search the tkprof output for that value.

Another usefull one-liner condenses the tkprof file to where you only see the SQL statement and its execution totals. This keeps you from having to look a all the nitty gritty details:

awk 'tolower($1)~ /(^select)|(^insert)|(^update)|(^delete)|(^declare)|(^begin)/,/^$/ { print NR "\t" $0 } /^call/ { print NR "\t" $0 } /^total/ { print NR "\t" $0 "\n"}' tk.txt | less

The output looks something like this:
10628   DELETE FROM SOME_TABLE
10629   WHERE
10630    STATUS = 'Q'
10631
10633   call     count       cpu    elapsed       disk      query    current        rows
10639   total        2      0.00       0.00          0          8          0           0
10646   BEGIN some_package.some_procedure; END;
10647
10649   call     count       cpu    elapsed       disk      query    current        rows
10655   total        1      0.07     332.15         13         26         51           1
10662   BEGIN
10663     MY_UTILITIES.PRC1;
10664   EXCEPTION
10665     WHEN OTHERS THEN
10666       LOG_ERROR(SYS_CONTEXT('USERENV'
10667                                      ,'SESSIONID')
10668                          ,PRC1);
10669   END;
10670
10671   call     count       cpu    elapsed       disk      query    current        rows
10677   total        2      0.00       0.00          0          0          0           1
Explanation: This command looks for all the lines that begin with an SQL statement, then starts printing until it comes across an empty line. Then it looks for a line beginning with the word call and prints the column headings. Finally it skips down to the total line and prints the values, followed by a blank line. The output is then piped to less. You could also pipe it to another text file if you wish. The nice thing about less, however, is that you can search the output right there. So if you were looking for the SQL statement that caused the 332.15 seconds of elapsed time (see above), simple search the output with the forward slash:
:/332.15
The output will skip to the line with that occurance.
Conclusion: These two one-liners can help you quickly identify the SQL statements that take a long time to execute, or cause a lot of reads in your database. What they don’t do, is tell you why. To do that, you have to dive into the details of the tkprof output, or use a third party tool like the Hotsos Profiler. The Hotsos Profiler will actually tell you where the time went during the execution of an SQL statement, and will give you a definitive starting point for your performance optimization.