Showing only those rows where data changed.
Let’s say you have a table where the values of a particular column only changes slowly over time, perhaps a parameter of some sort. Now you want to write a query that shows only those rows of the table of when the value in that column changed.
Take the following table for example: It has 21 rows. The first column (DT) shows a date, the second column (VAL) shows a value that slowly changes over time:
DT VAL 01-Sep-11 10 02-Sep-11 10 03-Sep-11 10 04-Sep-11 10 05-Sep-11 20 06-Sep-11 20 07-Sep-11 20 08-Sep-11 20 09-Sep-11 10 10-Sep-11 10 11-Sep-11 10 12-Sep-11 30 13-Sep-11 30 14-Sep-11 30 15-Sep-11 30 16-Sep-11 30 17-Sep-11 30 18-Sep-11 10 19-Sep-11 10 20-Sep-11 10 21-Sep-11 10
Now write a query that only returns the rows for those dates where the value changes, i.e. 9/1 (the first row), 9/5, 9/9, 9/12, and 9/18.
To do this we could try to use the LAG function in the WHERE clause and only return those rows where VAL is different from VAL of the previous row. But since we cannot use a windowing function in a WHERE clause, we’ll have to put it into a factored subquery first:
[sourcecode language=”sql”]
WITH myqry AS
(SELECT dt
,val
,lag(val, 1, 0) over(ORDER BY dt) pre_val
FROM t1)
SELECT dt,val
FROM myqry
WHERE val != pre_val;
[/sourcecode]
DT VAL 01-Sep-11 10 05-Sep-11 20 09-Sep-11 10 12-Sep-11 30 18-Sep-11 10