Detailed example of using Rownum paging in Oracle

  • 2021-12-13 17:35:46
  • OfStack

In MySQL, We usually use limit to complete the paging operation of dataset retrieval. However, in Oracle database, there is no convenient method to realize paging like limit1, so we usually complete paging directly in SQL statement, so we need to use rownum pseudo column or row_number () function. This paper will show the specific use methods of using rownum pseudo column and row_number () analysis function to complete Oracle data paging operation, and analyze and compare their performance.

1. Initialize test data

First, I selected 70,000 pieces of data from the data dictionary all_objects table. The creation steps are as follows:


-- To facilitate validation of result sets and avoid unnecessary sorting , I used it directly here rownum To produce an orderly OBJECT_ID Column
SQL> create table my_objects as
2 select rownum as OBJECT_ID,OBJECT_NAME,OBJECT_TYPE
3 from all_objects where rownum < 70001; Table created. -- Right OJBECT_ID Column to establish primary key
SQL> alter table my_objects add primary key (object_id); Table altered. SQL> select count(*) from my_objects; COUNT(*)
----------
70000 -- Analyze the table
SQL> exec dbms_stats.gather_table_stats(user,'my_objects',cascade => TRUE); PL/SQL procedure successfully completed.

2. Paged data acquisition

To complete the paging, we need to obtain 10 records in the table for articles 59991-60000, which we do using rownum and rown_number (), respectively


-- Method 1 , rownum Pseudo column mode
SQL> select t.* from (select d.*,rownum num from my_objects d where rownum<=60000) t where t.num>=59991; OBJECT_ID OBJECT_NAME OBJECT_TYPE NUM
---------- ------------------------------ ------------------- ----------
59991 /585bb929_DicomRepos24 JAVA CLASS 59991
59992 /13a1874f_DicomRepos25 JAVA CLASS 59992
59993 /2322ccf0_DicomRepos26 JAVA CLASS 59993
59994 /6c82abc6_DicomRepos27 JAVA CLASS 59994
59995 /34be1a57_DicomRepos28 JAVA CLASS 59995
59996 /b7ee0c7f_DicomRepos29 JAVA CLASS 59996
59997 /bb1d935c_DicomRepos30 JAVA CLASS 59997
59998 /deb95b4f_DicomRepos31 JAVA CLASS 59998
59999 /9b5f55c0_DicomRepos32 JAVA CLASS 59999
60000 /572f1657_DicomRepos33 JAVA CLASS 60000 10 rows selected. -- Method 2 , row_number Analysis function mode
SQL> select * from
2 (select t.*,row_number() over (order by t.OBJECT_ID) as num
3 from my_objects t)
4 where num between 59991 and 60000; OBJECT_ID OBJECT_NAME OBJECT_TYPE NUM
---------- ------------------------------ ------------------- ----------
59991 /585bb929_DicomRepos24 JAVA CLASS 59991
59992 /13a1874f_DicomRepos25 JAVA CLASS 59992
59993 /2322ccf0_DicomRepos26 JAVA CLASS 59993
59994 /6c82abc6_DicomRepos27 JAVA CLASS 59994
59995 /34be1a57_DicomRepos28 JAVA CLASS 59995
59996 /b7ee0c7f_DicomRepos29 JAVA CLASS 59996
59997 /bb1d935c_DicomRepos30 JAVA CLASS 59997
59998 /deb95b4f_DicomRepos31 JAVA CLASS 59998
59999 /9b5f55c0_DicomRepos32 JAVA CLASS 59999
60000 /572f1657_DicomRepos33 JAVA CLASS 60000 10 rows selected.

You can see that both methods return the correct result set; In the rownum method, since the rownum pseudo column cannot be directly used to perform the "greater than" comparison operation, here we first use rownum from the subquery to obtain the first 60,000 pieces of data, and then use the greater than operation in the outer query to remove unnecessary rows. For the row_number () method, the row_number () analysis function is sorted by OBJECT_ID and generates a unique identity for it, and then the interval data is obtained by between, which is easy to understand. Is the actual execution like this? Let's briefly analyze the implementation details of the two.

3. Paging Performance Analysis

First of all, look at their implementation plan:


SQL> set autotrace traceonly
SQL> set linesize 200 -- rownum Execution plan of pseudo-column paging
SQL> select t.* from (select d.*,rownum num from my_objects d where rownum<=60000) t where t.num>=59991; 10 rows selected. Execution Plan
----------------------------------------------------------
Plan hash value: 341064162 ----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 60000 | 3164K| 103 (0)| 00:00:02 |
|* 1 | VIEW | | 60000 | 3164K| 103 (0)| 00:00:02 |
|* 2 | COUNT STOPKEY | | | | | |
| 3 | TABLE ACCESS FULL| MY_OBJECTS | 60000 | 2226K| 103 (0)| 00:00:02 |
---------------------------------------------------------------------------------- Predicate Information (identified by operation id):
--------------------------------------------------- 1 - filter("T"."NUM">=59991)
2 - filter(ROWNUM<=60000) Statistics
----------------------------------------------------------
163 recursive calls
0 db block gets
399 consistent gets
0 physical reads
0 redo size
1030 bytes sent via SQL*Net to client
419 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
5 sorts (memory)
0 sorts (disk)
10 rows processed -- row_number() Paged execution plan
SQL> select * from
2 (select t.*,row_number() over (order by t.OBJECT_ID) as num
3 from my_objects t)
4 where num between 59991 and 60000; 10 rows selected. Execution Plan
----------------------------------------------------------
Plan hash value: 2942654422 ----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 70000 | 3691K| 565 (1)| 00:00:07 |
|* 1 | VIEW | | 70000 | 3691K| 565 (1)| 00:00:07 |
|* 2 | WINDOW NOSORT STOPKEY | | 70000 | 2597K| 565 (1)| 00:00:07 |
| 3 | TABLE ACCESS BY INDEX ROWID| MY_OBJECTS | 70000 | 2597K| 565 (1)| 00:00:07 |
| 4 | INDEX FULL SCAN | SYS_C0011057 | 70000 | | 146 (0)| 00:00:02 |
---------------------------------------------------------------------------------------------- Predicate Information (identified by operation id):
--------------------------------------------------- 1 - filter("NUM">=59991 AND "NUM"<=60000)
2 - filter(ROW_NUMBER() OVER ( ORDER BY "T"."OBJECT_ID")<=60000) Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
490 consistent gets
0 physical reads
0 redo size
1030 bytes sent via SQL*Net to client
419 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
10 rows processed

As we can see from the execution plan above, the rownum method uses a full table scan to get the first 60,000 rows in the table, and then uses the predicate condition "T". "NUM" > = 59991 to filter out unnecessary rows; Although the row_number () method uses the primary key index to omit the sorting operation of window generated by the analysis function itself, However, it first obtains all 70,000 rows of data in the table, and then uses the between keyword to filter the data rows. Many resources of this operation are consumed on data reading, so in the above example, the rownum pseudo-column method obtains better performance, but in fact, in most cases, the first rownum method obtains better performance.

One might wonder, since the row_number () method spends so much resources on data reading, why not just let it scan the full table, so let's take a look at using the full table scan:


-- Disable primary key directly
SQL> alter table my_objects disable primary key; Table altered. SQL> select * from
2 (select t.*,row_number() over (order by t.OBJECT_ID) as num
3 from my_objects t)
4 where num between 59991 and 60000; 10 rows selected. Execution Plan
----------------------------------------------------------
Plan hash value: 2855691782 -----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 70000 | 3691K| | 812 (1)| 00:00:10 |
|* 1 | VIEW | | 70000 | 3691K| | 812 (1)| 00:00:10 |
|* 2 | WINDOW SORT PUSHED RANK| | 70000 | 2597K| 3304K| 812 (1)| 00:00:10 |
| 3 | TABLE ACCESS FULL | MY_OBJECTS | 70000 | 2597K| | 120 (1)| 00:00:02 |
----------------------------------------------------------------------------------------------- Predicate Information (identified by operation id):
--------------------------------------------------- 1 - filter("NUM">=59991 AND "NUM"<=60000)
2 - filter(ROW_NUMBER() OVER ( ORDER BY "T"."OBJECT_ID")<=60000) Statistics
----------------------------------------------------------
190 recursive calls
0 db block gets
450 consistent gets
0 physical reads
0 redo size
1030 bytes sent via SQL*Net to client
419 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
6 sorts (memory)
0 sorts (disk)
10 rows processed

It can be seen that the WINDOW SORT PUSHED RANK method occurs in this full table scan situation, which means that cpu resources will be spent on sorting object_id again, although object_id has been ordered in this example, and its performance is also inferior to rownum method.

Therefore, in the process of writing the program, I still prefer to use the following rownum to complete the paging operation of Oracle, which is usually written as follows:


-- Return to the 20 Page data, per page 10 Row
SQL> define pagenum=20
SQL> define pagerecord=10
SQL> select t.* from (select d.*,rownum num from my_objects d
2 where rownum<=&pagerecord*&pagenum) t
3 where t.num>=(&pagenum-1)*&pagerecord +1; old 2: where rownum<=&pagerecord*&pagenum) t
new 2: where rownum<=10*20) t
old 3: where t.num>=(&pagenum-1)*&pagerecord +1
new 3: where t.num>=(20-1)*10 +1 OBJECT_ID OBJECT_NAME OBJECT_TYPE NUM
---------- ------------------------------ ------------------- ----------
191 SQLOBJ$DATA_PKEY INDEX 191
192 SQLOBJ$AUXDATA TABLE 192
193 I_SQLOBJ$AUXDATA_PKEY INDEX 193
194 I_SQLOBJ$AUXDATA_TASK INDEX 194
195 OBJECT_USAGE TABLE 195
196 I_STATS_OBJ# INDEX 196
197 PROCEDURE$ TABLE 197
198 PROCEDUREINFO$ TABLE 198
199 ARGUMENT$ TABLE 199
200 SOURCE$ TABLE 200 10 rows selected.

Remarks:

When writing the program, in order to facilitate understanding, some people will use between to limit data rows in rownum method, which is written as follows:


select t.* from (select rownum num, d.* from my_objects d) t where t.num between 59991 and 60000;

In their view, writing the returned data row in this way and the first rownum method is 1. Oracle will push the predicate between part into the subquery without affecting performance, but this idea is completely wrong. Let's look at its specific execution plan:


SQL> select t.* from (select rownum num, d.* from my_objects d) t where t.num between 59991 and 60000; 10 rows selected. Execution Plan
----------------------------------------------------------
Plan hash value: 1665864874 ----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 70000 | 3691K| 120 (1)| 00:00:02 |
|* 1 | VIEW | | 70000 | 3691K| 120 (1)| 00:00:02 |
| 2 | COUNT | | | | | |
| 3 | TABLE ACCESS FULL| MY_OBJECTS | 70000 | 2597K| 120 (1)| 00:00:02 |
---------------------------------------------------------------------------------- Predicate Information (identified by operation id):
--------------------------------------------------- 1 - filter("T"."NUM"<=60000 AND "T"."NUM">=59991) Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
423 consistent gets
0 physical reads
0 redo size
1030 bytes sent via SQL*Net to client
419 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
10 rows processed

It can be seen very conspicuously that 70,000 rows of full table scanning occurred in this query first, instead of the expected 60,000 rows. The reason is rownum. Using rownum in subquery directly disables the predicate push-forward function in query conversion stage, so the above query can only obtain all the data first and then apply between to filter. You can refer to my article "CBO-Query Transformation Exploration".

Having said that, in fact, it is the three SQL statements of Oracle paging. For the paging problem with very large amount of data, it will not be efficient to do so simply. Therefore, it is necessary to rely on one other technology, such as anti-normal design, pre-calculation or establishing an appropriate cache mechanism in the application layer.


Related articles: