A method for building indexes on fields in an OracleE database

  • 2020-06-15 10:24:46
  • OfStack

When the where clause USES a function on a column, the Oracle optimizer cannot use an index in a query unless it USES this simple technique to force an index. In general, if you do not use functions such as UPPER, REPLACE, or SUBSTRD in the WHERE clause, you cannot establish specific conditions on a specified column. If these functions are used, however, there is a problem: they prevent the Oracle optimizer from using indexes on columns, so the query takes more time than if it were used. Fortunately, if you include character data in these columns that use functions, you can modify the query statement in this way to enforce the use of indexes and run the query more efficiently. This article describes the technologies involved and explains how to implement them in two typical cases. Before discussing how to enforce the use of indexes because functions modify the contents of columns, let's first look at why the Oracle optimizer cannot use indexes in this case. Suppose we want to search for data that contains mixed case, such as the NAME column of the ADDRESS table in Table 1. Because the data is user-entered, we cannot use data that has been capitalized. To find each address named john, we use a query that contains the UPPER clause. As shown below: SQL > select address address where upper(name) like 'JOHN'; Before running this query, if we run the command "set autotrace on", we will get the following results, including the execution: ADDRESS cleveland 1 row selected. Execution Plan SELECT TABLE ACCESS FULL ADDRESS You can see that in this case, the Oracle optimizer makes a full scan of the ADDRESS table without using the index of the NAME column. This is because the index is based on the actual value of the data in the column, and the UPPER function has converted the characters to uppercase, that is, the values have been modified, so the query cannot use the index of that column. The optimizer cannot compare "JOHN" with an index entry. No index entry corresponds to "JOHN"- only "john". Thankfully, if you want to enforce an index in this case, there is an easy way: simply add one or more specific conditions to the WHERE clause to test the index value and reduce the number of rows to be scanned, but this does not change the conditions in the original SQL encoding. Take the following query statement as an example: SQL > select address address upper(name) like 'JO%' AND (name like 'J 'like 'j%') Using this query statement (AUTOTRACE set), you get the following results: ADDRESS cleveland 1 row selected Execution Plan TABLE TABLE TABLE BY ROWID INDEX RANGE ADDRESS_I TABLE ACCESS BY ROWID INDEX RANGE ADDRESS_I Now, the optimizer scans the range identified for each of the two statements joined by AND in the WHERE clause -- the second statement does not reference the function and therefore USES the index. After both scopes are scanned, the results of the run are merged. In this example, if the database has hundreds or thousands of rows, you can expand the WHERE clause and further narrow the scan by: select address where (name) JOHN' AND '(JOHN like 'jO'); The results are the same as before, but the execution is shown below, indicating that there are four scan ranges. Execution STATEMENT TABLE BY INDEX ROWID SCAN ADDRESS_I INDEX ROWID ADDRESS INDEX RANGE SCAN ADDRESS_I TABLE ACCESS BY INDEX ROWID ADDRESS INDEX RANGE SCAN ADDRESS_I If we are trying to take a step forward to speed up the query, we can specify three or more characters in the specific "name like" condition. However, doing so would make the WHERE clause 10 points bulky. Because you need all possible combinations of upper and lower case characters -joh,Joh,jOh,joH, and so on. In addition, specifying one or two characters is enough to speed up the query. Now let's see how we can use this basic technique when referring to different functions. In the case of REPLACE, just as names are not always entered in uppercase, telephone Numbers appear in many formats: 123-456-7890, 12456 7890, (123)456-7890, and so on. If you are searching for the above number in the column name PHONE_NUMBER, you may need to use the function REPLACE to ensure that the format is uniform 1. If the PHONE_NUMBER column contains only Spaces, hyphens, and Numbers, the where clause could look like this: WHERE replace(replace(phone_number, '-'), ') = '1234567890' The WHERE clause USES the REPLACE function twice to remove hyphens and Spaces, ensuring that the phone number is a simple numeric string. However, this function prevents the optimizer from using an index in this column. Therefore, we modify the WHERE clause as follows to enforce the index. WHERE replace(replace(phone_number, '-'), ') = '1234567890'AND phone_number like '123% 'If we know that the data may contain parentheses, the WHERE clause is a little more complicated by 1 point. We can add the REPLACE function (without parentheses, hyphens, and Spaces) to extend the added condition as follows: WHERE replace(replace(replace(replace(phone_number,' - '), '('), ') = '1234567890' AND (phone number es2220en _ES221en '(123%') ') This example highlights the importance of using the WHERE clause conditions smartly and without changing the result of the query. Your choice should be based on a full understanding of the type of information that exists in the column. In this case, we need to know that there are several different formats in the PHONE_NUMBER data so that we can modify the WHERE clause without affecting the query result. When you later encounter an WHERE clause that contains CHARACTER data to modify the function column, consider how to force the optimizer to use an index by adding one or two specific conditions. Selecting a specific set of conditions appropriately reduces the number of rows scanned, and enforcing the use of indexes does not affect query results -- but it does speed up query execution.

Related articles: