Simple performance test of oracle with python

  • 2021-08-17 00:16:54
  • OfStack

1. Overview

Two questions that dba can't avoid in its work, how much performance improvement will sql have when using bound variables? How much impact will the audit function of the database have on the performance of the database if it is turned on? I happened to meet them recently, so I just want to do an experiment.

Performance Impact of sql Using Binding Variables Influence of opening database audit function on performance

The method used in the experiment is very simple, that is, reading the csv file through python, then importing it into the database, and finally counting the time required for the program to be completed

Step 2 Prepare the script

python script dataimporttest. py


# author: yangbao
# function:  By importing csv Test database performance 

import cx_Oracle
import time


#  Database connection string 
DATABASE_URL = 'user/password@ip:1521/servicename'


class CsvDataImport:

 def __init__(self, use_bind):
  self.csv_name = 'test.csv'
  self.use_bind = use_bind
  if use_bind == 1:
   self.insert_sql = "insert into testtb values(:0, " \
        "to_date(:1,'yyyy-mm-dd hh24:mi:ss'), " \
        "to_date(:2,'yyyy-mm-dd hh24:mi:ss'), " \
        ":3, :4, :5, :6, :7, :8, :9, :10, :11, :12, :13, :14, " \
        ":15, :16, :17, :18, :19, :20, :21)" #  Object that uses the bound variable sql
  else:
   self.insert_sql = "insert into testtb values({0}, " \
        "to_date('{1}','yyyy-mm-dd hh24:mi:ss'), " \
        "to_date('{2}','yyyy-mm-dd hh24:mi:ss'), " \
        "{3}, {4}, '{5}', {6}, '{7}', {8}, {9}, {10}, {11}, {12}, {13}, {14}, " \
        "{15}, {16}, {17}, {18}, {19}, {20}, {21})" #  Object that does not use bound variables sql

 def data_import(self):

   begin_time = time.perf_counter()

   try:
    conn = cx_Oracle.connect(DATABASE_URL)
    curs = conn.cursor()

    with open(self.csv_name) as f:
     csv_contents = f.readlines()

    import_rows = 0

    message = '{} start to import'.format(self.csv_name)
    print(message)

    for line, csv_content in enumerate(csv_contents[1:]):

     data = csv_content.split(',')
     if self.use_bind == 1:
      data = map(lambda x: None if x == '' else x, data)
     else:
      data = map(lambda x: 'null' if x == '' else x, data)
     data = list(data)
     data[-1] = data[-1].replace('\n', '')

     if self.use_bind == 1:
      curs.execute(self.insert_sql, data) #  Insert data by binding variables 
     else:
      # print(self.insert_sql.format(*data))
      curs.execute(self.insert_sql.format(*data)) #  Insert data using unbound variables 
     import_rows += 1
     if import_rows % 10000 == 0:
      curs.execute('commit')
      message = '{} has imported {} lines'.format(self.csv_name, import_rows)
      print(message)

    conn.commit()
    curs.close()
    conn.close()

    end_time = time.perf_counter()

    elapsed = round(end_time - begin_time, 2)
    message = '{}, import rows: {}, use_bind: {}, elapsed: {}'.format(
     self.csv_name, import_rows, self.use_bind, elapsed)
    print(message)

   except Exception as e:
    message = '{} import failed, reason: {}'.format(self.csv_name, str(e))
    print(message)


if __name__ == '__main__':
 CsvDataImport(use_bind=1).data_import()

csv file
test. csv (content omitted)

3. Test the performance impact of sql using binding variables
a. Using bound variables
Restart the library in order to empty all the caches in the database and avoid interfering with the experimental results


SQL> startup force;
SQL> drop table yang.testtb purge;
SQL> create table yang.testtb as select * from yang.test where 1=0;

Run the script python dataimporttest. py

Results: test. csv, import, rows: 227795, use_bind: 1, elapsed: 260.31

b. Do not use bound variables
Restart the library


SQL> startup force;
SQL> drop table yang.testtb purge;
SQL> create table yang.testtb as select * from yang.test where 1=0;

Change the last line of the script CsvDataImport (use_bind=1). data_import () to CsvDataImport (use_bind=0). data_import ()

Run the script python dataimporttest. py

Results: test. csv, import, rows: 227795, use_bind: 0, elapsed: 662.82

It can be seen that under the same conditions, the running time of the program without using bound variables is 2.54 times that without using bound variables

4. Test the impact of database opening audit function on performance
Check whether the database audit function is turned on


SQL> show parameter audit 
NAME   TYPE  VALUE
-------------- ----------- ----------
audit_trail string  NONE

Count the number of rows in this table sys. aud $


SQL> select count(*) from sys.aud$;

 COUNT(*)
----------
   0

So you can directly take the result of step 3 (a. using binding variables) as the running time of the program without audit function

Enable the audit function on the library and restart it


SQL> alter system set audit_trail=db,extended scope=spfile; #  If set to db , then in sys.aud$ Inside sqltext Will be empty, which means that you can't see the user's sql Statement, audit is meaningless 
SQL> startup force;
SQL> drop table yang.testtb purge;
SQL> create table yang.testtb as select * from yang.test where 1=0;
SQL> audit insert table by yang; #  Open to the user yang Adj. insert Operational audit 

Change the last line of the script CsvDataImport (use_bind=0). data_import () to CsvDataImport (use_bind=1). data_import ()

Run the script python dataimporttest. py

Results: test.csv, import, rows: 227795, use_bind: 1, elapsed: 604.23

With the previous use of binding variables but no database audit function, the running time of the program with database audit function is 2.32 times as long as without database audit function

Let's look at the size of the table sys. aud $


SQL> select count(*) from sys.aud$;

 COUNT(*)
----------
 227798

Since both sqltext and sqlbind in the table sys. aud $are clob fields, the following sql is needed to count the space occupied by the table


SQL> select sum(bytes) from dba_extents where segment_name in (
select distinct name from (select table_name, segment_name from dba_lobs where table_name='AUD$') 
unpivot(name for i in(table_name, segment_name)));

SUM(BYTES)
----------
 369229824

View the space occupied by this table in testtb


SQL> select sum(bytes) from dba_extents where segment_name in ('TESTTB');

SUM(BYTES)
----------
 37748736

It can be seen that when a 220,000-row csv data is imported into the database, the audited table occupies an amazing space of 360M, while the testtb table itself is only 37M

Through the above experiments can be obtained, for the audit function of the database, the opening will seriously slow down the performance of the database and consume system table space!

STEP 5 Summarize

Use bound variables as much as possible in your code It is best not to open the audit of the database, but to realize the audit of user operation through the bastion machine (ps: please recommend a bastion machine manufacturer, which is the main purpose of this article _)

The experiment is not rigorous, and the relevant comparative data is only for reference

The above is the python to oracle simple performance test sample details, more about python to Oracle performance test information please pay attention to other related articles on this site!


Related articles: