A detailed explanation of the alternative approach to MySQL injection

  • 2021-01-02 22:00:39
  • OfStack

This article illustrates the alternative approach of MySQL to SQL injection. To share for your reference, the details are as follows:

Problem reading

I think the cost of this problem could be in the hundreds of millions of dollars a year. In this article, let's say we have the following SQL template statement:

select * from T where f1 = '{value1}' and f2 = {value2}

Now we need to populate the statement based on the user input values:


We get the following SQL statement, which we then submit to the database:

select * from T where f1='hello' and f2=5

The problem is that an attacker can construct the following user input values:

value1=anything' or 1=1 or f1='whatever

The resulting statement then becomes:

select * from T where f1='anything' or 1=1 or f1='whatever' and f2=5

The attacker succeeded in changing the semantics of the template statement. This problem is not unique to SQL, but to any language where templates are commonly used, such as HTML and shell scripts.

Description of the general solution

SQL is an axiom with arbitrariness and 1 tropism, and token and derivation rules form its axiomatic basis. One word to note is "arbitrariness". The axioms equivalent to SQL are numerous. For this arbitrary equivalent representation, every valid statement can be mapped exactly to the valid statement in SQL, as can any other language. In this arbitrary equivalence representation, if a statement is illegal, it is also illegal in SQL. It is impossible for an attacker to construct any possible, arbitrary rule equivalent to SQL.

Strategy 1: Extend the template statement with a different syntax depending on the different derivation rules

Example 1: Prefix language

SQL uses infix notation note 1. The infix notation is equivalent to the ES44en-style prefix notation note 2. Infix and prefix:

a OP1 b OP2 c <=> (OP1 a (OP2 b c))

a, b, c are identifiers or values, OP1, OP2 are operators or functions.

Example statements for prefix notation:

(select * T (and (= f1 '{value1}') (= f2 {value2})))

This statement is equivalent. They are semantically denoted. It is not a problem to automatically convert the infix representation of SQL to prefix or other representations. However, an attacker's injection is illegal with respect to prefix syntax:

(select * T (and (= f1  'anything' or 1=1 or a='whatever') (= f2 5)))

Grammatical errors. What the attacker wants is:

(select * T (or (= f1 'anything') (or (=1 1) (and (= a 'whatever') (= f2 5)))))

It's different. An attacker's injection cannot output a valid prefix language.

Example 2: Euler notation

Another alternative would be to use Euler's notation. From infix notation to Euler:

a OP1 b OP2 c <=> OP1(a,OP2(b,c))

The statement in the example:

select( *,T,and(=(f1,'{value1}'),=(f2,{value2})))

The injected statement will have a syntax error:

select( *,T,and(=(f1,'anything' or 1=1 or a='whatever'),=(f2,5)))

The attacker had intended to write:

select( *,T,or(=(f1,'anything',or(=(1,1),and(=(a,'whatever'),=(f2,5))))))

The attacker is doing the wrong thing. His injection did not pay any attention at all to the arbitrary notation chosen.

Example 3: Object tagging (object notation)

There's another alternative, object tagging. From prefix representation to object:

a OP1 b OP2 c <=> a.OP1(b).OP2(c)

Example code:



Inject one more fold into the syntax:



I will no longer provide the correct answer, as an exercise for the reader to see what the attacker should write.

Strategy 2: Select any other token for SQL

keyword is often any token in a language. What matters is their place in the derivation rules, not their arbitrary representation. You can always replace the existing keyword with another keyword and convert back and forth. For example, we can convert keyword in the following SQL statement to what we will call "arbitrary brainfuck" :

{"select":"iph0ohKi", "*":"ieZoh4xa", "from":"aeZi5uja", "where":"OoJ4aX4n", "=":"eeQu2Zad", "(":"eiD5aera",")":"Soo2uach", "or":"Ocaig5Es"}

For argument's sake, we will map operands to semi-arbitrary structured sequences:



phai1Oa6 is an arbitrarily selected sequence of characters. For the current situation, examples:

select * from T where f1 = '{value1}' and f2 = {value2}


iph0ohKi ieZoh4xa aeZi5uja @phai1Oa6@T@ OoJ4aX4n @phai1Oa6@f1@ eeQu2Zad '{value1}' @phai1Oa6@and@ @phai1Oa6@f2@ eeQu2Zad {value2}

This is legal and arbitrary brainfuck language. After injection, we get:

iph0ohKi ieZoh4xa aeZi5uja @phai1Oa6@T@ OoJ4aX4n @phai1Oa6@f1@ eeQu2Zad 'anything' or 1=1 or a='whatever' @phai1Oa6@and@ @phai1Oa6@f2@ eeQu2Zad 5

As you can see, the token it contains has 'or' and '=', which are illegal in any brainfuck language. Our grammar says that you must use:



These token are not operands either, as they will only be treated as:



In other words, the injected statement becomes illegal and unusable.

Strategy 3: Validate invariants

How many token in the template statement example below do you count?

[1] select [2] * [3] from [4] T [5] where [6] f1 [7] = [8] '{value1}' [9] and [10] f2 [11] = [12] {value2}
12. After the template is filled in, the total must still be 12, but we see the result of the attacker:

[1] select [2] * [3] from [4] T [5] where [6] f1 [7] = [8] 'anything' [9] or [10] 1 [11] = [12] 1 [13] or [14] a [15] = [16] 'whatever' [17] and [18] f2 [19] = [20] 5

There are now 20 token. By violating this invariant, you expose the problem. The same applies to the representation of the same statement, except for any brainfuck language. The padding of the template is unlikely to cause a change in the number of token at all.

In fact, you can try using other invariants and verify them after they are populated. The attacker must maintain a 1 with them.


Some people advocate that programmers should be more careful when populating SQL templates. With SQL injection, you just need to be careful with your programming. Clearly, this approach is not a solution. People still make mistakes in validating user input values and end up accepting malicious user input values. In other words, this kind of problem cannot be fundamentally solved by all of us working harder.

The real solution lies in the arbitrariness of the SQL statement itself and the requirement that all existing invariants conform to the rules of any equivalent structure. This can be done automatically without programmer intervention.

An attacker has to conform to an unknown, arbitrary rule of the brainfuck syntax. Trying to conform to a set of unknown rules is going to be a difficult problem to solve. As a result, attackers often fail.

For more information about MySQL, please refer to MySQL Log Operation Techniques, MySQL Transaction Operation Techniques, MySQL Stored Procedure Techniques, MySQL Database Locking Techniques and MySQL Common Functions.

I hope this article has been helpful to you with the MySQL database.

Related articles: