10 SQL mistakes that programmers are prone to making

  • 2020-06-01 09:48:03
  • OfStack

Java programmers need to mix object-oriented thinking with an imperative approach to programming, and it's up to the programmer to perfectly combine the two:

Skills (imperative programming is easy for anyone to learn) Patterns (some people use "pattern-pattern", for example, patterns can be applied anywhere and can be grouped into one type of pattern) State of mind (first of all, writing a good object-oriented program is much more difficult than an imperative program, and you have to put in some effort)

But when an Java programmer writes an SQL statement, the 1 is not the same. SQL is an declarative language rather than an object-oriented or imperative programming language. It is easy to write a query in SQL. But similar statements are not easy in Java, because programmers have to think about algorithms as well as programming paradigms.

Here are some common mistakes Java programmers make when writing SQL (in no particular order) :

1. Forget the NULL

The biggest mistake is probably the misunderstanding of NULL when the Java programmer writes SQL. Maybe it's because (and not just for one reason)NULL is also called UNKNOWN. If it's called UNKNOWN, it makes a little bit more sense. Another reason is that when you get something from a database or bind a variable, JDBC matches SQL NULL with null in Java. This leads to the misunderstanding between NULL = NULL(SQL) and null=null(Java).

The biggest misconception about NULL is when NULL is used as a row-value expression integrity constraint.

Another misunderstanding arises in the application of NULL to NOT IN anti-joins.


Train yourself well. Keep thinking about NULL when you write SQL:

Is this NULL integrity constraint correct? Does NULL affect the results?

2. Process data in Java memory

Very few Java developers understand SQL very well. The occasional JOIN, and the odd UNION, okay. But what about window functions. And grouping collections. Many Java developers load the SQL data into memory, convert the data into some similar set type, and then use the boundary loop control structure (at least until the Java8 set upgrade) to perform the tedious math on those sets.

But some SQL databases support advanced ones (and are supported by the SQL standard!) OLAP features, this one features better and write rise more convenient also. 1 (not standard) example is Oracle MODEL clauses are great. Just let the database do processing and then bring the results to only Java memory. Because, after all, all very smart guy has been optimized in these expensive products. So, in fact, by OLAP will be moved to the database, you will receive 1 under two benefits:

Convenience. This is probably easier than writing the correct SQL in Java. Performance. The database should be faster than your algorithm. And more importantly, you don't have to pass millions of records.

Perfect method:

Every time you implement a data-centric algorithm using Java, ask yourself: is there a way for the database to do the dirty work for me instead?

3. Use UNION instead of UNION ALL

Shame on you, UNION ALL requires an extra keyword compared to UNION. It would be better if the SQL standard already provided support.

UNION (allow repetition) UNION DISTINCT (remove repetition)

Not only is it rarely necessary (and sometimes even wrong) to remove duplicate rows, but for large data sets with many rows, it can be quite slow, because two children of select need to be sorted, and each tuple needs to be compared to its subsequence tuple.

Note that even though the SQL standard specifies INTERSECT ALL and EXCEPT ALL, few databases implement these useless set operators.


Every time you write an UNION statement, consider whether you actually need an UNION ALL statement.

4. Paging a large number of results through JDBC paging technology

Most databases support some paging commands to achieve paging effects, such as LIMIT.. OFFSET, TOP.. START AT, OFFSET.. FETCH statements, etc. Even if there is no database that supports these statements, it is still possible to filter ROWNUM (oracle) or ROW NUMBER () OVER () (DB2, SQL Server2008, etc.) faster than paging in memory. This is especially true when dealing with large amounts of data.


Using only these statements, a tool such as JOOQ can simulate the operation of these statements.

5. Add data to java memory

Since the early days of SQL, some developers still feel uneasy about using the JOIN statement in SQL. This is due to an inherent fear that joining JOIN will slow you down. It may be true that if a cost-based optimization is chosen to implement nested loops, it is possible to load all the tables in database memory before creating a single connected table source. But the odds of that happening are too low. Combining joins and hash joins is fairly fast with proper prediction, constraints, and indexing. This is all about proper metadata (I can't reference Tom Kyte too much here). Moreover, there may still be quite a few Java developers who load the two tables by querying them separately into a single map and somehow adding them to memory.


If you have queries from various tables in each step, think about whether you can express your query in a single statement.

6. Eliminate duplicates in a temporary cartesian product set using DISTINCT or UNION

Through complex joins, one may lose the concept of all the relationships that play a key role in the SQL statement. In particular, if this involves multiple columns of foreign key relationships, it is likely to be forgotten in JOIN.. Add relevant judgments to the ON clause. This can lead to duplicate records, but perhaps only in exceptional circumstances. Some developers may therefore choose DISTINCT to eliminate these duplicate records. This is wrong in three ways:

It (maybe) solves the symptoms but it doesn't solve the problem. It may also fail to resolve symptoms in extreme cases. It is slow for a large set of results with many columns. DISTINCT performs the ORDER BY operation to eliminate duplication. It's very slow for a large cartesian product set, and it still needs to load a lot of data into memory.


As a rule of thumb, if you get unwanted duplicate records, check your JOIN judgment. Maybe somewhere there's a cartesian product set that's hard to detect.

7. MERGE statements are not used

This is not a mistake, but it could be a lack of knowledge or confidence in the powerful MERGE statement. Some databases understand other forms of update insert (UPSERT) statements, such as MYSQL's repetitive primary key update statements, but MERGE is so powerful and important in databases that it greatly extends the SQL standard, such as SQL SERVER.

The solution:

If you use something like INSERT and UPDATE or SELECT.. FOR UPDATE then think before inserting updates such as INSERT or UPDATE. You can definitely use a simpler MERGE statement to get away from the risk compete condition.

8. Use aggregate functions instead of window functions (window functions)

Before introducing the window function, aggregating data in SQL means mapping the aggregate function using the GROUP BY statement. It works well in many situations, such as aggregating data to condense regular data, using group queries in join subqueries.

However, the window function is defined in SQL: 2003, which is implemented in many major databases. Window functions aggregate data on a result set but do not group it. In fact, each window function has its own separate PARTITION BY statement, which is great for displaying reports TM.

Using the window function:

Make SQL more readable (but not GROUP BY statements in subqueries) Performance improvements such as relational database management systems that make it easier to optimize window functions


When you use the GROUP BY statement in a subquery, consider again whether you can use the window function to do this.

9. Use in-memory indirect sorting

The SQL ORDER BY statement supports many types of expressions, including the CASE statement, which is useful for indirect sorting of 10 points. You might not be able to sort the data in Java memory again, because you're thinking:

SQL sort is slow SQL cannot sort


If you are sorting any SQL data in memory, consider again 3 if you cannot sort in the database. This is useful for database paging data for 10 cents.

10. Insert a large number of records one by one

JDBC understand batch processing (batch), you should not forget it. Do not use the INSERT statement to log in and out of thousands of records, one new PreparedStatement object is created each time. If all of your records are inserted into the same table, create an insert batch statement with an SQL statement and a set of values. Depending on your database and database setup, you may need to commit after you have reached 1 quantitative insert record to keep the UNDO log small.

Follow popular foreign websites


facebook website: http: / / www facebookzh. com





Always use batch processing to insert large amounts of data.

Related articles: