A method to solve the problem of fuzzy retrieval in MySQL database
- 2020-05-06 11:45:44
- OfStack
Under MySQL, when conducting Chinese fuzzy retrieval, it often returns some irrelevant records. For example, when searching "%a%" , it may return Chinese characters, but no a characters. I have encountered similar problems before. After reading Manual of MySQL in detail, I found that there is a way to solve them conveniently and get satisfactory results.
You want to search the news library by "title". The keywords may be in English and Chinese, as shown in the SQL statement:
The following is a quote:
Code:
select id,title,name from achech_com.news where title like '%a%'
Some of the title fields are identified with the "a" keyword, while others are only in Chinese, but are also returned in the search results.
The solution is to use the BINARY property for retrieval, such as
The following is a quote:
Code:
select id,title,name from achech_com.news where binary title like '%a%'
The result is more correct than before, but the English letters are case sensitive, so sometimes the result is different when retrieving "Achech" and "achech". Know the use BINARY attributes can solve this problem, in front of the look at MySQL support UCASE and CONCAT functions, including UCASE is all in English to uppercase, and CONCAT function is used to connect of characters, the following is our completely resolved SQL statement:
Code:
select id,title,name from achech_com.news
where ucase(title) like concat('%',ucase('a'),'%')
The retrieval step is to specify the attribute as BINARY in order to accurately retrieve the result. However, the content of title like title may contain upper and lower case letters. Therefore, ucase function is used to convert all the field contents into uppercase letters, and then like operation is performed. While the operation of like USES the fuzzy method, the advantage of using concat is that the keyword passed in can be direct, do not need to take "%" universal character, "a'" directly into your variable, in any language everything is free. of course you could also write
Code:
select id,title,name from achech_com.news where binary ucase(title) like ucase('%a%')
The result of the retrieval is satisfactory, but the speed may be reduced by N milliseconds.
You want to search the news library by "title". The keywords may be in English and Chinese, as shown in the SQL statement:
The following is a quote:
Code:
select id,title,name from achech_com.news where title like '%a%'
Some of the title fields are identified with the "a" keyword, while others are only in Chinese, but are also returned in the search results.
The solution is to use the BINARY property for retrieval, such as
The following is a quote:
Code:
select id,title,name from achech_com.news where binary title like '%a%'
The result is more correct than before, but the English letters are case sensitive, so sometimes the result is different when retrieving "Achech" and "achech". Know the use BINARY attributes can solve this problem, in front of the look at MySQL support UCASE and CONCAT functions, including UCASE is all in English to uppercase, and CONCAT function is used to connect of characters, the following is our completely resolved SQL statement:
Code:
select id,title,name from achech_com.news
where ucase(title) like concat('%',ucase('a'),'%')
The retrieval step is to specify the attribute as BINARY in order to accurately retrieve the result. However, the content of title like title may contain upper and lower case letters. Therefore, ucase function is used to convert all the field contents into uppercase letters, and then like operation is performed. While the operation of like USES the fuzzy method, the advantage of using concat is that the keyword passed in can be direct, do not need to take "%" universal character, "a'" directly into your variable, in any language everything is free. of course you could also write
Code:
select id,title,name from achech_com.news where binary ucase(title) like ucase('%a%')
The result of the retrieval is satisfactory, but the speed may be reduced by N milliseconds.