Chapter 4 data processing php regular expressions continued by jeong aki of

  • 2020-05-09 18:18:58
  • OfStack

1. The basics of regular expressions
Meaning: a string pattern consisting of ordinary characters (a-z) and some special characters
Function: validation.
Replace the text.
Extract 1 substring from 1 string.
Classification: POSIX and Perl
The POSIX style is easier to master, but cannot be used in base 2 mode, whereas the perl style is more complex.
2.POSIX style regular expressions
1. Write regular expressions
Table 4.3 list of POSIX regular expression syntax formats

Word operator

describe

\

Escape character, used to escape special characters. For example, '.' matches a single character, and '\.' matches a dot. '\-' match hyphen '-', '\\' match symbol '\'

^

Matches the starting position of the input string. For example, '^he' represents a string beginning with 'he'

$

Matches the end of the input string. For example, 'ok$' represents a string ending in 'ok'

*

Matches the preceding subexpression zero or more times. For example, 'zo*' matches' z 'and' zoo '. * is equivalent to {0,}

+

Matches the previous subexpression 1 or more times. For example, 'zo+' matches' zo 'and' zoo ', but not 'z'. Plus is the same thing as {1,}

?

Matches the previous subexpression zero or one times. For example, 'do (es)? 'can match' do 'or' do 'in' does '. '? 'is equivalent to {0,1}

{n}

n is a non-negative integer. Match the determined n times. For example, 'o{2}' cannot match 'o' in 'Bob', but can match 'o' in 'food'

{n,}

n is a non-negative integer. Match n at least once. For example, 'o{2,}' does not match 'o' in 'Bob', but it matches all 'o' in 'foooood'. 'o{1,}' is equivalent to 'o+'. 'o{0,}' is equivalent to 'o*'

{n,m}

Both m and n are non-negative integers, where n≤m. Match n at least and m at most. For example, "o{1,3}" will match the first three 'o' in "fooooood". 'o{0,1}' is equivalent to 'o? '. Note that there can be no Spaces between the comma and the two Numbers

?

When the character is followed by any of 1 other qualifiers (*, +,? , {n}, {n,}, {n,m}), the matching pattern is non-greedy. The non-greedy pattern matches as few strings as possible, while the default greedy pattern matches as many strings as possible. For example, for the string "oooo", 'o+? 'will match a single' o 'and 'o+' will match all 'o'

.

To match any single character other than "\n", to match any character including '\n', you can use the pattern of '[.\n]'

(pattern)

Match pattern and get the 1 match. The obtained matches are saved into the corresponding array. To match the parenthesis characters, use '\(' or '\)'

(?:pattern)

Matches pattern but does not get a match, that is, it is a non-fetch match and is not stored. This is useful when using "or "|" to combine parts of a pattern. For example, 'industr (? :y|ies). Is a shorter expression than 'industry|industries'

(?=pattern)

Forward preview, matching the lookup string at the beginning of any string that matches pattern. This is a non-fetch match, that is, the match does not need to be fetched for later use. For example, 'Windows (? =95|98|NT|2000)' matches "Windows" in "Windows 2000", but not "Windows" in "Windows 3.1". Precheck does not consume characters, that is, the search for the next match begins immediately after the last match after the first match, rather than after the character containing the precheck

(?!pattern)

Negative preview, matching the lookup string at the beginning of any string that does not match pattern. This is a non-fetch match, that is, the match does not need to be fetched for later use. Such as' Windows (? ! 95|98|NT|2000)' matches "Windows" in "'Windows 3.1", but not "Windows" in "Windows 2000". Precheck does not consume characters, that is, the search for the next match begins immediately after the last match after the first match, rather than after the character containing the precheck

x|y

Match x or y. For example, 'z|food' matches' z 'or' food ', '(z|f)ood' matches' zood 'or' food '

[xyz]

A set of characters. Matches any 1 character contained. For example, '[abc]' can match 'a' in 'plain'

[^xyz]

A collection of negative characters. Matches any character not included. For example, '[^abc]' can match 'p' in 'plain'

[a-z]

Character range. Matches any character in the specified range. For example, '[a-z]' can match any lowercase character from 'a' to 'z'

[^a-z]

Negative character range. Matches any character that is not within the specified range. For example, '[^ a-z]' can match any character that is not in the range 'a' to 'z'

Here are some simple examples of regular expressions:
●'[A-Za-z0-9] ': represents all uppercase letters, lowercase letters, and Numbers from 0 to 9.
●'^hello' : represents a string starting with hello.
●'world$' : represents a string ending in world.
●'.at ': represents a string that begins with any single character other than "\n" and ends with "at", such as "cat", "nat", etc.
●'^[a-zA-Z]' : represents a string beginning with a letter.
●'hi{2}' : means the letter h is followed by two i or hii.
●'(go)+' : a string containing at least one 'go' string, such as 'gogo'
Id card number 1 generally consists of 18 digits or 17 digits followed by an X or Y letter. To match the id card number, write:
^[0-9]{17}([0-9]|X|Y)$
The regular expression for the Email address can be written as:
^[a-zA-Z0-9\-]+@[a-zA-Z0-9\-]+\.[a-zA-Z0-9\-\.]+$
2. String matching
ereg() and eregi() functions
You can use the ereg() function to find a match between a string and a substring and return the length of the matching string, as well as an array of matching characters. The syntax is as follows:
int ereg(string ($pattern) , string $string [, array $regs ])
 
<?php 
/* This example checks whether the string is ISO Formatted date (YYYY-MM-DD) */ 
$date="1988-08-09"; 
$len=ereg ('([0-9]{4})-([0-9]{1,2})-([0-9]{1,2})', $date, $regs);// Date format is YYYY-MM-DD 
if ($len) 
{ 
echo "$regs[3].$regs[2].$regs[1]". "<br>"; // The output "09.08.1988" 
echo $regs[0] ."<br>"; // The output "1988-08-09" 
echo $len; // The output 10 
} 
else 
{ 
echo " Incorrect date format : $date"; 
} 
?> 

3. String substitution
The syntax of the ereg_replace() function is as follows:
string ereg_replace(string $pattern , string $replacement , string $string)
Note: the function replaces the matching part of the string $string with the string $replacement and returns the replaced string. If no match is found, it is returned as is
 
<?php 
$str="hello world"; 
echo ereg_replace('[aeo]', 'x',$str). "<br>"; // The output 'hxllx wxrld' 
$res='<a href=\"hello.php\">hello</a>'; 
echo ereg_replace('hello', $res,$str); // Replace with a hyperlink 'hello' 
?> 

4. Split the array

Using the split() function does the same thing as explode() function 1, and you can split the string according to the given regular expression and return an array. The syntax is as follows:

array split(string $pattern , string $string [, int $limit ])

5. Generate regular expressions

3.Perl compatible regular expressions

1. Write regular expressions

Table 4.4 Perl compatible syntax formats for regular expression expansion

Word operator

describe

\b

Match a word boundary, that is, the position between the word and the space. For example, 'er\b' can match 'er' in 'never', but cannot match 'er' in 'verb'.

\B

Matches nonword boundaries. 'er\B' can match 'er' in 'verb', but cannot match 'er' in 'never'.

\cx

Matches the control character specified by x. For example, '\cM' matches an Control-M or carriage return character. The value of x must be either A ~ Z or a ~ z 1. Otherwise, 'c' is treated as a literal 'c' character

\d

Matches 1 numeric character. Equivalent to '[0-9]'

\D

Matches 1 non-numeric character. Equivalent to '[^ 0-9]'

\f

Matches 1 page break. Equivalent to '\x0c' and '\cL'

\n

Matches 1 newline character. Equivalent to '\x0a' and '\cJ'

\r

Matches 1 carriage return. Equivalent to '\x0d' and '\cM'

\s

Matches any white space characters, including Spaces, tabs, page breaks, and so on. Equivalent to '[\f\n\r\t\v]'

\S

Matches any non-white space characters. This is equivalent to '[^ \f\n\r\t\v]'

\t

Matches 1 TAB character. Equivalent to '\x09' and '\cI'

\v

Matches 1 vertical TAB character. Equivalent to '\x0b' and '\cK'

\w

Matches any word character that includes an underscore. Equivalent to '[A - Za - z0 - _ 9]'

\W

Match any non-word character, equivalent to '[^ A-Za-z0-9_]'

\xn

Matches n, where n is the base 106 escape value. The base 106 escape value must be two digits long. For example, '\x41' matches "A". '\x041' is equivalent to '\x04' & "1". ASCII encodings can be used in regular expressions

\num

Match num, where num is a positive integer. A reference to the match obtained. For example, '(.)\1' matches two consecutive identical characters

\n

Flags 1 base 8 escape value or 1 backward reference. If there are at least n obtained subexpressions before \n, n is a backward reference. Otherwise, if n is a base 8 number (0 to 7), n is a base 8 escape value

\nm

Flags 1 base 8 escape value or 1 backward reference. If there are at least 465en obtained subexpressions before \nm, nm is a backward reference. If there are at least n fetched before \nm, n is a backward reference followed by the text m. If none of the above conditions are met, if n and m are both base 8 digits (0 to 7), then \nm will match the base 8 escape value nm

\nml

If n is a base 8 digit (0 to 3), and m and l are both base 8 digits (0 to 7), then match the base 8 escape value nml

\un

Matches n, where n is the Unicode character represented by four base 106 digits. For example, '\u00A9' matches the copyright symbol (©)

2. String matching
The preg_match() function performs a string lookup, with the syntax as follows:
int preg_match(string $pattern , string $subject [, array $matches [, int $flags ]])
Note: the structure of this function is similar to that of the ereg() function, which searches the $subject string for anything that matches the regular expression given by $pattern.
The preg_match() function returns the number of times $pattern was matched. Either zero (no match) or one, because the preg_match() function stops searching after the first match
Another is preg_match_all(), and the search continues from the end of the first match until the entire string is searched.
The value of the preg_match_all() function parameter $flags can take the following three types:
Low PREG_PATTERN_ORDER. The default entry, indicating that $matches[0] is an array of all pattern matches,
$matches[1] is an array of strings matched by the subpattern in the first parenthesis, and so on.
Low PREG_SET_ORDER. If this flag is set, $matches[0] is the array of the first set of matches, $matches[1] is the array of the second set of matches, and so on.
Low PREG_OFFSET_CAPTURE. PREG_OFFSET_CAPTURE can be used in combination with two other tags,
If this tag is set, a string offset is also returned for each match that occurs.
3. String substitution
Using the preg_replace() function does the same thing as the ereg_replace() function, looking for a matching substring in a string, and replacing the substring with the specified string.
The syntax is as follows:
mixed preg_replace(mixed $pattern , mixed $replacement , mixed $subject [, int $limit ])
4. String segmentation
The preg_split() function USES a regular expression as a boundary to split a string and return the substring into an array, similar to the split() function.
The syntax is as follows:
array preg_split(string $pattern , string $subject [, int $limit [, int $flags ]])
Description: this function is case sensitive and returns an array containing substrings from $subject along the $pattern matching boundary.
$limit is an optional parameter that returns a maximum of $limit strings if specified, or no restriction if omitted or -1.
The value of $flags can be as follows:
Low PREG_SPLIT_NO_EMPTY. If this tag is set, the function only returns a non-empty string.
Low PREG_SPLIT_DELIM_CAPTURE. If this tag is set, the match of the parenthetical expression in the delimiter pattern is also captured and returned.
PREG_SPLIT_OFFSET_CAPTURE. If this tag is set, a string offset is also returned for each match that occurs.
4.3 instance - verify the form content
Regular expressions are used to verify that the form content entered by the user meets the format requirements.
Create a new EX4_4_Hpage.php file and enter the following code.
 
<?php 
include 'EX4_4_Hpage.php'; // Include file EX4_4Hpage.php 
$id=$_POST['ID']; 
$pwd=$_POST['PWD']; 
$phone=$_POST['PHONE']; 
$Email=$_POST['EMAIL']; 
$checkid=preg_match('/^\w{1,10}$/',$id); // Check whether the string is in 10 Within 10 characters  
$checkpwd=preg_match('/^\d{4,14}$/',$pwd); // Check to see if 4 ~ 14 Between the Numbers  
$checkphone=preg_match('/^1\d{10}$/',$phone); // Check to see if 1 At the beginning of 11 A digital  
// check Email Validity of address  
$checkEmail=preg_match('/^[a-zA-Z0-9_\-]+@[a-zA-Z0-9\-]+\.[a-zA-Z0-9\-\.]+$/',$Email); 
if($checkid&&$checkpwd&&$checkphone&&$checkEmail) // If is 1 , the registration is successful  
echo " Registration successful! "; 
else 
echo " Registration failed, the format is incorrect "; 
?> 

Create a new EX4_4_Ppage.php file and enter the following code:
2. String matching
The preg_match() function performs a string lookup, with the syntax as follows:
int preg_match(string $pattern , string $subject [, array $matches [, int $flags ]])
Note: the structure of this function is similar to that of the ereg() function, which searches the $subject string for anything that matches the regular expression given by $pattern.
The preg_match() function returns the number of times $pattern was matched. Either zero (no match) or one, because the preg_match() function stops searching after the first match
Another is preg_match_all(), and the search continues from the end of the first match until the entire string is searched.
The value of the preg_match_all() function parameter $flags can be taken as follows:
Low PREG_PATTERN_ORDER. The default entry, indicating that $matches[0] is an array of all pattern matches,
$matches[1] is an array of strings matched by the first parenthesized subpattern, and so on.
Low PREG_SET_ORDER. If this flag is set, $matches[0] is the array of the first set of matches, $matches[1] is the array of the second set of matches, and so on.
Low PREG_OFFSET_CAPTURE. PREG_OFFSET_CAPTURE can be used in combination with two other tags,
If this tag is set, a string offset is also returned for each match that occurs.
3. String substitution
The preg_replace() function does the same thing as the ereg_replace() function, looking for a matching substring in a string, and replacing the substring with the specified string.
The syntax is as follows:
mixed preg_replace(mixed $pattern , mixed $replacement , mixed $subject [, int $limit ])
4. String segmentation
The preg_split() function USES a regular expression as a boundary to split a string and return the substring into an array, similar to the split() function.
The syntax is as follows:
array preg_split(string $pattern , string $subject [, int $limit [, int $flags ]])
Description: this function is case sensitive and returns an array containing substrings in $subject that are split along the $pattern matching boundary.
$limit is an optional parameter that returns a maximum of $limit strings if specified, or no restriction if omitted or -1.
The value of $flags can be as follows:
Low PREG_SPLIT_NO_EMPTY. If this tag is set, the function only returns a non-empty string.
Low PREG_SPLIT_DELIM_CAPTURE. If this tag is set, the match of the parenthetical expression in the delimiter pattern is also captured and returned.
PREG_SPLIT_OFFSET_CAPTURE. If this tag is set, a string offset is also returned for each match that occurs.
4.3 instance - verify the form content
Regular expressions are used to verify that the form content entered by the user meets the format requirements.
Create a new EX4_4_Hpage.php file and enter the following code.
 
<?php 
include 'EX4_4_Hpage.php'; // Include file EX4_4Hpage.php 
$id=$_POST['ID']; 
$pwd=$_POST['PWD']; 
$phone=$_POST['PHONE']; 
$Email=$_POST['EMAIL']; 
$checkid=preg_match('/^\w{1,10}$/',$id); // Check whether the string is in 10 Within 10 characters  
$checkpwd=preg_match('/^\d{4,14}$/',$pwd); // Check to see if 4 ~ 14 Between the Numbers  
$checkphone=preg_match('/^1\d{10}$/',$phone); // Check to see if 1 At the beginning of 11 A digital  
// check Email Validity of address  
$checkEmail=preg_match('/^[a-zA-Z0-9_\-]+@[a-zA-Z0-9\-]+\.[a-zA-Z0-9\-\.]+$/',$Email); 
if($checkid&&$checkpwd&&$checkphone&&$checkEmail) // If is 1 , the registration is successful  
echo " Registration successful! "; 
else 
echo " Registration failed, the format is incorrect "; 
?> 

Create a new EX4_4_Ppage.php file and enter the following code:
 
<?php 
include 'EX4_4_Hpage.php'; // Include file EX4_4Hpage.php 
$id=$_POST['ID']; 
$pwd=$_POST['PWD']; 
$phone=$_POST['PHONE']; 
$Email=$_POST['EMAIL']; 
$checkid=preg_match('/^\w{1,10}$/',$id); // Check whether the string is in 10 Within 10 characters  
$checkpwd=preg_match('/^\d{4,14}$/',$pwd); // Check to see if 4-14 Between characters  
$checkphone=preg_match('/^1\d{10}$/',$phone); // Check to see if 1 At the beginning of 11 digits  
// check Email Validity of address  
$checkEmail=preg_match('/^[a-zA-Z0-9_\-]+@[a-zA-Z0-9\-]+\.[a-zA-Z0-9\-\.]+$/',$Email); 
if($checkid&&$checkpwd&&$checkphone&&$checkEmail) // If is 1 , the registration is successful  
echo " Registration successful! "; 
else 
echo " Registration failed, the format is incorrect "; 
?> 

Related articles: