php正则子模式贪婪,php关于正则表达式贪婪模式与非贪婪

322 阅读 0 评论 213 点赞

我是靠谱客的博主清脆小丸子，这篇文章主要介绍php正则子模式贪婪,php关于正则表达式贪婪模式与非贪婪，现在分享给大家，希望可以做个参考。

工作中，我们经常要用到正则表达式去匹配到我们想要的数据，甚至还会把匹配到的数据替换成我们需要的数据。这一切，似乎很难做到，但是如果你会熟练使用正则表达式，这些，就不是个菜了。

一、贪婪与非贪婪

贪婪模式：可以这样认为，就是在整个表达式匹配成功的前提下，尽可能多的匹配，也就是所谓的“贪婪”，通俗点讲，就是看到想要的，有多少就捡多少，除非再也没有想要的了。

非贪婪模式：可以这样认为，就是在整个表达式匹配成功的前提下，尽可能少的匹配，也就是所谓的“非贪婪”，通俗点讲，就是找到一个想要的捡起来就行了，至于还有没有没捡的就不管了。

什么叫贪婪，比如说要从字符串中

香肠月饼吃东西，本来你只可以吃香肠，可是你贪心，于是就把第一个到最后一个里面的两个吃的取出来了，你想多吃点，非贪婪也就是你不贪吃了，就只吃香肠。

我们来看看正则里面是怎么贪婪的：

$str = '

香肠月饼';

preg_match('/

(.*)/',$str,$rs);

print_r($rs);

这样输出的结果是：

Array ( [0] => 香肠月饼 [1] => 香肠月饼 )

怎么来限制贪婪？

在修饰匹配次数的特殊符号后再加上一个 "?" 号，则可以使匹配次数不定的表达式尽可能少的匹配。

$str = '

香肠月饼';

preg_match('/

(.*?)/',$str,$rs);

print_r($rs);

这样输出的结果是：

Array ( [0] => 香肠 [1] => 香肠 )

在PHP中还可以通过模式修饰符来实现，大写"U"：

$str = '

香肠月饼';

preg_match('/

(.*)/U',$str,$rs);

print_r($rs);

这样输出的结果是与上面一样的！

二、预搜索

预搜索是一个非获取匹配，不进行存储供以后使用。

1、正向预搜索 "(?=xxxxx)"，"(?!xxxxx)"

"(?=xxxxx)”:所在缝隙的右侧，必须能够匹配上 xxxxx 这部分的表达式，

$str = 'windows NT windows 2003 windows xp';

preg_match('/windows (?=xp)/',$str,$res);

print_r($res);

结果：

Array

(

[0] => windows

)

这个是xp前面的windows，不会取NT和2003前面的。

格式："(?!xxxxx)"，所在缝隙的右侧，必须不能匹配 xxxxx 这部分表达式

$str = 'windows NT windows 2003 windows xp';

preg_match_all('/windows (?!xp)/',$str,$res);

print_r($res);

结果：

Array

(

[0] => Array

(

[0] => windows 这个是nt前面的

[1] => windows 这个是2003前面的

)

从这里可以看出，预搜索不进行存储供以后使用。与会存储的对比下。

$str = 'windows NT windows 2003 windows xp';

preg_match_all('/windows ([^xp])/',$str,$res);

print_r($res);

结果：

Array

(

[0] => Array 全部模式匹配的数组

(

[0] => windows N

[1] => windows 2

)

[1] => Array 子模式所匹配的字符串组成的数组，通过存储取得。

(

[0] => N

[1] => 2

)

2、反向预搜索

"(?<=xxxxx)"，"(?

"(?<=xxxxx)" :所在缝隙的 "左侧”能够匹配xxxxx部分。

$str = '1234567890123456';

preg_match('/(?<=d{4})d+(?=d{4})/',$str,$res);

print_r($res);

?>结果：

Array

(

[0] => 56789012

)

匹配除了前4个数字和后4个数字之外的中间8个数字

"(?

$str = '我1234567890123456';

preg_match('/(?

print_r($res);

结果：

Array

(

[0] => 234567890123456

)

@下面的例子其实主要讲了三个函数

preg_replace();

preg_match_all();preg_match()的基本用法。有兴趣的就看下吧！

例1：

我们把需要替换的放在parrerns数组里面，把换成后的数据放在replacement数组里面：

$string为一个字符串，定义$patterns为一个基于索引的数组：

函数讲解：preg_replace();preg_replace ( $pattern ,$replacement ,$subject [,int$limit= -1 [,int&$count ]] )

搜索subject中匹配pattern的部分，以replacement进行替换。

date_default_timezone_set("Asia/Shanghai");

$patterns = array();

$patterns[0] = '/2013-12-13 09:00:09/';

$patterns[1] = '/weobo/';

$patterns[2] = '/lail/';

$replacements = array();

$replacements[2] = date('Y-m-d H:i:s',time());

$replacements[1] = 'tengx';

$replacements[0] = 'buyao';

echo preg_replace($patterns, $replacements, $string);

以上的例子会输出：

以上例子我们可以把我们需要替换的值用索引数组的形式表达出来，然后一个一个的去替换，是不是很方便呢？

例2：

下面么再来看一个，我们需要把字符串里面的日期转当前的日期或者为我们需要的东西

date_default_timezone_set("Asia/Shanghai");

$pattern = '/d{4}-d{2}-d{2}s+d{2}:d{2}:d{2}/';

$replacement = date('Y-m-d H:i:s',time());

echo preg_replace($pattern, $replacement, $string);

以上的例子会输出：

字符串里面的日期被换成今天的日期了。

例3：

正则表达式如何匹配出现第一次出现下划线以前的内容：

函数讲解：preg_match_all();intpreg_match_all (string $pattern ,string$subject [,array&$matches [,int$flags=PREG_PATTERN_ORDER [,int$offset= 0 ]]] )

搜索subject中所有匹配pattern给定正则表达式的匹配结果并且将它们以flag指定顺序输出到matches中.

在第一个匹配找到后, 子序列继续从最后一次匹配位置搜索.

date_default_timezone_set("Asia/Shanghai");

$content="正则表达式如何匹配第一次出现下划线以前的内容_第二次_第三次_";

preg_match_all('/([^_]*?)_/i',$content, $matches);

if(count($matches[1])>0)

{

echo $matches[1][0];

}

var_dump($matches);

以上的例子会输出：

正则表达式如何匹配第一次出现下划线以前的内容

上面的数组打印出来的结果为：

array(2) { [0]=> array(3) { [0]=> string(45) "正则表达式如何匹配第一次出现下划线以前的内容_" [1]=> string(7) "第二次_" [2]=> string(7) "第三次_" } [1]=> array(3) { [0]=> string(44) "正则表达式如何匹配第一次出现下划线以前的内容" [1]=> string(6) "第二次" [2]=> string(6) "第三次" } }

例4：

函数讲解：preg_match()搜索subject与pattern给定的正则表达式的一个匹配.

preg_match()返回 pattern 的匹配次数。它的值将是0次(不匹配)或1次，因为 preg_match()在第一次匹配后将会停止搜索。 subject 直到到达结尾。如果发生错误 preg_match()返回 FALSE。

date_default_timezone_set("Asia/Shanghai");

$pattern="/d{4}(W)d{2}\1d{2}s+d{2}(W)d{2}\2d{2}s+(?:am|pm)/"; //正则表达式模式

$string="today is 2013/12/14 10:35:28 am..."; //需要和上面模式字符串进行匹配的变量字符串

if(preg_match($pattern, $string, $arr)){

echo "正则表达式 {$pattern} 和字符串 {$string} 匹配成功
";

echo '

';

print_r($arr);

echo '

}else{

echo "正则表达式{$pattern} 和字符串 {$string} 匹配失败";

}

以上的例子会输出：

正则表达式 /d{4}(W)d{2}1d{2}s+d{2}(W)d{2}2d{2}s+(?:am|pm)/ 和字符串 today is 2013/12/14 10:35:28 am... 匹配成功

Array

(

[0] => 2013/12/14 10:35:28 am

[1] => /

[2] => :

)

匹配字符串里面的时间匹配成功。

例5：

匹配字符串中的url

date_default_timezone_set("Asia/Shanghai");

$str="这是一个正则表https://www.sina.com达式的匹配函数";

$url="/(https?|ftps?)://((www|mail|news).([^./]+).(com|org|net|cn))/i";

if(preg_match($url, $str, $arr)){

echo "字符串中有正确的URL信息
";

echo '

';

print_r($arr);

echo '

}else{

echo "字符串中不包括URL";

}

以上的例子会输出：

字符串中有正确的URL信息

Array

(

[0] => https://www.sina.com

[1] => https

[2] => www.sina.com

[3] => www

[4] => sina

[5] => com

)

@其实我们也可以封装一个函数去完成匹配URL的功能：

date_default_timezone_set("Asia/Shanghai");

$str="这是一个正则表https://www.baidu.com达式的匹配函数

这是一个正则表http://www.baidu1.com达式的匹配函数

这是一个正则表https://mail.baidu2.com达式的匹配函数

这是一个正则表https://news.baidu3.com达式的匹配函数

这是一个正则表https://www.baidu4.org达式的匹配函数

这是一个正则表https://www.baidu5.net达式的匹配函数

这是一个正则表ftps://www.baidu6.com达式的匹配函数

这是一个正则表ftp://www.google7.com达式的匹配函数

这是一个正则表https://www.baidu7.net达式的匹配函数";

function setUrl($str) {

$url="/(https?|ftps?)://((www|mail|news).([^./]+).(com|org|net|cn))/i";

preg_match_all($url, $str, $arr,PREG_PATTERN_ORDER );

foreach($arr[0] as $url){

$str=str_replace($url, ''.$url.'', $str);

}

return $str;

}

echo setUrl($str);

以上的例子会输出：

这是一个正则表https://www.baidu.com达式的匹配函数这是一个正则表http://www.baidu1.com达式的匹配函数这是一个正则表https://mail.baidu2.com达式的匹配函数这是一个正则表https://news.baidu3.com达式的匹配函数这是一个正则表https://www.baidu4.org达式的匹配函数这是一个正则表https://www.baidu5.net达式的匹配函数这是一个正则表ftps://www.baidu6.com达式的匹配函数这是一个正则表ftp://www.google7.com达式的匹配函数这是一个正则表https://www.baidu7.net达式的匹配函数