正则表达式找到所有匹配

loading...


2

我需要一个正则表达式来查找我的模式的所有匹配项 .

文字是这样的:

"someother text !style_delete [company code : 43ev4] between text !style_delete [organiztion : 0asj9] end of line text"

我想找到该模式的所有匹配项:

!style_delete [.*]

我试过这样的:

Pattern pattern = Pattern.compile("!style_delete\\s*\\[.*\\]");

有了这个匹配文本就像这样:

!style_delete [company code : 43ev4] between text !style_delete [organiztion : 0asj9]

但我预计如下:

match 1 : !style_delete [company code : 43ev4] 
match 2 : !style_delete [organiztion : 0asj9]

请帮助我,java中的正则表达式将获得以上输出 .

loading...

3回答

  • 1

    您需要使用non-greedy匹配:

    start.*?end
    

    在您的情况下,模式是:

    !style_delete\\s\\[(.*?)\\] (Even simple to understand than first version :))
    

    证明(Java 7):

    String string = "someother text !style_delete [company code : 43ev4] between text !style_delete [organiztion : 0asj9] end of line text"; 
    Pattern pattern = Pattern.compile("!style_delete\\s\\[(.*?)\\]");
    Matcher matcher = pattern.matcher(string) ;
    while (matcher.find()) {
        System.out.println(matcher.group());
    }
    

    链接到证据:http://ideone.com/Qtymb3


  • 3
    @Test
    public void test() {
        final String input = "someother text !style_delete [company code : 43ev4] between text !style_delete [organiztion : 0asj9] end of line text";
        // my regexp:strong text
        // final String regex = "(!style_delete\\s\\[[a-zA-Z0-9\\s:]*\\])";
        // regexp from Trinmon:
        final String regex = "(!style_delete\\s*\\[[^\\]]*\\])";
    
        final Matcher m = Pattern.compile(regex).matcher(input);
    
        final List<String> matches = new ArrayList<>();
        while (m.find()) {
            matches.add(m.group(0));
        }
    
        assertEquals(2, matches.size());
        assertEquals("match 1: ", matches.get(0), "!style_delete [company code : 43ev4]");
        assertEquals("match 2: ", matches.get(1), "!style_delete [organiztion : 0asj9]");
    }
    

    edit

    也许Trinimon的答案模式更优雅一点 . 我用正则表达式更新了正则表达式 .


  • 3

    这是因为 .* 贪婪 . 改为使用它:

    "!style_delete\\s*\\[[^\\]]*\\]"
    

    这意味着:匹配括号中的所有内容,不包括结束 ] .

    或者使 [] 之间的内容非贪婪:

    "!style_delete\\s*\\[.*?\\]"
    
评论

暂时没有评论!