Home Articles

Java - 解释这个正则表达式(“,(?=([^ \”] * \“[^ \”] * \“)* [^ \”] * $)“, - 1)

Asked
Viewed 561 times
3

我用逗号分隔字符串"foo,bar,c;qual=" baz,blurb ",d;junk=" quux,syzygy“”但是想在引号中保留逗号 . 这个问题在这个_1839992问题中得到了回答,但它没有完全解释海报如何创建这段代码:

line.split(",(?=([^\"]*\"[^\"]*\")*[^\"]*$)", -1);

好的,所以我确实理解了一些正在发生的事情,但有一点令我感到困惑 . 我知道第一个逗号用于匹配 .

然后

(?=

是一个前瞻性搜索 .

然后将第一部分分组

([^\"]*\"[^\"]*\").

这让我感到困惑 . 所以第一部分

[^\"]*

表示任何带引号的行的开头都会将令牌分开零次或多次 .

然后来了 . “ . 现在这就像在字符串中打开一个引用还是说这个引用匹配?

然后它重复完全相同的代码行,为什么?

([^\"]*\"[^\"]*\")

在第二部分再次添加相同的代码来解释它必须用引号完成 .

有人可以解释我没有得到的部分吗?

3 Answers

  • 3

    我认为他们在后面的答案中做了很好的解释:

    [^\"] 匹配报价以外的其他内容 . \" 是报价 .

    所以这部分 ([^\"]*\"[^\"]*\")

    • [^\"]* 匹配除了报价0次或更多次

    • \" 匹配报价,是的,这是开场报价

    • [^\"]* 匹配除了报价0次或更多次

    • \" 匹配报价,收盘报价

    它们只需要第一个 [^\"]* ,因为它们不以引号开头,它们的示例输入类似于 a="abc",b="d,ef" . 如果您正在解析 "abc","d,ef" ,则不需要它 .

  • 1

    这是你的字符串/,(?=([^\“] \ _ 1180014] \”)[^\“] $)/

    这是https://regex101.com/的读数

    , matches the character , literally
    (?=([^\"]*\"[^\"]*\")*[^\"]*$) Positive Lookahead - Assert that the regex below can be matched
    1st Capturing group ([^\"]*\"[^\"]*\")*
    Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
    Note: A repeated capturing group will only capture the last iteration. Put a capturing group around the repeated group to capture all iterations or use a non-capturing group instead if you're not interested in the data
    [^\"]* match a single character not present in the list below
    Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
    \" matches the character " literally
    \" matches the character " literally
    [^\"]* match a single character not present in the list below
    Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
    \" matches the character " literally
    \" matches the character " literally
    [^\"]* match a single character not present in the list below
    Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
    \" matches the character " literally
    $ assert position at end of the string
    
  • 0

    [^\"] 是没有". "匹配的任何字符串". So basically ([^"] * \ _1180001] * \“)匹配包含2 " 的字符串,最后一个字符是 " .

Related