首页 文章

正则表达式,匹配所有有效格式的IPv6地址

提问于
浏览
11

At first glance, I concede that this question looks like a duplicate of this question and any other related to it:

Regular expression that matches valid IPv6 addresses

事实上这个问题的答案几乎可以回答我的问题, but not fully.

我遇到的问题代码,但最成功的代码如下所示:

private string RemoveIPv6(string sInput)
{
    string pattern = @"(([0-9a-fA-F]{1,4}:){7,7}[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,7}:|([0-9a-fA-F]{1,4}:){1,6}:[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,5}(:[0-9a-fA-F]{1,4}){1,2}|([0-9a-fA-F]{1,4}:){1,4}(:[0-9a-fA-F]{1,4}){1,3}|([0-9a-fA-F]{1,4}:){1,3}(:[0-9a-fA-F]{1,4}){1,4}|([0-9a-fA-F]{1,4}:){1,2}(:[0-9a-fA-F]{1,4}){1,5}|[0-9a-fA-F]{1,4}:((:[0-9a-fA-F]{1,4}){1,6})|:((:[0-9a-fA-F]{1,4}){1,7}|:)|fe80:(:[0-9a-fA-F]{0,4}){0,4}%[0-9a-zA-Z]{1,}|::(ffff(:0{1,4}){0,1}:){0,1}((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])|([0-9a-fA-F]{1,4}:){1,4}:((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9]))";
    //That is one looooong regex! From: https://stackoverflow.com/a/17871737/3472690
    //if (IsCompressedIPv6(sInput))
      //  sInput = UncompressIPv6(sInput);
    string output = Regex.Replace(sInput, pattern, "");
    if (output.Contains("Addresses"))
        output = output.Substring(0, "Addresses: ".Length);

    return output;
}

我在这个答案提供的正则表达式模式中遇到的问题是David M. Syzdek's Answer,它没有投掷它 .

我正在使用正则表达式模式主要用空格或空值替换字符串中的IPv6地址 .

例如,

Addresses:  2404:6800:4003:c02::8a

以及...

Addresses:  2404:6800:4003:804::200e

最后......

Addresses:  2001:4998:c:a06::2:4008

所有这些都没有被正则表达式完全匹配,或者未能完全匹配 .

正则表达式将返回字符串的其余部分,如下所示:

Addresses:  8a

    Addresses:  200e

    Addresses:  2:4008

可以看出,由于残余物所采用的格式不同,它已经留下了难以检测和删除的IPv6地址的残余 . 下面是正则表达式模式本身,以便更好地分析:

(([0-9a-fA-F]{1,4}:){7,7}[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,7}:|([0-9a-fA-F]{1,4}:){1,6}:[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,5}(:[0-9a-fA-F]{1,4}){1,2}|([0-9a-fA-F]{1,4}:){1,4}(:[0-9a-fA-F]{1,4}){1,3}|([0-9a-fA-F]{1,4}:){1,3}(:[0-9a-fA-F]{1,4}){1,4}|([0-9a-fA-F]{1,4}:){1,2}(:[0-9a-fA-F]{1,4}){1,5}|[0-9a-fA-F]{1,4}:((:[0-9a-fA-F]{1,4}){1,6})|:((:[0-9a-fA-F]{1,4}){1,7}|:)|fe80:(:[0-9a-fA-F]{0,4}){0,4}%[0-9a-zA-Z]{1,}|::(ffff(:0{1,4}){0,1}:){0,1}((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])|([0-9a-fA-F]{1,4}:){1,4}:((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9]))

因此, my question is ,如何更正此正则表达式模式以使其匹配,从而允许从不仅包含IPv6地址本身的字符串中完全删除任何IPv6地址?

Alternatively ,我上面提供的代码片段如何更正以提供所需的结果?

对于那些可能想知道的人,我从nslookup命令的StandardOutput获取字符串,并且IPv6地址将始终不同 . 对于上面的示例,我从"google.com"和"yahoo.com"获得了这些IPv6地址 .

我没有使用内置函数来解析DNS条目是有充分理由的,我认为暂时不重要,因此我使用的是nslookup .

至于调用该函数的代码,如果需要,如下所示:(它本身也是另一个函数/方法,或者说是其中一部分)

string output = "";
string garbagecan = "";
string tempRead = "";
string lastRead = "";
using (StreamReader reader = nslookup.StandardOutput)
{
     while (reader.Peek() != -1)
     {
         if (LinesRead > 3)
         {
             tempRead = reader.ReadLine();
             tempRead = RemoveIPv6(tempRead);

             if (tempRead.Contains("Addresses"))
                 output += tempRead;
             else if (lastRead.Contains("Addresses"))
                 output += tempRead.Trim() + Environment.NewLine;
             else
                 output += tempRead + Environment.NewLine;
             lastRead = tempRead;
         }
         else
             garbagecan = reader.ReadLine();
         LinesRead++;
     }
 }
 return output;

更正的正则表达式应该只允许删除IPv6地址,并保持IPv4地址不变 . 将传递给正则表达式的字符串不会单独包含IPv6地址,并且几乎总是包含其他详细信息,因此,地址显示在哪个索引上是不可预测的 . 由于某些原因,正则表达式也会在第一个出现的IPv6地址之后跳过所有其他IPv6地址,应该注意 .

抱歉,如果有任何遗漏的细节,我会尽力将它们包括在警报中 . 如果可能的话,我也更喜欢使用代码示例,因为我几乎没有关于正则表达式的知识 .

1 回答

  • 8
    (?:^|(?<=\s))(([0-9a-fA-F]{1,4}:){7,7}[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,7}:|([0-9a-fA-F]{1,4}:){1,6}:[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,5}(:[0-9a-fA-F]{1,4}){1,2}|([0-9a-fA-F]{1,4}:){1,4}(:[0-9a-fA-F]{1,4}){1,3}|([0-9a-fA-F]{1,4}:){1,3}(:[0-9a-fA-F]{1,4}){1,4}|([0-9a-fA-F]{1,4}:){1,2}(:[0-9a-fA-F]{1,4}){1,5}|[0-9a-fA-F]{1,4}:((:[0-9a-fA-F]{1,4}){1,6})|:((:[0-9a-fA-F]{1,4}){1,7}|:)|fe80:(:[0-9a-fA-F]{0,4}){0,4}%[0-9a-zA-Z]{1,}|::(ffff(:0{1,4}){0,1}:){0,1}((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])|([0-9a-fA-F]{1,4}:){1,4}:((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9]))(?=\s|$)
    

    使用 lookarounds 可以强制执行完全匹配而不是 partial 匹配 . 请参阅演示 .

    https://regex101.com/r/cT0hV4/5

相关问题