如何使用sed只替换文件中的第一个匹配项？-Java 学习之路

169

我想在任何现有的#includes之前用额外的include指令更新大量的C源文件 . 对于这种任务，我通常使用带有sed的小bash脚本来重写文件 .

如何让 sed 替换文件中第一次出现的字符串而不是替换每次出现？

如果我使用

sed s/#include/#include "newfile.h"\n#include/

它取代了所有#includes .

也欢迎提供实现相同目标的替代建议 .

19 回答

作为替代建议，您可能需要查看 ed 命令 .

man 1 ed

teststr='
#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>
'

# for in-place file editing use "ed -s file" and replace ",p" with "w"
# cf. http://wiki.bash-hackers.org/howto/edit-ed
cat <<-'EOF' | sed -e 's/^ *//' -e 's/ *$//' | ed -s <(echo "$teststr")
   H
   /# *include/i
   #include "newfile.h"
   .
   ,p
   q
EOF

回复于 2024-04-29T09:12:35+08:00

2
可能的解决方案：
```
/#include/!{p;d;}
    i\
    #include "newfile.h"
    :
    n
    b
```
说明：
- 读取行直到我们找到#include，打印这些行然后开始新的循环
- 插入新的包含行
- 进入一个只读行的循环（默认sed也会打印这些行），我们不会从这里回到脚本的第一部分
回复于 2024-04-29T09:12:35+08:00

我会用awk脚本执行此操作：

BEGIN {i=0}
(i==0) && /#include/ {print "#include \"newfile.h\""; i=1}
{print $0}    
END {}

然后用awk运行它：

awk -f awkscript headerfile.h > headerfilenew.h

可能很草率，我是新手 .

回复于 2024-04-29T09:12:35+08:00

# sed script to change "foo" to "bar" only on the first occurrence
 1{x;s/^/first/;x;}
 1,/foo/{x;/first/s///;x;s/foo/bar/;}
 #---end of script---

或者，如果您愿意：编者注：仅适用于GNU sed .

sed '0,/RE/s//to_that/' file

Source

回复于 2024-04-29T09:12:35+08:00

2
写一个sed脚本，只会用“Banana”替换第一次出现的“Apple”

示例输入：输出：
```
Apple       Banana
     Orange      Orange
     Apple       Apple
```
这是一个简单的脚本：编者注：仅适用于GNU sed .
```
sed '0,/Apple/{s/Apple/Banana/}' filename
```
回复于 2024-04-29T09:12:35+08:00

sed '0,/pattern/s/pattern/replacement/' filename

这对我有用 .

例

sed '0,/<Menu>/s/<Menu>/<Menu><Menu>Sub menu<\/Menu>/' try.txt > abc.txt

编者注：两者都只适用于GNU sed .

回复于 2024-04-29T09:12:35+08:00

7
overview 中有很多有用 existing answers ，辅以 explanations ：

这里的示例使用简化的用例：仅在第一个匹配行中将'foo'替换为'bar' . 由于使用ANSI C引用的字符串（$'...'）来提供样本输入行，因此将bash，ksh或zsh假定为shell .

GNU sed only:

Ben Hoffstein's anwswer向我们展示了GNU为POSIX specification for sed提供了一个扩展，它允许以下2地址形式： 0,/re/ （ re 表示这里的任意正则表达式） .

0,/re/ 允许正则表达式 match on the very first line also . 换句话说：这样的地址将创建从第1行到包括与 re 匹配的行的范围 - 无论 re 出现在第1行还是后续行 .
- 将此与POSIX兼容表单 1,/re/ 进行对比，该表单创建一个范围，该范围从第1行开始，包括与后续行匹配 re 的行;换句话说：这个 will not detect the first occurrence of an re match if it happens to occur on the 1st line 以及 prevents the use of shorthand // 用于重用最近使用的正则表达式（见下一点） . [1]
如果将 0,/re/ 地址与使用相同正则表达式的 s/.../.../ （替换）调用组合在一起，则命令将仅在与 re 匹配的第一行上执行替换 .
sed 提供了方便 shortcut for reusing the most recently applied regular expression ： empty delimiter pair, // .
```
$ sed '0,/foo/ s//bar/' <<<$'1st foo\nUnrelated\n2nd foo\n3rd foo' 
1st bar         # only 1st match of 'foo' replaced
Unrelated
2nd foo
3rd foo
```
A POSIX-features-only sed such as BSD (macOS) sed （也适用于GNU sed ）：

由于 0,/re/ 无法使用，并且 1,/re/ 形式如果碰巧发生在第一行（见上文）， 1,/re/ 将无法检测到 special handling for the 1st line is required .

MikhailVS's answer提到了这项技术，在这里举了一个具体的例子：
```
$ sed -e '1 s/foo/bar/; t' -e '1,// s//bar/' <<<$'1st foo\nUnrelated\n2nd foo\n3rd foo'
1st bar         # only 1st match of 'foo' replaced
Unrelated
2nd foo
3rd foo
```
注意：
- 此处使用空的正则表达式 // 快捷方式两次：一次用于范围的 endpoints ，一次用于 s 调用;在这两种情况下，regex foo 都被隐式重用，允许我们不必复制它，这使得更短和更易维护的代码 .
- POSIX sed 在某些函数之后需要实际换行符，例如在标签名称之后甚至是其遗漏之后，如 t 这样;策略性地将脚本拆分为多个 -e 选项是使用实际换行符的替代方法：结束每个 -e 脚本块，其中通常需要换行 .
1 s/foo/bar/ 仅在第1行替换 foo ，如果在那里找到的话 . 如果是这样， t 分支到脚本的末尾（跳过该行上的剩余命令） . （仅当最近的 s 调用执行实际替换时， t 函数才会分支到标签;如果没有标签，则此处的情况就是脚本的末尾分支到） .

发生这种情况时，范围地址 1,// （通常从第2行开始查找第一次出现）将不匹配，并且不会处理范围，因为当当前行已经是 2 时会计算地址 .

相反，如果第一行没有匹配项，将输入 1,// ，并找到真正的第一场比赛 .

净效果与GNU sed 的 0,/re/ 相同：只有第一次出现被替换，无论是在第一行还是其他任何一行 .

NON-range approaches

potong's answer演示 loop techniques 那 bypass the need for a range ;因为他使用GNU sed 语法，这里是 POSIX-compliant equivalents ：

循环技术1：在第一次匹配时，执行替换，然后 enter a loop that simply prints the remaining lines as-is ：
```
$ sed -e '/foo/ {s//bar/; ' -e ':a' -e '$!{n;ba' -e '};}' <<<$'1st foo\nUnrelated\n2nd foo\n3rd foo'
1st bar
Unrelated
2nd foo
3rd foo
```
循环技术2，适用于 smallish files only ： read the entire input into memory, then perform a single substitution on it .
```
$ sed -e ':a' -e '$!{N;ba' -e '}; s/foo/bar/' <<<$'1st foo\nUnrelated\n2nd foo\n3rd foo'
1st bar
Unrelated
2nd foo
3rd foo
```
[1] 1.61803提供了1，/ re /，有和没有后续s //： - sed'1，/ foo / s / foo / bar /'<<< $ 1foo \ n2foo'产生的例子$ '1BAR \ n2bar';即两条线都被更新，因为第1行与第1行匹配，而regex / foo / - 范围的结束 - 仅在下一行开始查找 . 因此，在这种情况下选择两行，并且对它们两者执行s / foo / bar /替换 . - sed'1，/ foo / s // bar /'<<< $ 1foo \ n2foo \ n3foo'失败：使用sed：first RE may不是空的（BSD / macOS）和sed：-e表达式＃1，char 0：没有前一个正则表达式（GNU），因为，在处理第一行时（由于行号1开始该范围），还没有应用正则表达式，所以//不引用任何东西 . 除了GNU sed的特殊0，/ re /语法之外，任何以行号开头的范围都有效地排除了//的使用 .
回复于 2024-04-29T09:12:35+08:00
105
你可以使用awk做类似的事情..
```
awk '/#include/ && !done { print "#include \"newfile.h\""; done=1;}; 1;' file.c
```
说明：
```
/#include/ && !done
```
当行匹配“#include”并且我们尚未处理它时，在{}之间运行操作语句 .
```
{print "#include \"newfile.h\""; done=1;}
```
这打印#include“newfile.h”，我们需要转义引号 . 然后我们将done变量设置为1，因此我们不添加更多包含 .
```
1;
```
这意味着“打印出行” - 空行动默认打印$ 0，打印出整行 . 一个班轮，比sed IMO更容易理解:-)
回复于 2024-04-29T09:12:35+08:00
222
关于linuxtopia sed FAQ的全面答案 . 它还强调了人们提供的一些答案不适用于非GNU版本的sed，例如
```
sed '0,/RE/s//to_that/' file
```
在非GNU版本中必须是
```
sed -e '1s/RE/to_that/;t' -e '1,/RE/s//to_that/'
```
但是，此版本不适用于gnu sed .

这是一个适用于以下两个版本的版本：
```
-e '/RE/{s//to_that/;:a' -e '$!N;$!ba' -e '}'
```
例如：
```
sed -e '/Apple/{s//Banana/;:a' -e '$!N;$!ba' -e '}' filename
```
回复于 2024-04-29T09:12:35+08:00
49
只需在最后添加出现次数：
```
sed s/#include/#include "newfile.h"\n#include/1
```
回复于 2024-04-29T09:12:35+08:00
15
```
#!/bin/sed -f
1,/^#include/ {
    /^#include/i\
#include "newfile.h"
}
```
此脚本的工作原理：对于1和第一个 #include 之间的行（在第1行之后），如果该行以 #include 开头，则在前面添加指定的行 .

但是，如果第一个 #include 在第1行，那么第1行和下一个后续的 #include 都将前面加上该行 . 如果您正在使用GNU sed ，它有一个扩展名，其中 0,/^#include/ （而不是 1, ）将做正确的事情 .
回复于 2024-04-29T09:12:35+08:00
2
我终于让它在一个Bash脚本中工作，用于在RSS提要的每个项目中插入一个唯一的时间戳：
```
sed "1,/====RSSpermalink====/s/====RSSpermalink====/${nowms}/" \
            production-feed2.xml.tmp2 > production-feed2.xml.tmp.$counter
```
它仅更改第一次出现 .

${nowms} 是Perl脚本设置的时间（以毫秒为单位）， $counter 是用于脚本中循环控制的计数器， \ 允许命令在下一行继续 .

读入文件并将stdout重定向到工作文件 .

我理解它的方式， 1,/====RSSpermalink====/ 通过设置范围限制告诉sed何时停止，然后 s/====RSSpermalink====/${nowms}/ 是用第二个替换第一个字符串的熟悉的sed命令 .

在我的情况下，我把命令放在双引号中因为我在带有变量的Bash脚本中使用它 .
回复于 2024-04-29T09:12:35+08:00

如果要处理的文件中没有 include 语句，请使用 FreeBSD ed 并避免 ed 的"no match"错误：

teststr='
#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>
'

# using FreeBSD ed
# to avoid ed's "no match" error, see
# *emphasized text*http://codesnippets.joyent.com/posts/show/11917 
cat <<-'EOF' | sed -e 's/^ *//' -e 's/ *$//' | ed -s <(echo "$teststr")
   H
   ,g/# *include/u\
   u\
   i\
   #include "newfile.h"\
   .
   ,p
   q
EOF

回复于 2024-04-29T09:12:35+08:00

这可能适合你（GNU sed）：

sed -si '/#include/{s//& "newfile.h\n&/;:a;$!{n;ba}}' file1 file2 file....

或者如果内存不是问题：

sed -si ':a;$!{N;ba};s/#include/& "newfile.h\n&/' file1 file2 file...

回复于 2024-04-29T09:12:35+08:00

0
我知道这是一个旧帖子，但我有一个我以前使用的解决方案：
```
grep -E -m 1 -n 'old' file | sed 's/:.*$//' - | sed 's/$/s\/old\/new\//' - | sed -f - file
```
基本上使用grep找到第一次出现并停在那里 . 还打印行号，即5行 . 管道进入sed并删除：以及之后的所有内容，只需要留下行号 . 管道进入sed，它将s /.*/替换为末尾，它给出一个1行脚本，该脚本通过管道传输到最后一个sed作为文件脚本运行 .

因此，如果regex = #include和replace = blah并且grep第一次出现在第5行，那么通过管道传输到最后一个sed的数据将是5s /.*/ blah / .
回复于 2024-04-29T09:12:35+08:00
0
如果有人来这里替换所有行中第一次出现的字符（比如我自己），请使用：
```
sed '/old/s/old/new/1' file

-bash-4.2$ cat file
123a456a789a
12a34a56
a12
-bash-4.2$ sed '/a/s/a/b/1' file
123b456a789a
12b34a56
b12
```
例如，通过将1更改为2，您可以仅替换所有第二个a .
回复于 2024-04-29T09:12:35+08:00

以下命令删除文件中第一次出现的字符串 . 它也删除了空行 . 它出现在xml文件中，但它适用于任何文件 .

如果您使用xml文件并且想要删除标记，则非常有用 . 在此示例中，它删除了第一次出现的“isTag”标记 .

命令：

sed -e 0,/'<isTag>false<\/isTag>'/{s/'<isTag>false<\/isTag>'//}  -e 's/ *$//' -e  '/^$/d'  source.txt > output.txt

源文件（source.txt）

<xml>
    <testdata>
        <canUseUpdate>true</canUseUpdate>
        <isTag>false</isTag>
        <moduleLocations>
            <module>esa_jee6</module>
            <isTag>false</isTag>
        </moduleLocations>
        <node>
            <isTag>false</isTag>
        </node>
    </testdata>
</xml>

结果文件（output.txt）

<xml>
    <testdata>
        <canUseUpdate>true</canUseUpdate>
        <moduleLocations>
            <module>esa_jee6</module>
            <isTag>false</isTag>
        </moduleLocations>
        <node>
            <isTag>false</isTag>
        </node>
    </testdata>
</xml>

ps：它在Solaris SunOS 5.10（相当陈旧）上对我不起作用，但它适用于Linux 2.6，sed版本4.1.5

回复于 2024-04-29T09:12:35+08:00

没有什么新的，但也许更具体的答案： sed -rn '0,/foo(bar).*/ s%%\1%p'

示例： xwininfo -name unity-launcher 生成如下输出：

xwininfo: Window id: 0x2200003 "unity-launcher"

  Absolute upper-left X:  -2980
  Absolute upper-left Y:  -198
  Relative upper-left X:  0
  Relative upper-left Y:  0
  Width: 2880
  Height: 98
  Depth: 24
  Visual: 0x21
  Visual Class: TrueColor
  Border width: 0
  Class: InputOutput
  Colormap: 0x20 (installed)
  Bit Gravity State: ForgetGravity
  Window Gravity State: NorthWestGravity
  Backing Store State: NotUseful
  Save Under State: no
  Map State: IsViewable
  Override Redirect State: no
  Corners:  +-2980+-198  -2980+-198  -2980-1900  +-2980-1900
  -geometry 2880x98+-2980+-198

使用 xwininfo -name unity-launcher|sed -rn '0,/^xwininfo: Window id: (0x[0-9a-fA-F]+).*/ s%%\1%p' 提取窗口ID会产生：

0x2200003

回复于 2024-04-29T09:12:35+08:00

POSIXly（在sed中也有效），只使用 one 正则表达式，只需要一行内存（像往常一样）：

sed '/\(#include\).*/!b;//{h;s//\1 "newfile.h"/;G};:1;n;b1'

解释：

sed '
/\(#include\).*/!b          # Only one regex used. On lines not matching
                            # the text  `#include` **yet**,
                            # branch to end, cause the default print. Re-start.
//{                         # On first line matching previous regex.
    h                       # hold the line.
    s//\1 "newfile.h"/      # append ` "newfile.h"` to the `#include` matched.
    G                       # append a newline.
  }                         # end of replacement.
:1                          # Once **one** replacement got done (the first match)
n                           # Loop continually reading a line each time
b1                          # and printing it by default.
'                           # end of sed script.

回复于 2024-04-29T09:12:35+08:00

如何使用sed只替换文件中的第一个匹配项？

19 回答

相关问题