循环遍历Bash中的文件内容-Java 学习之路

1024

如何使用Bash迭代文本文件的每一行？

使用此脚本：

echo "Start!"
for p in (peptides.txt)
do
    echo "${p}"
done

我在屏幕上看到这个输出：

Start!
./runPep.sh: line 3: syntax error near unexpected token `('
./runPep.sh: line 3: `for p in (peptides.txt)'

（后来我想用 $p 做一些更复杂的事情，而不仅仅是输出到屏幕上 . ）

环境变量 SHELL 是（来自env）：

SHELL=/bin/bash

/bin/bash --version 输出：

GNU bash, version 3.1.17(1)-release (x86_64-suse-linux-gnu)
Copyright (C) 2005 Free Software Foundation, Inc.

cat /proc/version 输出：

Linux version 2.6.18.2-34-default (geeko@buildhost) (gcc version 4.1.2 20061115 (prerelease) (SUSE Linux)) #1 SMP Mon Nov 27 11:46:27 UTC 2006

文件peptides.txt包含：

RKEKNVQ
IPKKLLQK
QYFHQLEKMNVK
IPKKLLQK
GDLSTALEVAIDCYEK
QYFHQLEKMNVKIPENIYR
RKEKNVQ
VLAKHGKLQDAIN
ILGFMK
LEDVALQILL

11 回答

65
还有一些其他答案没有涉及的事情：

从分隔文件中读取
```
# ':' is the delimiter here, and there are three fields on each line in the file
# IFS set below is restricted to the context of `read`, it doesn't affect any other code
while IFS=: read -r field1 field2 field3; do
  # process the fields
  # if the line has less than three fields, the missing fields will be set to an empty string
  # if the line has more than three fields, `field3` will get all the values, including the third field plus the delimiter(s)
done < input.txt
```
使用进程替换从另一个命令的输出中读取
```
while read -r line; do
  # process the line
done < <(command ...)
```
这种方法优于 command ... | while read -r line; do ... ，因为while循环在当前shell中运行而不是在后者的情况下运行子shell . 请参阅相关文章A variable modified inside a while loop is not remembered .

从空分隔输入读取，例如find ... -print0
```
while read -r -d '' line; do
  # logic
  # use a second 'read ... <<< "$line"' if we need to tokenize the line
done < <(find /path/to/dir -print0)
```
相关阅读：BashFAQ/020 - How can I find and safely handle file names containing newlines, spaces or both?

一次从多个文件中读取
```
while read -u 3 -r line1 && read -u 4 -r line2; do
  # process the lines
  # note that the loop will end when we reach EOF on either of the files, because of the `&&`
done 3< input1.txt 4< input2.txt
```
基于@chepner's回答here：

-u 是一个bash扩展名 . 对于POSIX兼容性，每次调用看起来都像 read -r X <&3 .

将整个文件读入数组（Bash版本早于4）
```
while read -r line; do
    my_array+=("$line")
done < my_file
```
如果文件以不完整的行结束（结尾处缺少换行符），则：
```
while read -r line || [[ $line ]]; do
    my_array+=("$line")
done < my_file
```
将整个文件读入数组（Bash版本4x及更高版本）
```
readarray -t my_array < my_file
```
要么
```
mapfile -t my_array < my_file
```
然后
```
for line in "${my_array[@]}"; do
  # process the lines
done
```
相关文章：
回复于 2024-04-26T19:56:31+08:00

315

这是我的真实例子如何循环另一个程序输出的行，检查子串，从变量中删除双引号，在循环外使用该变量 . 我想很多人迟早会问这些问题 .

##Parse FPS from first video stream, drop quotes from fps variable
## streams.stream.0.codec_type="video"
## streams.stream.0.r_frame_rate="24000/1001"
## streams.stream.0.avg_frame_rate="24000/1001"
FPS=unknown
while read -r line; do
  if [[ $FPS == "unknown" ]] && [[ $line == *".codec_type=\"video\""* ]]; then
    echo ParseFPS $line
    FPS=parse
  fi
  if [[ $FPS == "parse" ]] && [[ $line == *".r_frame_rate="* ]]; then
    echo ParseFPS $line
    FPS=${line##*=}
    FPS="${FPS%\"}"
    FPS="${FPS#\"}"
  fi
done <<< "$(ffprobe -v quiet -print_format flat -show_format -show_streams -i "$input")"
if [ "$FPS" == "unknown" ] || [ "$FPS" == "parse" ]; then 
  echo ParseFPS Unknown frame rate
fi
echo Found $FPS

在循环外声明变量，设置值并在循环外使用它需要完成<<< "$(...)"语法 . 应用程序需要在当前控制台的上下文中运行 . 命令周围的引号保持输出流的换行符 .

子串的循环匹配然后读取名称=值对，拆分最后=字符的右侧部分，删除第一个引用，删除最后一个引号，我们有一个干净的值在别处使用 .

回复于 2024-04-26T19:56:31+08:00

3
这并不比其他答案好，但是在没有空格的文件中完成工作的另一种方法（参见注释） . 我发现我经常需要单行来挖掘文本文件中的列表，而无需使用单独的脚本文件 .
```
for word in $(cat peptides.txt); do echo $word; done
```
这种格式允许我将它全部放在一个命令行中 . 将“echo $ word”部分更改为您想要的任何内容，您可以发出由分号分隔的多个命令 . 以下示例将文件的内容用作您可能编写的其他两个脚本的参数 .
```
for word in $(cat peptides.txt); do cmd_a.sh $word; cmd_b.py $word; done
```
或者，如果您打算像流编辑器一样使用它（学习sed），您可以将输出转储到另一个文件，如下所示 .
```
for word in $(cat peptides.txt); do cmd_a.sh $word; cmd_b.py $word; done > outfile.txt
```
我已经使用了上面这些，因为我使用了文本文件，我用它创建了每行一个单词 . （请参阅注释）如果你有空格，你不想拆分你的单词/行，它会有点丑陋，但相同的命令仍然如下工作：
```
OLDIFS=$IFS; IFS=$'\n'; for line in $(cat peptides.txt); do cmd_a.sh $line; cmd_b.py $line; done > outfile.txt; IFS=$OLDIFS
```
这只是告诉shell只分裂换行符，而不是空格，然后将环境返回到之前的状态 . 此时，您可能需要考虑将所有内容放入shell脚本中，而不是将其全部压缩到一行中 .

祝你好运！
回复于 2024-04-26T19:56:31+08:00

1631

#!/bin/bash
#
# Change the file name from "test" to desired input file 
# (The comments in bash are prefixed with #'s)
for x in $(cat test.txt)
do
    echo $x
done

回复于 2024-04-26T19:56:31+08:00

4
如果您不希望读取被换行符破坏，请使用 -
```
#!/bin/bash
while IFS='' read -r line || [[ -n "$line" ]]; do
    echo "$line"
done < "$1"
```
然后以文件名作为参数运行脚本 .
回复于 2024-04-26T19:56:31+08:00
11
一种方法是：
```
while read p; do
  echo "$p"
done <peptides.txt
```
正如评论中所指出的，这会产生修剪前导空格，解释反斜杠序列以及如果缺少终止换行符而跳过尾随行的副作用 . 如果这些是问题，你可以这样做：
```
while IFS="" read -r p || [ -n "$p" ]
do
  printf '%s\n' "$p"
done < peptides.txt
```
例外情况下，如果loop body may read from standard input，您可以使用不同的文件描述符打开文件：
```
while read -u 10 p; do
  ...
done 10<peptides.txt
```
这里，10只是一个任意数字（不同于0,1,2） .
回复于 2024-04-26T19:56:31+08:00
12
假设你有这个文件：
```
$ cat /tmp/test.txt
Line 1
    Line 2 has leading space
Line 3 followed by blank line

Line 5 (follows a blank line) and has trailing space    
Line 6 has no ending CR
```
有四个元素会改变许多Bash解决方案读取的文件输出的含义：
- 空行4;
- 两行上的前导或尾随空格;
- 维护各行的含义（即每行是一条记录）;
- 第6行未以CR终止 .
如果希望逐行包含文本文件（包括空行和没有CR的终止行），则必须使用while循环，并且必须对最后一行进行备用测试 .

以下是可能更改文件的方法（与 cat 返回的内容相比）：

1）丢失最后一行以及前导和尾随空格：
```
$ while read -r p; do printf "%s\n" "'$p'"; done </tmp/test.txt
'Line 1'
'Line 2 has leading space'
'Line 3 followed by blank line'
''
'Line 5 (follows a blank line) and has trailing space'
```
（如果改为 while IFS= read -r p; do printf "%s\n" "'$p'"; done </tmp/test.txt ，则保留前导和尾随空格，但如果未以CR终止，则仍会丢失最后一行）

2）使用 cat 进程替换将一次读取整个文件并且失去了各行的含义：
```
$ for p in "$(cat /tmp/test.txt)"; do printf "%s\n" "'$p'"; done
'Line 1
    Line 2 has leading space
Line 3 followed by blank line

Line 5 (follows a blank line) and has trailing space    
Line 6 has no ending CR'
```
（如果从 $(cat /tmp/test.txt) 中删除 " ，则逐字而不是一次读取文件 . 也可能不是预期的...）

逐行读取文件并保留所有间距的最强大和最简单的方法是：
```
$ while IFS= read -r line || [[ -n $line ]]; do printf "'%s'\n" "$line"; done </tmp/test.txt
'Line 1'
'    Line 2 has leading space'
'Line 3 followed by blank line'
''
'Line 5 (follows a blank line) and has trailing space    '
'Line 6 has no ending CR'
```
如果您想剥离领先和交易空间，请删除 IFS= 部分：
```
$ while read -r line || [[ -n $line ]]; do printf "'%s'\n" "$line"; done </tmp/test.txt
'Line 1'
'Line 2 has leading space'
'Line 3 followed by blank line'
''
'Line 5 (follows a blank line) and has trailing space'
'Line 6 has no ending CR'
```
（没有终止 \n 的文本文件虽然相当常见，但在POSIX下被视为已损坏 . 如果您可以指望尾随 \n ，则 while 循环中不需要 || [[ -n $line ]] . ）

更多BASH FAQ
回复于 2024-04-26T19:56:31+08:00
41
Option 1a: while循环：一次一行：输入重定向
```
#!/bin/bash
filename='peptides.txt'
echo Start
while read p; do 
    echo $p
done < $filename
```
Option 1b: while循环：一次一行：
打开文件，从文件描述符中读取（在本例中为文件描述符＃4） .
```
#!/bin/bash
filename='peptides.txt'
exec 4<$filename
echo Start
while read -u4 p ; do
    echo $p
done
```
Option 2: For循环：将文件读入单个变量并解析 .
此语法将基于标记之间的任何空白区域解析"lines" . 这仍然有效，因为给定的输入文件行是单字标记 . 如果每行有多个令牌，则此方法不起作用 . 此外，将整个文件读入单个变量对于大文件来说不是一个好策略 .
```
#!/bin/bash
filename='peptides.txt'
filelines=`cat $filename`
echo Start
for line in $filelines ; do
    echo $line
done
```
回复于 2024-04-26T19:56:31+08:00

cat peptides.txt | while read line
do
   # do something with $line here
done

回复于 2024-04-26T19:56:31+08:00

119

@Peter：这可能对你有用 -

echo "Start!";for p in $(cat ./pep); do
echo $p
done

这将返回输出 -

Start!
RKEKNVQ
IPKKLLQK
QYFHQLEKMNVK
IPKKLLQK
GDLSTALEVAIDCYEK
QYFHQLEKMNVKIPENIYR
RKEKNVQ
VLAKHGKLQDAIN
ILGFMK
LEDVALQILL

回复于 2024-04-26T19:56:31+08:00

39
使用while循环，如下所示：
```
while IFS= read -r line; do
   echo "$line"
done <file
```
笔记：
- 如果未正确设置 IFS ，则会丢失缩进 .
- You should almost always use the -r option with read.
- Don't read lines with for
回复于 2024-04-26T19:56:31+08:00

循环遍历Bash中的文件内容

11 回答

从分隔文件中读取

使用进程替换从另一个命令的输出中读取

从空分隔输入读取，例如find ... -print0

一次从多个文件中读取

将整个文件读入数组（Bash版本早于4）

将整个文件读入数组（Bash版本4x及更高版本）

相关问题