它将生成两个文件 cn.txt and en.txt . 它检查该行是否包含至少一个非ascii字符,如果找到一个,该行将被视为中文行 .
小测试:
kent$ cat f
this is line1 in english
你好
this is line2 in english
你好你好
this is line3 in english
this is line4 in english
你好你好你好
kent$ awk '/[^\x00-\x7f]/{print >"cn.txt";next}{print > "en.txt"}' f
kent$ head *.txt
==> cn.txt <==
你好
你好你好
你好你好你好
==> en.txt <==
this is line1 in english
this is line2 in english
this is line3 in english
this is line4 in english
1 回答
这个单行可能会有所帮助:
它将生成两个文件
cn.txt and en.txt
. 它检查该行是否包含至少一个非ascii字符,如果找到一个,该行将被视为中文行 .小测试: