如何将字符串拆分为列表？-Java 学习之路

476

我希望我的Python函数分割一个句子（输入）并将每个单词存储在一个列表中 . 我当前的代码拆分了句子，但没有将单词存储为列表 . 我怎么做？

def split_line(text):

    # split the text
    words = text.split()

    # for each word in the line:
    for word in words:

        # print the word
        print(word)

10 回答

46
```
text.split()
```
这应足以将每个单词存储在列表中 . words 已经是句子中的单词列表，因此不需要循环 .

其次，它可能是一个错字，但你的循环有点搞砸了 . 如果你确实想要使用append，它将是：
```
words.append(word)
```
不
```
word.append(words)
```
回复于 2024-04-29T23:48:54+08:00
11
在任何连续的空白运行中拆分 text 中的字符串 .
```
words = text.split()
```
拆分 text 中的分隔符： "," .
```
words = text.split(",")
```
单词变量将是 list 并包含分隔符上 text split的单词 .
回复于 2024-04-29T23:48:54+08:00
25
str.split()

返回字符串中的单词列表，使用sep作为分隔符...如果未指定sep或为None，则应用不同的分割算法：连续空格的运行被视为单个分隔符，结果将如果字符串具有前导或尾随空格，则在开头或结尾不包含空字符串 .
```
>>> line="a sentence with a few words"
>>> line.split()
['a', 'sentence', 'with', 'a', 'few', 'words']
>>>
```
回复于 2024-04-29T23:48:54+08:00
405
根据您打算如何对列表中的句子进行操作，您可能需要查看Natural Language Took Kit . 它主要涉及文本处理和评估 . 您也可以使用它来解决您的问题：
```
import nltk
words = nltk.word_tokenize(raw_sentence)
```
这具有分割标点符号的额外好处 .

例：
```
>>> import nltk
>>> s = "The fox's foot grazed the sleeping dog, waking it."
>>> words = nltk.word_tokenize(s)
>>> words
['The', 'fox', "'s", 'foot', 'grazed', 'the', 'sleeping', 'dog', ',', 
'waking', 'it', '.']
```
这允许您过滤掉您不想要的任何标点符号并仅使用单词 .

请注意，如果您不打算对句子进行任何复杂的操作，那么使用 string.split() 的其他解决方案会更好 .

将帖子
回复于 2024-04-29T23:48:54+08:00

这个算法怎么样？在空格上拆分文本，然后修剪标点符号 . 这样可以小心地删除单词边缘的标点符号，而不会损害 we're 等词语中的撇号 .

>>> text
"'Oh, you can't help that,' said the Cat: 'we're all mad here. I'm mad. You're mad.'"

>>> text.split()
["'Oh,", 'you', "can't", 'help', "that,'", 'said', 'the', 'Cat:', "'we're", 'all', 'mad', 'here.', "I'm", 'mad.', "You're", "mad.'"]

>>> import string
>>> [word.strip(string.punctuation) for word in text.split()]
['Oh', 'you', "can't", 'help', 'that', 'said', 'the', 'Cat', "we're", 'all', 'mad', 'here', "I'm", 'mad', "You're", 'mad']

回复于 2024-04-29T23:48:54+08:00

3
我希望我的python函数分割句子（输入）并将每个单词存储在列表中

str().split() 方法执行此操作，它需要一个字符串，将其拆分为一个列表：
```
>>> the_string = "this is a sentence"
>>> words = the_string.split(" ")
>>> print(words)
['this', 'is', 'a', 'sentence']
>>> type(words)
<type 'list'> # or <class 'list'> in Python 3.0
```
你遇到的问题是因为拼写错误，你写了 print(words) 而不是 print(word) ：

将 word 变量重命名为 current_word ，这就是您所拥有的：
```
def split_line(text):
    words = text.split()
    for current_word in words:
        print(words)
```
..当你应该做的时候：
```
def split_line(text):
    words = text.split()
    for current_word in words:
        print(current_word)
```
如果由于某种原因你想在for循环中手动构造一个列表，你可以使用list append() 方法，也许是因为你想要小写所有单词（例如）：
```
my_list = [] # make empty list
for current_word in words:
    my_list.append(current_word.lower())
```
或者更整洁，使用list-comprehension：
```
my_list = [current_word.lower() for current_word in words]
```
回复于 2024-04-29T23:48:54+08:00
386
shlex具有.split()功能 . 它与 str.split() 的不同之处在于它不保留引号并将引用的短语视为单个单词：
```
>>> import shlex
>>> shlex.split("sudo echo 'foo && bar'")
['sudo', 'echo', 'foo && bar']
```
回复于 2024-04-29T23:48:54+08:00
75

我认为你因为拼写错误而感到困惑 .

将 print(words) 替换为循环内的 print(word) ，以便将每个单词打印在不同的行上

回复于 2024-04-29T23:48:54+08:00

如果您想要列表中 word/sentence 的所有字符，请执行以下操作：

print(list("word"))
#  ['w', 'o', 'r', 'd']


print(list("some sentence"))
#  ['s', 'o', 'm', 'e', ' ', 's', 'e', 'n', 't', 'e', 'n', 'c', 'e']

回复于 2024-04-29T23:48:54+08:00

0
你可以使用sta（字符串到数组）

pip install sta

然后
```
>>> import sta
>>> sta("some words on a list")
['some', 'words', 'on', 'a', 'list']
```
回复于 2024-04-29T23:48:54+08:00

如何将字符串拆分为列表？

10 回答

相关问题