首页 文章

Python:如何使用函数过滤pandas.Series而不会丢失索引关联?

提问于
浏览
0

我有一个 pandas.DataFrame ,我现在就在这里'm iterating over the rows. On each row I need to filter out some non valuable values and keep the indexes association. This is where I':

for i,row in df.iterrows():
    my_values = row["first_interesting_column":]
    # here I need to filter 'my_values' Series based on a function
    # what I'm doin right now is use the built-in python filter function, but what I get back is a list with no indexes anymore
    my_valuable_values = filter(lambda x: x != "-", my_values)

我怎样才能做到这一点?

2 回答

  • 1

    我被IRC上的一个人建议了答案 . 这里是:

    w = my_values != "-" # creates a Series with a map of the stuff to be included/exluded
    my_valuable_values = my_values[w]
    

    ......也可以缩短......

    my_valuable_values = my_values[my_values != "-"]
    

    ......当然,要避免再迈一步......

    row["first_interesting_column":][row["first_interesting_column":] != "-"]
    
  • 0

    迭代行通常是不好的做法(并且非常慢) . 正如@JohnE建议你想使用applymap .

    如果我理解你的问题,我想你想做的是:

    import pandas as pd
    from io import StringIO
    
    datastring = StringIO("""\
    2009    2010    2011   2012
    1       4       -      4
    3       -       2      3
    4       -       8      7
    """)
    df = pd.read_table(datastring, sep='\s\s+')
    a = df[df.applymap(lambda x: x != '-')].astype(np.float).values
    a[~np.isnan(a)]
    

相关问题