首页 文章

如何写入现有的excel文件而不覆盖数据(使用pandas)?

提问于
浏览
81

我使用pandas以下列方式写入excel文件:

import pandas

writer = pandas.ExcelWriter('Masterfile.xlsx') 

data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2'])

writer.save()

Masterfile.xlsx已包含许多不同的选项卡 .

Pandas正确写入“Main”表,不幸的是它还删除了所有其他选项卡 .

8 回答

  • 5

    Pandas docs表示它将openpyxl用于xlsx文件 . 快速查看 ExcelWriter 中的代码可以找出类似这样的内容可能会解决的问题:

    import pandas
    from openpyxl import load_workbook
    
    book = load_workbook('Masterfile.xlsx')
    writer = pandas.ExcelWriter('Masterfile.xlsx', engine='openpyxl') 
    writer.book = book
    writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
    
    data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2'])
    
    writer.save()
    
  • 101

    这是一个辅助函数:

    def append_df_to_excel(filename, df, sheet_name='Sheet1', startrow=None,
                           truncate_sheet=False, 
                           **to_excel_kwargs):
        """
        Append a DataFrame [df] to existing Excel file [filename]
        into [sheet_name] Sheet.
        If [filename] doesn't exist, then this function will create it.
    
        Parameters:
          filename : File path or existing ExcelWriter
                     (Example: '/path/to/file.xlsx')
          df : dataframe to save to workbook
          sheet_name : Name of sheet which will contain DataFrame.
                       (default: 'Sheet1')
          startrow : upper left cell row to dump data frame.
                     Per default (startrow=None) calculate the last row
                     in the existing DF and write to the next row...
          truncate_sheet : truncate (remove and recreate) [sheet_name]
                           before writing DataFrame to Excel file
          to_excel_kwargs : arguments which will be passed to `DataFrame.to_excel()`
                            [can be dictionary]
    
        Returns: None
        """
        from openpyxl import load_workbook
    
        import pandas as pd
    
        # ignore [engine] parameter if it was passed
        if 'engine' in to_excel_kwargs:
            to_excel_kwargs.pop('engine')
    
        writer = pd.ExcelWriter(filename, engine='openpyxl')
    
        # Python 2.x: define [FileNotFoundError] exception if it doesn't exist 
        try:
            FileNotFoundError
        except NameError:
            FileNotFoundError = IOError
    
    
        try:
            # try to open an existing workbook
            writer.book = load_workbook(filename)
    
            # get the last row in the existing Excel sheet
            # if it was not specified explicitly
            if startrow is None and sheet_name in writer.book.sheetnames:
                startrow = writer.book[sheet_name].max_row
    
            # truncate sheet
            if truncate_sheet and sheet_name in writer.book.sheetnames:
                # index of [sheet_name] sheet
                idx = writer.book.sheetnames.index(sheet_name)
                # remove [sheet_name]
                writer.book.remove(writer.book.worksheets[idx])
                # create an empty sheet [sheet_name] using old index
                writer.book.create_sheet(sheet_name, idx)
    
            # copy existing sheets
            writer.sheets = {ws.title:ws for ws in writer.book.worksheets}
        except FileNotFoundError:
            # file does not exist yet, we will create it
            pass
    
        if startrow is None:
            startrow = 0
    
        # write out the new sheet
        df.to_excel(writer, sheet_name, startrow=startrow, **to_excel_kwargs)
    
        # save the workbook
        writer.save()
    

    NOTE: for Pandas <0.21.0,将 sheet_name 替换为 sheetname

    用法示例:

    append_df_to_excel('d:/temp/test.xlsx', df)
    
    append_df_to_excel('d:/temp/test.xlsx', df, header=None, index=False)
    
    append_df_to_excel('d:/temp/test.xlsx', df, sheet_name='Sheet2', index=False)
    
    append_df_to_excel('d:/temp/test.xlsx', df, sheet_name='Sheet2', index=False, startrow=25)
    
  • 1

    使用 openpyxl 版本 2.4.0pandas 版本 0.19.2 ,@ski提出的过程变得更简单:

    import pandas
    from openpyxl import load_workbook
    
    with pandas.ExcelWriter('Masterfile.xlsx', engine='openpyxl') as writer:
        writer.book = load_workbook('Masterfile.xlsx')
        data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2'])
    #That's it!
    
  • 11

    老问题,但我猜有些人还在搜索这个 - 所以...

    我发现这个方法很好,因为所有的工作表都被加载到工作表的名称和数据框对的字典中,由pandas使用sheetname = None选项创建 . 在将电子表格读取为dict格式并将其从dict中写回之间添加,删除或修改工作表非常简单 . 对我来说,xlsxwriter在速度和格式方面比openpyxl更适合这项特殊任务 .

    注意:未来版本的pandas(0.21.0)会将“sheetname”参数更改为“sheet_name” .

    # read a single or multi-sheet excel file
    # (returns dict of sheetname(s), dataframe(s))
    ws_dict = pd.read_excel(excel_file_path,
                            sheetname=None)
    
    # all worksheets are accessible as dataframes.
    
    # easy to change a worksheet as a dataframe:
    mod_df = ws_dict['existing_worksheet']
    
    # do work on mod_df...then reassign
    ws_dict['existing_worksheet'] = mod_df
    
    # add a dataframe to the workbook as a new worksheet with
    # ws name, df as dict key, value:
    ws_dict['new_worksheet'] = some_other_dataframe
    
    # when done, write dictionary back to excel...
    # xlsxwriter honors datetime and date formats
    # (only included as example)...
    with pd.ExcelWriter(excel_file_path,
                        engine='xlsxwriter',
                        datetime_format='yyyy-mm-dd',
                        date_format='yyyy-mm-dd') as writer:
    
        for ws_name, df_sheet in ws_dict.items():
            df_sheet.to_excel(writer, sheet_name=ws_name)
    

    对于2013年问题中的示例:

    ws_dict = pd.read_excel('Masterfile.xlsx',
                            sheetname=None)
    
    ws_dict['Main'] = data_filtered[['Diff1', 'Diff2']]
    
    with pd.ExcelWriter('Masterfile.xlsx',
                        engine='xlsxwriter') as writer:
    
        for ws_name, df_sheet in ws_dict.items():
            df_sheet.to_excel(writer, sheet_name=ws_name)
    
  • 8

    我知道这是一个较旧的线程,但这是您在搜索时找到的第一个项目,如果您需要在已创建的工作簿中保留图表,则上述解决方案不起作用 . 在这种情况下,xlwings是一个更好的选择 - 它允许您写入excel书并保留图表/图表数据 .

    简单的例子:

    import xlwings as xw
    import pandas as pd
    
    #create DF
    months = ['2017-01','2017-02','2017-03','2017-04','2017-05','2017-06','2017-07','2017-08','2017-09','2017-10','2017-11','2017-12']
    value1 = [x * 5+5 for x in range(len(months))]
    df = pd.DataFrame(value1, index = months, columns = ['value1'])
    df['value2'] = df['value1']+5
    df['value3'] = df['value2']+5
    
    #load workbook that has a chart in it
    wb = xw.Book('C:\\data\\bookwithChart.xlsx')
    
    ws = wb.sheets['chartData']
    
    ws.range('A1').options(index=False).value = df
    
    wb = xw.Book('C:\\data\\bookwithChart_updated.xlsx')
    
    xw.apps[0].quit()
    
  • 0
    def append_sheet_to_master(self, master_file_path, current_file_path, sheet_name):
        try:
            master_book = load_workbook(master_file_path)
            master_writer = pandas.ExcelWriter(master_file_path, engine='openpyxl')
            master_writer.book = master_book
            master_writer.sheets = dict((ws.title, ws) for ws in master_book.worksheets)
            current_frames = pandas.ExcelFile(current_file_path).parse(pandas.ExcelFile(current_file_path).sheet_names[0],
                                                                   header=None,
                                                                   index_col=None)
            current_frames.to_excel(master_writer, sheet_name, index=None, header=False)
    
            master_writer.save()
        except Exception as e:
            raise e
    

    这完全正常,只有主文件(我们添加新工作表的文件)的格式化丢失了 .

  • 13
    writer = pd.ExcelWriter('prueba1.xlsx'engine='openpyxl',keep_date_col=True)
    

    “keep_date_col”希望对您有所帮助

  • 0
    book = load_workbook(xlsFilename)
    writer = pd.ExcelWriter(self.xlsFilename)
    writer.book = book
    writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
    df.to_excel(writer, sheet_name=sheetName, index=False)
    writer.save()
    

相关问题