首页 文章

如何从谷歌驱动器导入谷歌colab的数据?

提问于
浏览
8

我在谷歌硬盘上上传了一些数据文件 . 我想将这些文件导入google colab .

REST API方法和PyDrive方法显示如何创建新文件并将其上载到驱动器和colab上 . 使用它,我无法弄清楚如何在我的python代码中读取我的驱动器上已存在的数据文件 .

我是一个全新的人 . 有人可以帮我吗?

2 回答

  • 1

    (2018年4月15日更新:gspread经常被更新,所以为了确保稳定的工作流程,我指定了版本)

    对于电子表格文件,基本思想是使用包gspread和pandas来读取Drive中的电子表格并将它们转换为pandas数据帧格式 .

    在Colab笔记本中:

    #install packages
    !pip install gspread==2.1.1
    !pip install gspread-dataframe==2.1.0
    !pip install pandas==0.22.0
    
    
    #import packages and authorize connection to Google account:
    import pandas as pd
    import gspread
    from gspread_dataframe import get_as_dataframe, set_with_dataframe
    from google.colab import auth
    auth.authenticate_user()  # verify your account to read files which you have access to. Make sure you have permission to read the file!
    from oauth2client.client import GoogleCredentials
    gc = gspread.authorize(GoogleCredentials.get_application_default())
    

    然后我知道3种阅读Google电子表格的方法 .

    按文件名:

    spreadsheet = gc.open("goal.csv") # Open file using its name. Use this if the file is already anywhere in your drive
    sheet =  spreadsheet.get_worksheet(0)  # 0 means the first sheet in the file
    df2 = pd.DataFrame(sheet.get_all_records())
    df2.head()
    

    通过网址:

    spreadsheet = gc.open_by_url('https://docs.google.com/spreadsheets/d/1LCCzsUTqBEq5pemRNA9EGy62aaeIgye4XxwReYg1Pe4/edit#gid=509368585') # use this when you have the complete url (the edit#gid means permission)
        sheet =  spreadsheet.get_worksheet(0)  # 0 means the first sheet in the file
        df2 = pd.DataFrame(sheet.get_all_records())
        df2.head()
    

    按文件密钥/ ID:

    spreadsheet = gc.open_by_key('1vpukIbGZfK1IhCLFalBI3JT3aobySanJysv0k5A4oMg') # use this when you have the key (the string in the url following spreadsheet/d/)
    sheet =  spreadsheet.get_worksheet(0)  # 0 means the first sheet in the file
    df2 = pd.DataFrame(sheet.get_all_records())
    df2.head()
    

    我在Colab笔记本上分享了上面的代码:https://drive.google.com/file/d/1cvur-jpIpoEN3vAO8Fd_yVAT5Qgbr4GV/view?usp=sharing

    资料来源:https://github.com/burnash/gspread

  • 10

    !)将您的数据设置为公开可用,然后用于公共电子表格:

    from StringIO import StringIO  # got moved to io in python3.
    
    import requests
    r = requests.get('https://docs.google.com/spreadsheet/ccc? 
    key=0Ak1ecr7i0wotdGJmTURJRnZLYlV3M2daNTRubTdwTXc&output=csv')
    data = r.content
    
    In [10]: df = pd.read_csv(StringIO(data), index_col=0,parse_dates= 
    ['Quradate'])
    
    In [11]: df.head()
    

    更多这里:Getting Google Spreadsheet CSV into A Pandas Dataframe

    如果私人数据排序相同,但你将不得不做一些auth体操......

相关问题