Hi Guys, 我是python的新手（即时开始不到2周）所以我需要一些建议和技巧：p

什么是获取大约1500 api请求的最快和最有效的方法？

使用异步函数执行它们并返回以获得结果？
将它们分成300个url的列表并将每个列表放在一个Thread中，它将在异步循环中执行它们？
与第二个建议做同样的事情，但使用Processes而不是Threads？

for the moment it's working for me but it takes something like 8s to execute 1400 api requests but when i try a single request without threads it takes 9s im i doing something wrong ??!

获取一个URL（我试图将会话作为参数传递但是在达到700个请求时出现错误）

async def fetch_one(url):
    async with curio_http.ClientSession() as session:
        response = await session.get(url)
        content =  await response.json()
        return  content

获取异步循环内的URL列表

async def fetchMultiURLs(url_list):
        tasks = []
        responses = []
        for url in url_list:
            task = await curio.spawn(fetch_one(url))
            tasks.append(task)

        for task in tasks:
            content = await task.join()
            responses.append(content)
            print(content)

创建线程并在其中放入异步循环，具体取决于URL / X URL by Loop

例如MultiFetch（URLS [600]，200）将创建3个线程，它们将以线程和异步方式调用200个请求

def MultiFetch(URLS,X):
    MyThreadsList = []
    MyThreadsResults = []
    N_Threads = (lambda x:  int (x/X) if (x % X == 0) else int(x/X)+1) (len(URLS))
    for i in range( N_Threads  ): # will iterate X = ListSize / X
              MyThreadsList.append( Thread(  target = curio.run , args = (fetchMultiURLs( (URLS[ i*X:(X*i+X)]) )  ,)    )  )
              MyThreadsList[i].start()
    for i in range( N_Threads  ):
              MyThreadsResults.append(MyThreadsList[i].join())
    return MyThreadsResults

1 回答

Finaly i found a solution :) 获取1400个网址需要2.2秒

我使用了3ed建议（进程内的异步循环）

# Fetch 1 URL

async def fetch_one(url):
    async with curio_http.ClientSession() as session:
        response = await session.get(url)
        content =  await response.json()
        return  content

# Fetch X URLs async def fetchMultiURLs（url_list）：tasks = [] answers = [] for url in url_list：task = await curio.spawn（fetch_one（url））tasks.append（task）

for task in tasks:
            content = await task.join()
            responses.append(content)
        return responses

# i tried to put lambda instead of this function but it not working

def RuningCurio(X):
    return curio.run(fetchMultiURLs(X))

# Create Processes and Async Loops depending on URLs / X URL by Loop

# in my case (im using a VPS) a single Process can easly fetch 700 links in less than 1s , so dont make multiProcesses under this number of urls (just use the fetchMultiURLs function)

def MultiFetch(URLS,X):
    MyListofLists = []
    LengthURLs = len(URLS)
    N_Process = int (LengthURLs / X) if ( LengthURLs % X == 0) else int( LengthURLs / X) + 1
    for i in range( N_Process  ): # Create a list of lists (  [ [1,2,3],[4,5,6],[7,8,9] ] )
        MyListofLists.append(URLS[ i*X:(X*i+X)])
    P = Pool( N_Process) 
    return  P.map( RuningCurio ,MyListofLists)

# im fetching 2100 urls in 1.1s i hope this Solution will help you Guys

回复于 2024-05-12T18:33:57+08:00

Python多请求URL

获取一个URL（我试图将会话作为参数传递但是在达到700个请求时出现错误）

获取异步循环内的URL列表

创建线程并在其中放入异步循环，具体取决于URL / X URL by Loop

例如MultiFetch（URLS [600]，200）将创建3个线程，它们将以线程和异步方式调用200个请求

1 回答

相关问题