Hi Guys, 我是python的新手(即时开始不到2周)所以我需要一些建议和技巧:p
什么是获取大约1500 api请求的最快和最有效的方法?
-
使用异步函数执行它们并返回以获得结果?
-
将它们分成300个url的列表并将每个列表放在一个Thread中,它将在异步循环中执行它们?
-
与第二个建议做同样的事情,但使用Processes而不是Threads?
for the moment it's working for me but it takes something like 8s to execute 1400 api requests but when i try a single request without threads it takes 9s im i doing something wrong ??!
获取一个URL(我试图将会话作为参数传递但是在达到700个请求时出现错误)
async def fetch_one(url):
async with curio_http.ClientSession() as session:
response = await session.get(url)
content = await response.json()
return content
获取异步循环内的URL列表
async def fetchMultiURLs(url_list):
tasks = []
responses = []
for url in url_list:
task = await curio.spawn(fetch_one(url))
tasks.append(task)
for task in tasks:
content = await task.join()
responses.append(content)
print(content)
创建线程并在其中放入异步循环,具体取决于URL / X URL by Loop
例如MultiFetch(URLS [600],200)将创建3个线程,它们将以线程和异步方式调用200个请求
def MultiFetch(URLS,X):
MyThreadsList = []
MyThreadsResults = []
N_Threads = (lambda x: int (x/X) if (x % X == 0) else int(x/X)+1) (len(URLS))
for i in range( N_Threads ): # will iterate X = ListSize / X
MyThreadsList.append( Thread( target = curio.run , args = (fetchMultiURLs( (URLS[ i*X:(X*i+X)]) ) ,) ) )
MyThreadsList[i].start()
for i in range( N_Threads ):
MyThreadsResults.append(MyThreadsList[i].join())
return MyThreadsResults
1 回答
Finaly i found a solution :) 获取1400个网址需要2.2秒
我使用了3ed建议(进程内的异步循环)
# Fetch 1 URL
# Fetch X URLs async def fetchMultiURLs(url_list):tasks = [] answers = [] for url in url_list:task = await curio.spawn(fetch_one(url))tasks.append(task)
# i tried to put lambda instead of this function but it not working
# Create Processes and Async Loops depending on URLs / X URL by Loop
# in my case (im using a VPS) a single Process can easly fetch 700 links in less than 1s , so dont make multiProcesses under this number of urls (just use the fetchMultiURLs function)
# im fetching 2100 urls in 1.1s i hope this Solution will help you Guys