Multi-threaded Python Web Crawler Got Stuck
I'm writing a Python web crawler and I want to make it multi-threaded. Now I have finished the basic part, below is what it does: a thread gets a url from the queue; the thread ex
Solution 1:
Your crawl function has an infinite while loop with no possible exit path.
The condition True
always evaluates to True
and the loop continues, as you say,
not exiting properly
Modify the crawl function's while loop to include a condition. For instance, when the number of links saved to the csv file exceeds a certain minimum number, then exit the while loop.
i.e.,
def crawl():
whilelen(exist) <= min_links:
...
Post a Comment for "Multi-threaded Python Web Crawler Got Stuck"