失眠网 > python 爬虫抓取网站img图片

python 爬虫抓取网站img图片

时间：2022-11-13 06:13:00

相关推荐

python 爬虫抓取网站img图片

from getHtml import getHtmlWinthIpfrom getHtml import getHtmlfrom bs4 import BeautifulSoupfrom urllib import request#为了存储import os #为了创建文件夹imgsrcl = []def getD(url,no):html = getHtmlWinthIp(url)soup = BeautifulSoup(html,'html.parser')#寻找parentparent = soup.find(id='content').find('ul')#找到所有的lilis = parent.find_all('li',limit=no)#新建列表存储所有的srcfor each in lis:#each.find('img').attrs这是所有img的属性组成的字典src = each.find('img').attrs['src']#读取字典的srcimgsrcl.append(src)#添加到总的列表# os.mkdir()#创建文件夹# os.chdir()#改变文件路径# os.path.exists()#判断是否已经存在某文件夹def store():if os.path.exists('范冰冰2'):os.chdir('范冰冰2')else:os.mkdir('范冰冰2')os.chdir('范冰冰2')# 存储for i, v in enumerate(imgsrcl):request.urlretrieve(v, str(i + 1) + '.jpg')def main(n):for index in range(30,n+31,30) :url = '/celebrity/1050059/photos/?type=C&start='+str(index-30)+'&sortby=like&size=a&subtype=a'print("正在爬取第" + str(index//30) + "页")if n%index!=0:no=n%indexgetD(url,no)else:getD(url,30)store()#传入多少就爬取多少张if __name__ == '__main__':main(48)

如果觉得《python 爬虫抓取网站img图片》对你有帮助，请点赞、收藏，并留下你的观点哦！

本内容不代表本网观点和政治立场，如有侵犯你的权益请联系我们处理。

网友评论

网友评论仅供其表达个人看法，并不表明网站立场。