Python批量抓取圖片

LueOsburn 8年前發布 | 2K 次閱讀 Python

[Python]代碼    

# -- coding:utf-8 --

coding=UTF-8

import os,urllib,urllib2,re

url = u"

def getHtml(url): webfile = urllib.urlopen(url) outhtml = webfile.read() print outhtml return outhtml

def getImageList(html): restr=ur'(' restr+=ur'http:\/\/[^\s,"].jpg' restr+=ur'|http:\/\/[^\s,"].jpeg' restr+=ur'|http:\/\/[^\s,"].png' restr+=ur'|http:\/\/[^\s,"].gif' restr+=ur'|http:\/\/[^\s,"].bmp' restr+=ur'|https:\/\/[^\s,"].jpeg'
restr+=ur'|https:\/\/[^\s,"].jpeg' restr+=ur'|https:\/\/[^\s,"].png' restr+=ur'|https:\/\/[^\s,"].gif' restr+=ur'|https:\/\/[^\s,"].bmp' restr+=ur')' htmlurl = re.compile(restr) imgList = re.findall(htmlurl,html) print imgList return imgList

def download(imgList, page): x = 1 for imgurl in imgList: filepathname=str(outpath+'pic%09d%010d'%(page,x)+str(os.path.splitext(urllib2.unquote(imgurl).decode('utf8').split('/')[-1])[1])).lower() print '[Debug] Download file :'+ imgurl+' >> '+filepathname urllib.urlretrieve(imgurl,filepathname) x+=1

def downImageNum(pagenum): page = 1 pageNumber = pagenum while(page <= pageNumber): html = getHtml(url)#獲得url指向的html內容 imageList = getImageList(html)#獲得所有圖片的地址,返回列表 download(imageList,page)#下載所有的圖片 page = page+1

if name == 'main': downImageNum(1)

</pre>

QQ截圖20150807150345.png    

 本文由用戶 LueOsburn 自行上傳分享,僅供網友學習交流。所有權歸原作者,若您的權利被侵害,請聯系管理員。
 轉載本站原創文章,請注明出處,并保留原始鏈接、圖片水印。
 本站是一個以用戶分享為主的開源技術平臺,歡迎各類分享!