从零开始,提供ag视讯玩法|注册论坛

快捷导航
广告联系qq1031180668ag自动下注软件|HOME
查看: 228|回复: 0
打印 上一主题 下一主题

[python] 妹子图爬虫

[复制链接]

classn_11

69

主题

69

帖子

152

积分

注册会员

Rank: 2

积分
152
跳转到指定楼层
楼主
发表于 2019-9-23 23:52:38 | 只看该作者 回帖奖励 |倒序浏览 |阅读模式
[Python] syntaxhighlighter_viewsource syntaxhighlighter_copycode
import requests
from lxml.html import etree
import os
from multiprocessing import Pool



def jie(url):
    headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36"
                             ,"Referer":"http://www.mzitu.com/"}
    kai = requests.get(url,headers=headers)
    kai.encoding = "utf-8"
    html = etree.HTML(kai.text)
    return html

def cun(url):
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36"
        , "Referer": "http://www.mmjpg.com/"}
    html = jie(url)
    tu = html.xpath('//*[@id="content"]/a/img/@src')
    name = tu[0].split("/")[-1]
    xia = requests.get(tu[0],headers=headers)
    with open(name,"wb")as f:
        f.write(xia.content)
        print(name)

def da(wang,name):
    html = jie(wang)
    wan = html.xpath('//*[@id="opic"]/preceding-sibling::a[1]/text()')[0]
    for i in range(1,(int(wan)+1)):
        cun(wang+"/"+str(i))
def wang(url):
    html = url.xpath("/html/body/div/div/ul/li/a/@href")
    name = url.xpath('/html/body/div/div/ul/li/a/img/@alt')
    for (i1,i2) in zip(html,name):
        os.mkdir(i2)
        os.chdir(i2)
        da(i1,i2)
        os.chdir(r"C:\Users\Administrator\Desktop\妹子图")

def main(url):
    html = jie(url)
    wang(html)

if __name__ == '__main__':
    jihe = ["http://www.mmjpg.com/"]
    for i in range(2, 105):
        jihe.append(jihe[0] + "home" + "/" + str(i))
    os.chdir(r"C:\Users\Administrator\Desktop")
    os.mkdir("妹子图")
    os.chdir("妹子图")
    pool = Pool()
    pool.map(main,[url for url in jihe])


游客
回复
您需要登录后才可以回帖 登录 | 立即注册

手机版|Archiver|小黑屋|sitemap| 从零开始,提供ag视讯玩法|注册论坛 - 一个单纯的提供ag视讯玩法|注册学习交流论坛 ( 豫ICP备15032706号 )

GMT+8, 2019-10-25 05:37 , Processed in 1.139697 second(s), 22 queries .

Powered by Discuz! X3.4

? 2001-2013 Comsenz Inc.

快速回复 返回顶部 返回列表