當前位置:成語大全網 - 漢語詞典 - Python爬蟲源代碼

Python爬蟲源代碼

導入操作系統,請求

從bs4導入BeautifulSoup

標題={

用戶代理':' Mozilla/5.0(Windows NT 10.0;Win64x64rv:75.0)壁虎/20100101火狐/75.0 '

}

對於範圍內的I(105,200):

嘗試:

URL = '/web 201605/hero detail/'+str(I)+'。' shtml '

response = requests.get(url,headers)

response.encoding = 'gbk '

soup = beautiful soup(response . text,' html.parser ')

# skill_name = soup.find('p ',' skill-name ')

# skill_desc = soup.find('p ',' skill-desc ')

#打印(skill_name.text)

#打印(skill_desc.text)

name = soup.find("h2 "," cover-name ")。文本

#打印(姓名)

story = soup.find('div ',' pop-bd ')。文本

if story =='\n ':

打印(" \ n沒有%d%s的故事!"%(我,名字))

否則:

story _ = story . replace(' 0.01 ','。\n ')

story_ = story.replace('\n ',' \ t & gt& gt& gt')

print(story_[0:30]+" ... ")

# OS . mkdir(' C:\ \ Users \ \ Crystal \ \ Desktop \ \ Hero Story 2 ')

# OS . mkdir(' c:\ \ users \ \ 28459 \ \ desktop \ \ test \ \ ')

OS . chdir(' c:\ \ users \ \ 28459 \ \ desktop \ \ test \ \ ')

打開(' %s'%name +')。txt ',' w ')。寫(故事_)

打印(“%d%s文章已保存!”%(我,名字))

打印()

除了屬性錯誤:

print(" \ n沒有數字為%d的英雄!"%i)