鍓嶈█锛?/h3>
鍡ㄥ柦~澶у濂藉憖锛岃繖閲屾槸榄旂帇鍛?鉂?~!
2023骞寸殑涓鑺傚拰鍥藉簡鑺傚嵆灏嗘潵涓达紝濂芥秷鎭槸锛屽畠浠皢杩炰紤8澶╋紒锛侊紒
杩欎釜闀垮亣涓鸿澶氫汉鎻愪緵浜嗙粷浣崇殑浼戦棽鏈轰細锛?/p>
璁╄澶氫汉閮借揩涓嶅強寰呭湴鎯宠閲婃斁浠栦滑琚帇鎶戝凡涔呯殑鏃呮父鐑儏锛?/p>
鎵€浠ュ緢澶氭湅鍙嬪凡缁忓紑濮嬬潃鎵嬭鍒掍粬浠殑鏃呮父琛岀▼銆?/p>
浠婂ぉ鎴戜滑鏉ュ垎鏋愪笅鍘诲摢鍎跨殑鏃呮父鏀荤暐鏁版嵁锛?/p>
鐪嬬湅鍚冦€佷綇銆佹父鐜╁湪浠蜂綅鍚堥€傜殑鎯呭喌涓嬶紝鎬庢牱鎵嶈兘鐜╃殑寮€蹇?/p>
鐜浣跨敤
瑙i噴鍣ㄧ増鏈? >>> python 3.8
浠g爜缂栬緫鍣? >>> pycharm 2021.2
妯″潡浣跨敤
requests >>> 涓昏鐢ㄦ潵鍙?閫?HTTP 璇锋眰 / 绗笁鏂规ā鍧?/p>
parsel >>> 涓昏鐢ㄦ潵灏嗚姹傚悗鐨勫瓧绗︿覆鏍煎紡瑙f瀽鎴恟e,xpath,css杩涜鍐呭鐨勫尮閰?/ 绗笁鏂规ā鍧?/p>
csv
绗笁鏂规ā鍧楀畨瑁咃細
win + R 杈撳叆cmd 杈撳叆瀹夎鍛戒护 pip install 妯″潡鍚?/p>
(濡傛灉浣犺寰楀畨瑁呴€熷害姣旇緝鎱? 浣犲彲浠ュ垏鎹㈠浗鍐呴暅鍍忔簮)
鏁版嵁鏉ユ簮鍒嗘瀽
1. 鏄庣‘闇€姹?/strong>
杩欐閫夌殑鏈堜唤涓?0 ~ 12鏈堬紝娓哥帺璐圭敤涓?000 ~ 2999杩欎釜浠蜂綅
2. 鎶撳寘鍒嗘瀽
鎸塅12锛屾墦寮€寮€鍙戣€呭伐鍏凤紝鐐瑰嚮鎼滅储锛岃緭鍏ヤ綘鎯宠鐨勬暟鎹?/p>
鎵惧埌鏁版嵁閾炬帴
https://travel.qunar.com/travelbook/list.htm?page=1&order=hot_heat&&month=10_11_12&avgPrice=2
浠g爜瀹炵幇
瀵煎叆妯″潡
import requests
import parsel
import csv
璇锋眰鏁版嵁
妯℃嫙娴忚鍣? <鍙互鐩存帴澶嶅埗>
response.text 鑾峰彇鍝嶅簲鏂囨湰鏁版嵁
response.json() 鑾峰彇鍝嶅簲json鏁版嵁
response.content 鑾峰彇鍝嶅簲浜岃繘鍒舵暟鎹?/p>
鎴戜滑浣跨敤requests.get()鏂规硶鍚戞寚瀹氱殑URL鍙戦€丟ET璇锋眰锛屽苟鑾峰彇鍒板搷搴旂殑鍐呭
url = f'https://travel.qunar.com/travelbook/list.htm?page=1&order=hot_heat&&month=10_11_12&&avgPrice=2'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36'
}
response = requests.get(url, headers=headers)
瑙f瀽
鍏堝彇鍝嶅簲鏂囨湰鏁版嵁
selector = parsel.Selector(response.text)
css閫夋嫨鍣?锛氭牴鎹爣绛惧睘鎬ф彁鍙栨暟鎹唴瀹癸紝鐪嬪厓绱犻潰鏉? 涓轰簡甯姪鎵惧埌鏁版嵁鏍囩,
lis = selector.css('.list_item')
for li in lis:
title = li.css('.tit a::text').get()
user_name = li.css('.user_name a::text').get()
date = li.css('.date::text').get()
days = li.css('.days::text').get()
photo_nums = li.css('.photo_nums::text').get()
fee = li.css('.fee::text').get()
people = li.css('.people::text').get()
trip = li.css('.trip::text').get()
places = ''.join(li.css('.places ::text').getall()).split('琛岀▼')
place_1 = places[0].replace('閫旂粡锛?, '')
place_2 = places[-1].replace('锛?, '')
href = li.css('.tit a::attr(href)').get().split('/')[-1]
link = f'https://travel.qunar.com/travelbook/note/{href}'
dit = {
'鏍囬': title,
'鏄电О': user_name,
'鏃ユ湡': date,
'鑰楁椂': days,
'鐓х墖': photo_nums,
'璐圭敤': fee,
'浜哄憳': people,
'鏍囩': trip,
'閫斿緞': place_1,
'琛岀▼': place_2,
'璇︽儏椤?: link,
}
print(title, user_name, date, days, photo_nums, fee, people, trip, place_1, place_2, link, sep=' | ')
淇濆瓨
f = open('data.csv', mode='w', encoding='utf-8', newline='')
csv_writer = csv.DictWriter(f, fieldnames=[
'鏍囬',
'鏄电О',
'鏃ユ湡',
'鑰楁椂',
'鐓х墖',
'璐圭敤',
'浜哄憳',
'鏍囩',
'閫斿緞',
'琛岀▼',
'璇︽儏椤?,
])
csv_writer.writeheader()
鏁版嵁鍙鍖?/h3>
瀵煎叆妯″潡銆佹暟鎹?/h4>
import pandas as pd
df = pd.read_csv('data.csv')
df.head()
骞翠唤鍒嗗竷鎯呭喌
from pyecharts import options as opts
from pyecharts.charts import Pie
from pyecharts.faker import Faker
num = df['骞翠唤'].value_counts().to_list()
info = df['骞翠唤'].value_counts().index.to_list()
c = (
Pie()
.add(
"",
[
list(z)
for z in zip(
info,
num,
)
],
center=["40%", "50%"],
)
.set_global_opts(
title_opts=opts.TitleOpts(title="骞翠唤鍒嗗竷鎯呭喌"),
legend_opts=opts.LegendOpts(type_="scroll", pos_left="80%", orient="vertical"),
)
.set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}"))
# .render("pie_scroll_legend.html")
)
c.render_notebook()
鏈堜唤鍒嗗竷鎯呭喌
from pyecharts import options as opts
from pyecharts.charts import Pie
from pyecharts.faker import Faker
num = df['鏈堜唤'].value_counts().to_list()
info = df['鏈堜唤'].value_counts().index.to_list()
c = (
Pie()
.add(
"",
[
list(z)
for z in zip(
info,
num,
)
],
center=["40%", "50%"],
)
.set_global_opts(
title_opts=opts.TitleOpts(title="鏈堜唤鍒嗗竷鎯呭喌"),
legend_opts=opts.LegendOpts(type_="scroll", pos_left="80%", orient="vertical"),
)
.set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}"))
# .render("pie_scroll_legend.html")
)
c.render_notebook()
鍑鸿鏃堕棿鎯呭喌
from pyecharts import options as opts
from pyecharts.charts import Pie
from pyecharts.faker import Faker
num = df['鑰楁椂'].value_counts().to_list()
info = df['鑰楁椂'].value_counts().index.to_list()
c = (
Pie()
.add(
"",
[
list(z)
for z in zip(
info,
num,
)
],
center=["40%", "50%"],
)
.set_global_opts(
title_opts=opts.TitleOpts(title="鍑鸿鏃堕棿鎯呭喌"),
legend_opts=opts.LegendOpts(type_="scroll", pos_left="80%", orient="vertical"),
)
.set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}"))
# .render("pie_scroll_legend.html")
)
c.render_notebook()
璐圭敤鍒嗗竷鎯呭喌
from pyecharts import options as opts
from pyecharts.charts import Pie
from pyecharts.faker import Faker
num = df['璐圭敤'].value_counts().to_list()
info = df['璐圭敤'].value_counts().index.to_list()
c = (
Pie()
.add(
"",
[
list(z)
for z in zip(
info,
num,
)
],
center=["40%", "50%"],
)
.set_global_opts(
title_opts=opts.TitleOpts(title="璐圭敤鍒嗗竷鎯呭喌"),
legend_opts=opts.LegendOpts(type_="scroll", pos_left="80%", orient="vertical"),
)
.set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}"))
# .render("pie_scroll_legend.html")
)
c.render_notebook()
浜哄憳鍒嗗竷鎯呭喌
from pyecharts import options as opts
from pyecharts.charts import Pie
from pyecharts.faker import Faker
num = df['浜哄憳'].value_counts().to_list()
info = df['浜哄憳'].value_counts().index.to_list()
c = (
Pie()
.add(
"",
[
list(z)
for z in zip(
info,
num,
)
],
center=["40%", "50%"],
)
.set_global_opts(
title_opts=opts.TitleOpts(title="浜哄憳鍒嗗竷鎯呭喌"),
legend_opts=opts.LegendOpts(type_="scroll", pos_left="80%", orient="vertical"),
)
.set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}"))
# .render("pie_scroll_legend.html")
)
c.render_notebook()
灏捐
import pandas as pd
df = pd.read_csv('data.csv')
df.head()
骞翠唤鍒嗗竷鎯呭喌
from pyecharts import options as opts
from pyecharts.charts import Pie
from pyecharts.faker import Faker
num = df['骞翠唤'].value_counts().to_list()
info = df['骞翠唤'].value_counts().index.to_list()
c = (
Pie()
.add(
"",
[
list(z)
for z in zip(
info,
num,
)
],
center=["40%", "50%"],
)
.set_global_opts(
title_opts=opts.TitleOpts(title="骞翠唤鍒嗗竷鎯呭喌"),
legend_opts=opts.LegendOpts(type_="scroll", pos_left="80%", orient="vertical"),
)
.set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}"))
# .render("pie_scroll_legend.html")
)
c.render_notebook()
鏈堜唤鍒嗗竷鎯呭喌
from pyecharts import options as opts
from pyecharts.charts import Pie
from pyecharts.faker import Faker
num = df['鏈堜唤'].value_counts().to_list()
info = df['鏈堜唤'].value_counts().index.to_list()
c = (
Pie()
.add(
"",
[
list(z)
for z in zip(
info,
num,
)
],
center=["40%", "50%"],
)
.set_global_opts(
title_opts=opts.TitleOpts(title="鏈堜唤鍒嗗竷鎯呭喌"),
legend_opts=opts.LegendOpts(type_="scroll", pos_left="80%", orient="vertical"),
)
.set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}"))
# .render("pie_scroll_legend.html")
)
c.render_notebook()
鍑鸿鏃堕棿鎯呭喌
from pyecharts import options as opts
from pyecharts.charts import Pie
from pyecharts.faker import Faker
num = df['鑰楁椂'].value_counts().to_list()
info = df['鑰楁椂'].value_counts().index.to_list()
c = (
Pie()
.add(
"",
[
list(z)
for z in zip(
info,
num,
)
],
center=["40%", "50%"],
)
.set_global_opts(
title_opts=opts.TitleOpts(title="鍑鸿鏃堕棿鎯呭喌"),
legend_opts=opts.LegendOpts(type_="scroll", pos_left="80%", orient="vertical"),
)
.set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}"))
# .render("pie_scroll_legend.html")
)
c.render_notebook()
璐圭敤鍒嗗竷鎯呭喌
from pyecharts import options as opts
from pyecharts.charts import Pie
from pyecharts.faker import Faker
num = df['璐圭敤'].value_counts().to_list()
info = df['璐圭敤'].value_counts().index.to_list()
c = (
Pie()
.add(
"",
[
list(z)
for z in zip(
info,
num,
)
],
center=["40%", "50%"],
)
.set_global_opts(
title_opts=opts.TitleOpts(title="璐圭敤鍒嗗竷鎯呭喌"),
legend_opts=opts.LegendOpts(type_="scroll", pos_left="80%", orient="vertical"),
)
.set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}"))
# .render("pie_scroll_legend.html")
)
c.render_notebook()
浜哄憳鍒嗗竷鎯呭喌
from pyecharts import options as opts
from pyecharts.charts import Pie
from pyecharts.faker import Faker
num = df['浜哄憳'].value_counts().to_list()
info = df['浜哄憳'].value_counts().index.to_list()
c = (
Pie()
.add(
"",
[
list(z)
for z in zip(
info,
num,
)
],
center=["40%", "50%"],
)
.set_global_opts(
title_opts=opts.TitleOpts(title="浜哄憳鍒嗗竷鎯呭喌"),
legend_opts=opts.LegendOpts(type_="scroll", pos_left="80%", orient="vertical"),
)
.set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}"))
# .render("pie_scroll_legend.html")
)
c.render_notebook()
灏捐
鏈€鍚庢劅璋綘瑙傜湅鎴戠殑鏂囩珷鍛悀鏈鑸彮鍒拌繖閲屽氨缁撴潫鍟? 馃洭
甯屾湜鏈瘒鏂囩珷鏈夊浣犲甫鏉ュ府鍔?馃帀锛屾湁瀛︿範鍒颁竴鐐圭煡璇唦
韬茶捣鏉ョ殑鏄熸槦馃崶涔熷湪鍔姏鍙戝厜锛屼綘涔熻鍔姏鍔犳补锛堣鎴戜滑涓€璧峰姫鍔涘彮锛夈€?/p>
<a id="article_bottom"></a>鏈€鍚庯紝瀹d紶涓€涓嬪憖~馃憞馃憞馃憞鏇村婧愮爜銆佽祫鏂欍€佺礌鏉愩€佽В绛斻€佷氦娴?/strong>鐨嗙偣鍑讳笅鏂瑰悕鐗囪幏鍙栧憖馃憞馃憞