当前位置: 首页>编程语言>正文

8天长假快来了,Python分析【去哪儿旅游攻略】数据,制作可视化图表

鍓嶈█锛?/h3>

鍡ㄥ柦~澶у濂藉憖锛岃繖閲屾槸榄旂帇鍛?鉂?~!

8天长假快来了,Python分析【去哪儿旅游攻略】数据,制作可视化图表,第1张

2023骞寸殑涓鑺傚拰鍥藉簡鑺傚嵆灏嗘潵涓达紝濂芥秷鎭槸锛屽畠浠皢杩炰紤8澶╋紒锛侊紒

8天长假快来了,Python分析【去哪儿旅游攻略】数据,制作可视化图表,第2张

杩欎釜闀垮亣涓鸿澶氫汉鎻愪緵浜嗙粷浣崇殑浼戦棽鏈轰細锛?/p>

璁╄澶氫汉閮借揩涓嶅強寰呭湴鎯宠閲婃斁浠栦滑琚帇鎶戝凡涔呯殑鏃呮父鐑儏锛?/p>

鎵€浠ュ緢澶氭湅鍙嬪凡缁忓紑濮嬬潃鎵嬭鍒掍粬浠殑鏃呮父琛岀▼銆?/p>

8天长假快来了,Python分析【去哪儿旅游攻略】数据,制作可视化图表,第3张

浠婂ぉ鎴戜滑鏉ュ垎鏋愪笅鍘诲摢鍎跨殑鏃呮父鏀荤暐鏁版嵁锛?/p>

鐪嬬湅鍚冦€佷綇銆佹父鐜╁湪浠蜂綅鍚堥€傜殑鎯呭喌涓嬶紝鎬庢牱鎵嶈兘鐜╃殑寮€蹇?/p>

鐜浣跨敤

  • 瑙i噴鍣ㄧ増鏈? >>> python 3.8

  • 浠g爜缂栬緫鍣? >>> pycharm 2021.2

妯″潡浣跨敤

  • requests >>> 涓昏鐢ㄦ潵鍙?閫?HTTP 璇锋眰 / 绗笁鏂规ā鍧?/p>

  • parsel >>> 涓昏鐢ㄦ潵灏嗚姹傚悗鐨勫瓧绗︿覆鏍煎紡瑙f瀽鎴恟e,xpath,css杩涜鍐呭鐨勫尮閰?/ 绗笁鏂规ā鍧?/p>

  • csv

绗笁鏂规ā鍧楀畨瑁咃細

win + R 杈撳叆cmd 杈撳叆瀹夎鍛戒护 pip install 妯″潡鍚?/p>

(濡傛灉浣犺寰楀畨瑁呴€熷害姣旇緝鎱? 浣犲彲浠ュ垏鎹㈠浗鍐呴暅鍍忔簮)

鏁版嵁鏉ユ簮鍒嗘瀽

1. 鏄庣‘闇€姹?/strong>

杩欐閫夌殑鏈堜唤涓?0 ~ 12鏈堬紝娓哥帺璐圭敤涓?000 ~ 2999杩欎釜浠蜂綅

8天长假快来了,Python分析【去哪儿旅游攻略】数据,制作可视化图表,第4张

2. 鎶撳寘鍒嗘瀽

鎸塅12锛屾墦寮€寮€鍙戣€呭伐鍏凤紝鐐瑰嚮鎼滅储锛岃緭鍏ヤ綘鎯宠鐨勬暟鎹?/p>

8天长假快来了,Python分析【去哪儿旅游攻略】数据,制作可视化图表,第5张

鎵惧埌鏁版嵁閾炬帴

8天长假快来了,Python分析【去哪儿旅游攻略】数据,制作可视化图表,第6张
https://travel.qunar.com/travelbook/list.htm?page=1&order=hot_heat&&month=10_11_12&avgPrice=2

浠g爜瀹炵幇

瀵煎叆妯″潡

import requests
import parsel
import csv

璇锋眰鏁版嵁

妯℃嫙娴忚鍣? <鍙互鐩存帴澶嶅埗>

  • response.text 鑾峰彇鍝嶅簲鏂囨湰鏁版嵁

  • response.json() 鑾峰彇鍝嶅簲json鏁版嵁

  • response.content 鑾峰彇鍝嶅簲浜岃繘鍒舵暟鎹?/p>

鎴戜滑浣跨敤requests.get()鏂规硶鍚戞寚瀹氱殑URL鍙戦€丟ET璇锋眰锛屽苟鑾峰彇鍒板搷搴旂殑鍐呭

url = f'https://travel.qunar.com/travelbook/list.htm?page=1&order=hot_heat&&month=10_11_12&&avgPrice=2'
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36'
}
response = requests.get(url, headers=headers)

瑙f瀽

鍏堝彇鍝嶅簲鏂囨湰鏁版嵁

selector = parsel.Selector(response.text)

css閫夋嫨鍣?锛氭牴鎹爣绛惧睘鎬ф彁鍙栨暟鎹唴瀹癸紝鐪嬪厓绱犻潰鏉? 涓轰簡甯姪鎵惧埌鏁版嵁鏍囩,

lis = selector.css('.list_item')
for li in lis:
    title = li.css('.tit a::text').get()
    user_name = li.css('.user_name a::text').get()
    date = li.css('.date::text').get()
    days = li.css('.days::text').get()
    photo_nums = li.css('.photo_nums::text').get()
    fee = li.css('.fee::text').get()
    people = li.css('.people::text').get()
    trip = li.css('.trip::text').get()
    places = ''.join(li.css('.places ::text').getall()).split('琛岀▼')
    place_1 = places[0].replace('閫旂粡锛?, '')
    place_2 = places[-1].replace('锛?, '')
    href = li.css('.tit a::attr(href)').get().split('/')[-1]
    link = f'https://travel.qunar.com/travelbook/note/{href}'
    dit = {
        '鏍囬': title,
        '鏄电О': user_name,
        '鏃ユ湡': date,
        '鑰楁椂': days,
        '鐓х墖': photo_nums,
        '璐圭敤': fee,
        '浜哄憳': people,
        '鏍囩': trip,
        '閫斿緞': place_1,
        '琛岀▼': place_2,
        '璇︽儏椤?: link,
    }
    print(title, user_name, date, days, photo_nums, fee, people, trip, place_1, place_2, link, sep=' | ')
8天长假快来了,Python分析【去哪儿旅游攻略】数据,制作可视化图表,第7张

淇濆瓨

f = open('data.csv', mode='w', encoding='utf-8', newline='')
csv_writer = csv.DictWriter(f, fieldnames=[
    '鏍囬',
    '鏄电О',
    '鏃ユ湡',
    '鑰楁椂',
    '鐓х墖',
    '璐圭敤',
    '浜哄憳',
    '鏍囩',
    '閫斿緞',
    '琛岀▼',
    '璇︽儏椤?,
])
csv_writer.writeheader()
8天长假快来了,Python分析【去哪儿旅游攻略】数据,制作可视化图表,第8张

鏁版嵁鍙鍖?/h3>

瀵煎叆妯″潡銆佹暟鎹?/h4>
import pandas as pd

df = pd.read_csv('data.csv')
df.head()
8天长假快来了,Python分析【去哪儿旅游攻略】数据,制作可视化图表,第9张

骞翠唤鍒嗗竷鎯呭喌

from pyecharts import options as opts
from pyecharts.charts import Pie
from pyecharts.faker import Faker
num = df['骞翠唤'].value_counts().to_list()
info = df['骞翠唤'].value_counts().index.to_list()
c = (
    Pie()
    .add(
        "",
        [
            list(z)
            for z in zip(
                info,
                num,
            )
        ],
        center=["40%", "50%"],
    )
    .set_global_opts(
        title_opts=opts.TitleOpts(title="骞翠唤鍒嗗竷鎯呭喌"),
        legend_opts=opts.LegendOpts(type_="scroll", pos_left="80%", orient="vertical"),
    )
    .set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}"))
#     .render("pie_scroll_legend.html")
)
c.render_notebook()
8天长假快来了,Python分析【去哪儿旅游攻略】数据,制作可视化图表,第10张

鏈堜唤鍒嗗竷鎯呭喌

from pyecharts import options as opts
from pyecharts.charts import Pie
from pyecharts.faker import Faker
num = df['鏈堜唤'].value_counts().to_list()
info = df['鏈堜唤'].value_counts().index.to_list()
c = (
    Pie()
    .add(
        "",
        [
            list(z)
            for z in zip(
                info,
                num,
            )
        ],
        center=["40%", "50%"],
    )
    .set_global_opts(
        title_opts=opts.TitleOpts(title="鏈堜唤鍒嗗竷鎯呭喌"),
        legend_opts=opts.LegendOpts(type_="scroll", pos_left="80%", orient="vertical"),
    )
    .set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}"))
#     .render("pie_scroll_legend.html")
)
c.render_notebook()
8天长假快来了,Python分析【去哪儿旅游攻略】数据,制作可视化图表,第11张

鍑鸿鏃堕棿鎯呭喌

from pyecharts import options as opts
from pyecharts.charts import Pie
from pyecharts.faker import Faker
num = df['鑰楁椂'].value_counts().to_list()
info = df['鑰楁椂'].value_counts().index.to_list()
c = (
    Pie()
    .add(
        "",
        [
            list(z)
            for z in zip(
                info,
                num,
            )
        ],
        center=["40%", "50%"],
    )
    .set_global_opts(
        title_opts=opts.TitleOpts(title="鍑鸿鏃堕棿鎯呭喌"),
        legend_opts=opts.LegendOpts(type_="scroll", pos_left="80%", orient="vertical"),
    )
    .set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}"))
#     .render("pie_scroll_legend.html")
)
c.render_notebook()
8天长假快来了,Python分析【去哪儿旅游攻略】数据,制作可视化图表,第12张

璐圭敤鍒嗗竷鎯呭喌

from pyecharts import options as opts
from pyecharts.charts import Pie
from pyecharts.faker import Faker
num = df['璐圭敤'].value_counts().to_list()
info = df['璐圭敤'].value_counts().index.to_list()
c = (
    Pie()
    .add(
        "",
        [
            list(z)
            for z in zip(
                info,
                num,
            )
        ],
        center=["40%", "50%"],
    )
    .set_global_opts(
        title_opts=opts.TitleOpts(title="璐圭敤鍒嗗竷鎯呭喌"),
        legend_opts=opts.LegendOpts(type_="scroll", pos_left="80%", orient="vertical"),
    )
    .set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}"))
#     .render("pie_scroll_legend.html")
)
c.render_notebook()
8天长假快来了,Python分析【去哪儿旅游攻略】数据,制作可视化图表,第13张

浜哄憳鍒嗗竷鎯呭喌

from pyecharts import options as opts
from pyecharts.charts import Pie
from pyecharts.faker import Faker
num = df['浜哄憳'].value_counts().to_list()
info = df['浜哄憳'].value_counts().index.to_list()
c = (
    Pie()
    .add(
        "",
        [
            list(z)
            for z in zip(
                info,
                num,
            )
        ],
        center=["40%", "50%"],
    )
    .set_global_opts(
        title_opts=opts.TitleOpts(title="浜哄憳鍒嗗竷鎯呭喌"),
        legend_opts=opts.LegendOpts(type_="scroll", pos_left="80%", orient="vertical"),
    )
    .set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}"))
#     .render("pie_scroll_legend.html")
)
c.render_notebook()
8天长假快来了,Python分析【去哪儿旅游攻略】数据,制作可视化图表,第14张

灏捐

鏈€鍚庢劅璋綘瑙傜湅鎴戠殑鏂囩珷鍛悀鏈鑸彮鍒拌繖閲屽氨缁撴潫鍟? 馃洭

甯屾湜鏈瘒鏂囩珷鏈夊浣犲甫鏉ュ府鍔?馃帀锛屾湁瀛︿範鍒颁竴鐐圭煡璇唦

韬茶捣鏉ョ殑鏄熸槦馃崶涔熷湪鍔姏鍙戝厜锛屼綘涔熻鍔姏鍔犳补锛堣鎴戜滑涓€璧峰姫鍔涘彮锛夈€?/p>

8天长假快来了,Python分析【去哪儿旅游攻略】数据,制作可视化图表,第15张

<a id="article_bottom"></a>鏈€鍚庯紝瀹d紶涓€涓嬪憖~馃憞馃憞馃憞鏇村婧愮爜銆佽祫鏂欍€佺礌鏉愩€佽В绛斻€佷氦娴?/strong>鐨嗙偣鍑讳笅鏂瑰悕鐗囪幏鍙栧憖馃憞馃憞


https://www.xamrdz.com/lan/5b41995230.html

相关文章: