Python Flask定时调度疫情大数据爬取全栈项目实战使用-7.爬取腾讯疫情数据制作

python-flask xuhss 601℃ 0评论

爬取腾讯数据

爬取腾讯统计疫情网站

js获取数据的api地址:https://view.inews.qq.com/g2/getOnsInfo?name=disease_h5

import requests
import json
header = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.190 Safari/537.36"
}
url = "https://view.inews.qq.com/g2/getOnsInfo?name=disease_h5"
res = requests.get(url, headers = header)
d = json.loads(res.text)

data_all = json.loads(d["data"])
print(type(data_all))

print(data_all.keys())
print(data_all["lastUpdateTime"]) #上次数据更新时间
print(data_all["chinaTotal"]) #当前汇总数据
print(data_all["chinaAdd"]) #当前新增数据
print(data_all["areaTree"][0].keys()) #中国的数据
print(data_all["areaTree"][0]["name"]) #名称
print(data_all["areaTree"][0]["today"]) #当天数据
print(data_all["areaTree"][0]["total"]) #总共的数据
print(len(data_all["areaTree"][0]["children"])) #34个地区的疫情数据
for i in data_all["areaTree"][0]["children"]:
    print(i["name"])#各个地区的名字都输出出来

20210309213953296 - Python Flask定时调度疫情大数据爬取全栈项目实战使用-7.爬取腾讯疫情数据制作

Json数据结构

lastUpdateTime #最后更新时间

chinaTotal 汇总数据

areaTree:

​ areaTree[0] 中国数据

​ name 名称

​ total 总共的数据

​ today 当天的数据

​ children34个省级地区的数据,列表

​ name 各省级地区名称

​ today

​ total

​ children 市级数据,列表

​ name 各市级地区名称

​ today

​ total

总结的写入数据库的代码

import requests
import json
import time

def get_tencent_data():
    """
    :return:返回历史数据和当日详细数据
    """
    header = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.190 Safari/537.36"
    }
    url = "https://view.inews.qq.com/g2/getOnsInfo?name=disease_h5"
    r = requests.get(url, headers = header)
    res = json.loads(r.text)#  json字符串转换为字典
    data_all = json.loads(res['data'])
    updateTime = data_all["lastUpdateTime"]

    history = {} #字典数据(key时间, value各种数据)
    chinaTotal =  data_all["chinaTotal"]
    confirm = chinaTotal["confirm"]
    suspect = chinaTotal["suspect"]
    heal = chinaTotal["heal"]
    dead = chinaTotal["dead"]
    history[updateTime] = {"confirm":confirm, "suspect":suspect, "heal":heal, "dead":dead}

    chinaAdd =  data_all["chinaAdd"]
    confirmAdd = chinaAdd["confirm"]
    suspectAdd = chinaAdd["suspect"]
    healAdd = chinaAdd["heal"]
    deadAdd = chinaAdd["dead"]
    history[updateTime].update({"confirm_add":confirmAdd, "suspect_add":suspectAdd, "heal_add":healAdd, "dead_add":deadAdd})
    print(history)

    details  = [] #当日详细数据
    update_time = data_all["lastUpdateTime"]
    data_country = data_all["areaTree"] # 国家 只有一个中国 没有其他国家
    data_province = data_country[0]["children"] # 中国各省
    for pro_infos in data_province:
        province = pro_infos["name"] # 省名
        for city_infos in pro_infos["children"]:
            city = city_infos["name"] #
            confirm = city_infos["total"]["confirm"]
            confirm_add = city_infos["today"]["confirm"]
            heal = city_infos["total"]["heal"]
            dead = city_infos["total"]["dead"]     
            details.append([update_time, province, city, confirm, confirm_add, heal, dead])
            print([update_time, province, city, confirm, confirm_add, heal, dead])
    return history, details

get_tencent_data()

转载请注明:xuhss » Python Flask定时调度疫情大数据爬取全栈项目实战使用-7.爬取腾讯疫情数据制作

喜欢 (4)

您必须 登录 才能发表评论!