测试积点老人 发表于 2022-1-4 10:29:19

selenium怎么获取全部的粉丝和文章内容并且写进excel表?

问题遇到的现象和发生背景
无法获取全部数据,不知道文章和粉丝之间的页面的数据怎么进行连接
from selenium import webdriver
import random

from selenium.webdriver.common.keys import Keys
import time
import re


# user_ = input('请输入账号:')
# password = input('请输入密码:')
url = "https://weibo.com/login.php"
dirver = webdriver.Chrome()
dirver.get(url)
time.sleep(0.5)
dirver.maximize_window()


dirver.find_element_by_id('loginname').send_keys('18327862307')
time.sleep(2)


dirver.find_element_by_xpath('//*[@id="pl_login_form"]/div/div/div/div/input').send_keys('love572461914')
time.sleep(2)
dirver.find_element_by_xpath('//*[@id="pl_login_form"]/div/div/div/a').click()
time.sleep(20)

# 登录进入六星网页
url_six = "https://weibo.com/liuxingedu"
dirver.get(url_six)
time.sleep(2)
# dirver.find_element_by_xpath('//*[@id="__sidebar"]/div/div/div/div/div/div/div/div/div').click()

# 进行页面滚动
for i in range(1,1000):
    js="var q=document.documentElement.scrollTop=%s"%(i*300)
    time.sleep(0.3)
    dirver.execute_script(js)

# 获取文章内容
title_url = "https://weibo.com/ajax/statuses/mymblog?uid=7617227236&page=1&feature=0"
dirver.get(title_url)
titles = dirver.find_elements_by_xpath('//*[@id="app"]/div/div/div/main/div/div/div/div')
for title in titles:
    print(1,title.text)

# 点击进入粉丝页
dirver.find_element_by_xpath('//*[@id="app"]/div/div/div/main/div/div/div/div/div/div/div/div/a').click()

fans_url = 'https://weibo.com/u/page/follow/7617227236?relate=fans'
dirver.get(fans_url)
fans_list = []
fans_nums = dirver.find_elements_by_class_name('vue-recycle-scroller__item-view')
for fans in fans_nums:
    f_dict = {fans}
    fans_list.append(f_dict)
print(fans_list)

# 进行页面滚动
for i in range(1,1000):
    js="var q=document.documentElement.scrollTop=%s"%(i*300)
    time.sleep(0.3)
    dirver.execute_script(js)

qqq911 发表于 2022-1-5 10:46:29

参考下爬虫的方式
页: [1]
查看完整版本: selenium怎么获取全部的粉丝和文章内容并且写进excel表?