51Testing软件测试论坛

标题: selenium爬虫被检测到 该如何破? [打印本页]

作者: 测试积点老人    时间: 2020-7-13 11:32
标题: selenium爬虫被检测到 该如何破?
你好, 我现在用selenium抓取一个网站的时候,被识别为爬虫,请问有什么破解的方法么? 代码如下
  1. import time
  2. from selenium import webdriver
  3. from selenium.webdriver.support.ui import WebDriverWait
  4. from selenium.webdriver.support import expected_conditions as EC
  5. from selenium.common.exceptions import TimeoutException

  6. browser = webdriver.Chrome()
  7. browser.implicitly_wait(40)
  8. browser.get("https://www.crunchbase.com/app/search/companies/")

  9. time.sleep(60)
复制代码
页面返回:
  1. Pardon Our Interruption...
  2. As you were browsing crunchbase accelerates innovation by bringing together data on companies and the people behind them. something about your browser made us think you were a bot. There are a few reasons this might happen:

  3. You're a power user moving through this website with super-human speed.
  4. You've disabled JavaScript in your web browser.
  5. A third-party browser plugin, such as Ghostery or NoScript, is preventing JavaScript from running. Additional information is available in this support article.
  6. To request an unblock, please fill out the form below and we will review it as soon as possible.

复制代码
该网站使用了http://distilnetworks.com的反爬服务.

作者: 海海豚    时间: 2020-7-14 09:38
https://blog.csdn.net/Python1996/article/details/99709167 看下这个
作者: 郭小贱    时间: 2020-7-14 09:49
参考看看呢 https://www.zhihu.com/question/50738719
作者: bellas    时间: 2020-7-14 09:54
https://www.jianshu.com/p/5e34a8f95512  参考下这个链接
作者: qqq911    时间: 2020-7-14 10:19
加上头信息




欢迎光临 51Testing软件测试论坛 (http://bbs.51testing.com/) Powered by Discuz! X3.2