Selenium截屏圖片未加載的問題解決--【懶加載】

本文轉載自查看原文 2020-10-10 19:26 715 爬蟲/ Selenium

需求：

截屏后轉PDF。

問題：

selenium截屏后，圖片未加載

如下圖：

原因：

網站使用了懶加載技術：只有在瀏覽器中縱向滾動條滾動到指定的位置時，頁面的元素才會被動態加載。

什么是圖片懶加載？

圖片懶加載是一種網頁優化技術。圖片作為一種網絡資源，在被請求時也與普通靜態資源一樣，將占用網絡資源，而一次性將整個頁面的所有圖片加載完，將大大增加頁面的首屏加載時間。

為了解決這種問題，通過前后端配合，使圖片僅在瀏覽器當前視窗內出現時才加載該圖片，達到減少首屏圖片請求數的技術就被稱為“圖片懶加載”。

解決：

模擬人滾動滾動條的行為, 實現頁面的加載

模擬人滾動滾動條的代碼：

        js_height = "return document.body.clientHeight"
        driver.get(link)
        k = 1
        height = driver.execute_script(js_height)
        while True:
            if k * 500 < height:
                js_move = "window.scrollTo(0,{})".format(k * 500)
                print(js_move)
                driver.execute_script(js_move)
                time.sleep(0.2)
                height = driver.execute_script(js_height)
                k += 1
            else:
                break

全部代碼：

#!/usr/bin/python3
# -*- coding:utf-8 -*-
"""
@author: lms
@file: screenshot.py
@time: 2020/10/10 13:02
@desc: 
"""

import time
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from PIL import Image


def screenshot_and_convert_to_pdf(link):
    path = './'

    # 一定要使用無頭模式，不然截不了全頁面，只能截到你電腦的高度
    chrome_options = Options()
    chrome_options.add_argument('--headless')
    chrome_options.add_argument('--disable-gpu')
    chrome_options.add_argument('--no-sandbox')
    driver = webdriver.Chrome(chrome_options=chrome_options)
    try:
        driver.implicitly_wait(20)
        driver.get(link)

        # 模擬人滾動滾動條,處理圖片懶加載問題
        js_height = "return document.body.clientHeight"
        driver.get(link)
        k = 1
        height = driver.execute_script(js_height)
        while True:
            if k * 500 < height:
                js_move = "window.scrollTo(0,{})".format(k * 500)
                print(js_move)
                driver.execute_script(js_move)
                time.sleep(0.2)
                height = driver.execute_script(js_height)
                k += 1
            else:
                break

        time.sleep(1)
        # 接下來是全屏的關鍵，用js獲取頁面的寬高
        width = driver.execute_script("return document.documentElement.scrollWidth")
        height = driver.execute_script("return document.documentElement.scrollHeight")
        print(width, height)
        # 將瀏覽器的寬高設置成剛剛獲取的寬高
        driver.set_window_size(width, height)
        time.sleep(1)

        png_path = path + '/{}.png'.format('123456')
        # pdf_url = SERVER_URL + '/static/global_tech_map/{}.pdf'.format(.pic_num)
        # 截圖並關掉瀏覽器
        driver.save_screenshot(png_path)
        driver.close()
        # png轉pdf
        image1 = Image.open(png_path)
        im1 = image1.convert('RGB')
        pdf_path = png_path.replace('.png', '.pdf')
        im1.save(pdf_path)

    except Exception as e:
        print(e)


if __name__ == '__main__':
    screenshot_and_convert_to_pdf('https://mp.weixin.qq.com/s/nJRnGpPVeJ1kdMIOwiPNpg')

處理完成后的截屏：

感謝閱讀~

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Android中ViewPager+Fragment取消(禁止)預加載延遲加載(懶加載)問題解決方案 GitHub圖片加載不出來問題解決 Vue Router懶加載報錯問題解決方案 Error: Loading chunk * failed，Vue Router懶加載報錯問題解決方案 selenium顯示等待解決瀏覽器未加載完成查找控件的問題爬蟲之圖片懶加載技術,selenium python爬蟲之圖片懶加載、selenium和phantomJS HTML-頁面圖片加載報錯403，但可以單獨打開圖片的問題解決 ios 代碼截屏模糊問題解決辦法解決springdatajpa懶加載問題

Selenium截屏 圖片未加載的問題解決--【懶加載】

需求：

問題：

原因：

解決：

處理完成后的截屏：

免責聲明！

Selenium截屏圖片未加載的問題解決--【懶加載】