2019年9月24日 星期二

Tips of Web Scraping

Tips of Web Scraping

Python Automatic configuration script PAC:
CSV read/write:
"Selenium 是為瀏覽器自動化(Browser Automation)需求所設計的一套工具集合,讓程式可以直接驅動瀏覽器進行各種網站操作。" ...
Python + Selenium:
  1. install python
  2. install Selenium $pip install selenium
  • for Windows
  • for Linux (or Linux under chromeos):
    • get the appropirate version of chrome driver
    • copy or "chromedriver" (ex, the unzip file of "chromedriver_linux64.zip") to 'usr/local/bin'https://sites.google.com/a/chromium.org/chromedriver/getting-started/chromeos
    • "All ChromeOS test images shall have Chrome Driver binary installed in /usr/local/chromedriver/."* sudo mv chromedriver /usr/local/bin
    • Now, you would need to run something like
      sudo chmod a+x chromedriver to mark it executable.
  1. install BeautifulSoup $pip install beautifulsoup

沒有留言:

張貼留言

Binary Data, String, and Integer Conversions in Python

In Python 3, struct  will interpret bytes as packed binary data: This module performs conversions between Python values and C structs rep...