Tips of Web Scraping
- How to scrape websites with Python and BeautifulSoup
- 手把手 | 范例+代码:一文带你上手Python网页抓取神器BeautifulSoup库
- Python爬蟲新手筆記
- 瀏覽器內的爬蟲初體驗
- Selenium with Python
- 在Windows上安裝Python & Selenium + 簡易教學
- Practical Introduction to Web Scraping in Python
- https://medium.com/@yanweiliu/python%E7%88%AC%E8%9F%B2%E5%AD%B8%E7%BF%92%E7%AD%86%E8%A8%98-%E4%B8%80-beautifulsoup-1ee011df8768
- Python 使用 Beautiful Soup 抓取與解析網頁資料,開發網路爬蟲教學
- Beautiful Soup網頁解析
- http://blog.castman.net/%E6%95%99%E5%AD%B8/2016/12/22/python-data-science-tutorial-3.html
- https://www.dataquest.io/blog/web-scraping-tutorial-python/
Python Automatic configuration script PAC:
- http://programmersought.com/article/63531430070/
- https://pypac.readthedocs.io/en/latest/user_guide.html
CSV read/write:
- https://docs.python.org/3/library/csv.html
- https://realpython.com/python-csv/
- https://blog.gtwang.org/programming/python-csv-file-reading-and-writing-tutorial/
"Selenium 是為瀏覽器自動化(Browser Automation)需求所設計的一套工具集合,讓程式可以直接驅動瀏覽器進行各種網站操作。" ...
Python + Selenium:
- install python
- install Selenium
$pip install selenium
- install webdriver https://www.seleniumhq.org/about/platforms.jsp
- for Windows
- download webdriver for Chrome https://sites.google.com/a/chromium.org/chromedriver/
- 把webdriver解壓縮得到的chromedriver.exe檔放在跟python.exe*同一資料夾內 (ex, *C:\ProgramData\Anaconda3)
- for Linux (or Linux under chromeos):
- get the appropirate version of chrome driver
- copy or "chromedriver" (ex, the unzip file of "chromedriver_linux64.zip") to 'usr/local/bin'https://sites.google.com/a/chromium.org/chromedriver/getting-started/chromeos
- "All ChromeOS test images shall have Chrome Driver binary installed in /usr/local/chromedriver/."*
sudo mv chromedriver /usr/local/bin
- Now, you would need to run something like
sudo chmod a+x chromedriver
to mark it executable.
- install BeautifulSoup
$pip install beautifulsoup
沒有留言:
張貼留言