javascript - Selenium page_source does not return modified DOM tree -


I want to understand changes such as NoScript / ghostery before and after applying a certain webpage. Noscript / Ghostry blocks trackers and advertisers' scripts and remove them from the DOM tree (as an example, I search it on CNN.com before and after 'NoScript' in Firefox '' '' '' '' However, there is still "if I dump a tree tree using the browser of the Seminium. I am using the following code in this process:

  selenium come Pick up the import from the WebDriver fp = webdriver.FirefoxProfile (../<extensions/addons/>) Browser = webdriver.firefox (Firefox_profile = fp) browser.get ("http://www.cnn.com") ) Html_source = browser.page_sourcef = open ("cnn.p", "wb") pickle.dump (html_source, f)  

selenium's gate_source is called source documentation that it is modified (In my case modified by Noscript) DOM tree but I can not know that it happens I appreciate that someone can comment Land how modified (an addon) DOM tree selenium or using any automated tool.

After trying several methods, finally my problem solved. Instead of using Webdriver.page_source (output 'html source'), I have a web driver. Execute_script ("return document.documentElement.outerHTML") is used to dump the HTML used.


Comments