<?xml version="1.0" encoding="utf-8" ?><rss version="2.0" xml:base="http://tech.saigonist.com/taxonomy/term/21/all" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>webcrawler</title>
    <link>http://tech.saigonist.com/taxonomy/term/21/all</link>
    <description></description>
    <language>en</language>
          <item>
    <title>How to login to any website using Curl from the command line or shell script</title>
    <link>http://tech.saigonist.com/b/code/how-login-any-website-using-curl-command-line-or-shell-script</link>
    <description>&lt;span class=&quot;submitted-by&quot;&gt;January 8, 2016&lt;/span&gt;&lt;div class=&quot;field_tags&quot;&gt;&lt;span class=&quot;label label-info&quot;&gt;&lt;a href=&quot;/tags/curl&quot;&gt;curl&lt;/a&gt;&lt;/span&gt; &lt;span class=&quot;label label-info&quot;&gt;&lt;a href=&quot;/tags/webcrawler&quot;&gt;webcrawler&lt;/a&gt;&lt;/span&gt; &lt;/div&gt;&lt;div class=&quot;field field-name-body field-type-text-with-summary field-label-hidden&quot;&gt;&lt;div class=&quot;field-items&quot;&gt;&lt;div class=&quot;field-item even&quot;&gt;&lt;p&gt;There are times you need to scrape/crawl some field on a page but the page requires authentication (logging in). Unless the site is using Basic Auth, where you can have the username and password in the url like &lt;code&gt;http://username:1234paSSwoRd@target.site/&lt;/code&gt; then you&#039;ll need to curl with more sophistication. Besides curl, there are other web tools which you can use on the command line such as links/elinks (elinks is an enhanced version of links which also supports JavaScript to a very limited extent). Links and curl will not execute JavaScript though, so if that&#039;s necessary to get...&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;</description>
     <pubDate>Fri, 08 Jan 2016 10:04:11 +0000</pubDate>
 <dc:creator>tomo</dc:creator>
 <guid isPermaLink="false">27 at http://tech.saigonist.com</guid>
  </item>
  <item>
    <title>Selenium IDE vs Selenium Webdriver vs CasperJS</title>
    <link>http://tech.saigonist.com/b/code/selenium-ide-vs-selenium-webdriver-vs-casperjs</link>
    <description>&lt;span class=&quot;submitted-by&quot;&gt;November 24, 2015&lt;/span&gt;&lt;div class=&quot;field_tags&quot;&gt;&lt;span class=&quot;label label-info&quot;&gt;&lt;a href=&quot;/tags/webcrawler&quot;&gt;webcrawler&lt;/a&gt;&lt;/span&gt; &lt;span class=&quot;label label-info&quot;&gt;&lt;a href=&quot;/tags/scrape&quot;&gt;scrape&lt;/a&gt;&lt;/span&gt; &lt;span class=&quot;label label-info&quot;&gt;&lt;a href=&quot;/tags/selenium&quot;&gt;selenium&lt;/a&gt;&lt;/span&gt; &lt;span class=&quot;label label-info&quot;&gt;&lt;a href=&quot;/tags/casperjs&quot;&gt;casperjs&lt;/a&gt;&lt;/span&gt; &lt;span class=&quot;label label-info&quot;&gt;&lt;a href=&quot;/tags/phantomjs&quot;&gt;phantomjs&lt;/a&gt;&lt;/span&gt; &lt;span class=&quot;label label-info&quot;&gt;&lt;a href=&quot;/tags/javascript&quot;&gt;javascript&lt;/a&gt;&lt;/span&gt; &lt;span class=&quot;label label-info&quot;&gt;&lt;a href=&quot;/tags/chrome&quot;&gt;chrome&lt;/a&gt;&lt;/span&gt; &lt;span class=&quot;label label-info&quot;&gt;&lt;a href=&quot;/tags/firefox&quot;&gt;firefox&lt;/a&gt;&lt;/span&gt; &lt;/div&gt;&lt;div class=&quot;field field-name-body field-type-text-with-summary field-label-hidden&quot;&gt;&lt;div class=&quot;field-items&quot;&gt;&lt;div class=&quot;field-item even&quot;&gt;&lt;p&gt;Or more specifically: Selenium IDE (Firefox plugin) vs Selenium Webdriver (Python and other languages) vs CasperJS (and PhantomJS or SlimerJS)&lt;/p&gt;&lt;p&gt;Selenium allows you, a programmer or non-programmer, to control a web browser and make it do things that you would otherwise do manually. With that ability, you can test your website over and over (and automatically from cron), similate users, or visit any number of web pages and read data (web scraping) on them and save to a file for processing.&lt;/p&gt;&lt;p&gt;If you go to the &lt;a href=&quot;http://www.seleniumhq.org/download/&quot;&gt;Selenium website&lt;/a&gt; you will...&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;</description>
     <pubDate>Tue, 24 Nov 2015 03:50:23 +0000</pubDate>
 <dc:creator>tomo</dc:creator>
 <guid isPermaLink="false">14 at http://tech.saigonist.com</guid>
  </item>
  </channel>
</rss>
