cairo-dock-team team mailing list archive

Thread
Date

[Merge] lp:~eduardo-mucelli/cairo-dock-plug-ins-extras/WebSearch into lp:cairo-dock-plug-ins-extras

To: mp+169532@xxxxxxxxxxxxxxxxxx
From: Eduardo Mucelli Rezende Oliveira <edumucelli@xxxxxxxxx>
Date: Fri, 14 Jun 2013 20:23:24 -0000
Reply-to: mp+169532@xxxxxxxxxxxxxxxxxx
Sender: bounces@xxxxxxxxxxxxx

Eduardo Mucelli Rezende Oliveira has proposed merging lp:~eduardo-mucelli/cairo-dock-plug-ins-extras/WebSearch into lp:cairo-dock-plug-ins-extras.

Requested reviews:
  Cairo-Dock Third-Party (cairo-dock-third-party)

For more details, see:
https://code.launchpad.net/~eduardo-mucelli/cairo-dock-plug-ins-extras/WebSearch/+merge/169532

Fixing Google and Bing search. Removed Webshots and Twitter. Webshots closed :( its doors and Twitter changed its search to something really hard to fetch.
-- 
https://code.launchpad.net/~eduardo-mucelli/cairo-dock-plug-ins-extras/WebSearch/+merge/169532
Your team Cairo-Dock Third-Party is requested to review the proposed merge of lp:~eduardo-mucelli/cairo-dock-plug-ins-extras/WebSearch into lp:cairo-dock-plug-ins-extras.

=== modified file 'WebSearch/Changelog.txt'
--- WebSearch/Changelog.txt	2012-02-01 01:57:11 +0000
+++ WebSearch/Changelog.txt	2013-06-14 20:22:27 +0000
@@ -1,3 +1,4 @@
+1.4.6: (June/14/2013): Fixing Google and Bing search. Removed Webshots and Twitter. Webshots closed :( its doors and Twitter changed its search to something really hard to fetch.
 1.4.3: (January/31/2012): Fixing Youtube search. Fixing a problem when opening URLs.
 1.4.2: (November/17/2011): Simplifying the dependencies. Fixing Wikipedia search.
 1.4.0: (July/13/2010): WebSearch now keeps a history of recently searched terms.

=== modified file 'WebSearch/WebSearch.conf'
--- WebSearch/WebSearch.conf	2012-11-05 14:51:51 +0000
+++ WebSearch/WebSearch.conf	2013-06-14 20:22:27 +0000
@@ -1,4 +1,4 @@
-#1.4.5
+#1.4.6
 
 #[gtk-about]
 [Icon]
@@ -97,7 +97,7 @@
 #[gtk-preferences]
 [Configuration]
 
-#l[Google;Bing;Yahoo!;Teoma;Youtube;Webshots;Flickr;Wikipedia;ImageShack;Twitter;Digg] Search engine :
+#l[Google;Bing;Yahoo!;Teoma;Youtube;Flickr;Wikipedia;ImageShack;Digg] Search engine :
 engine = 0
 #i[5;10] Maximum number of results shown :
 #{in sub-icons.}
@@ -106,6 +106,6 @@
 show current page = true
 #b Show the description of the result instead of its URL in the sub-icons ?
 show description instead url = true
-#b Enable thumbnail preview for Youtube, Webshots, Flickr, ImageShack, Twitter, and Digg searches ?
+#b Enable thumbnail preview for Youtube, Flickr, ImageShack, and Digg searches ?
 #{for slow connections, disable will result in significantly faster fetching.}
 show thumbnail preview = true

=== modified file 'WebSearch/auto-load.conf'
--- WebSearch/auto-load.conf	2012-11-05 14:51:51 +0000
+++ WebSearch/auto-load.conf	2013-06-14 20:22:27 +0000
@@ -4,10 +4,10 @@
 author = Eduardo Mucelli Rezende Oliveira
 
 # A short description of the applet and how to use it.
-description = This applet provides an interface to some search engines such as\nGoogle, Bing, Teoma, Yahoo!, Youtube, Webshots, Flickr, Wikipedia,  ImageShack, and Twitter.\nTo choose the search engine you can\n    (1) Right-click on the main icon -> WebSearch -> (Choose the engine)\n    (2) Right-click -> Configure this applet -> Configuration -> Search engine\n    (3) Scroll up or down over the icon (applicable only for the first search)\nYou can search in three ways\n    (1) Middle-click on the main icon\n    (2) Left-click on main icon (right after choosing a new engine)\nType your query and validate. Each result will be shown as a sub-icon.\nLeft-click to open the the result in the default Web Browser\nMiddle-click on the sub-icon of any result to show its description\nScroll up to fetch the next results\nScroll down to fetch the previous results\nLeft-click on the main icon to show search stats
+description = This applet provides an interface to some search engines such as\nGoogle, Bing, Teoma, Yahoo!, Youtube, Flickr, Wikipedia, and ImageShack.\nTo choose the search engine you can\n    (1) Right-click on the main icon -> WebSearch -> (Choose the engine)\n    (2) Right-click -> Configure this applet -> Configuration -> Search engine\n    (3) Scroll up or down over the icon (applicable only for the first search)\nYou can search in three ways\n    (1) Middle-click on the main icon\n    (2) Left-click on main icon (right after choosing a new engine)\nType your query and validate. Each result will be shown as a sub-icon.\nLeft-click to open the the result in the default Web Browser\nMiddle-click on the sub-icon of any result to show its description\nScroll up to fetch the next results\nScroll down to fetch the previous results\nLeft-click on the main icon to show search stats
 
 # Category of the applet : 2 = files, 3 = internet, 4 = Desktop, 5 = accessory, 6 = system, 7 = fun
 category = 3
 
 # Version of the applet; change it everytime you change something in the config file. Don't forget to update the version both in this file and in the config file.
-version = 1.4.5
+version = 1.4.6

=== modified file 'WebSearch/lib/Bing.rb'
--- WebSearch/lib/Bing.rb	2010-07-13 16:02:17 +0000
+++ WebSearch/lib/Bing.rb	2013-06-14 20:22:27 +0000
@@ -16,12 +16,12 @@
 	# Fetch links from Bing. Since Bing does not provide an in-url way to fetch more links than the 10
 	# as Google does (&num=amount_to_fetch), this method will be called every time that 10 new results need to be shown
 	def retrieve_links(query, offset = 1)
-		bing = Nokogiri::HTML(open(URI.encode("#{self.query_url}#{query}&first=#{offset}")))
+		bing = Nokogiri::HTML(open(URI.encode("#{self.query_url}#{query}&first=#{offset}"), "User-Agent" => self.user_agent))
 		self.stats = retrieve_bing_result_stats(bing, query)
-		(bing/"h3").search("a[@onmousedown]").each do |raw_link|
-			url = raw_link['href']
-			description = raw_link.inner_text
-			self.links << Link.new(url, description)
+		(bing/"div[@class=sb_tlst]/h3/a").each do |raw_link|
+		  url = raw_link['href']
+		  description = raw_link.inner_text
+		  self.links << Link.new(url, description)
 		end
 		self.links
 	end

=== modified file 'WebSearch/lib/Engine.rb'
--- WebSearch/lib/Engine.rb	2010-07-13 16:02:17 +0000
+++ WebSearch/lib/Engine.rb	2013-06-14 20:22:27 +0000
@@ -11,11 +11,12 @@
 	require './lib/Link.rb'
 	require './lib/Exceptions.rb'
 
-	attr_accessor :name, :stats, :links, :base_url, :query_url
+	attr_accessor :name, :stats, :links, :base_url, :query_url, :user_agent
 
 	def initialize
 		self.links =[]
 		self.stats = ""
+		self.user_agent = "Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.97 Safari/537.11"
 	end
 
 	def connect

=== modified file 'WebSearch/lib/Engines.rb'
--- WebSearch/lib/Engines.rb	2010-07-13 16:02:17 +0000
+++ WebSearch/lib/Engines.rb	2013-06-14 20:22:27 +0000
@@ -13,18 +13,16 @@
 	TEOMA = "Teoma"
 	WIKIPEDIA = "Wikipedia"
 	YOUTUBE = "Youtube"
-	WEBSHOTS = "Webshots"
 	FLICKR = "Flickr"
 	IMAGESHACK = "ImageShack"
-	TWITTER = "Twitter"
 	DIGG = "Digg"
 	
 	# All the engines. Help to create the list of strings controlled by mouse scroll to be shown in the icon
-	@List = [GOOGLE, BING, YAHOO, TEOMA, WIKIPEDIA, YOUTUBE, WEBSHOTS, FLICKR, IMAGESHACK, TWITTER, DIGG]
+	@List = [GOOGLE, BING, YAHOO, TEOMA, WIKIPEDIA, YOUTUBE, FLICKR, IMAGESHACK, DIGG]
     
     # some engines use the concept of offset which is the first index of an interval of links/images to be shown
 	# but there is those that use a sequential page (1,2,3, ...) which has an amount of links/images, etc
-    @PaginatedByPage = [TEOMA, YOUTUBE, FLICKR, IMAGESHACK, TWITTER, DIGG]
+    @PaginatedByPage = [TEOMA, YOUTUBE, FLICKR, IMAGESHACK, DIGG]
 #	PaginatedByOffset = [GOOGLE, BING, YAHOO, WEBSHOTS, WIKIPEDIA]
 
     def self.list

=== modified file 'WebSearch/lib/Google.rb'
--- WebSearch/lib/Google.rb	2010-07-13 16:02:17 +0000
+++ WebSearch/lib/Google.rb	2013-06-14 20:22:27 +0000
@@ -19,7 +19,7 @@
 	# Fetch a user-defined number links from Google with just one query. The parameter offset is the index of the first link.
     # It is better to fetch a higher amount of links in order to minimize the number of queries to be sent to google
 	def retrieve_links (query, offset)
-		google = Nokogiri::HTML(open(URI.encode("#{self.query_url}#{query}&start=#{offset}&num=#{self.number_of_fetched_links}")))
+		google = Nokogiri::HTML(open(URI.encode("#{self.query_url}#{query}&start=#{offset}&num=#{self.number_of_fetched_links}"), "User-Agent" => self.user_agent))
 		self.stats = retrieve_result_stats(google, query)
 		(google/"h3[@class='r']").search("a[@href]").each do |raw_link|
 			url = raw_link['href']

=== removed file 'WebSearch/lib/Twitter.rb'
--- WebSearch/lib/Twitter.rb	2010-07-13 16:02:17 +0000
+++ WebSearch/lib/Twitter.rb	1970-01-01 00:00:00 +0000
@@ -1,31 +0,0 @@
-# This is a part of the external WebSearch applet for Cairo-Dock
-# Author: Eduardo Mucelli Rezende Oliveira
-# E-mail: edumucelli@xxxxxxxxx or eduardom@xxxxxxxxxxx
-#
-# This module fetch results from Twitter, including thumbnails - www.twitter.com
-
-class Twitter < Engine
-
-	def initialize
-		self.name = self.class.to_s
-		self.query_url = "http://search.twitter.com/search?q=";											# 15 results per page
-		super
-	end
-
-	# url, e.g., http://twitter.com/runscored_cin/statuses/14382443834
-	# thumb_url, e.g., http://a1.twimg.com/profile_images/768793556/n12430139_39051956_3639_normal.jpg ; twitter does not change the original pic name
-	# description, e.g, Dusty, I hope you literally kill Miguel Cairo with your words after today. #RedsFAIL
-	def retrieve_links(query, page = 1)
-		twitter = Nokogiri::HTML.parse(open(URI.encode("#{self.query_url}#{query}&page=#{page}")))
-		(twitter/"div[@id='results']/ul").each do |res|
-			(res/"li").each do |raw_result|
-				thumb_url = raw_result.at("div[@class='avatar']/a/img")['src']							# the thumb of the avatar which tweeted
-				description = raw_result.at("div[@class='msg']/span[@class^='msgtxt']").inner_text		# the tweet text
-				url = raw_result.at("div[@class='info']/a[@class='lit']")['href']						# url of the tweet
-				self.links << ThumbnailedLink.new(url, description, thumb_url, self.name)
-			end
-		end
-		self.links
-	end
-
-end

=== removed file 'WebSearch/lib/Webshots.rb'
--- WebSearch/lib/Webshots.rb	2010-07-13 16:02:17 +0000
+++ WebSearch/lib/Webshots.rb	1970-01-01 00:00:00 +0000
@@ -1,34 +0,0 @@
-# This is a part of the external WebSearch applet for Cairo-Dock
-# Author: Eduardo Mucelli Rezende Oliveira
-# E-mail: edumucelli@xxxxxxxxx or eduardom@xxxxxxxxxxx
-#
-# This module fetch results from Webshots, including thumbnails - www.webshots.com
-
-class Webshots < Engine
-	
-	def initialize
-		self.name = self.class.to_s
-		self.base_url = "http://www.webshots.com";
-		self.query_url = "#{self.base_url}/search?querySource=community&query="								# 36 results per page
-		super
-	end
-
-	# url, e.g, http://good-times.webshots.com/photo/2500137270102572130
-	# thumb_url, e.g, http://thumb10.webshots.net/t/24/665/1/37/27/2500137270102572130SmNoHt_th.jpg";
-	def retrieve_links(query, offset = 0)
-		webshots = Nokogiri::HTML(open(URI.encode("#{self.query_url}#{query}&start=#{offset}")))
-		self.stats = retrieve_webshots_result_stats(webshots, query)
-		(webshots/"a[@class='searchListItemLink']").each do |res|
-			url = res['href']
-			description = res['title']
-			thumb_url = res.at("img[@class='searchListItemImg']")['src']
-			self.links << ThumbnailedLink.new(url, description, thumb_url, self.name)
-		end
-		self.links
-	end
-
-	def retrieve_webshots_result_stats(webshots, query)
-		total = webshots.at("span[@class='resultsNo']/strong").inner_text
-		"Search for #{query} returned #{total} results"
-	end
-end

Follow ups

Re: [Merge] lp:~eduardo-mucelli/cairo-dock-plug-ins-extras/WebSearch into lp:cairo-dock-plug-ins-extras
From: Matthieu Baerts, 2013-06-16
[Merge] lp:~eduardo-mucelli/cairo-dock-plug-ins-extras/WebSearch into lp:cairo-dock-plug-ins-extras
From: noreply, 2013-06-15
Re: [Merge] lp:~eduardo-mucelli/cairo-dock-plug-ins-extras/WebSearch into lp:cairo-dock-plug-ins-extras
From: Matthieu Baerts, 2013-06-14