searxng - Privacy-respecting federated metasearch engine

Age	Commit message (Collapse)	Author
2024-06-16	[fix] \!goi irrelevant results AND display more results	Allen

2024-03-11	[mod] pylint all engines without PYLINT_SEARXNG_DISABLE_OPTION	Markus Heiser
	Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-02-25	[refactor] images: add resolution, image format and filesize fields	Bnyro
	Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
2023-12-03	[mod] add option max_page	Markus Heiser
	Related: https://github.com/searxng/searxng/issues/2982 Closes: https://github.com/searxng/searxng/issues/2972 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-09-21	[fix] engine - google images error when no results	jazzzooo

2023-03-24	[mod] Google: reversed engineered & upgrade to data_type: traits_v1	Markus Heiser
	Partial reverse engineering of the Google engines including a improved language and region handling based on the engine.traits_v1 data. When ever possible the implementations of the Google engines try to make use of the async REST APIs. The get_lang_info() has been generalized to a get_google_info() function / especially the region handling has been improved by adding the cr parameter. searx/data/engine_traits.json Add data type "traits_v1" generated by the fetch_traits() functions from: - Google (WEB), - Google images, - Google news, - Google scholar and - Google videos and remove data from obsolete data type "supported_languages". A traits.custom type that maps region codes to supported_domains is fetched from https://www.google.com/supported_domains searx/autocomplete.py: Reversed engineered autocomplete from Google WEB. Supports Google's languages and subdomains. The old API suggestqueries.google.com/complete has been replaced by the async REST API: https://{subdomain}/complete/search?{args} searx/engines/google.py Reverse engineering and extensive testing .. - fetch_traits(): Fetch languages & regions from Google properties. - always use the async REST API (formally known as 'use_mobile_ui') - use supported_domains from traits - improved the result list by fetching './/div[@data-content-feature]' and parsing the type of the various content features --> thumbnails are added searx/engines/google_images.py Reverse engineering and extensive testing .. - fetch_traits(): Fetch languages & regions from Google properties. - use supported_domains from traits - if exists, freshness_date is added to the result - issue 1864: result list has been improved a lot (due to the new cr parameter) searx/engines/google_news.py Reverse engineering and extensive testing .. - fetch_traits(): Fetch languages & regions from Google properties. supported_domains is not needed but a ceid list has been added. - different region handling compared to Google WEB - fixed for various languages & regions (due to the new ceid parameter) / avoid CONSENT page - Google News do no longer support time range - result list has been fixed: XPath of pub_date and pub_origin searx/engines/google_videos.py - fetch_traits(): Fetch languages & regions from Google properties. - use supported_domains from traits - add paging support - implement a async request ('asearch': 'arc' & 'async': 'use_ac:true,_fmt:html') - simplified code (thanks to '_fmt:html' request) - issue 1359: fixed xpath of video length data searx/engines/google_scholar.py - fetch_traits(): Fetch languages & regions from Google properties. - use supported_domains from traits - request(): include patents & citations - response(): fixed CAPTCHA detection (Scholar has its own CATCHA manager) - hardening XPath to iterate over results - fixed XPath of pub_type (has been change from gs_ct1 to gs_cgt2 class) - issue 1769 fixed: new request implementation is no longer incompatible Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24	[mod] Google: fetch engine traits (data_type: supported_languages)	Markus Heiser
	Implements a fetch_traits function for the Google engines. .. note:: Does not include migration of the request methode from 'supported_languages' to 'traits' (EngineTraits) object! Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-09-21	[mod] google-images: slightly improvements of the engine	Markus Heiser
	Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-09-20	use the internal API for google images	Emilien Devos

2022-08-01	[mod] add 'Accept-Language' HTTP header to online processores	Markus Heiser
	Most engines that support languages (and regions) use the Accept-Language from the WEB browser to build a response that fits to the language (and region). - add new engine option: send_accept_language_header Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-25	[fix] google & youtube - set EU consent cookie	Emilien Devos
	This change the previous bypass method for Google consent using ``ucbcb=1`` (6face215b8) to accept the consent using ``CONSENT=YES+``. The youtube_noapi and google have a similar API, at least for the consent[1]. Get CONSENT cookie from google reguest:: curl -i "https://www.google.com/search?q=time&tbm=isch" \ -A "Mozilla/5.0 (X11; Linux i686; rv:102.0) Gecko/20100101 Firefox/102.0" \ \| grep -i consent ... location: https://consent.google.com/m?continue=https://www.google.com/search?q%3Dtime%26tbm%3Disch&gl=DE&m=0&pc=irp&uxe=eomtm&hl=en-US&src=1 set-cookie: CONSENT=PENDING+936; expires=Wed, 24-Jul-2024 11:26:20 GMT; path=/; domain=.google.com; Secure ... PENDING & YES [2]: Google change the way for consent about YouTube cookies agreement in EU countries. Instead of showing a popup in the website, YouTube redirects the user to a new webpage at consent.youtube.com domain ... Fix for this is to put a cookie CONSENT with YES+ value for every YouTube request [1] https://github.com/iv-org/invidious/pull/2207 [2] https://github.com/TeamNewPipe/NewPipeExtractor/issues/592 Closes: https://github.com/searxng/searxng/issues/1432
2022-07-09	bypass google consent with ucbcb=1	Emilien Devos

2022-02-19	[fix] google images engine: Fix 'scrap_img_by_id' function	Markus Heiser
	The 'scrap_img_by_id' function didn't return any longer anything useful. This fix allows the google images engine to present the full source image instead of only the thumbnail. The function scrap_img_by_id() is rpelaced by a fully rewrite to parse image URLs by a regular expression. The new function parse_urls_img_from_js(dom) returns a mapping of data-id to image URL. Closes: https://github.com/searxng/searxng/issues/909 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-05	[enh] add more categories	Martin Fischer

2021-12-27	[format.python] initial formatting of the python code	Markus Heiser
	This patch was generated by black [1]:: make format.python [1] https://github.com/psf/black Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-11-21	[fix] google images: @href index 0 not found	Markus Heiser
	Sometimes there is no href in the `<a ..>` tag of a link_node [1]. [1] https://github.com/searxng/searxng/issues/532 Reported-by: @TheEssem Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-07	[fix] drop useless pylint: disable=undefined-variable	Markus Heiser
	Since 7b235a1 (see line 591) it is no longer needed to disable 'undefined-variable' for names defined in:: PYLINT_ADDITIONAL_BUILTINS_FOR_ENGINES Suggested-by: @dalf https://github.com/searxng/searxng/issues/102#issuecomment-914068609 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-06	[mod] one logger per engine - drop obsolete logger.getChild	Markus Heiser
	Remove the no longer needed `logger = logger.getChild(...)` from engines. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-08-31	[pylint] Pylint 2.10 - fix use-list-literal & use-dict-literal	Markus Heiser
	Pylint 2.10 added new default checks [1]: use-list-literal Emitted when list() is called with no arguments instead of using [] use-dict-literal Emitted when dict() is called with no arguments instead of using {} [1] https://pylint.pycqa.org/en/latest/whatsnew/2.10.html Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-07-10	Fix google images	Émilien Devos
	Proposed fix in https://github.com/searx/searx/pull/2115#issuecomment-876716010
2021-06-21	[docs] add documentation from the sources of the google engines	Markus Heiser
	Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-11	[fix] log messages from: google- images, news, scholar, videos	Markus Heiser
	- HTTP header Accept-Language --> lang_info['headers']['Accept-Language'] - remove obsolete query_url log messages which is already logged by httpx._client:HTTP request Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-10	[enh] google engine: supports "default language"	Alexandre Flament
	Same behaviour behaviour than Whoogle [1]. Only the google engine with the "Default language" choice "(all)"" is changed by this patch. When searching for a locate place, the result are in the expect language, without missing results [2]: > When a language is not specified, the language interpretation is left up to > Google to decide how the search results should be delivered. The query parameters are copied from Whoogle. With the ``all`` language: - add parameter ``source=lnt`` - don't use parameter ``lr`` - don't add a ``Accept-Language`` HTTP header. The new signature of function ``get_lang_info()`` is: lang_info = get_lang_info(params, lang_list, custom_aliases, supported_any_language) Argument ``supported_any_language`` is True for google.py and False for the other google engines. With this patch the function now returns: - query parameters: ``lang_info['params']`` - HTTP headers: ``lang_info['headers']`` - and as before this patch: - ``lang_info['subdomain']`` - ``lang_info['country']`` - ``lang_info['language']`` [1] https://github.com/benbusby/whoogle-search [2] https://github.com/benbusby/whoogle-search/releases/tag/v0.5.4
2021-04-26	[pylint] tag PYLINT_FILES by comment `# lint: pylint`	Markus Heiser
	These py files are linted by `test.pylint`, all other files are linted by `test.pep8`. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-02-01	[mod] dynamically set language_support variable	Alexandre Flament
	The language_support variable is set to True by default, and set to False in only 5 engines. Except the documentation and the /config URL, this variable is not used. This commit remove the variable definition in the engines, and set value according to supported_languages length: False when the length is 0, True otherwise. Close #2485
2021-01-28	[fix] normalize the language & region aspects of all google engines	Markus Heiser
	BTW: make the engines ready for search.checker: - replace eval_xpath by eval_xpath_getindex and eval_xpath_list - google_images: remove outer try/except block Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-22	[fix] revise of the google-news engine	Markus Heiser
	This revise is based on the methods developed in the revise of the google engine (see commit 410c2f9). Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-14	[enh] engines: add about variable	Alexandre Flament
	move meta information from comment to the about variable so the preferences, the documentation can show these information
2020-12-03	[mod] various engines: use eval_xpath* functions and searx.exceptions.*	Alexandre Flament
	Engine list: ahmia, duckduckgo_images, elasticsearch, google, google_images, google_videos, youtube_api
2020-12-01	[mod] pylint: numerous minor code fixes	Alexandre Flament

2020-11-14	[mod] remove unused import	Alexandre Flament
	use from searx.engines.duckduckgo import _fetch_supported_languages, supported_languages_url # NOQA so it is possible to easily remove all unused import using autoflake: autoflake --in-place --recursive --remove-all-unused-imports searx tests
2020-10-02	[mod] move extract_text, extract_url to searx.utils	Alexandre Flament

2020-09-10	Drop Python 2 (1/n): remove unicode string and url_utils	Dalf

2020-08-08	Fix google images 'get image' button bug from issue #2103 (#2115)	Vlad
	Closes #2103
2020-07-08	[fix] pep8	Adam Tauber

2020-07-07	[fix] revise google images engine	Markus Heiser
	this commit is picked from #1985
2019-08-26	[fix] google images	Marc Abonce Seguin

2019-05-28	Use string formatter to create source and img_format labels (#1566)	Frank de Lange
	google_images : use JSON embedded in HTML (engine expected pure JSON)
2019-04-14	Fix google image search	Nick Espig
	- Because there is not full image url in the dom, we replace "image_url" with the same url as the "url" (url of source). See example HTML https://gist.github.com/Nachtalb/2dea8a4d2c723c49226ad9645838121f - Remove unused import - Fix google image search title - Keep google image safe value up to date
2018-06-14	[fix] use html result page in google images (previous endpoint stopped working)	Adam Tauber

2017-06-13	[fix] fix xpath of google images	Noémi Ványi

2017-05-15	[enh] py3 compatibility	Adam Tauber

2016-12-11	add year to time range to engines which support "Last year"	Noémi Ványi
	Engines: * Bing images * Flickr (noapi) * Google * Google Images * Google News
2016-08-13	[fix] google images paging - closes #571	Adam Tauber

2016-07-26	[fix] time range detection	Adam Tauber

2016-07-25	fix pep8	Noemi Vanyi

2016-07-25	add time range search for google images	Noemi Vanyi

2016-04-07	[fix] broken google images parsing	Adam Tauber

2015-12-22	[fix] remove debug message	Adam Tauber

2015-12-09	[doc] correct google images docstring	Adam Tauber