summaryrefslogtreecommitdiff
path: root/searx/engines/google_images.py
AgeCommit message (Collapse)Author
2022-08-01[mod] add 'Accept-Language' HTTP header to online processoresMarkus Heiser
Most engines that support languages (and regions) use the Accept-Language from the WEB browser to build a response that fits to the language (and region). - add new engine option: send_accept_language_header Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-25[fix] google & youtube - set EU consent cookieEmilien Devos
This change the previous bypass method for Google consent using ``ucbcb=1`` (6face215b8) to accept the consent using ``CONSENT=YES+``. The youtube_noapi and google have a similar API, at least for the consent[1]. Get CONSENT cookie from google reguest:: curl -i "https://www.google.com/search?q=time&tbm=isch" \ -A "Mozilla/5.0 (X11; Linux i686; rv:102.0) Gecko/20100101 Firefox/102.0" \ | grep -i consent ... location: https://consent.google.com/m?continue=https://www.google.com/search?q%3Dtime%26tbm%3Disch&gl=DE&m=0&pc=irp&uxe=eomtm&hl=en-US&src=1 set-cookie: CONSENT=PENDING+936; expires=Wed, 24-Jul-2024 11:26:20 GMT; path=/; domain=.google.com; Secure ... PENDING & YES [2]: Google change the way for consent about YouTube cookies agreement in EU countries. Instead of showing a popup in the website, YouTube redirects the user to a new webpage at consent.youtube.com domain ... Fix for this is to put a cookie CONSENT with YES+ value for every YouTube request [1] https://github.com/iv-org/invidious/pull/2207 [2] https://github.com/TeamNewPipe/NewPipeExtractor/issues/592 Closes: https://github.com/searxng/searxng/issues/1432
2022-07-09bypass google consent with ucbcb=1Emilien Devos
2022-02-19[fix] google images engine: Fix 'scrap_img_by_id' functionMarkus Heiser
The 'scrap_img_by_id' function didn't return any longer anything useful. This fix allows the google images engine to present the full source image instead of only the thumbnail. The function scrap_img_by_id() is rpelaced by a fully rewrite to parse image URLs by a regular expression. The new function parse_urls_img_from_js(dom) returns a mapping of data-id to image URL. Closes: https://github.com/searxng/searxng/issues/909 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-05[enh] add more categoriesMartin Fischer
2021-12-27[format.python] initial formatting of the python codeMarkus Heiser
This patch was generated by black [1]:: make format.python [1] https://github.com/psf/black Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-11-21[fix] google images: @href index 0 not foundMarkus Heiser
Sometimes there is no href in the `<a ..>` tag of a *link_node* [1]. [1] https://github.com/searxng/searxng/issues/532 Reported-by: @TheEssem Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-07[fix] drop useless pylint: disable=undefined-variableMarkus Heiser
Since 7b235a1 (see line 591) it is no longer needed to disable 'undefined-variable' for names defined in:: PYLINT_ADDITIONAL_BUILTINS_FOR_ENGINES Suggested-by: @dalf https://github.com/searxng/searxng/issues/102#issuecomment-914068609 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-06[mod] one logger per engine - drop obsolete logger.getChildMarkus Heiser
Remove the no longer needed `logger = logger.getChild(...)` from engines. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-08-31[pylint] Pylint 2.10 - fix use-list-literal & use-dict-literalMarkus Heiser
Pylint 2.10 added new default checks [1]: use-list-literal Emitted when list() is called with no arguments instead of using [] use-dict-literal Emitted when dict() is called with no arguments instead of using {} [1] https://pylint.pycqa.org/en/latest/whatsnew/2.10.html Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-07-10Fix google imagesÉmilien Devos
Proposed fix in https://github.com/searx/searx/pull/2115#issuecomment-876716010
2021-06-21[docs] add documentation from the sources of the google enginesMarkus Heiser
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-11[fix] log messages from: google- images, news, scholar, videosMarkus Heiser
- HTTP header Accept-Language --> lang_info['headers']['Accept-Language'] - remove obsolete query_url log messages which is already logged by httpx._client:HTTP request Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-10[enh] google engine: supports "default language"Alexandre Flament
Same behaviour behaviour than Whoogle [1]. Only the google engine with the "Default language" choice "(all)"" is changed by this patch. When searching for a locate place, the result are in the expect language, without missing results [2]: > When a language is not specified, the language interpretation is left up to > Google to decide how the search results should be delivered. The query parameters are copied from Whoogle. With the ``all`` language: - add parameter ``source=lnt`` - don't use parameter ``lr`` - don't add a ``Accept-Language`` HTTP header. The new signature of function ``get_lang_info()`` is: lang_info = get_lang_info(params, lang_list, custom_aliases, supported_any_language) Argument ``supported_any_language`` is True for google.py and False for the other google engines. With this patch the function now returns: - query parameters: ``lang_info['params']`` - HTTP headers: ``lang_info['headers']`` - and as before this patch: - ``lang_info['subdomain']`` - ``lang_info['country']`` - ``lang_info['language']`` [1] https://github.com/benbusby/whoogle-search [2] https://github.com/benbusby/whoogle-search/releases/tag/v0.5.4
2021-04-26[pylint] tag PYLINT_FILES by comment `# lint: pylint`Markus Heiser
These py files are linted by `test.pylint`, all other files are linted by `test.pep8`. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-02-01[mod] dynamically set language_support variableAlexandre Flament
The language_support variable is set to True by default, and set to False in only 5 engines. Except the documentation and the /config URL, this variable is not used. This commit remove the variable definition in the engines, and set value according to supported_languages length: False when the length is 0, True otherwise. Close #2485
2021-01-28[fix] normalize the language & region aspects of all google enginesMarkus Heiser
BTW: make the engines ready for search.checker: - replace eval_xpath by eval_xpath_getindex and eval_xpath_list - google_images: remove outer try/except block Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-22[fix] revise of the google-news engineMarkus Heiser
This revise is based on the methods developed in the revise of the google engine (see commit 410c2f9). Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-14[enh] engines: add about variableAlexandre Flament
move meta information from comment to the about variable so the preferences, the documentation can show these information
2020-12-03[mod] various engines: use eval_xpath* functions and searx.exceptions.*Alexandre Flament
Engine list: ahmia, duckduckgo_images, elasticsearch, google, google_images, google_videos, youtube_api
2020-12-01[mod] pylint: numerous minor code fixesAlexandre Flament
2020-11-14[mod] remove unused importAlexandre Flament
use from searx.engines.duckduckgo import _fetch_supported_languages, supported_languages_url # NOQA so it is possible to easily remove all unused import using autoflake: autoflake --in-place --recursive --remove-all-unused-imports searx tests
2020-10-02[mod] move extract_text, extract_url to searx.utilsAlexandre Flament
2020-09-10Drop Python 2 (1/n): remove unicode string and url_utilsDalf
2020-08-08Fix google images 'get image' button bug from issue #2103 (#2115)Vlad
Closes #2103
2020-07-08[fix] pep8Adam Tauber
2020-07-07[fix] revise google images engineMarkus Heiser
this commit is picked from #1985
2019-08-26[fix] google imagesMarc Abonce Seguin
2019-05-28Use string formatter to create source and img_format labels (#1566)Frank de Lange
google_images : use JSON embedded in HTML (engine expected pure JSON)
2019-04-14Fix google image searchNick Espig
- Because there is not full image url in the dom, we replace "image_url" with the same url as the "url" (url of source). See example HTML https://gist.github.com/Nachtalb/2dea8a4d2c723c49226ad9645838121f - Remove unused import - Fix google image search title - Keep google image safe value up to date
2018-06-14[fix] use html result page in google images (previous endpoint stopped working)Adam Tauber
2017-06-13[fix] fix xpath of google imagesNoémi Ványi
2017-05-15[enh] py3 compatibilityAdam Tauber
2016-12-11add year to time range to engines which support "Last year"Noémi Ványi
Engines: * Bing images * Flickr (noapi) * Google * Google Images * Google News
2016-08-13[fix] google images paging - closes #571Adam Tauber
2016-07-26[fix] time range detectionAdam Tauber
2016-07-25fix pep8Noemi Vanyi
2016-07-25add time range search for google imagesNoemi Vanyi
2016-04-07[fix] broken google images parsingAdam Tauber
2015-12-22[fix] remove debug messageAdam Tauber
2015-12-09[doc] correct google images docstringAdam Tauber
2015-12-09[fix] replace the dead google images ajax api with a working oneAdam Tauber
2015-05-02Merge pull request #308 from dalf/versions_upgradeAdam Tauber
update versions.cfg to use the current up-to-date packages
2015-05-02update versions.cfg to use the current up-to-date packagesAlexandre Flament
2015-05-02[enh] reduce the number of http outgoing connections.Alexandre Flament
engines that still use http : gigablast, bing image for thumbnails, 1x and dbpedia autocompleter
2015-02-08[enh] set google safesearch filter more restictiveThomas Pointhuber
2015-02-08[enh] add safesearch to google_imagesThomas Pointhuber
2015-01-31Google images' unit testCqoicebordel
2015-01-17Tiny forgotsCqoicebordel
2015-01-17Add thumbnails in images resultsCqoicebordel
- Modify engines to create/fetch an URL for the thumbnails - Modify themes to show thumbnails instead of full images. In Courgette, the result is not very beautiful. Should we change it ?