summaryrefslogtreecommitdiff
path: root/searx/engines/google_news.py
AgeCommit message (Collapse)Author
2021-12-27[format.python] initial formatting of the python codeMarkus Heiser
This patch was generated by black [1]:: make format.python [1] https://github.com/psf/black Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-07[pylint] engines: drop no longer needed 'missing-function-docstring'Markus Heiser
Suggested-by: @dalf https://github.com/searxng/searxng/issues/102#issuecomment-914168470 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-07[fix] drop useless pylint: disable=undefined-variableMarkus Heiser
Since 7b235a1 (see line 591) it is no longer needed to disable 'undefined-variable' for names defined in:: PYLINT_ADDITIONAL_BUILTINS_FOR_ENGINES Suggested-by: @dalf https://github.com/searxng/searxng/issues/102#issuecomment-914068609 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-06[mod] one logger per engine - drop obsolete logger.getChildMarkus Heiser
Remove the no longer needed `logger = logger.getChild(...)` from engines. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-21[docs] add documentation from the sources of the google enginesMarkus Heiser
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-18[fix] google news - send CONSENT Cookie to not be redirectedMarkus Heiser
In the EU there exists a "General Data Protection Regulation" [1] aka GDPR (BTW: very user friendly!) which requires consent to tracking. To get the consent from the user, google-news requests are redirected to confirm and get a CONSENT Cookie from https://consent.google.de/s?continue=... This patch adds a CONSENT Cookie to the google-news request to avoid redirection. The behavior of the CONTENTS cookies over all google engines seems similar but the pattern is not yet fully clear to me, here are some random samples from my analysis .. Using common google search from different domains:: google.com: CONSENT=YES+cb.{{date}}-14-p0.de+FX+816 google.de: CONSENT=YES+cb.{{date}}-14-p0.de+FX+333 google.fr: CONSENT=YES+srp.gws-{{date}}-0-RC2.fr+FX+826 When searching about videos (google-videos):: google.es: CONSENT=YES+srp.gws-{{date}}-0-RC2.es+FX+076 google.de: CONSENT=YES+srp.gws-{{date}}-0-RC2.de+FX+171 Google news has only one domain for all languages:: news.google.com: CONSENT=YES+cb.{{date}}-14-p0.de+FX+816 Using google-scholar search from different domains:: scholar.google.de: CONSENT=YES+cb.{{date}}-14-p0.de+FX+333 scholar.google.fr: does not use such a cookie / did not ask the user scholar.google.es: does not use such a cookie / did not ask the user Interim summary: Pattern is unclear and I won't apply the CONSENT cookie to all google engines. More experience is need before we generalize the CONSENT cookies over all google engines. Related: - e9a6ab401 [fix] youtube - send CONSENT Cookie to not be redirected - https://github.com/benbusby/whoogle-search/issues/311 - https://github.com/benbusby/whoogle-search/issues/243 [1] https://en.wikipedia.org/wiki/General_Data_Protection_Regulation Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-18[fix] google-news engine - KeyError: 'hl in requestMarkus Heiser
Since we added - 1c67b6aec [enh] google engine: supports "default language" there is a KeyError: 'hl in request,error pattern:: ERROR:searx.searx.search.processor.online:engine google news : exception : 'hl' Traceback (most recent call last): File "searx/search/processors/online.py", line 144, in search search_results = self._search_basic(query, params) File "searx/search/processors/online.py", line 118, in _search_basic self.engine.request(query, params) File "searx/engines/google_news.py", line 97, in request if lang_info['hl'] == 'en': KeyError: 'hl' Closes: https://github.com/searxng/searxng/issues/154 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-11[fix] log messages from: google- images, news, scholar, videosMarkus Heiser
- HTTP header Accept-Language --> lang_info['headers']['Accept-Language'] - remove obsolete query_url log messages which is already logged by httpx._client:HTTP request Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-10[enh] google engine: supports "default language"Alexandre Flament
Same behaviour behaviour than Whoogle [1]. Only the google engine with the "Default language" choice "(all)"" is changed by this patch. When searching for a locate place, the result are in the expect language, without missing results [2]: > When a language is not specified, the language interpretation is left up to > Google to decide how the search results should be delivered. The query parameters are copied from Whoogle. With the ``all`` language: - add parameter ``source=lnt`` - don't use parameter ``lr`` - don't add a ``Accept-Language`` HTTP header. The new signature of function ``get_lang_info()`` is: lang_info = get_lang_info(params, lang_list, custom_aliases, supported_any_language) Argument ``supported_any_language`` is True for google.py and False for the other google engines. With this patch the function now returns: - query parameters: ``lang_info['params']`` - HTTP headers: ``lang_info['headers']`` - and as before this patch: - ``lang_info['subdomain']`` - ``lang_info['country']`` - ``lang_info['language']`` [1] https://github.com/benbusby/whoogle-search [2] https://github.com/benbusby/whoogle-search/releases/tag/v0.5.4
2021-04-26[pylint] tag PYLINT_FILES by comment `# lint: pylint`Markus Heiser
These py files are linted by `test.pylint`, all other files are linted by `test.pep8`. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-02-01[mod] dynamically set language_support variableAlexandre Flament
The language_support variable is set to True by default, and set to False in only 5 engines. Except the documentation and the /config URL, this variable is not used. This commit remove the variable definition in the engines, and set value according to supported_languages length: False when the length is 0, True otherwise. Close #2485
2021-01-28[fix] normalize the language & region aspects of all google enginesMarkus Heiser
BTW: make the engines ready for search.checker: - replace eval_xpath by eval_xpath_getindex and eval_xpath_list - google_images: remove outer try/except block Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-24[fix] google_news: avoid one HTTP redirect except for the English resultsAlexandre Flament
also add params['soft_max_redirects'] = 1 to avoid false error reporting in /stats/errors
2021-01-23[fix] google-news: query uses locale without country tagMarkus Heiser
Wthout country-region tag google will redirect to correct the contry tag [1]: SEARX_DEBUG=1 searx-checker -v "google news" ... https://news.google.com:443 "GET /search?q=computer&hl=en... HTTP/1.1" 302 0 https://news.google.com:443 "GET /search?q=computer&hl=en-US&.... HTTP/1.1" 200 None ... [1] https://github.com/searx/searx/pull/2483#issuecomment-765600849 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-22[fix] revise of the google-news engineMarkus Heiser
This revise is based on the methods developed in the revise of the google engine (see commit 410c2f9). Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-14[enh] engines: add about variableAlexandre Flament
move meta information from comment to the about variable so the preferences, the documentation can show these information
2020-12-01[mod] pylint: numerous minor code fixesAlexandre Flament
2020-11-14[mod] remove unused importAlexandre Flament
use from searx.engines.duckduckgo import _fetch_supported_languages, supported_languages_url # NOQA so it is possible to easily remove all unused import using autoflake: autoflake --in-place --recursive --remove-all-unused-imports searx tests
2020-09-10Drop Python 2 (1/n): remove unicode string and url_utilsDalf
2020-02-25bugfix: google-news and bing-news has changed the language parameterMarkus Heiser
closes: https://github.com/asciimoo/searx/issues/1838 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2019-01-07Revert "remove 'all' option from search languages"Noémi Ványi
This reverts commit 4d1770398a6af8902e75c0bd885781584d39e796.
2019-01-06[fix] check language aliases when setting search languageMarc Abonce Seguin
2018-04-08[fix] google news xpathMarc Abonce Seguin
2018-03-27refactor engine's search language handlingMarc Abonce Seguin
Add match_language function in utils to match any user given language code with a list of engine's supported languages. Also add language_aliases dict on each engine to translate standard language codes into the custom codes used by the engine.
2017-12-06remove 'all' option from search languagesmarc
2017-08-31[fix] google news dom xpath fixmisnyo
2017-05-15[enh] py3 compatibilityAdam Tauber
2017-01-10[fix] skip non-complete google news resultsAdam Tauber
2016-12-28Merge branch 'master' into languagesAdam Tauber
2016-12-23[fix] handle missing images in google newsAdam Tauber
2016-12-15tests for _fetch_supported_languages in enginesmarc
and refactor method to make it testable without making requests
2016-12-13[mod] fetch supported languages for several enginesmarc
utils/fetch_languages.py gets languages supported by each engine and generates engines_languages.json with each engine's supported language.
2016-12-13[enh] add supported_languages on engines and auto-generate languages.pymarc
2016-12-11add year to time range to engines which support "Last year"Noémi Ványi
Engines: * Bing images * Flickr (noapi) * Google * Google Images * Google News
2016-12-11[fix] google news engine change follow-upAdam Tauber
2015-05-02update versions.cfg to use the current up-to-date packagesAlexandre Flament
2015-01-31Google news' unit testCqoicebordel
2014-12-07[fix] pep8 : engines (errors E121, E127, E128 and E501 still exist)dalf
2014-09-01add comments to google-enginesThomas Pointhuber
2014-03-18[fix] remove unused imports ++ pep8Adam Tauber
2014-03-18simplify datetime extractionThomas Pointhuber
2014-03-15[fix] pep8Adam Tauber
2014-03-14showing publishedDate for newsThomas Pointhuber
2014-03-04[fix] pep8Adam Tauber
2014-03-04Create google_news.pyThomas Pointhuber