summaryrefslogtreecommitdiff
path: root/searx/engines/wikipedia.py
AgeCommit message (Collapse)Author
2024-03-11[mod] pylint all engines without PYLINT_SEARXNG_DISABLE_OPTIONMarkus Heiser
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-09-19wikipedia wikidata infobox + disable wikisource (#2806)Émilien (perso)
Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-25[fix] engine & network issues / documentation and type annotationsMarkus Heiser
This patch fixes some quirks and issues related to the engines and the network. Each engine has its own network and this network was broken for the following engines[1]: - archlinux - bing - dailymotion - duckduckgo - google - peertube - startpage - wikipedia Since the files have been touched anyway, the type annotaions of the engine modules has also been completed so that error messages from the type checker are no longer reported. Related and (partial) fixed issue: - [1] https://github.com/searxng/searxng/issues/762#issuecomment-1605323861 - [2] https://github.com/searxng/searxng/issues/2513 - [3] https://github.com/searxng/searxng/issues/2515 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-15[fix] searxng_extra/update/update_engine_descriptions.py (part 1)Markus Heiser
Follow up of #2269 The script to update the descriptions of the engines does no longer work since PR #2269 has been merged. searx/engines/wikipedia.py ========================== 1. There was a misusage of zh-classical.wikipedia.org: - `zh-classical` is dedicate to classical Chinese [1] which is not traditional Chinese [2]. - zh.wikipedia.org has LanguageConverter enabled [3] and is going to dynamically show simplified or traditional Chinese according to the HTTP Accept-Language header. 2. The update_engine_descriptions.py needs a list of all wikipedias. The implementation from #2269 included only a reduced list: - https://meta.wikimedia.org/wiki/Wikipedia_article_depth - https://meta.wikimedia.org/wiki/List_of_Wikipedias searxng_extra/update/update_engine_descriptions.py ================================================== Before PR #2269 there was a match_language() function that did an approximation using various methods. With PR #2269 there are only the types in the data model of the languages, which can be recognized by babel. The approximation methods, which are needed (only here) in the determination of the descriptions, must be replaced by other methods. [1] https://en.wikipedia.org/wiki/Classical_Chinese [2] https://en.wikipedia.org/wiki/Traditional_Chinese_characters [3] https://www.mediawiki.org/wiki/Writing_systems#LanguageConverter Closes: https://github.com/searxng/searxng/issues/2330 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24[mod] wikipedia & wikidata: upgrade to data_type: traits_v1Markus Heiser
BTW this fix an issue in wikipedia: SearXNG's locales zh-TW and zh-HK are now using language `zh-classical` from wikipedia (and not `zh`). Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24[mod] Wikipedia: fetch engine traits (data_type: supported_languages)Markus Heiser
Implements a fetch_traits function for the Wikipedia engines. .. note:: Does not include migration of the request methode from 'supported_languages' to 'traits' (EngineTraits) object! Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-01-29wikipedia engine: update _fetch_supported_languagesAlexandre Flament
the layout https://meta.wikimedia.org/wiki/List_of_Wikipedias has changed
2022-08-01[mod] add 'Accept-Language' HTTP header to online processoresMarkus Heiser
Most engines that support languages (and regions) use the Accept-Language from the WEB browser to build a response that fits to the language (and region). - add new engine option: send_accept_language_header Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-12-27[format.python] initial formatting of the python codeMarkus Heiser
This patch was generated by black [1]:: make format.python [1] https://github.com/psf/black Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-04-12[httpx] replace searx.poolrequests by searx.networkAlexandre Flament
settings.yml: * outgoing.networks: * can contains network definition * propertiers: enable_http, verify, http2, max_connections, max_keepalive_connections, keepalive_expiry, local_addresses, support_ipv4, support_ipv6, proxies, max_redirects, retries * retries: 0 by default, number of times searx retries to send the HTTP request (using different IP & proxy each time) * local_addresses can be "192.168.0.1/24" (it supports IPv6) * support_ipv4 & support_ipv6: both True by default see https://github.com/searx/searx/pull/1034 * each engine can define a "network" section: * either a full network description * either reference an existing network * all HTTP requests of engine use the same HTTP configuration (it was not the case before, see proxy configuration in master)
2021-03-25[fix] wikipedia: remove HTML from the titleAlexandre Flament
fr.wikipedia.org (and it seems not other wikipedia websites), adds HTML to api_result['displayTitle']. (Search for '!wp :fr Braid' for example) The commit uses api_result['title']
2021-02-25remove articles number from engines_languages.jsonMarc Abonce Seguin
2021-02-11[upd] wikipedia engine: return an empty result on query with illegal charactersAlexandre Flament
on some queries (like an IT error message), wikipedia returns an HTTP error 400. this commit returns an empty result instead of showing an error to the user.
2021-02-08add support for Chinese variants in WikipediaMarc Abonce Seguin
2021-01-14[enh] engines: add about variableAlexandre Flament
move meta information from comment to the about variable so the preferences, the documentation can show these information
2020-12-11[enh] add raise_for_httperrorAlexandre Flament
check HTTP response: * detect some comme CAPTCHA challenge (no solving). In this case the engine is suspended for long a time. * otherwise raise HTTPError as before the check is done in poolrequests.py (was before in search.py). update qwant, wikipedia, wikidata to use raise_for_httperror instead of raise_for_status
2020-12-07[fix] wikipedia: minor fix: return no result instead of crash in some very ↵Alexandre Flament
few cases. In few cases, the JSON results doesn't contains the key 'type'.
2020-12-04[fix] wikipedia engine: don't raise an error when the query is not foundAlexandre Flament
Add a new parameter "raise_for_status", set by default to True. When True, any HTTP status code >= 300 raise an exception ( #2332 ) When False, the engine can manage the HTTP status code by itself.
2020-09-10Drop Python 2 (1/n): remove unicode string and url_utilsDalf
2020-09-10use Wikipedia's REST v1 APIMarc Abonce Seguin
2020-07-26fix Wikipedia's paragraph extractionMarc Abonce Seguin
2019-12-21remove empty parenthesis in wikipedia's summaryMarc Abonce Seguin
They're usually IPA pronunciations which are removed by the API.
2019-12-21exclude disambiguation pages from wikipedia infoboxMarc Abonce Seguin
2019-12-21[fix] handle empty response from wikipedia engine - closes #1114Adam Tauber
2019-01-07fix after rebaseNoémi Ványi
2019-01-07Revert "remove 'all' option from search languages"Noémi Ványi
This reverts commit 4d1770398a6af8902e75c0bd885781584d39e796.
2019-01-06[fix] check language aliases when setting search languageMarc Abonce Seguin
2018-03-27refactor engine's search language handlingMarc Abonce Seguin
Add match_language function in utils to match any user given language code with a list of engine's supported languages. Also add language_aliases dict on each engine to translate standard language codes into the custom codes used by the engine.
2017-12-06remove 'all' option from search languagesmarc
2017-05-15[enh] py3 compatibilityAdam Tauber
2016-12-29change language list to only include languages with a minimum of enginesmarc
that support them. users can still query lesser supported through the :lang_code bang.
2016-12-16minor fixes in utils/fetch_languages.pymarc
2016-12-15tests for _fetch_supported_languages in enginesmarc
and refactor method to make it testable without making requests
2016-12-13[mod] fetch supported languages for several enginesmarc
utils/fetch_languages.py gets languages supported by each engine and generates engines_languages.json with each engine's supported language.
2016-12-13[enh] add supported_languages on engines and auto-generate languages.pymarc
2016-08-05[fix] urls merge in infobox (#593)marc
TODO: merge attributes
2016-04-17[enh] wikipedia infoboxa01200356
creates simple multilingual infobox using wikipedia's api
2014-09-03using general mediawiki-engineThomas Pointhuber
* writing general mediawiki-engine * using this engine for wikipedia * using this engine for uncyclopedia
2014-09-02fix wikipedia engine and add commentsThomas Pointhuber
* add paging support * make number_of_results changable * make result calculation more clear * add comments
2014-01-31[enh] search language support initasciimoo
2013-10-23[mod] wikipedia engine removedasciimoo
2013-10-16[mod] wikipedia limited to first resultasciimoo
2013-10-15[enh] proper urlsasciimoo
2013-10-15[enh] wikipedia search addedasciimoo