summaryrefslogtreecommitdiff
path: root/searx/engines
AgeCommit message (Collapse)Author
2024-11-17[fix] engine: duckduckgo - don't quote query stringMarkus Heiser
The query string send to DDG must not be qouted. The query string was URL-qouted in #4011, but the URL-qouted query string result in unexpected *URL decoded* and other garbish results as reported in #4019 and #4020. To test compare the results of a query like:: !ddg Häuser und Straßen :de !ddg Häuser und Straßen :all !ddg 房屋和街道 :all !ddg 房屋和街道 :zh Closed: - [#4019] https://github.com/searxng/searxng/issues/4019 - [#4020] https://github.com/searxng/searxng/issues/4020 Related: - [#4011] https://github.com/searxng/searxng/pull/4011 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-11-14[fix] engine: duckduckgo - only uses first word of the search termsNicolas Dato
during the revision in PR #3955 the query string was accidentally converted into a list of words, further the query must be quoted before POSTed in the ``data`` field, see ``urllib.parse.quote_plus`` [1] [1] https://docs.python.org/3/library/urllib.parse.html#urllib.parse.quote_plus Closed: #4009 Co-Authored-by: @return42
2024-11-01[fix] annas archive: crash when no thumbnail, differing results, pagingBnyro
2024-10-31[fix] google: display every result when keyword is contained in content fielduply23333
2024-10-29[refactor] engine: duckduckgo - https://html.duckduckgo.com/htmlMarkus Heiser
The entire source code of the duckduckgo engine has been reengineered and purified. 1. DDG used the URL https://html.duckduckgo.com/html for no-JS requests whose response is also easier to parse than the previous https://lite.duckduckgo.com/lite/ URL 2. the bot detection of DDG has so far caused problems and often led to a CAPTCHA, this can be circumvented using `'Sec-Fetch-Mode'] = “navigate”` Closes: https://github.com/searxng/searxng/issues/3927 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-10-19[fix] engine: duckduckgo - CAPTCHA detectionMarkus Heiser
The previous implementation could not distinguish a CAPTCHA response from an ordinary result list. In the previous implementation a CAPTCHA was taken as a result list where no items are in. DDG does not block IPs. Instead, a CAPTCHA wall is placed in front of request on a dubious request. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-10-15[upd] pypi: Bump pylint from 3.2.7 to 3.3.1dependabot[bot]
Bumps [pylint](https://github.com/pylint-dev/pylint) from 3.2.7 to 3.3.1. - [Release notes](https://github.com/pylint-dev/pylint/releases) - [Commits](https://github.com/pylint-dev/pylint/compare/v3.2.7...v3.3.1) --- updated-dependencies: - dependency-name: pylint dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>
2024-10-15[feat] engine: support for openlibraryBnyro
2024-10-15[enh] engine: mojeek - add language support0xhtml
Improve region and language detection / all locale Testing has shown the following behaviour for the different default and empty values of Mojeeks parameters: | param | idx | value | behaviour | | -------- | --- | ------ | ------------------------- | | region | 0 | '' | detect region based on IP | | region | 1 | 'none' | all regions | | language | 0 | '' | all languages |
2024-10-14[mod] engine gitea: compatible with modern gitea or forgejoSnoweuph
Without this patch the Gitea Search Engine is only partially compatible with modern gitea or forgejo: - Fixing some JSON Fields - Using Repository Avatar when Available To Verify My results you can look at the Modern API doc and results, its available on all Gitea and Forgejo instance by Default. Heres an Search API result of Mine: - https://git.euph.dev/api/v1/repos/search?q=ccna
2024-10-03[doc] slightly improve documentation of SQL enginesMarkus Heiser
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-10-03[feat] implement mariadb engineGrant Lanham
2024-10-03add get_embeded_stream_url to searx.utilsAustin-Olacsi
2024-09-29[enh] engine: stract - add language/region support0xhtml
2024-09-26[fix] use get accessor to pull desc from bing_imagesGrant Lanham
2024-09-23add Cloudflare AI Gateway engineZhijie He
add Cloudflare AI Gateway engine add settings for Cloudflare AI Gateway engine set utf8 encode for data, fix non english char cause 500 error format json data fixed indentation and config format error fix line-length limitation in CI reformatted code for CI reformatted code for CI limit system prompts to less 120 chars cleanup unused variable & format code
2024-09-15[fix] Removes ``/>`` ending tags for void HTML elementsGrant Lanham
Removes ``/>`` ending tags for void elements [1] and replaces them with ``>``. Part of the larger cleanup to cleanup invalid HTML throughout the codebase [2]. [1] https://html.spec.whatwg.org/multipage/syntax.html#void-elements [2] https://github.com/searxng/searxng/issues/3793
2024-09-15[fix] engine: qwant - detect captchaUrl and raise SearxEngineCaptchaExceptionMarkus
So far a CAPTCHA was not recognized in the response of the qwant engine and a SearxEngineAPIException was raised by mistake. With this patch a CAPTCHA redirect is recognized and the correct SearxEngineCaptchaException is raised. Closes: https://github.com/searxng/searxng/issues/3806 Signed-off-by: Markus <markus@venom.fritz.box>
2024-09-15[fix] fetch_traits: brave, google, annas_archive & radio_browserMarkus
This patch fixes a bug reported by CI "Fetch traits" [1] (brave) and improves other fetch traits functions (google, annas_archive & radio_browser). brave: File "/home/runner/work/searxng/searxng/searx/engines/brave.py", line 434, in fetch_traits sxng_tag = region_tag(babel.Locale.parse(ui_lang, sep='-')) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/runner/work/searxng/searxng/searx/locales.py", line 155, in region_tag Error: raise ValueError('%s missed a territory') google: change ERROR message about unknow UI language to INFO message radio_browser: country_list contains duplicates that differ only in upper/lower case annas_archive: for better diff; sort the persistence of the traits [1] https://github.com/searxng/searxng/actions/runs/10606312371/job/29433352518#step:6:41 Signed-off-by: Markus <markus@venom.fritz.box>
2024-09-15[feat] gitlab: implement dedicated moduleBnyro
Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
2024-09-12[fix] json_engine: Fix result fields being mixed upLucas Schwiderski
Fixes #3810.
2024-09-12[fix] yep engine: remove links to other engines0xhtml
Yep includes links to search for the same query on Google and other search engines as a result in the search result. This fix skips these results.
2024-09-06[fix] bilibili engine - ValueError in duration & HTML in titleMarkus Heiser
- ValueError in duration: issue reported in #3799 - HTML in title: related to #3770 [#3799] https://github.com/searxng/searxng/issues/3799 [#3770] https://github.com/searxng/searxng/pull/3770 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-09-03[fix] engine yahoo: HTML tags are included in result titlesMarkus
- https://github.com/searxng/searxng/issues/3790 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-08-30[fix] Do not show DDG user-agent from zero clickAlexander Sulfrian
We do not want to show the user-agent information from the duckduckgo zero click info. This is the user-agent used by searxng and not the user-agent used by the user. This was already done for the IP address in: 0fb3f0e4aeecf62612cb6568910cf0f97c98cab9
2024-08-21[feat] engine: implementation of yandex (web, images)Austin-Olacsi
It's set to inactive in settings.yml because of CAPTCHA. You need to remove that from the settings.yml to get in use. Closes: https://github.com/searxng/searxng/issues/961
2024-08-21Fix tineye engine url, datetime parsing, and minor refactorGrant Lanham
Changes made to tineye engine: 1. Importing logging if TYPE_CHECKING is enabled 2. Remove unecessary try-catch around json parsing the response, as this masked the original error and had no immediate benefit 3. Improve error handling explicitely for status code 422 and 400 upfront, deferring json_parsing only for these status codes and successful status codes 4. Unit test all new applicable changes to ensure compatability
2024-08-08[fix] engine google: use extract_text everywhere0xhtml
2024-08-08[fix] engine google: strip bubble text from answers0xhtml
Google underlines words inside of answers that can be clicked to show additional definitions. These definitions inside the answer were not correctly handled and ended up in the middle of the answer text. With this fix, the extra definitions are stripped from the answer shown by the frontend.
2024-07-29[fix] brave fetch_traits: Brave added Chinese (zh-hant) to UIMarkus Heiser
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-07-28[fix] engine geizhals: if there are no offers, there is no best priceMarkus Heiser
Fault pattern: if there are no offers, then an exception has been thrown: IndexError: list index out of range This patch makes the addition of “best price” dependent on whether one exists. Closes: #3685 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-07-27[feat] videos template: support for view countBnyro
2024-07-27[feat] engine: implementation of geizhals.deBnyro
2024-07-27[enh] Add API Key support for discourse.org forumsSylvain Cau
2024-07-20[fix] engine yacy: update list of base URLsMarkus Heiser
https://search.lomig.me Poor results / tested `!yacy :en hello` and got zero results https://yacy.ecosys.eu Slow response (> 6sec for trivial search terms) https://search.webproject.link Dead instance / URL offline Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-07-15Update mullvad_leta.py to account for img_elemGrant Lanham
A recent update from Mullvad Leta introduced the img_elem. This update broke the existing logic. Now, by checking the length of the dom_result to see if it was included in the return results, we can handle the logic accordingly.
2024-07-14[feat] engine: implementation of alpine linux packagesBnyro
Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
2024-07-07Implement google/brave switch in Mullvad LetaGrant Lanham
cleanup Import annontations
2024-07-03[fix] gentoo: use mediawiki engineBnyro
2024-06-30[mod] libretranslate: add direct link to translation (engine)Thomas Renard
2024-06-25[fix] brave fetch_traits: layout of the settings page has changedMarkus Heiser
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-06-25[fix] engine zlibrary: handle seized domainMarkus Heiser
The domains of zlibrary instances are known to be seized from time to time. This leads to problems when, for example, the automated tasks try to update the engine traits (aka fetch_traits). The search function should also generate a suitable error message (currently either SSL errors or empty result lists are returned). [1] [1] https://github.com/searxng/searxng/issues/3610 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-06-25[fix] bing news results return invalid imagesMarkus Heiser
Closes: https://github.com/searxng/searxng/issues/3502 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-06-23[fix] implement tests and remove usage of gen_useragent in enginesGrant Lanham
2024-06-20Fix search_url building.Richard Lyons
2024-06-16[fix] \!goi irrelevant results AND display more resultsAllen
2024-06-15[perf] torrents.html, files.html: don't parse and re-format filesizeBnyro
2024-06-15[feat] mozhi: fix crash, support synonyms and definitionBnyro
2024-06-15[refactor] duckduckgo: use extr helper function in get_vqdBnyro
2024-06-07[feat] mojeek: implement dedicated moduleBnyro