summaryrefslogtreecommitdiff
path: root/searx/utils.py
AgeCommit message (Collapse)Author
2024-11-24[chore] *: fix typos detected by typos-cliBnyro
2024-10-03add get_embeded_stream_url to searx.utilsAustin-Olacsi
2024-07-27[feat] videos template: support for view countBnyro
2024-07-27[fix] remove unused code / `_STORAGE_UNIT_VALUE`Markus Heiser
The `_STORAGE_UNIT_VALUE` dictionary is a left over from: - https://github.com/searxng/searxng/pull/3570 in this PR we removed the old implementations but forgot to delete this `_STORAGE_UNIT_VALUE`. Closes: https://github.com/searxng/searxng/pull/3672 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-06-15[perf] torrents.html, files.html: don't parse and re-format filesizeBnyro
2024-05-29[enh] add re-usable func to filter textAllen
2024-04-08[fix] remove usage of no longer existing names from lxmlMarkus Heiser
In lxml 5.1.1 the private name `_ElementStringResult` in module `lxml.etree` does no longer exists. This code was written nearly a decade ago, its no longer clear what the intention `_ElementStringResult` and `_ElementUnicodeResult` had been. It can be assumed that these classes will no longer occur. Closes: https://github.com/searxng/searxng/issues/3368 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-03-11[mod] pylint all files with one profile / drop PYLINT_SEARXNG_DISABLE_OPTIONMarkus Heiser
In the past, some files were tested with the standard profile, others with a profile in which most of the messages were switched off ... some files were not checked at all. - ``PYLINT_SEARXNG_DISABLE_OPTION`` has been abolished - the distinction ``# lint: pylint`` is no longer necessary - the pylint tasks have been reduced from three to two 1. ./searx/engines -> lint engines with additional builtins 2. ./searx ./searxng_extra ./tests -> lint all other python files Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-03-07[fix] nyaa engine - paging support & filesize (GiB)Markus Heiser
BTW: pylint engine Closes: https://github.com/searxng/searxng/issues/3290 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-02-25[refactor] images: add resolution, image format and filesize fieldsBnyro
Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
2023-10-22[fix] HTMLParser: undocumented not implemented methodMarkus Heiser
In python versions <py3.10 there is an issue with an undocumented method HTMLParser.error() [1][2] that was deprecated in Python 3.4 and removed in Python 3.5. To be compatible to higher versions (>=py3.10) an error method is implemented which throws an AssertionError exception like the higher Python versions do [3]. [1] https://github.com/python/cpython/issues/76025 [2] https://bugs.python.org/issue31844 [3] https://github.com/python/cpython/pull/8562 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-09-18[fix] spellingjazzzooo
2023-09-15[fix] brave.newsjazzzooo
2023-09-09Replace chompjs with pure Python codeAlexandre Flament
The new implementation is good enough for the current usage (brave)
2023-09-08[mod] utils.py: add markdown_to_text helper functionBnyro
2023-03-24[mod] replace utils.match_language by locales.match_localeMarkus Heiser
This patch replaces the *full of magic* ``utils.match_language`` function by a ``locales.match_locale``. The ``locales.match_locale`` function is based on the ``locales.build_engine_locales`` introduced in 9ae409a0 [1]. In the past SearXNG did only support a search by a language but not in a region. This has been changed a long time ago and regions have been added to SearXNG core but not to the engines. The ``utils.match_language`` was the function to handle the different aspects of language/regions in SearXNG core and the supported *languages* in the engine. The ``utils.match_language`` did it with some magic and works good for most use cases but fails in some edge case. To replace the concurrence of languages and regions in the SearXNG core the ``locales.build_engine_locales`` was introduced in 9ae409a0 [1]. With the last patches all engines has been migrated to a ``fetch_traits`` and a language/region concept that is based on ``locales.build_engine_locales``. To summarize: there is no longer a need for the ``locales.match_language``. [1] https://github.com/searxng/searxng/pull/1652 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24[mod] replace searx.languages by searx.sxng_localesMarkus Heiser
With the language and region tags from the EngineTraitsMap the handling of SearXNG's tags of languages and regions has been normalized and is no longer a *mystery*. The "languages" became "locales" that are supported by babel and by this, the update_engine_traits.py can be simplified a lot. Other code places can be simplified as well, but these simplifications should (respectively can) only be done when none of the engines work with the deprecated EngineTraits.supported_languages interface anymore. This commit replaces searx.languages by searx.sxng_locales and fix the naming of some names from "language" to "locale" (e.g. language_codes --> sxng_locales). Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-02-19[doc] improved docs of implementations for automatic speech recognitionMarkus Heiser
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-02-17Add "Auto-detected" as a language.Alexandre Flament
When the user choose "Auto-detected", the choice remains on the following queries. The detected language is displayed. For example "Auto-detected (en)": * the next query language is going to be auto detected * for the current query, the detected language is English. This replace the autodetect_search_language plugin.
2022-12-26Lazy load fasttext-predictAlexandre Flament
2022-12-16Replace langdetect with fasttextArtikusHG
2022-09-27[fix] typos / reported by @kianmeng in searx PR-3366Markus Heiser
[PR-3366] https://github.com/searx/searx/pull/3366 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-30[fix] pyright repported errorsAlexandre Flament
The errors make pyright usage useless since a new error won't be seen [1]. [1] https://github.com/searxng/searxng/pull/1569 ``` searx/compat.py:11:27 - error: Expression of type "Type[cached_property[_T@cached_property]]" cannot be assigned to declared type "Type[cached_property]" "Type[cached_property[_T@cached_property]]" is incompatible with "Type[cached_property]" Type "Type[cached_property[_T@cached_property]]" cannot be assigned to type "Type[cached_property]" (reportGeneralTypeIssues) searx/utils.py:69:36 - error: Expression of type "None" cannot be assigned to parameter of type "str" Type "None" cannot be assigned to type "str" (reportGeneralTypeIssues) searx/utils.py:573:85 - error: Expression of type "None" cannot be assigned to parameter of type "int" Type "None" cannot be assigned to type "int" (reportGeneralTypeIssues) searx/webapp.py:1306:22 - error: Argument of type "str" cannot be assigned to parameter "__a" of type "BytesPath" in function "join" Type "str" cannot be assigned to type "BytesPath" "str" is incompatible with "bytes" "str" is incompatible with protocol "PathLike[bytes]" "__fspath__" is not present (reportGeneralTypeIssues) searx/webapp.py:1306:68 - error: Argument of type "Literal['themes']" cannot be assigned to parameter "paths" of type "BytesPath" in function "join" Type "Literal['themes']" cannot be assigned to type "BytesPath" "Literal['themes']" is incompatible with "bytes" "Literal['themes']" is incompatible with protocol "PathLike[bytes]" "__fspath__" is not present (reportGeneralTypeIssues) searx/webapp.py:1306:78 - error: Argument of type "str | Any | None" cannot be assigned to parameter "paths" of type "BytesPath" in function "join" Type "str | Any | None" cannot be assigned to type "BytesPath" Type "str" cannot be assigned to type "BytesPath" "str" is incompatible with "bytes" "str" is incompatible with protocol "PathLike[bytes]" "__fspath__" is not present (reportGeneralTypeIssues) searx/webapp.py:1306:85 - error: Argument of type "Literal['img']" cannot be assigned to parameter "paths" of type "BytesPath" in function "join" Type "Literal['img']" cannot be assigned to type "BytesPath" "Literal['img']" is incompatible with "bytes" "Literal['img']" is incompatible with protocol "PathLike[bytes]" "__fspath__" is not present (reportGeneralTypeIssues) searx/engines/mongodb.py:8:6 - warning: Import "pymongo" could not be resolved (reportMissingImports) searx/engines/mysql_server.py:9:8 - warning: Import "mysql.connector" could not be resolved (reportMissingImports) searx/engines/postgresql.py:9:8 - warning: Import "psycopg2" could not be resolved from source (reportMissingModuleSource) searx/engines/xpath.py:187:28 - warning: "categories" is not defined (reportUndefinedVariable) searx/search/__init__.py:184:82 - warning: "flask" is not defined (reportUndefinedVariable) searx/search/checker/background.py:19:26 - error: Type of "schedule" is partially unknown Type of "schedule" is "(delay: Any, func: Any, *args: Any) -> Literal[True]" (reportUnknownVariableType) searx/shared/__init__.py:8:12 - warning: Import "uwsgi" could not be resolved (reportMissingImports) searx/shared/shared_uwsgi.py:5:8 - warning: Import "uwsgi" could not be resolved (reportMissingImports) ```
2022-06-03[fix] prepare for pylint 2.14.0Markus Heiser
Remove issue reported by Pylint 2.14.0: - no-self-use: has been moved to optional extension [1] - The refactoring checker now also raises 'consider-using-generator' messages for max(), min() and sum(). [2] .pylintrc: - <option name>-hint has been removed since long, Pylint 2.14.0 raises an error on invalid options - bad-continuation and bad-whitespace have been removed [3] [1] https://pylint.pycqa.org/en/latest/whatsnew/2/2.14/summary.html#removed-checkers [2] https://pylint.pycqa.org/en/latest/whatsnew/2/2.14/full.html#what-s-new-in-pylint-2-14-0 [2] https://pylint.pycqa.org/en/latest/whatsnew/2/2.6/summary.html#summary-release-highlights Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-04-22[test.pyright] suppress unneeded error & warning messagesMarkus Heiser
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-04-16searx.utils.html_to_text: replace <br/> by a spaceAlexandre Flament
2022-01-30[mod] searx.utils: more typingAlexandre Flament
2022-01-29[mod] add documentation about searx.utilsAlexandre Flament
This module is a toolbox for the engines. Is should be documented. In addition, searx/utils.py is checked by pylint.
2021-12-27[format.python] initial formatting of the python codeMarkus Heiser
This patch was generated by black [1]:: make format.python [1] https://github.com/psf/black Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-10-12[fix] fix match_language issue to make zh-TW match to zh-Hant-TWMarc Abonce Seguin
pybabel separates locales with underscores but we use hyphens everywhere babel doesn't directly touch
2021-10-06[fix] don't mix loaded modules with imported modules (sys.modules)Markus Heiser
The utils.load_module() function is used to load a python file (aka module) and return the module's namespace. SearXNG uses this function to load *engines and answerers* from arbitrary locations with arbitrary modifications. These are not real python modules and it is not intended to mix this *engines and answerers* with the python modules registered in sys.modules. Closes: https://github.com/searxng/searxng/issues/312 Suggested-by: @dalf in https://github.com/searxng/searxng/issues/312 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-08-24[mod] searx.utils.dict_subset: rewrite with comprehensionAlexandre Flament
2021-07-30version based on the git repositoryAlexandre Flament
This commit remove the need to update the brand for GIT_URL and GIT_BRANCH: there are read from the git repository. It is possible to call python -m searx.version freeze to freeze the current version. Useful when the code is installed outside git (distro package, docker, etc...)
2021-06-09[fix] strip spaces from searx user agentAlexandre Flament
h11 (used by httpx) rejects HTTP request with a trailing space in HTTP headers
2021-06-01[mod] move all default settings into searx.settings_defaultsAlexandre Flament
2021-05-28[mod] utils.get_value() - avoidance of a recursionMarkus Heiser
In a comment [1] dalf suggested to avoid a recursion of get_value() [1] https://github.com/searxng/searxng/pull/99#discussion_r640833716 Suggested-by: Alexandre Flament <alex@al-f.net> Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-28[enh] add settings option to enable/disable search formatsMarkus Heiser
Access to formats can be denied by settings configuration:: search: formats: [html, csv, json, rss] Closes: https://github.com/searxng/searxng/issues/95 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-04-10[enh] replace requests by httpxAlexandre Flament
2020-12-20[fix] pylint: use "raise ... from ..."Alexandre Flament
2020-12-03[mod] bing_news: use eval_xpath_getindexAlexandre Flament
remove unused function searx.utils.list_get
2020-12-03[enh] record details exception per engineAlexandre Flament
add an new API /stats/errors
2020-12-01[mod] pylint: numerous minor code fixesAlexandre Flament
2020-11-14[mod] remove unused importAlexandre Flament
use from searx.engines.duckduckgo import _fetch_supported_languages, supported_languages_url # NOQA so it is possible to easily remove all unused import using autoflake: autoflake --in-place --recursive --remove-all-unused-imports searx tests
2020-10-28[mod] duckduckgo_definitions: display only user friendly attributes / URLAlexandre Flament
various bug fixes
2020-10-07[mod] Add searx.data moduleAlexandre Flament
Instead of loading the data/*.json in different location, load these files in the new searx.data module.
2020-10-06[fix] drop Python 2: use importlib instead of imp.load_sourceAlexandre Flament
imp.load_source is not documented in Python 3 see documentation : https://docs.python.org/3/library/importlib.html#importing-a-source-file-directly partial fix of https://github.com/searx/searx/issues/1674
2020-10-03[mod] searx.utils.normalize_url: remove Yahoo hackAlexandre Flament
* The hack for Yahoo URLs is not necessary anymore. (see searx.engines.yahoo.parse_url) * move the URL normalization in extract_url to normalize_url
2020-10-02[mod] searx/utils.py: add docstringAlexandre Flament
2020-10-02[mod] move extract_text, extract_url to searx.utilsAlexandre Flament
2020-09-22[mod] add searx/webutils.pyAlexandre Flament
contains utility functions and classes used only by webapp.py