Age | Commit message (Collapse) | Author |
|
The request function should not request a language (aka locale) that is not
supported by qwant. Select a locale like zh-TW ends in qwant's API error:
ERROR searx.engines.qwant news: exception : \
API error::locale must be one of the following values: \
en_gb, en_ie, en_us, en_ca, en_my, en_au, en_nz, de_de, de_ch, de_at, fr_fr, \
fr_be, fr_ch, fr_ca, fr_ad, fc_ca, co_fr, es_es, es_ar, es_cl, es_co, es_mx, \
es_pe, es_ad, ca_es, ca_ad, ca_fr, eu_es, eu_fr, it_it, it_ch, pt_pt, pt_ad, \
nl_be, nl_nl
The existing searx.utils.match_language function is unsuitable for this purpose,
it is replaced by function searx.locales.get_engine_locale that is based on the
methods from the babel package.
The quant's _fetch_supported_languages function has been revised to filter out
languages 8aka locales) not supported by qwant.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
The match_language function sometimes returns incorrect results which is why a
new function get_engine_locale is required.
A bugfix of the match_language is not easily possible, because there is almost
no documentation for it and already the call parameters are undefined. E.g. the
function processes values like the ones from yahoo::
"yahoo": [
"ar",
...
"zh_chs",
"zh_cht"
]
The get_engine_locale has been documented in detail, there is a clear
description of the assumptions as well as the requirements and approximation
rules (read doc-string for more details)::
Argument ``engine_locales`` is a python dict that maps *SearXNG locales* to
corresponding *engine locales*:
<engine>: {
# SearXNG string : engine-string
'ca-ES' : 'ca_ES',
'fr-BE' : 'fr_BE',
'fr-CA' : 'fr_CA',
'fr-CH' : 'fr_CH',
'fr' : 'fr_FR',
...
'pl-PL' : 'pl_PL',
'pt-PT' : 'pt_PT'
}
.. hint::
The *SearXNG locale* string has to be known by babel!
In the following you will find a comparison:
>>> import babel.languages
>>> from searx.utils import match_language
>>> from searx.locales import get_engine_locale
Assume we have an engine that supports the follwoing locales:
>>> lang_list = {
... "zh-CN": "zh_CN",
... "zh-HK": "zh_HK",
... "nl-BE": "nl_BE",
... "fr-CA": "fr_CA",
... }
Assumption:
A. When a user selects a language the results should be optimized according to
the selected language.
B. When user selects a language and a territory the results should be
optimized with first priority on territory and second on language.
----
Example: (Assumption A.)
A user selects region 'zh-TW' which should end in zh_HK
hint:
CN is 'Hans' and HK ('Hant') fits better to TW ('Hant')
>>> get_engine_locale('zh-TW', lang_list)
'zh_HK'
>>> lang_list[match_language('zh-TW', lang_list)]
'zh_CN'
----
Example: (Assumption A.)
A user selects only the language 'zh' which should end in CN
>>> get_engine_locale('zh', lang_list)
'zh_CN'
>>> lang_list[match_language('zh', lang_list)]
'zh_CN'
----
Example: (Assumption B.)
A user selects region 'fr-BE' which should end in nl-BE
hint:
priority should be on the territory the user selected. If the user
prefers 'fr' he will select 'fr' without a region tag.
>>> get_engine_locale('fr-BE', lang_list, default='unknown')
'nl_BE'
>>> match_language('fr-BE', lang_list, fallback='unknown')
'fr-CA'
----
Example: (Assumption A.)
A user selects only the language 'fr' which should end in fr_CA
>>> get_engine_locale('fr', lang_list)
'fr_CA'
>>> lang_list[match_language('fr', lang_list)]
'fr_CA'
----
The difference in priority on the territory is best shown with a engine that
supports the following locales:
>>> lang_list = {
... "fr-FR": "fr_FR",
... "fr-CA": "fr_CA",
... "en-GB": "en_GB",
... "nl-BE": "nl_BE",
... }
----
Example: (Assumption A.)
A user selects only a language
>>> get_engine_locale('en', lang_list)
'en_GB'
>>> match_language('en', lang_list)
'en-GB'
hint: the engine supports fr_FR and fr_CA since no territory is given, fr_FR
takes priority ..
>>> get_engine_locale('fr', lang_list)
'fr_FR'
>>> lang_list[match_language('fr', lang_list)]
'fr_FR'
----
Example: (Assumption B.)
A user selects region 'fr-BE' which should end in nl-BE
>>> get_engine_locale('fr-BE', lang_list)
'nl_BE'
>>> lang_list[match_language('fr-BE', lang_list)]
'fr_FR'
----
If the user selects a language and there are two locales like the following:
>>> lang_list = {
... "fr-BE": "fr_BE",
... "fr-CH": "fr_CH",
... }
>>>
>>> get_engine_locale('fr', lang_list)
'fr_BE'
>>> lang_list[match_language('fr', lang_list)]
'fr_BE'
Looks like both functions return the same value, but match_language depends on the
order of the dictionary (which is not predictable):
>>> lang_list = {
... "fr-CH": "fr_CH",
... "fr-BE": "fr_BE",
... }
>>> get_engine_locale('fr', lang_list)
'fr_BE'
>>> lang_list[match_language('fr', lang_list)]
'fr_CH'
>>>
The get_engine_locale selects the locale by looking at the "population percent"
and this percentage has an higher amount in BE (68.%) compared to CH (21%)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
By using new property `qwant_categ:` the category of qwant is no longer bound to
the category of SearXNG.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
[fix] improve OpenSearch description
|
|
Add neeva engine
|
|
Add custom time_range_url and time_range_map
Set soft_max_redirects = 2 to prevent "ErrorContext('searx/search/processors/online.py', 116, 'count_error(', None, '2 redirects, maximum: 0', ('200', 'OK', 'neeva.com')) True"
|
|
Neeva is "the world's first ad-free, private search engine" and uses data from Apple, Bing, Yelp and "others".
They claim to crawl "hundreds of millions" of URLs a day (https://twitter.com/Neeva/status/1536447373903335426).
|
|
f2997bfa - 2022-08-12 - Markus Heiser <markus.heiser@darmarit.de>
eeca674f - 2022-08-10 - Edrean Ernst <edrean@allesbeste.com>
7478de6a - 2022-08-11 - Markus Heiser <markus.heiser@darmarit.de>
c4fb9110 - 2022-08-07 - wordpure <wordlesspure@gmail.com>
a5b432e2 - 2022-08-11 - Markus Heiser <markus.heiser@darmarit.de>
eb01d415 - 2022-08-09 - Markus Heiser <markus.heiser@darmarit.de>
f96eb06e - 2022-08-11 - Shopimisrel <shopisrael12@gmail.com>
e7c79191 - 2022-08-08 - ajnasaboobacker <ajnasaboobacker@gmail.com>
f4dbd424 - 2022-08-08 - ajnasaboobacker <ajnasaboobacker@gmail.com>
|
|
Some HTTP-Clients do have issues with the ``opensearch.xml`` from SearXNG
(related [1][2]) while other OpenSearch descriptions[3] (e.g. from qwant) work
flawles.
Inspired by the OpenSearch description from qwant and with informations from the
specification[4] the ``opensearch.xml`` has been *improved*.
- convert `<Url>` methods from lower case to upper case (`POST`|`GET`)
- add `<moz:SearchForm>` and `xmlns:moz="http://www.mozilla.org/2006/browser/search/"`
- add `<Query role="example" searchTerms="SearXNG" />` [4]
OpenSearch description documents should include at least one Query element of
`role="example"` that is expected to return search results. Search clients may
use this example query to validate that the search engine is working properly.
- modified `<LongName>` to SearXNG
- modified `<Description>` the word 'hackable' scares uninitiated users and was removed
- add the `type="image/png"` to `<Image>`
Test can be done by::
make run
Visit http://127.0.0.1:8888/ and add the search engine to your WEB-Browser /
test with different WEB-Browser from desktop and Smartphones (are there any iOS
user here, please test on Safari and Chrome).
[1] https://app.element.io/#/room/#searxng:matrix.org/$xN_abdKhNqUlgXRBrb_9F3pqOxnSzGQ1TG0s0G9hQVw
[2] https://github.com/searxng/searxng/issues/431
[3] https://developer.mozilla.org/en-US/docs/Web/OpenSearch
[4] https://github.com/dewitt/opensearch/blob/master/opensearch-1-1-draft-6.md#the-query-element
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
./manage pyenv.cmd python ./searxng_extra/update/update_engine_descriptions.py
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
|
|
output format protobuf to HTML for google mobile
|
|
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
|
|
|
|
This reverts the changes made to the Google results XPath in PR #1633.
|
|
Seems google rolls out changes first on the `google.com` domain and later on the
"language" domains. By example: yesterday [1] `google.com` did not work but
`google.de` and `google.fr` did work, today they do not work any longer and this
fix is needed on all domains.
Closes: https://github.com/searxng/searxng/issues/1628
[1] https://github.com/searxng/searxng/issues/1628#issuecomment-1208191816
|
|
[enh] Initial Marginalia.nu support (foss)
|
|
[mod] engine yep.com: show all 100 results yep.com has
|
|
Currently it uses a public api_key `/public/` [1]
The 'index' parameter selects the search index, corresponding to the drop down
next to the search field in the main GUI.
0: popular
1: blogs
2: big_sites
3: default
4: experimental
'experimental' is more up to date and does not exclude other sites, which is the
case with 'big sites' or 'blogs'.
[1] https://api.marginalia.nu/
[2] https://git.marginalia.nu/marginalia/marginalia.nu
[3] https://news.ycombinator.com/item?id=31536626
Closes: https://github.com/searxng/searxng/issues/1620
|
|
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
yep.com is still in beta, the api.yep.com does not have paging support. There
is only a 'limit' argument with a maximum of 100 results.
yep.com seems fast; there is nor need for a timeout of 12 sec.
The API returns JSON nevertheless what the HTTP header is, the "show more"
button on yep.com's web site does not set a special HTTP Accept header.
FYI: The index does not support languages, the WEB UI does not offer a language
selection of the results and the entire index seems in English.
Closes: https://github.com/searxng/searxng/issues/1619
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
acf8bd39 - 2022-08-05 - Markus Heiser <markus.heiser@darmarit.de>
4ad75b6e - 2022-08-04 - Markus Heiser <markus.heiser@darmarit.de>
ee8cbee6 - 2022-07-31 - Markus Heiser <markus.heiser@darmarit.de>
87c19313 - 2022-08-01 - Academic tyro <y13593582403@gmail.com>
cbe0de32 - 2022-07-30 - Markus Heiser <markus.heiser@darmarit.de>
45029a17 - 2022-08-04 - Markus Heiser <markus.heiser@darmarit.de>
6eec3795 - 2022-08-03 - Markus Heiser <markus.heiser@darmarit.de>
f8d8f31f - 2022-07-29 - Markus Heiser <markus.heiser@darmarit.de>
b3fb365f - 2022-07-29 - Markus Heiser <markus.heiser@darmarit.de>
aaeabbc9 - 2022-08-03 - Lakatos Tamás <tomimost@gmail.com>
6c71c501 - 2022-08-03 - Markus Heiser <markus.heiser@darmarit.de>
f7b5ba19 - 2022-08-01 - Markus Heiser <markus.heiser@darmarit.de>
850e7fa0 - 2022-08-04 - Mico Hautaluoma <m@mha.fi>
0cb696fc - 2022-07-31 - Markus Heiser <markus.heiser@darmarit.de>
04c3785f - 2022-08-02 - Markus Heiser <markus.heiser@darmarit.de>
b500f2ad - 2022-08-01 - Edrean Ernst <edrean@allesbeste.com>
0b576b83 - 2022-08-01 - GooGuJiang <gu@gmoe.cc>
0adeb6e2 - 2022-08-01 - Edrean Ernst <edrean@allesbeste.com>
0b025f17 - 2022-07-31 - PRATYAY MUSTAFI <pratyaymustafi@gmail.com>
|
|
|
|
Most engines that support languages (and regions) use the Accept-Language from
the WEB browser to build a response that fits to the language (and region).
- add new engine option: send_accept_language_header
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
The errors make pyright usage useless since a new error won't be seen [1].
[1] https://github.com/searxng/searxng/pull/1569
```
searx/compat.py:11:27 - error: Expression of type "Type[cached_property[_T@cached_property]]" cannot be assigned to declared type "Type[cached_property]"
"Type[cached_property[_T@cached_property]]" is incompatible with "Type[cached_property]"
Type "Type[cached_property[_T@cached_property]]" cannot be assigned to type "Type[cached_property]" (reportGeneralTypeIssues)
searx/utils.py:69:36 - error: Expression of type "None" cannot be assigned to parameter of type "str"
Type "None" cannot be assigned to type "str" (reportGeneralTypeIssues)
searx/utils.py:573:85 - error: Expression of type "None" cannot be assigned to parameter of type "int"
Type "None" cannot be assigned to type "int" (reportGeneralTypeIssues)
searx/webapp.py:1306:22 - error: Argument of type "str" cannot be assigned to parameter "__a" of type "BytesPath" in function "join"
Type "str" cannot be assigned to type "BytesPath"
"str" is incompatible with "bytes"
"str" is incompatible with protocol "PathLike[bytes]"
"__fspath__" is not present (reportGeneralTypeIssues)
searx/webapp.py:1306:68 - error: Argument of type "Literal['themes']" cannot be assigned to parameter "paths" of type "BytesPath" in function "join"
Type "Literal['themes']" cannot be assigned to type "BytesPath"
"Literal['themes']" is incompatible with "bytes"
"Literal['themes']" is incompatible with protocol "PathLike[bytes]"
"__fspath__" is not present (reportGeneralTypeIssues)
searx/webapp.py:1306:78 - error: Argument of type "str | Any | None" cannot be assigned to parameter "paths" of type "BytesPath" in function "join"
Type "str | Any | None" cannot be assigned to type "BytesPath"
Type "str" cannot be assigned to type "BytesPath"
"str" is incompatible with "bytes"
"str" is incompatible with protocol "PathLike[bytes]"
"__fspath__" is not present (reportGeneralTypeIssues)
searx/webapp.py:1306:85 - error: Argument of type "Literal['img']" cannot be assigned to parameter "paths" of type "BytesPath" in function "join"
Type "Literal['img']" cannot be assigned to type "BytesPath"
"Literal['img']" is incompatible with "bytes"
"Literal['img']" is incompatible with protocol "PathLike[bytes]"
"__fspath__" is not present (reportGeneralTypeIssues)
searx/engines/mongodb.py:8:6 - warning: Import "pymongo" could not be resolved (reportMissingImports)
searx/engines/mysql_server.py:9:8 - warning: Import "mysql.connector" could not be resolved (reportMissingImports)
searx/engines/postgresql.py:9:8 - warning: Import "psycopg2" could not be resolved from source (reportMissingModuleSource)
searx/engines/xpath.py:187:28 - warning: "categories" is not defined (reportUndefinedVariable)
searx/search/__init__.py:184:82 - warning: "flask" is not defined (reportUndefinedVariable)
searx/search/checker/background.py:19:26 - error: Type of "schedule" is partially unknown
Type of "schedule" is "(delay: Any, func: Any, *args: Any) -> Literal[True]" (reportUnknownVariableType)
searx/shared/__init__.py:8:12 - warning: Import "uwsgi" could not be resolved (reportMissingImports)
searx/shared/shared_uwsgi.py:5:8 - warning: Import "uwsgi" could not be resolved (reportMissingImports)
```
|
|
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
c0c9107c - 2022-07-27 - Sangha Lee <totoriato@gmail.com>
5b48bce6 - 2022-07-24 - Linerly <linerly@protonmail.com>
79669e65 - 2022-07-29 - Markus Heiser <markus.heiser@darmarit.de>
520e9284 - 2022-07-25 - Markus Heiser <markus.heiser@darmarit.de>
7cf52ff5 - 2022-07-25 - Markus Heiser <markus.heiser@darmarit.de>
9d3ebe72 - 2022-07-24 - Markus Heiser <markus.heiser@darmarit.de>
2d03c097 - 2022-07-24 - Markus Heiser <markus.heiser@darmarit.de>
388af012 - 2022-07-27 - Markus Heiser <markus.heiser@darmarit.de>
a4bcf098 - 2022-07-25 - Miguel Silva <miguelcabeca.dev@gmail.com>
93fd0b72 - 2022-07-27 - Markus Heiser <markus.heiser@darmarit.de>
8f68b206 - 2022-07-26 - tents <remendne@pentrens.jp>
9007c99c - 2022-07-24 - Markus Heiser <markus.heiser@darmarit.de>
aeec96f2 - 2022-07-26 - Matija Kromar <matija.kromar@gmail.com>
69084863 - 2022-07-25 - Markus Heiser <markus.heiser@darmarit.de>
b48190ab - 2022-07-24 - alexfs2015 <alex04fs@gmail.com>
b6bbc0a5 - 2022-07-23 - Markus Heiser <markus.heiser@darmarit.de>
1a503806 - 2022-07-29 - Markus Heiser <markus.heiser@darmarit.de>
c960cb93 - 2022-07-27 - Markus Heiser <markus.heiser@darmarit.de>
8a2bd34b - 2022-07-25 - Markus Heiser <markus.heiser@darmarit.de>
1064cea0 - 2022-07-23 - LagManCZ <lagmen@post.cz>
67423045 - 2022-07-24 - alexfs2015 <alex04fs@gmail.com>
56c87fda - 2022-07-24 - Markus Heiser <markus.heiser@darmarit.de>
36a64f1c - 2022-07-24 - Ankit Gupta <guptaa.ankitt@gmail.com>
|
|
Update searx.data - update_engine_descriptions.py
|
|
Update searx.data - update_currencies.py
|
|
Update searx.data - update_firefox_version.py
|
|
Update searx.data - update_languages.py
|
|
|
|
|
|
|
|
|
|
|
|
This reverts commit 747cf1a246df587aeb3b6b175c315ef0b9612dc4.
|
|
|
|
This revert part of the commit of https://github.com/searxng/searxng/commit/5fb2071cb2248c0f0ada7affb0c47f841ddbf102
|
|
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
This change the previous bypass method for Google consent using
``ucbcb=1`` (6face215b8) to accept the consent using ``CONSENT=YES+``.
The youtube_noapi and google have a similar API, at least for the consent[1].
Get CONSENT cookie from google reguest::
curl -i "https://www.google.com/search?q=time&tbm=isch" \
-A "Mozilla/5.0 (X11; Linux i686; rv:102.0) Gecko/20100101 Firefox/102.0" \
| grep -i consent
...
location: https://consent.google.com/m?continue=https://www.google.com/search?q%3Dtime%26tbm%3Disch&gl=DE&m=0&pc=irp&uxe=eomtm&hl=en-US&src=1
set-cookie: CONSENT=PENDING+936; expires=Wed, 24-Jul-2024 11:26:20 GMT; path=/; domain=.google.com; Secure
...
PENDING & YES [2]:
Google change the way for consent about YouTube cookies agreement in EU
countries. Instead of showing a popup in the website, YouTube redirects the
user to a new webpage at consent.youtube.com domain ... Fix for this is to
put a cookie CONSENT with YES+ value for every YouTube request
[1] https://github.com/iv-org/invidious/pull/2207
[2] https://github.com/TeamNewPipe/NewPipeExtractor/issues/592
Closes: https://github.com/searxng/searxng/issues/1432
|
|
The engine name is not only a *name* its also a identifier that is used in
logs, HTTP headers and more. Unicode characters in the name of an engine could
cause various issues.
Closes: https://github.com/searxng/searxng/issues/1544
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
|
|
Closes: https://github.com/searxng/searxng/issues/1449
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
cf6e9482 - 2022-07-19 - Linerly <linerly@protonmail.com>
918c1bfe - 2022-07-20 - Markus Heiser <markus.heiser@darmarit.de>
4e65ecf6 - 2022-07-21 - calb sepherus <calb.sepherus@protonmail.com>
a54be8fe - 2022-07-19 - Markus Heiser <markus.heiser@darmarit.de>
cad6cb2f - 2022-07-19 - Markus Heiser <markus.heiser@darmarit.de>
a6bd1170 - 2022-07-19 - Markus Heiser <markus.heiser@darmarit.de>
9d0e8754 - 2022-07-19 - Markus Heiser <markus.heiser@darmarit.de>
|
|
|
|
Fix missing option value "0".
|
|
Update translations
|