Age | Commit message (Collapse) | Author |
|
|
|
Many things have been changed since last review of this engine. This patch fix
xpath selectors, implements suggestion and is a complete review / rewrite of the
engine.
Signed-off-by: Markus Heiser <markus@darmarit.de>
|
|
|
|
Update searx.data - update_languages.py
|
|
When initing engines a "SearxEngineResponseException" is logged very verbose,
including full traceback information:
ERROR:searx.engines:yggtorrent engine: Fail to initialize
Traceback (most recent call last):
File "share/searx/searx/engines/__init__.py", line 293, in engine_init
init_fn(get_engine_from_settings(engine_name))
File "share/searx/searx/engines/yggtorrent.py", line 42, in init
resp = http_get(url, allow_redirects=False)
File "share/searx/searx/poolrequests.py", line 197, in get
return request('get', url, **kwargs)
File "share/searx/searx/poolrequests.py", line 190, in request
raise_for_httperror(response)
File "share/searx/searx/raise_for_httperror.py", line 60, in raise_for_httperror
raise_for_captcha(resp)
File "share/searx/searx/raise_for_httperror.py", line 43, in raise_for_captcha
raise_for_cloudflare_captcha(resp)
File "share/searx/searx/raise_for_httperror.py", line 30, in raise_for_cloudflare_captcha
raise SearxEngineCaptchaException(message='Cloudflare CAPTCHA', suspended_time=3600 * 24 * 15)
searx.exceptions.SearxEngineCaptchaException: Cloudflare CAPTCHA, suspended_time=1296000
For SearxEngineResponseException this is not needed. Those types of exceptions
can be a normal use case. E.g. for CAPTCHA errors like shown in the example
above. It should be enough to log a warning for such issues:
WARNING:searx.engines:yggtorrent engine: Fail to initialize // Cloudflare CAPTCHA, suspended_time=1296000
closes: #2612
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
Update searx.data - update_wikidata_units.py
|
|
Update searx.data - update_ahmia_blacklist.py
|
|
|
|
|
|
|
|
|
|
Update autocomplete
|
|
|
|
The old xpath configuration for google scholar did not work and is replaced by a
python implementation.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
|
|
[py2to3] use unittest from py3, remove unittest2 from py2
|
|
Update searx.data - firefox_version
|
|
Fix fetch_languages for Bing
|
|
Add freesound engine with player.
Co-authored-by: Gazoil <maildeguzel@gmail.com>
|
|
- unittest2 is a backport of the new features added to the unittest testing
framework in Python 2.7
- unittest2 was only needed in py2 and can be dropped now
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
Update searx.data - currencies
|
|
Update searx.data - wikidata_units
|
|
|
|
|
|
|
|
|
|
|
|
Bing has a list of regions that it supports and some of these regions
may have more than one possible language.
In some cases, like Switzerland, these languages are always shown as
options, so there is no issue. But in other cases, like Andorra, Bing
will only show one language at the time, either the region's default or
the request's language if the latter is supported by that region.
For example, if the HTTP request is in French, Andorra will appear as
fr-AD but if the same page is requested in any other language Andorra
will appear as ca-AD.
This is specially a problem when Bing assumes that the request is in
English because it overrides enough language codes to make several major
languages like Arabic dissappear from the languages.py file.
To avoid that issue, I set the Accept-Language header to a language
that's only supported in one region to hopefully avoid these overrides.
|
|
Based on duckduckgo bangs
Store bangs on a trie to allow autocomplete (not in this commit)
|
|
[mod] update wikidata_units.json and fetch_wikidata_units.py
|
|
use a sparql request on wikidata to get the list of currencies.
currencies.json contains the translation for all supported searx languages.
Supersede #993
|
|
The fetch_wikidata_units.py result won't change randomly.
See comments in the script.
|
|
|
|
Update rumble.py
some lines too long.
Disable Rumble engine
disabled : True
PEP8 fix
change line spacing
|
|
update yggtorrent url + add it back
|
|
|
|
At the moment videos without a description are not shown - setting
default content to "" fixes this.
Another current bug is that thumbnails are not displayed. This is caused
by a double slash in the url. For this every trailing slash is now
stripped (for backwards compatibility) and the API response is correctly
parsed.
|
|
[remove] yandex engine
|
|
* searx understand "!ddg !g time" as : send "!g time" to DDG
* !g a DDG bang for Google: DDG return a HTTP redirect to Google
This commit adds a the allows_redirect param not to follow HTTP redirect.
The DDG engine returns a empty result as before without HTTP redirect.
|
|
[mod] json_engine: add content_html_to_text and title_html_to_text
|
|
[upd] wikipedia engine: return an empty result on query with illegal characters
|
|
[fix] fix seznam engine
|
|
Fix duckduckgo
|
|
Fix: activate raise_for_error by default
|
|
[enh] add engine MediathekViewWeb (API)
|
|
|
|
no paging support
|
|
on some queries (like an IT error message), wikipedia returns an HTTP error 400.
this commit returns an empty result instead of showing an error to the user.
|
|
Some JSON API returns HTML in either in the HTML or the content.
This commit adds two new parameters to the json_engine:
content_html_to_text and title_html_to_text, False by default.
If True, then the searx.utils.html_to_text removes the HTML tags.
Update crossref, openairedatasets and openairepublications engines
|
|
[Engine] Add Library of Congress engine
|