Age | Commit message (Collapse) | Author |
|
based on @MarcAbonce commit on searx
|
|
|
|
* download images using the "image_proxy" network (HTTP/1 instead of HTTP/2)
* don't cache data: URL (reduce memory usage)
* after each test: purge image URL cache then call garbage collector
* download only the first 64kb of images
|
|
If there is no write access, there is no need for global. Remove global
statement if there is no assignment.
global-variable-not-assigned:
Using global for names but no assignment is done Used when a variable is
defined through the "global" statement but no assignment to this variable is
done.
In Pylint 2.11 the global-variable-not-assigned checker now catches global
variables that are never reassigned in a local scope and catches (reassigned)
functions [1][2]
[1] https://pylint.pycqa.org/en/latest/whatsnew/2.11.html
[2] https://github.com/PyCQA/pylint/issues/1375
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
metrics & processors use the engine logger
|
|
|
|
Currently, searx.search.Search calls on_result once the engine results have been merged
(ResultContainer.order_results).
on_result plugins can rewrite the results: once the URL(s) are modified, even they can be merged,
it won't be the case since ResultContainer.order_results has already be called.
This commit call on_result inside for each result of each engines.
In addition the on_result function can return False to remove the result.
Note: the on_result function now run on the engine thread instead of the Flask thread.
|
|
Suggested-by: @dalf https://github.com/searxng/searxng/issues/102#issuecomment-914168470
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
So checker_results['status'] == 'ok' is enough to check the checker result.
See searx/webapp.py, /preferences endpoint
|
|
Upgrade from pylint v2.8.3 to 2.9.3 raise some new issues::
searx/search/checker/__main__.py:37:26: R1732: Consider using 'with' for resource-allocating operations (consider-using-with)
searx/search/checker/__main__.py:38:26: R1732: Consider using 'with' for resource-allocating operations (consider-using-with)
searx/search/processors/__init__.py:20:0: R0402: Use 'from searx import engines' instead (consider-using-from-import)
searx/preferences.py:182:19: C0207: Use data.split('-', maxsplit=1)[0] instead (use-maxsplit-arg)
searx/preferences.py:506:15: R1733: Unnecessary dictionary index lookup, use 'user_setting' instead (unnecessary-dict-index-lookup)
searx/webapp.py:436:0: C0206: Consider iterating with .items() (consider-using-dict-items)
searx/webapp.py:950:4: C0206: Consider iterating with .items() (consider-using-dict-items)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
|
|
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
|
|
Lint files that has been touched by [PR #58]
[PR #58] https://github.com/searxng/searxng/pull/58
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
it prepares the new architecture change,
everything about multithreading in moved in the searx.search.* packages
previously the call to the "init" function of the engines was done in searx.engines:
* the network was not set (request not sent using the defined proxy)
* it requires to monkey patch the code to avoid HTTP requests during the tests
|
|
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
|
|
* display the median time instead of the average.
* add a "Reliability" column (sum up the metrics and the checker results).
* the "selected language", "SafeSearch", "Time range" values are displayed as "broken" when the checker tests fail.
|
|
Some error won't stop the engine:
* additional HTTP redirects for example
* some invalid results
secondary=True allows to flag these errors as not important.
|
|
|
|
Report to the user suspended engines.
searx.search.processor.abstract:
* manages suspend time (per network).
* reports suspended time to the ResultContainer (method extend_container_if_suspended)
* adds the results to the ResultContainer (method extend_container)
* handles exceptions (method handle_exception)
|
|
settings.yml:
* outgoing.networks:
* can contains network definition
* propertiers: enable_http, verify, http2, max_connections, max_keepalive_connections,
keepalive_expiry, local_addresses, support_ipv4, support_ipv6, proxies, max_redirects, retries
* retries: 0 by default, number of times searx retries to send the HTTP request (using different IP & proxy each time)
* local_addresses can be "192.168.0.1/24" (it supports IPv6)
* support_ipv4 & support_ipv6: both True by default
see https://github.com/searx/searx/pull/1034
* each engine can define a "network" section:
* either a full network description
* either reference an existing network
* all HTTP requests of engine use the same HTTP configuration (it was not the case before, see proxy configuration in master)
|
|
|
|
* initialize engine_data (youtube engine)
* don't crash if an engine don't set result['url']
|
|
|
|
Related to https://github.com/searx/searx/pull/2373
|
|
|
|
use a sparql request on wikidata to get the list of currencies.
currencies.json contains the translation for all supported searx languages.
Supersede #993
|
|
* searx understand "!ddg !g time" as : send "!g time" to DDG
* !g a DDG bang for Google: DDG return a HTTP redirect to Google
This commit adds a the allows_redirect param not to follow HTTP redirect.
The DDG engine returns a empty result as before without HTTP redirect.
|
|
Fix commit d703119d3a313a406482b121ee94c6afee3bc307 :
Some engines need to parse the HTTP error but
raise_for_error is always set to False in the "request" function.
|
|
|
|
pycld3 requires the native library cld3
langdetect is a pure python package
|
|
|
|
Without this commit, the URL /stats/errors shows percentage above 100% after the checker has run.
|
|
Before this commit, even with the scheduler disabled, the checker was running
at least once for each uwsgi worker.
|
|
|
|
for each engine: replace status by success
|
|
the query "time" is convinient because most of the search engine will return some results,
but some engines in the general category will return documentation about the HTML tags <time> or <input type="time">
|
|
* output is unbuffered
* verbose mode describe more precisly the errrors
|
|
See settings.yml for the options
SIGUSR1 signal starts the checker.
The result is available at /stats/checker
|
|
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
|
|
from_bang is True when the user query contains a bang.
In this case the category is also set to 'none'.
from_bang only usage was in searx.webadapter.parse_specific :
if from_bang is True, then the EngineRef category is ignored and force to 'none'.
This commit also removes the searx.webadapter.parse_sepecific function.
|
|
The categories parameter is useless in the constructor:
it is always the categories from the EngineRef.
The categories becomes a property.
|
|
|
|
see searx.search.processors.abstract.EngineProcessor
First the method searx call the get_params method.
If the return value is not None, then the searx call the method search.
|
|
|
|
|