Age | Commit message (Collapse) | Author |
|
|
|
[remove] yandex engine
|
|
[mod] json_engine: add content_html_to_text and title_html_to_text
|
|
[fix] fix seznam engine
|
|
[enh] add engine MediathekViewWeb (API)
|
|
|
|
no paging support
|
|
Some JSON API returns HTML in either in the HTML or the content.
This commit adds two new parameters to the json_engine:
content_html_to_text and title_html_to_text, False by default.
If True, then the searx.utils.html_to_text removes the HTML tags.
Update crossref, openairedatasets and openairepublications engines
|
|
[Engine] Add Library of Congress engine
|
|
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
|
|
Google Play Music has been replaced by Youtube music.
|
|
Closes #2540
|
|
The new version of MetaGer needs to reload the reults (into a iframe) with a
unique tag (see HTML response below).
Implementing a dedicated metager-engine for searx makes no sense to me. The
great days of MetaGer seems to be ended. I remember the good old days this
project started in the 90's of the last century. But in the last few years it
becomes more and more crap. As the name suggested, MetaGer was made for
germans in the first place. They have added a english and spain translation but
the i18n is very poor compared to what searx offers.
It's a pity, lets drop MetaGer.
This is the first response, the id (b82679980656899ba5a17ffd02a56846) is unique
for each query:
$ curl "https://metager.org/meta/meta.ger3?eingabe=foo&submit-query=&focus=web"
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<link rel="stylesheet" href="/index.css?id=b82679980656899ba5a17ffd02a56846">
<script src="/index.js?id=b82679980656899ba5a17ffd02a56846"></script>
<title>foo - MetaGer</title>
<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1" />
</head>
<body>
<iframe id="mg-framed" src="https://metager.org/meta/meta.ger3?eingabe=foo&submit-query=&focus=web&mgv=b82679980656899ba5a17ffd02a56846" autofocus="true" onload="this.contentWindow.focus();"></iframe>
</body>
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
|
|
BTW: fix indentation by 2 spaces
The additional tests has been commented out in the google engines to not release
any CAPTCHA issues.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
|
|
Add wiby.me engine
|
|
[Fix] Invidious Engine
|
|
move meta information from comment to the about variable
so the preferences, the documentation can show these information
|
|
working instances
|
|
See settings.yml for the options
SIGUSR1 signal starts the checker.
The result is available at /stats/checker
|
|
|
|
|
|
|
|
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
Closes #2339
|
|
Fixes: #1998
Suggested-by: @linuxmue
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
|
|
Voat shutted down on December 25th, 2020 at 12 noon PST: https://voat.co/host/voat/static/inactive.min.html?ReturnUrl=/
|
|
[remove] remove searchcode_doc and twitter
|
|
check HTTP response:
* detect some comme CAPTCHA challenge (no solving). In this case the engine is suspended for long a time.
* otherwise raise HTTPError as before
the check is done in poolrequests.py (was before in search.py).
update qwant, wikipedia, wikidata to use raise_for_httperror instead of raise_for_status
|
|
* twitter: the API has changed. the engine needs to rewritten.
* searchcode_doc: the API about documentation doesn't exist anymore.
|
|
[mod] libgen: update the URL to http://libgen.rs/
|
|
[remove] seedpeer engine
|
|
the website is offline.
|
|
|
|
https://libgen.is actually redirect to http://libgen.rs/
It seems there is no HTTPS version:
* https://www.wikidata.org/wiki/Q22017206
* https://librarygenesis.net/
|
|
|
|
recoll is a local search engine based on Xapian:
http://www.lesbonscomptes.com/recoll/
By itself recoll does not offer web or API access,
this can be achieved using recoll-webui:
https://framagit.org/medoc92/recollwebui.git
This engine uses a custom 'files' result template
set `base_url` to the location where recoll-webui can be reached
set `dl_prefix` to a location where the file hierarchy as indexed by recoll can be reached
set `search_dir` to the part of the indexed file hierarchy to be searched, use an empty string to search the entire search domain
|
|
credits go to @bauruine see https://github.com/searx/searx/pull/1958
|
|
|
|
|
|
Xpath engine and results template changed to account for the fact that
archive.org doesn't cache .onions, though some onion engines migth have
their own cache.
Disabled by default. Can be enabled by setting the SOCKS proxies to
wherever Tor is listening and setting using_tor_proxy as True.
Requires Tor and updating packages.
To avoid manually adding the timeout on each engine, you can set
extra_proxy_timeout to account for Tor's (or whatever proxy used) extra
time.
|
|
|
|
|
|
|
|
|
|
|
|
supported_languages values: see https://framagit.org/framasoft/peertube/search-index/-/blob/master/client/src/views/Search.vue#L618-641
|
|
fix URLs
|