[users] [announce] geminispace.info - alternative search provider

Stephane Bortzmeyer stephane at sources.org
Sat Feb 27 10:16:46 GMT 2021


On Sat, Feb 27, 2021 at 10:21:18AM +0100,
 Côme Chilliet <come at chilliet.eu> wrote 
 a message of 4 lines which said:

> I was kind of expecting to see a solution based on an existing
> search engine to emerge, such as elastic search, by implementing
> only the gemini specific parts, but I looked into quite a few
> project and all were terribly complicated…

A search engine service has three parts: the crawler, the indexer and
the querier (the one the user interacts with). ElasticSearch could be
a good idea for the last two (at least the second and may be part of
the third). You still have to write the crawler and, speaking for
experience, this is not a one week-end project. At the beginning, it
is, you have a prototype running quite rapidly but then, in the real
world, a lot of problems happen. My "favorite" is capsules accepting
TCP, completing the TLS handshake, but then not replying to queries
but there are also endless redirections and other "funny" stuff. A
crawler has to be paranoid! Managing such a beast takes time, and the
growth of the geminispace (47 capsules added yesterday, a new record,
including one in catalan, apparently the first one) requires than you
plan in advance: what works today won't in a few months.







More information about the Gemini mailing list