[SPEC] Backwards-compatible metadata in Gemini

Omar Polo op at omarpolo.com
Thu Feb 25 20:56:27 GMT 2021


John Cowan <cowan at ccil.org> writes:

> On Thu, Feb 25, 2021 at 4:17 AM Omar Polo <op at omarpolo.com> wrote:
>
>  - we have TLS because it's fundamental to guarantee confidentiality
>>    between servers and clients
>>
>
> I personally don't give a damn about this (after all, who's going to pass
> confidential information over Gemini?  See below.), but I accept that other
> people do.

"confidentiality" maybe was the wrong word.  The idea is that I don't
want to let everyone in the same network see (and possibly hijack) the
pages I visit.  (OK, there are tons of possible issues with this, given
TOFU and how specific clients implements it etc, I don't want to get off
topic though.)

>  - we have status codes, because a page that says "an error occurred"
>>    or "certificate required" cannot be interpreted correctly otherwise
>>
>
> They aren't actually part of the required protocol engine, except for 1x
> vs. 2x.  The META for the others is just human-readable content, and could
> be replaced by a 2x document, and some other mechanism could be found for
> the marginal case of 1x.  (Clients might exploit 6x to automatically retry
> with a (different) cert, but it's unclear that this is the Right Thing: the
> protocol document is, as usual, ambiguous, as it says "should be retried"
> but not who should retry it.)

I'm not sure I understand what you mean.  If my server fails to execute
a CGI script, it should return a 20 reply with "Error while executing
the script" instead of 42?  The meta is human-readable, but the codes
3x, 4x, 5x and 6x carries a meaning, it's an extra information, heck,
you can consider that a metadata in some large interpretation of the
word.  (Maybe 3x can be dropped, and some codes inside 4x and 5x as well
if we really want to be barebones, but again, I don't want to get
off-topic.)

>>  - we have pre-formatted blocks to allow certain types of
>>    explanations/presentations that otherwise would have been impossible
>>    (how do we teach how to write text/gemini in text/gemini?)
>>
>
> It's still hard to explain ``` within ```: you can talk about it but you
> can't show an example.

Touché

> Anyway, I don't care about any of these points, just that "necessary" is in
> the eye of the beholder.
>
>  - the one adding the line-type =: (or whatever): you have to parse the
>>
>    whole document to extract the metadata
>
>
> True, but that's true for any approach except metadata-at-the-top.  Note
> that ^^^ within ``` does not mark metadata, so you have to parse at least
> that much.
>
> and it allows for possibly unreadable text/gemini files[1]
>>
>

[...]

> you probably will have trouble here

FYI, Emacs handled that wall of text surprisingly well :)

(just for curiosity, what languages are?  I couldn't recognize them)

> Workers at the Tower of Babel:
> "Bitte geben Sie mir einen kleineren Schraubenschlüssel.'
> 'Non ho idea di quello che stai chiedendo.’

Adesso ne ho una, o almeno mezza, sperando che la traduzione sia fedele :)

(sorry I couldn't resist.  I don't see pretty often examples featuring
Italian)

> ‘Давайте чашку чая и домой.’
> ‘Kei te korero koe i tito noa, ko ahau ngenge o te whare pourewa.'
>
> There.  A perfectly valid text/gemini or indeed text/plain document, and
> yet quite unintelligible, and it would be trivial to make it far worse.
> The question is, what is the _motive_ for writing and serving such rubbish?
> Web pages lie to search engines because money.  Web pages contain hidden
> text because money.  Web pages plant tracking pixels because money.  The
> love of money is the root of all Web evil. But where's the money in
> Gemini?  I mean, you could put textual ads in your pages and a link to a
> website, but why not just use the website?

The motive to serving rubbish can simply be some sloppery when
generating automatic content.  If I know `=:'-style metadata can be put
anywhere, I can write scripts that *for convenience* don't try to put
everything at the top/bottom.  But I agree, this is probably a moot
point.  The real issue I see lies in just adding an explicit syntax for
key-pairs value.  I'm explaining it better below

>    An user on a non-sophisticate client cannot (easily) understand
>>    that.  It's just full of bloat.
>>
>
> Again, you'd have to be a fool to write something like your example or mine
> except for hack value.  Reasonable people would either put metadata at the
> bottom of the document or the most important entries at the top and less
> important ones at the bottom; it's the possibility of doing that, along
> with allowing links-with-metadata, that make me want it to be able to go
> anywhere.  Gemini is like (anarchist) Anarres: everything is open to
> everyone's .  The Web has become like (capitalist) Urras: there is lots of
> glitz on top, but the important stuff is hidden in the cellars, where
> people are bleeding to death.
>
> Repeating myself:  metadata conventions should go in a metadata spec, *not*
> in the text/gemini spec, since neither clients nor servers are required to
> take any notice of them.

> [snip]
>
>> One thing that I haven't though about when writing the mail, but only
>> later when discussing the matter with thfr@, is that we're trying to
>> hide stuff from users eyes.  Sure, if used correctly those two proposed
>> syntaxes (=: and ^^^) can be easy to read, but lets be honest: clients
>> won't show them as-is, in particular the more advanced ones.
>>
>
> I'm sorry, but I am unwilling to take this for granted.  Why hide them?
> (I'm already annoyed that Gmail hides signatures, and mine aren't always
> the same, so I don't use the standard "-- " line to announce them any more.)
>
> As things stands now, there are only two things that Gemini clients
>> usually hide: the URL of a link-line and the alt-text of a pre-formatted
>> block.
>
>
> I actually think hiding the URL is bad UX.  People should be able to know
> *before* going to a page where it's coming from.  Showing the domain is not
> enough, because multi-hosting.  (This is an option in Lagrange now.)
>
>> There's a understandable UX reason for that, but do we really
>> need to add something else that we know will be hidden to end users?
>>
>
> "Know" is a very strong verb, especially for something that hasn't happened
> yet.  Metadata in HTML head elements was *required* to be hidden from day 1.
>[]
>> Another thing that I forgot to explicitly say in my previous message is
>> that we can use some sort of common notation, a convention, rather than
>> adding new things to the specification.  See for instance the
>> "Subscribing to Gemini pages" companion specification: a lightweight,
>> convention-based way to provide atom-like feeds.  I found it pretty
>> elegant, and has proven a) easy to implement b) easy for content writers
>> to use c) easy for end-users to consume and d) avoid adding extra line
>> types/file types/etc to the specification.
>>
>
> That's what I want too.  =: does not have to be a text/gemini line type,
> just a convention explained elsewhere.

I feel like we're not talking about the same thing.  This thread has
grown pretty large, so I apologies if I missed some parts of it.

My understanding of the proposals is that they want to add some sort of
extra notation (either as line-type or block, it doesn't matter) to
express (potentially) arbitrary key-value pairs.  Given this, we know
that non-dummy clients will hide them in the document, probably to show
those information in a "better" way.  I mean, it wouldn't be bad if a
client would be able to display a sidebar the authors of the document
and creation/update dates.  But then we have given users ways to extend
the format way beyond its purpose, so it gets way easier to add styling,
formatting, and other stuff.

But from what you write, it seems that you're aim at some sort of
convention for metadata, and this is something I can actually agree on.
To make it clear, since in another mail you mentioned two "factions",
I'm not against the idea of metadata in the first place, I simply don't
like the existing proposals for the already mentioned issues.

Even if I'm not 100% happy with what follows, if I were to add metadata,
I'd advocate for something simpler, without a prefix that smell like
"unofficial line-type", and probably already in use like:

Author: Omar Polo
Published: 2020-02-24
Edited: 2020-02-25
Licence: ISC

(even the FAQ document has a Last-Update line near the top, or something
like that)

With something like this I think we could achieve metadata in a way
that:
 - they're not hidden from the end-user eyes
 - easy (and natural) to use by who write contents
 - intuitive for the readers
 - backward - and forward - compatible with the spec
 - doesn't require a special treatment by already existing clients

Something like this would also avoid various problems regarding the
classification of valid keywords and the troubles regarding
internationalisation.  In an Italian document I would happily write

Autore: Omar Polo
Pubblicato: 2020-02-24
Aggiornato: 2020-02-25
Licenza: ISC

and all my readers would understand the meaning.  We could then have
specific search-engines per-language, adapt tools to cope with this etc,
all without modifying the spec and without providing a syntax that can
be abused.

Now, to the OT  (i'm lazy today and I don't want to write another mail)

> OT: I'm actually designing something like GeminiScript, but not for Gemini
> clients.  I think the idea of no-install instant-download software with
> severe limitations on presentation and no access to the local system except
> very limited keyboard/screen/mouse is in fact a good one, but unlike
> Brendan Eich I have the luxury of more than two weeks to think about it.  I
> also want to make it as accessible to non-professionals as microcomputer
> Basic was.  It would run in its own native client, either CLI or TUI or GUI
> (there are some issues around the fact that CLIs linearize access).  Like
> most languages, it could probably be compiled to JavaScript.
>

I don't like the idea of instant-download software for various ethical
and practical concerns, but I have a soft-spot for programming language
design and compilers, so, if you don't mind, I'd be curious to read more
about it :)



P.S.: I find your way of quoting text strange.  Why the first line of
every cited block isn't prefixed by > when all the others are?  Is that
some sort of arcane custom that youngster like me don't understand?

>
>
> John Cowan          http://vrici.lojban.org/~cowan        cowan at ccil.org
> Your worships will perhaps be thinking that it is an easy thing
> to blow up a dog? [Or] to write a book?
>     --Don Quixote, Introduction



More information about the Gemini mailing list