[SPEC] Backwards-compatible metadata in Gemini

Omar Polo op at omarpolo.com
Fri Feb 26 13:01:21 GMT 2021

Stephane Bortzmeyer <stephane at sources.org> writes:

> On Thu, Feb 25, 2021 at 10:16:52AM +0100,
>  Omar Polo <op at omarpolo.com> wrote 
>  a message of 161 lines which said:
>>     As things stands, I know I can
>>     cat file1.gmi file2.gmi ... > result.gmi
>>    and obtain a valid text/gemini file.
> One of the things missing in the current specification is a formal
> grammar of gemtext, so there is currently no way to know if lines must
> end with a end-of-line or is the end-of-file sufficient.
> => gemini://gemini.bortzmeyer.org/gemini/missing-eol.gmi Example
> Amfora and Lagrange seem to accept the last line (the one without an
> end-of-line). But your example with cat would break it, concatening
> the last line of the first file with the first line of the second file.

It's not completely true.  This example was specifically for UNIX-like
systems where files are expected to end with a newline.  It's POSIX
fault, as it defines a line as a sequence of character ending with a \n:


(this isn't an excuse for clients to not handle them correctly though!)

>> Also, the examples you gave in support of your proposals seems bogus
>> too.  Serving a mailing list archive over Gemini?  Cool, but why convert
>> the mails to text/gemini?  Wrapping them in ``` (with headers visible)
>> or serving them "raw" is not enough?
> Because humans (specially non-anglosaxon humans) have trouble with
> "From", "Subject" and "Message-ID"?

I can agree, but I find preferable something like

	# Hypothetical archive page for a Gemini mailing list in Italian

	Da: Omar Polo <op at omarpolo.com>
	A: <someone at example.com>
	CC: ...
	Oggetto: [spec] proposta per i metadatai

	actual raw body of the e-mail...

instead of

	-: x-mail-from="Omar Polo <op at omarpolo.com>"
	-: x-mail-to=<someone at example.com>
	-: x-mail-subject="..."

or even

	actual body of the email

	from: ...
	to: ...

"Free-form metadata", as in simple text paragraphs made up like that,
are probably already widespread, intuitive for writers and readers, and
probably also "simple enough" for tools to grok, at least partially.  It
also solves localisation problems, because it's the author that is free
to pick the way he/she prefers to conveys the meaning for his/her
audience.  Different authors may choose different keys for the same
"idea" (i.e. "Updated", "last-updated", "edited" ...), that's true, this
solution is not without its drawbacks.

(ah, all of this without adding dangerous ways to extend the text format)

>> If we want to build a better GUS I don't think that adding metadata
>> to text/gemini will solve anything, it will actually make things
>> worst.  The point is, you can't trust 3rd-parties metadata. [...]
>> people will abuse the metadata to "go up" in the search results, and
>> the outcome of that is crystal-clear on the Web,
> I'm not convinced by this "look at the SEO mess" argument. Gemini is
> not the Web, there is no money at stake, marketing people and salesmen
> frown upon Gemini ("what, no pictures? No tracking?") so I really
> doubt that many people would resort to dirty tricks just to be higher
> in GUS' results.

I acknowledge that this is not a problem right now, and hopefully will
never be, but we should be aware of the troubles.  Adding a syntax for
metadata will (possibly) open the doors for tracking, styling, etc.

More information about the Gemini mailing list