[spec] The Tragedy of &

Gary Johnson lambdatronic at disroot.org
Sat Jan 30 20:54:39 GMT 2021

> ## Section 3.3: (DESIRED) Client-side Requests
>> The intention of this example is that the clients would produce requests
>> of this form after each input prompt:
>> => gemini://awesome.capsule.net/form?$SESSION&name&Gary%20Johnson
>> => gemini://awesome.capsule.net/form?$SESSION&password&secret
>> => gemini://awesome.capsule.net/form?$SESSION&smog&yes
>> => gemini://awesome.capsule.net/form?$SESSION&plant&Ficus
>> where $SESSION is whatever value was generated by the CGI script on the
>> first page load.
> I do not understand this example.
> When using regular inputs, the client will send these requests:
> gemini://awesome.capsule.net/form/name?Gary%Johnson
> gemini://awesome.capsule.net/form/password?secret
> gemini://awesome.capsule.net/form/smog?yes
> gemini://awesome.capsule.net/form/plant?Ficus
> gemini://awesome.capsule.net/form/submit
> (No "?" on "submit" since it's just telling the server that we're done.)
> What is the benefit of doing it your way?

Hi Katarina,

  Thanks for taking the time to reply to my message. I'll try to clarify
my point here.

The issue I'm raising is that there appears to be no way to pass more
than one piece of information at a time in our query strings. This has a
very significant impact on any writers of CGI scripts, which is how many
Gemini servers allow users to add dynamic pages to their capsules.

But why, you ask?

Because each CGI script is available at a particular file path and
therefore additional path segments can't be used to pass information to
them. They have to get their inputs from the query string.

This is a script. It probably returns a 20 response:

=> gemini://awesome.capsule.net/form.clj

If I want to fill in a name field on that page, I might provide a link
like this:

=> gemini://awesome.capsule.net/form.clj?name

This calls the CGI script with a query parameter. Great! The script can
use "name" to look up the appropriate response. Here it is:

10 Please enter your name\r\n

However, when the user fills in their name, the browser will now send
this request to the server:

=> gemini://awesome.capsule.net/form.clj?Gary%20Johnson

There is no way for the CGI script to know that this is a name value and
not the value for any other form field on the page.

And therein lies the rub. If the only way to associate input values with
the variables they represent is with path segments, then CGI scripts
simply can't ever use more than one input field per page. Even then, if
the query string used to trigger a 10 INPUT response is typed by the
user (into the totally free form text field they are presented), then
the server will continue to respond with yet another 10 INPUT response.

This would make a form with N fields require N+1 separate CGI scripts,
all chained together via links that represent the directory structure
into which they are installed.

This is an absolute nightmare scenario for programming anything that
wants to accept user inputs.

So what does this mean for Geminispace?

It means essentially that CGI scripts are currently second-class
citizens, and the only people who can write dynamic capsules are server
authors (or people willing to hack on server code). This is because
encoding information using path segments requires injecting custom
routing table code into the server's request handler.

As a server author, I am capable of creating a custom fork of my server
with a new routing table for each dynamic capsule I want to build.
However, I suspect the majority of Gemini users are not going to have
both the skill and willingness to engage in this level of coding on
their pages.

That is why I and many other authors have added support for CGI scripts
to our servers. But under the "only one piece of information in the
query string" paradigm, these scripts are currently rather handicapped
when it comes to accepting user input.

Hopefully, I've made the technical merits of my case clear here.

> ## Section 4.2: Append Don't Replace!
>> As far as I can tell, the fix here is for Solderpunk to update the text
>> in section 3.2.1 to indicate that if a query string is already part of
>> the request leading to an INPUT response, then the user's input should
>> be appended (using &) to the existing query string rather than replacing
>> it wholesale (using ?).
> This is not a necessary spec change.

Yes, it really is if anyone other than server authors is ever going to
be able to write their own dynamic pages.

> Otherwise, we really have no way to input more than one query param
>> (with &) other than asking the user to type it directly into the INPUT
>> prompt (e.g., cat&dog&pig).
> The responsibility for collecting parameters fall on the server, not on the
> client. The only thing the client needs to do is sending one query for each
> field.

Again, see above. A single query value cannot be associated with its
variable without adding a custom routing table to the server to enable
the parsing of path segment data as additional inputs.

> I'm hoping this isn't the spec's intention
>> here and that we just have a case of ambiguous wording that has led some
>> client authors to create divergent (or broken) implementations
> Sorry to disappoint you. I suggest leaving the ampersands to the web
> queries.

I'm afraid we disagree here.

> Thank you for providing example code and I'm sorry for not doing the same.

If you can write a CGI script that can correctly associate INPUT
responses with their intended variables, please share it. I suspect it
would be quite educational.

Happy hacking,

GPG Key ID: 7BC158ED
Use `gpg --search-keys lambdatronic' to find me
Protect yourself from surveillance: https://emailselfdefense.fsf.org
()  ascii ribbon campaign - against html e-mail
/\  www.asciiribbon.org   - against proprietary attachments

Why is HTML email a security nightmare? See https://useplaintext.email/

Please avoid sending me MS-Office attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html

More information about the Gemini mailing list