@dgl I've been doing a bunch of thinking on your "shells should have json as a first class citizen" thing.

I want to follow the "shape" of rc(1) which had lists as a first class type. So extending that to objects can't be that hard, right?

@dgl Some decisions:
* () Is used for grouping commands (instead of {} in rc) this leaving {} free for objects. Use [] for lists (instead of () for rc).
* Use $foo[name] to make it possible to use $foo[$bar[baz]] and $foo[$bar][baz] (instead of $foo.$bar.baz which is ambiguous).
* (Optionally) require , between items in [] lists.
* Support multidimensional ifs splitting.
* Extend rc ^ to support mapping a function (to allow mapping over an object's key/vales)

@dgl one thing that I am not particularly happy about is using `(command) to support inclusion.

What type should the output be? rc used ifs (the only place ifs is used) to split into words which were then used in a list. sh obviously transcludes it as a string then reparses.

My thought is to keep the rc behaviour. But allow `J(command) to parse the output of command as json. This seems like a bit of a hack. It does mean I could have other parse types in the future (csv? XML?)

@isomer @dgl

Are those other things representable as json, or do you want to extract e.g. metadata nonrepresentable in json and somehow make it accessible?

If former, then you don't really need anything: one could always pipe the whole thing through a converter (and then maybe something other than JSON is a better choice for the thing the shell actually understands?).

Also, you could have shell builtins that can only be used as last stages of a pipeline in $() construction and that "emit" "objects". In reality, their presence there is pure fakery and they have no byte-serialized output format: they just indicate how the thing should be parsed.

@robryk @dgl `() is how rc(1) spells sh's $().

Yeah I could just always use json, and then require you always to use a converter. But that seems like it would get annoying quick. Shells are optimised for ease of typing and quickly trying things out. TIMTOWTDI is common in shells, to save typing. It's one thing that makes their code difficult to maintain.

Follow

@isomer @dgl

You can also require a converter _always_ and use its presence as an indication that this is an object, but this is somewhat more verbose anyway if it's actually json \shruggie{}

@robryk @isomer @dgl I also thought of the syntax being '$(... | json)' or something like that, with 'json' being a shell builtin that actually merely marked the result as json. One advantage of a shell builtin is that people could alias it if they were doing it a lot: $(... | j), or the like.

But maybe the idea of general $() post-processors would be better, eg (in rc syntax): `` json (....). rc already has "`` $ifs (...)" as a syntax, so if $ifs is a builtin/function it seems natural.

@cks @robryk @dgl one of the other things I've tinkered with is having a command/function tentatively called "show" that uses magic to figure out what the content type of stdin is, and then can do various transforms to format it for humans (convert CSV to nicely aligned tables, syntax colour highlight code, etc).

I've wanted something like this, so I've coded up a few versions but none have "felt" right yet.

@isomer @cks @dgl

Based on the bytes of stdin or based on some other sources?

@robryk @cks @dgl basically read the first 1MiB into a file. Call file(1) on it in mime mode. Have a big case statement to match on it and dispatch to a renderer.

@isomer @cks @dgl

That sounds like yet another nightmare similar to browsers' mime type sniffing, with similar consequences~~

@robryk @cks @dgl it would be nice if pipes were somehow typed, but I don't know any good way to pass that information along other than magic.

If you're already gonna blat it to a terminal you're already not in particularly safe ground...

@isomer @cks @dgl

So you have the option of requiring that the user tell you what it is or trying to guess. If the user tells you the wrong thing, it will simply never work so will be ~immediately noticed. I see a clear difference in safety between this and "usually works unless your csv file contains a field with long enough piece of json".

Also: terminal? Are you talking about people writing/reading that input by hand/eye, or did you also mean pipes/sockets/fifos/...?

@robryk @cks @dgl I mean I have a pipeline that produces data. The default is just to output to stdout.

Yes, the operator could specify the type. But again, the problem is that getting humans to have to be explicit everywhere causes them to take shortcuts.

@isomer @cks @dgl

What kinds of shortcuts? I don't see any better ones than "just specify what it actually is, when you get an error telling you »you must specify what this thing (which seems to be json) is«".

@isomer @cks @dgl

A kind of magic you can use is:

on the shell side: give the program an stdout that's an (anonymous, from socketpair) Unix stream socket. If you receive an fd over it as the first ever thing sent over, read metadata from that fd.

on the program side: try to send a pipe fd in the blind into stdout. if it succeeds, write metadata into it (ignore EPIPE in case the other end dropped it on the floor).

This can also work for intermediate parts of a pipeline, as long as shell instantiates the pipe as a stream unix socket.

@cks @robryk @dgl currently my parser has (approximately):

inclusion := BACKQUOTE block
| BACKQUOTE word block

word can be anything.

(Currently only the parser exists so the next bit is hypothetical)
If word evaluates to a list then it's used for ifs. If it evaluates to a string then it can be used to decide format.

In theory you could do `$fmt(cat file.dat). Where $fmt could be the string "json" or [":"] to split on colon etc.

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.