Anyone who works in a field related to computer software, ends up
running a bunch of CLI (command line interface) commands pretty
often. The longer you stay in the field, the larger is your command
set. With new systems being invented everyday, the complexity of
commands being run is also increasing. Day by day, I am finding myself
running more and more complex commands. That's when I decided I need
something to manage these commands for me. Shell history is great, but
it is not portable and reliable. So, I have built pin
, a portable
commands organizer. Pin is available for use at
https://github.com/harshadjs/pin. In this post, I'll go over some of
the key design choices I made while building this tool.
Portability and the Ease of First Use
I often run complex commands on resource constrained or ill configured
environments which don't retain any state (ephemeral VMs) or
minimalstic chrooted environments - such as a custom kernel running
with minimal debian image with no network configuration under qemu
virtualization. This often means installing a collection of libraries
and dependencies is not feasible / convenient. Therefore, the primary
goals I had in my mind while designing pin
were portability and the
ease of first use.
So, given the above constraints, how does pin provide ease of first use and portability?
Pin is just one shell script that relies on very basic shell programs
(such as base64
or sha1sum
) for doing most of its work. Therefore,
pin
can be run in almost any environment where you have shell access;
and given that you are trying to remember your shell commands, you
… well - do have shell access.
This makes pin installation a very simple process. Just use wget
or
curl
to download pin
script and you are all set to use it! For
environments where internet connectivity is not available (often
inside qemu while testing custom kernel), you can still paste the
compressed and base64 encoded version of the pin
script, which gets
decompressed and decoded by simple shell commands allowing you to use
pin.
Self Identifying Objects
Another design choice that I made was to store commands as files in
the following format which are named as the SHA1 hash of the command
(CMD
) itself:
CMD="<base64 encoding of "find . -type f -iname *.[ch] | etags -">"
DESC="Generate emacs tags"
TYPE=cmd
This implies that there's inherent dedpulication of all the commands
stored in pin
. Also, self-identification of the object prevents
accidental overwrites of command files.
Output Caching
Lastly, even though shell is very portable and requires the least
number of dependencies in order to be used, it is definitely slower
compared to say - a python script or a C/C++ compiled program. Since
in the write path, pin
mostly performs writes or updates to fixed
set of files, the slowness of shell doesn't affect the write path
much.
However, there's a slight impact on the read path. For example, the
pin grep
command which searches for occurence of a term in the
database of commands (which are base64
encoded) can be slow if the
number of commands is very high. In order to deal with this problem,
pin
implements output caching which allows it to significantly
reduce read path latencies. pin
stores output of grep queries in
/tmp
folder. There's some more work needed to maintain and perhaps
configure the size of the cache, but in my experience, the number of
such commands that I run in a reboot cycle has not resulted in any
significant bloating of the cache.