Saturday, 9 May 2015

Automated File Modification in Bash

Automated File Modification in Bash.md

Disclaimer

My basic premise is that you want to make a throwaway script which requires adding, removing or modifying lines in a file whose contents and structure is more or less well known to you. Use cases might include tweaking a configuration file or modifying a template REST request body. Modifying files using basic tools (especially regular expressions) is prone to error. In general, if you want to make a maintainable, robust and efficient system which needs to perform file modifications, you should use a purpose built 3rd party library or tool which understands the format of the file to be modified (JSON, XML etc), for example Augeas.

Assumptions

For the purposes of this example, I am making the reasonable assumptions that you have access to a Bash shell and an up to date version of GNU Sed and Echo (OSX users will need to upgrade from the BSD version of Sed using Brew).

Appending Lines to Files

Adding a line onto the end of a file is very simple using the double redirect operator >> which appends the output of the command preceeding it onto the end of a file. If you simply want to append a line onto the end of a file as part of a script and this does the job, do it this way:

echo '127.0.0.1    hostname' >> /etc/hosts

In this case, echo is used as a convenient command whose output can be fully controlled. We could take this one step further and parameterise the call to echo with an environmental variable making a bit of code that can be used to add aliases to the hosts config file which point to the local operating system's loopback IP address.

echo "127.0.0.1    $HOSTNAME" >> /etc/hosts

Note that the variable $HOSTNAME will be evaluated when using double quotes, which is not the case when using single quotes. You should also know that the single redirect > will overwrite the contents of the target file with the output of the command rather than appending.

Replacing Lines in a File by Line Number

Unfortunately, not every scenario is as simple as the append case. It might be desirable to have a script modify the logrotate configuration so that logs are rotated daily instead of weekly. Examination of the logrotate config file /etc/logrotate.conf reveals:

1 # see "man logrotate" for details 
2 # rotate log files weekly 
3 weekly 
4  
5 # keep 4 weeks worth of backlogs 
6 rotate 4
...

Note that line numbers can be added when viewing the file in vim by entering :set number and removed by entering :set nonumber
Line 3 needs to be changed from 'weekly' to 'daily'. This can be accomplished using GNU Sed:

sed -i '3d' /etc/logrotate.conf
sed -i '3i\daily' /etc/logrotate.conf

The first command deletes the 3rd line 'in place' (rather than outputting the modified file).
The second command modifies the file (again, 'in place') by inserting a line containing 'daily' at the 3rd line.

Regular Expressions

For more advanced requirements it might be useful to use a regular expression to define the modifications to the file. An attempt to acheive the effect of replacing the 'weekly' with 'daily' as in the previous example would be to use the substitution command of Sed:

sed -i 's/weekly/daily/' /etc/logrotate.conf

This will replace the first occurrence of 'weekly' on each line with 'daily'. This does perform the change as required, but may change more than intended depending on the contents of the rest of the file. It may be safer to substitute all instances of 'weekly' which are positioned at the beginning of the line by prepending the line beginning matcher character ^:

sed -i 's/^weekly/daily/' /etc/logrotate.conf

For further information about using regular expressions with Sed, see the official documentation. As stated earlier, regular expressions are particularly prone to causing unintended side effects. In more advanced scenarios, it is often better to use a program capable of understanding the file format being modified.

Wednesday, 8 April 2015

Extracting Values from JSON Responses in Bash

Disclaimer

My basic premise is that you want to make a throwaway script for exploring or monitoring a JSON REST API and you want to be able to pull the value of a JSON property to feed into the next request, to log or to make a decision about what to do next. In general, if you want to make a maintainable, robust and efficient system which needs to do structured text parsing, you should use a purpose built 3rd party library or framework like jq.

Assumptions

For the purposes of this example, I am making the reasonable assumptions that you have access to a Bash shell, Curl, Python 2.6+ and an up to date version of GNU Grep (OSX users will need to upgrade from the BSD version using Brew).

Extracting JSON Data

An example of an API that returns JSON in the response body is the Way Back When Machine REST API. I'll use a request to get the archive data for example.com in this example:

$ curl http://archive.org/wayback/available?url=example.com 2> /dev/null

Which responds with some archive metadata:

{"archived_snapshots":{"closest":{"available":true,"url":"http://web.archive.org/web/20150119140634/http://me@example.com/","timestamp":"20150119140634","status":"200"}}}

Let's say that it's important to me to have a script make decisions on what to do next depending on the value of "status" returned in the response. It would be hard to use grep to extract this data without formatting the JSON first. This can be achieved using the Python JSON formatter as follows:

$ curl http://archive.org/wayback/available?url=example.com 2> /dev/null | python -m json.tool

{
  "archived_snapshots": {
    "closest": {
      "available": true,
      "status": "200",
      "timestamp": "20150119140634",
      "url": "http://web.archive.org/web/20150119140634/http://me@example.com/"
    }
  }
}

It is then possible to pull out the status itself using a Grep Perl regular expression (-P) and printing out only the matching part of the line (-o). The following regular expression matches any character preceded by "status": " which is not a newline, a comma or a double quote:

(?<="status": ")[^\n,"]*:

Piping it together:

$ curl http://archive.org/wayback/available?url=example.com 2>/dev/null | python -m json.tool | grep -Po '(?<="status": ")[^\n,"]*'

Gives result:

If I wanted to extract the value of a numeric or boolean type property I'd have to adjust the expression a little bit by removing the double quote on the end of the initial matcher:

$ curl http://archive.org/wayback/available?url=example.com 2>/dev/null | python -m json.tool | grep -Po '(?<="available": )[^\n,"]*'

Resulting in:

true

To complete the example, the result can be stored directly into a variable for use later in the script:

$ export STATUS=$(curl http://archive.org/wayback/available?url=example.com 2> /dev/null | python -m json.tool | grep -Po '(?<="status": ")[^\n,"]*')

if [[ $STATUS == '200' ]]; then
    echo 'OK'
fi

Friday, 27 March 2015

ssh private/public keys

Note to self

I've always found setting up ssh rsa private/public keys fiddly. I have compiled some notes here:

When you are setting up ssh private/public keys do this.

Once you've generated your key, you can copy the public key across to the remote server in a one liner:

    $ ssh-copy-id -i ~/.ssh/id_rsa.pub yourusername@remoteserver

See here for more info.

Troubleshooting

Ssh onto the remote host as the user you want to set up and ensure that .ssh directory has the correct permissions:

    $ chmod 700 ~/.ssh

On the (RHEL) remote host:

    # vi /etc/ssh/sshd_config

Check the the following config is set:

 
    PubkeyAuthentication yes
    AuthorizedKeysFile    .ssh/authorized_keys

If you had to change it, restart the daemon:

    # service sshd restart