Showing posts with label bash. Show all posts

Saturday, 9 May 2015

Automated File Modification in Bash

Automated File Modification in Bash.md

Disclaimer

My basic premise is that you want to make a throwaway script which requires adding, removing or modifying lines in a file whose contents and structure is more or less well known to you. Use cases might include tweaking a configuration file or modifying a template REST request body. Modifying files using basic tools (especially regular expressions) is prone to error. In general, if you want to make a maintainable, robust and efficient system which needs to perform file modifications, you should use a purpose built 3rd party library or tool which understands the format of the file to be modified (JSON, XML etc), for example Augeas.

Assumptions

For the purposes of this example, I am making the reasonable assumptions that you have access to a Bash shell and an up to date version of GNU Sed and Echo (OSX users will need to upgrade from the BSD version of Sed using Brew).

Appending Lines to Files

Adding a line onto the end of a file is very simple using the double redirect operator >> which appends the output of the command preceeding it onto the end of a file. If you simply want to append a line onto the end of a file as part of a script and this does the job, do it this way:

echo '127.0.0.1    hostname' >> /etc/hosts

In this case, echo is used as a convenient command whose output can be fully controlled. We could take this one step further and parameterise the call to echo with an environmental variable making a bit of code that can be used to add aliases to the hosts config file which point to the local operating system's loopback IP address.

echo "127.0.0.1    $HOSTNAME" >> /etc/hosts

Note that the variable $HOSTNAME will be evaluated when using double quotes, which is not the case when using single quotes. You should also know that the single redirect > will overwrite the contents of the target file with the output of the command rather than appending.

Replacing Lines in a File by Line Number

Unfortunately, not every scenario is as simple as the append case. It might be desirable to have a script modify the logrotate configuration so that logs are rotated daily instead of weekly. Examination of the logrotate config file /etc/logrotate.conf reveals:

1 # see "man logrotate" for details 
2 # rotate log files weekly 
3 weekly 
4  
5 # keep 4 weeks worth of backlogs 
6 rotate 4
...

Note that line numbers can be added when viewing the file in vim by entering :set number and removed by entering :set nonumber
Line 3 needs to be changed from 'weekly' to 'daily'. This can be accomplished using GNU Sed:

sed -i '3d' /etc/logrotate.conf
sed -i '3i\daily' /etc/logrotate.conf

The first command deletes the 3rd line 'in place' (rather than outputting the modified file).
The second command modifies the file (again, 'in place') by inserting a line containing 'daily' at the 3rd line.

Regular Expressions

For more advanced requirements it might be useful to use a regular expression to define the modifications to the file. An attempt to acheive the effect of replacing the 'weekly' with 'daily' as in the previous example would be to use the substitution command of Sed:

sed -i 's/weekly/daily/' /etc/logrotate.conf

This will replace the first occurrence of 'weekly' on each line with 'daily'. This does perform the change as required, but may change more than intended depending on the contents of the rest of the file. It may be safer to substitute all instances of 'weekly' which are positioned at the beginning of the line by prepending the line beginning matcher character ^:

sed -i 's/^weekly/daily/' /etc/logrotate.conf

For further information about using regular expressions with Sed, see the official documentation. As stated earlier, regular expressions are particularly prone to causing unintended side effects. In more advanced scenarios, it is often better to use a program capable of understanding the file format being modified.

Wednesday, 8 April 2015

Extracting Values from JSON Responses in Bash

Disclaimer

My basic premise is that you want to make a throwaway script for exploring or monitoring a JSON REST API and you want to be able to pull the value of a JSON property to feed into the next request, to log or to make a decision about what to do next. In general, if you want to make a maintainable, robust and efficient system which needs to do structured text parsing, you should use a purpose built 3rd party library or framework like jq.

Assumptions

For the purposes of this example, I am making the reasonable assumptions that you have access to a Bash shell, Curl, Python 2.6+ and an up to date version of GNU Grep (OSX users will need to upgrade from the BSD version using Brew).

Extracting JSON Data

An example of an API that returns JSON in the response body is the Way Back When Machine REST API. I'll use a request to get the archive data for example.com in this example:

$ curl http://archive.org/wayback/available?url=example.com 2> /dev/null

Which responds with some archive metadata:

{"archived_snapshots":{"closest":{"available":true,"url":"http://web.archive.org/web/20150119140634/http://me@example.com/","timestamp":"20150119140634","status":"200"}}}

Let's say that it's important to me to have a script make decisions on what to do next depending on the value of "status" returned in the response. It would be hard to use grep to extract this data without formatting the JSON first. This can be achieved using the Python JSON formatter as follows:

$ curl http://archive.org/wayback/available?url=example.com 2> /dev/null | python -m json.tool

{
  "archived_snapshots": {
    "closest": {
      "available": true,
      "status": "200",
      "timestamp": "20150119140634",
      "url": "http://web.archive.org/web/20150119140634/http://me@example.com/"
    }
  }
}

It is then possible to pull out the status itself using a Grep Perl regular expression (-P) and printing out only the matching part of the line (-o). The following regular expression matches any character preceded by "status": " which is not a newline, a comma or a double quote:

(?<="status": ")[^\n,"]*:

Piping it together:

$ curl http://archive.org/wayback/available?url=example.com 2>/dev/null | python -m json.tool | grep -Po '(?<="status": ")[^\n,"]*'

Gives result:

If I wanted to extract the value of a numeric or boolean type property I'd have to adjust the expression a little bit by removing the double quote on the end of the initial matcher:

$ curl http://archive.org/wayback/available?url=example.com 2>/dev/null | python -m json.tool | grep -Po '(?<="available": )[^\n,"]*'

Resulting in:

true

To complete the example, the result can be stored directly into a variable for use later in the script:

$ export STATUS=$(curl http://archive.org/wayback/available?url=example.com 2> /dev/null | python -m json.tool | grep -Po '(?<="status": ")[^\n,"]*')

if [[ $STATUS == '200' ]]; then
    echo 'OK'
fi

Monday, 19 May 2014

How to Get the Absolute Path of the Currently Executing Script

Assign the directory of the currently executing script to a variable and then print it to the standard out:

pushd `dirname $0` > /dev/null
PATH_TO_SCRIPT=`pwd -P`popd > /dev/null
echo "Path:" $PATH_TO_SCRIPT;

Explanation:

1. Move to the directory of the script

2. Assign PATH_TO_SCRIPT to the absolute path of the current directory resolving symbolic links

3. Move back to the original directory

4. Print out the path