Saturday 9 May 2015

Automated File Modification in Bash

Automated File Modification in Bash.md

Disclaimer

My basic premise is that you want to make a throwaway script which requires adding, removing or modifying lines in a file whose contents and structure is more or less well known to you. Use cases might include tweaking a configuration file or modifying a template REST request body. Modifying files using basic tools (especially regular expressions) is prone to error. In general, if you want to make a maintainable, robust and efficient system which needs to perform file modifications, you should use a purpose built 3rd party library or tool which understands the format of the file to be modified (JSON, XML etc), for example Augeas.

Assumptions

For the purposes of this example, I am making the reasonable assumptions that you have access to a Bash shell and an up to date version of GNU Sed and Echo (OSX users will need to upgrade from the BSD version of Sed using Brew).

Appending Lines to Files

Adding a line onto the end of a file is very simple using the double redirect operator >> which appends the output of the command preceeding it onto the end of a file. If you simply want to append a line onto the end of a file as part of a script and this does the job, do it this way:
echo '127.0.0.1    hostname' >> /etc/hosts
In this case, echo is used as a convenient command whose output can be fully controlled. We could take this one step further and parameterise the call to echo with an environmental variable making a bit of code that can be used to add aliases to the hosts config file which point to the local operating system's loopback IP address.
echo "127.0.0.1    $HOSTNAME" >> /etc/hosts
Note that the variable $HOSTNAME will be evaluated when using double quotes, which is not the case when using single quotes. You should also know that the single redirect > will overwrite the contents of the target file with the output of the command rather than appending.

Replacing Lines in a File by Line Number

Unfortunately, not every scenario is as simple as the append case. It might be desirable to have a script modify the logrotate configuration so that logs are rotated daily instead of weekly. Examination of the logrotate config file /etc/logrotate.conf reveals:
1 # see "man logrotate" for details 
2 # rotate log files weekly 
3 weekly 
4  
5 # keep 4 weeks worth of backlogs 
6 rotate 4
...
Note that line numbers can be added when viewing the file in vim by entering :set number and removed by entering :set nonumber
Line 3 needs to be changed from 'weekly' to 'daily'. This can be accomplished using GNU Sed:
sed -i '3d' /etc/logrotate.conf
sed -i '3i\daily' /etc/logrotate.conf
  • The first command deletes the 3rd line 'in place' (rather than outputting the modified file).
  • The second command modifies the file (again, 'in place') by inserting a line containing 'daily' at the 3rd line.

Regular Expressions

For more advanced requirements it might be useful to use a regular expression to define the modifications to the file. An attempt to acheive the effect of replacing the 'weekly' with 'daily' as in the previous example would be to use the substitution command of Sed:
sed -i 's/weekly/daily/' /etc/logrotate.conf
This will replace the first occurrence of 'weekly' on each line with 'daily'. This does perform the change as required, but may change more than intended depending on the contents of the rest of the file. It may be safer to substitute all instances of 'weekly' which are positioned at the beginning of the line by prepending the line beginning matcher character ^:
sed -i 's/^weekly/daily/' /etc/logrotate.conf
For further information about using regular expressions with Sed, see the official documentation. As stated earlier, regular expressions are particularly prone to causing unintended side effects. In more advanced scenarios, it is often better to use a program capable of understanding the file format being modified.

Wednesday 8 April 2015

Extracting Values from JSON Responses in Bash

Disclaimer

My basic premise is that you want to make a throwaway script for exploring or monitoring a JSON REST API and you want to be able to pull the value of a JSON property to feed into the next request, to log or to make a decision about what to do next. In general, if you want to make a maintainable, robust and efficient system which needs to do structured text parsing, you should use a purpose built 3rd party library or framework like jq.

Assumptions

For the purposes of this example, I am making the reasonable assumptions that you have access to a Bash shell, Curl, Python 2.6+ and an up to date version of GNU Grep (OSX users will need to upgrade from the BSD version using Brew).

Extracting JSON Data

An example of an API that returns JSON in the response body is the Way Back When Machine REST API. I'll use a request to get the archive data for example.com in this example:

$ curl http://archive.org/wayback/available?url=example.com 2> /dev/null

Which responds with some archive metadata:

{"archived_snapshots":{"closest":{"available":true,"url":"http://web.archive.org/web/20150119140634/http://me@example.com/","timestamp":"20150119140634","status":"200"}}}

Let's say that it's important to me to have a script make decisions on what to do next depending on the value of "status" returned in the response. It would be hard to use grep to extract this data without formatting the JSON first. This can be achieved using the Python JSON formatter as follows:

$ curl http://archive.org/wayback/available?url=example.com 2> /dev/null | python -m json.tool

{
  "archived_snapshots": {
    "closest": {
      "available": true,
      "status": "200",
      "timestamp": "20150119140634",
      "url": "http://web.archive.org/web/20150119140634/http://me@example.com/"
    }
  }
}

It is then possible to pull out the status itself using a Grep Perl regular expression (-P) and printing out only the matching part of the line (-o). The following regular expression matches any character preceded by "status": " which is not a newline, a comma or a double quote:

(?<="status": ")[^\n,"]*:

Piping it together:

$ curl http://archive.org/wayback/available?url=example.com 2>/dev/null | python -m json.tool | grep -Po '(?<="status": ")[^\n,"]*'

Gives result:

200

If I wanted to extract the value of a numeric or boolean type property I'd have to adjust the expression a little bit by removing the double quote on the end of the initial matcher:

$ curl http://archive.org/wayback/available?url=example.com 2>/dev/null | python -m json.tool | grep -Po '(?<="available": )[^\n,"]*'

Resulting in:

true

To complete the example, the result can be stored directly into a variable for use later in the script:

$ export STATUS=$(curl http://archive.org/wayback/available?url=example.com 2> /dev/null | python -m json.tool | grep -Po '(?<="status": ")[^\n,"]*')

if [[ $STATUS == '200' ]]; then
    echo 'OK'
fi

Friday 27 March 2015

ssh private/public keys

Note to self

I've always found setting up ssh rsa private/public keys fiddly. I have compiled some notes here:

When you are setting up ssh private/public keys do this.

Once you've generated your key, you can copy the public key across to the remote server in a one liner:
    $ ssh-copy-id -i ~/.ssh/id_rsa.pub yourusername@remoteserver

See here for more info.

Troubleshooting

Ssh onto the remote host as the user you want to set up and ensure that .ssh directory has the correct permissions:
    $ chmod 700 ~/.ssh

On the (RHEL) remote host:
    # vi /etc/ssh/sshd_config

Check the the following config is set:
 
    PubkeyAuthentication yes
    AuthorizedKeysFile    .ssh/authorized_keys

If you had to change it, restart the daemon:
    # service sshd restart

Tuesday 19 August 2014

Checking the JVM Heap Allocation Quickly from the Command Line

One liner for checking the JVM heap memory allocation:

java -server -XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal -version | grep -i heapsize

On a 32 bit JVM:

uintx ErgoHeapSizeLimit                    = 0                {product}
uintx InitialHeapSize                     := 63763392         {product}
uintx LargePageHeapSizeThreshold           = 134217728        {product}
uintx MaxHeapSize                         := 1020214272       {product}
java version "1.6.0_22"
Java(TM) SE Runtime Environment (build 1.6.0_22-b04)
Java HotSpot(TM) Server VM (build 17.1-b03, mixed mode)

(1 gigabyte allocated)

On a 64 bit JVM:

uintx ErgoHeapSizeLimit                    = 0                {product}
uintx InitialHeapSize                     := 128755328        {product}
uintx LargePageHeapSizeThreshold           = 134217728        {product}
uintx MaxHeapSize                         := 2060085248       {product}
java version "1.6.0_22"
Java(TM) SE Runtime Environment (build 1.6.0_22-b04)
Java HotSpot(TM) 64-Bit Server VM (build 17.1-b03, mixed mode)

(2 gigabytes allocated)

A 32 bit JVM is allocated 1G and a 64 bit JVM is allocated 2G to the max heap space by default.

Tuesday 3 June 2014

How to find the Most Recent Commit to a File in Git

As pointed out here, it is pretty simple to find the most recent commit to a particular file in Git:

git log -n 1 -- example/file

Thursday 22 May 2014

Git Diff Filenames Ignoring Whitespace

A colleague of mine asked me whether I knew how to compare 2 branches in Git outputting only the filenames, but ignoring files with only whitespace changes.

Taking inspiration from here, I tried:


git diff --ignore-space-at-eol -b -w --name-only

But I was quite surprised to see files which only had whitespace changes showing up in the output. So I suggested working around the strange Git behaviour like this:


git diff --ignore-space-at-eol -b -w | grep ^diff

Which filters the output of the full diff, giving lines like:


diff --git a/non-whitespace.txt b/non-whitespace.txt
diff --git a/non-whitespace2.txt b/non-whitespace2.txt


He further improved on this by removing the a/ b/ prefix and using awk to extract only the filename for each line:


git diff --ignore-space-at-eol -b -w --no-prefix | grep ^diff | awk '{print $3}'


Giving a nice output:

non-whitespace.txt
non-whitespace2.txt


How To: Create a Bash Terminal in Windows!

Why I Want a Bash Terminal in Windows

Since I have a choice, my favourite operating system for development is Linux and I use it at work. On the other hand, I often want to play computer games in my free time, so I use Windows 7 at home. However, sometimes I like to develop at home so I decided to see if I could add a Linux like terminal to Windows. Why?


  • I find a Bash terminal is a more compelling command line than trying to use cmd.exe
  • The skills I have gained from day to day use of Linux are now portable to Windows
  • I find that being able to open a terminal and perform some operation on the local file system or a remote Linux box using only the keyboard improves my ability to multitask

  • I understand that there are other ways of dealing with the "Linux for Work, Windows for Fun" problem. I could use a dual boot system or run windows emulation for games at home, but I have tried both of those in the past and they are not to my liking.


    A Solution

    Install freely available tools MinGW, msys and mintty for a Bash terminal with GNU utilities in Windows. You get familiar tools like: ssh scp cp mv rm ls grep vim wget tar gzip and a Bash terminal!


    Warning - This terminal does not like spaces in paths and you will have to perform some workarounds for this for Windows programs you want to be able to invoke from the command line (see "Setup Notepad++" below). Also, these tools do not fully support all Unix functions, but that's OK, the aim here is to make a system which is convenient for developers who work in Linux and Windows rather than creating a full Unix system on top of Windows.

    DisclaimerAnother popular solution is Cygwin, a comparison of the two is outside the scope of this blog post, read more here.

    MinGW (Minimal GNU for Windows)

    Download from here
    Choose: mingw-get-inst-20120426.exe
    (latest version at the time of writing)

    When installing, select Components:

    • C Compiler
    • MSYS Basic System
    Finish installer.

    run cmd.exe
    Enter following commands:

    cd c:\MinGW\bin
    mingw-get update
    mingw-get upgrade
    mingw-get install msys-mintty msys-vim msys-wget msys-openssl msys-openssh msys-tar msys-gzip msys-zip msys-unzip msys-libbz2


    Mintty (Xterm Emulator for Windows):


    Make a shortcut (link) on desktop for mintty:
    name:
    mintty
    path: c:\MinGW\msys\1.0\bin\mintty.exe /bin/bash --login -i

    Launch mintty (all terminal commands from this point on are to be entered into mintty)

    Mintty Solarized Theme (Optional, but really cool):

    Read more about Solarized here

    Get from here

    Paste the text of light or dark theme into ~/.minttyrc

    Also add the lines:

    BoldAsFont=no
    Font=Consolas
    FontHeight=11


    Mintty Familiar Behaviour (Optional):


    I really miss ll if I don't have it.

    Add the following lines to ~/.bash_profile

    alias ls='ls --color=always'
    alias ll='ls -l'

    Move to previous/next word using CTRL+Left/Right Arrow:

    Add the following lines to ~/.inputrc

    # Ctrl+Left/Right to move by whole words
    "\e[1;5C": forward-word
    "\e[1;5D": backward-word

    Exit and open a new terminal to check that these work.

    Customise Vim (Optional):


    I added syntax highlighting, tabs for spaces and set paste and a mode dependent cursor to vimrc:

    vim /share/vim/vimrc

    " Syntax highlighting on
    syntax on
    
    " Allow pasting blocks of code
    set paste
    
    " Tabs for spaces
    set expandtab
    set tabstop=4
    
    " Mode dependent cursor
    let &t_ti.="\e[1 q"
    let &t_SI.="\e[5 q"
    let &t_EI.="\e[1 q"
    let &t_te.="\e[0 q"
    

    Msysgit (Git for Windows):


    If msysgit is already installed on your system, remove it first, we need to install it again in a non-default directory.

    Download from here

    Choose: Git-1.8.1.2-preview20130201.exe
    (latest version at the time of writing)


    Msysgit installation path: C:\msysgit

    Configuring the line ending conversions
    Checkout as-is, commit Unix-style line endings

    Add c:\msysgit\cmd to the PATH env variable

    (Start bar -> run "Edit System Environmental Variables" -> Environmental Variables ->Edit PATH variable)

    Git Colour (Optional):


    vim /c/Users/[username]/.gitconfig

    http://nathanhoad.net/how-to-colours-in-git

    Set up Notepad++ (Optional):


    Notepad++ is a pretty cool text editor, in fact, I would not mind having this program in a Linux environment. However, MinGW does not like spaces in paths (e.g. C:\Program Files (x86)\Notepad++\ ). So to access Notepad++ from mintty, it is necessary to uninstall Notepad++ and reinstall to a directory not containing a space for example: C:\Notepad++

    Once this is done, add C:\Notepad++ to the PATH env variable

    (Start bar -> run "Edit System Environmental Variables" -> Environmental Variables ->Edit PATH variable)

    (Optional again) Make a shortcut for Notepad++ with fewer characters:

    cd /c/Notepad++/
    ln -s notepad++.exe np


    Then edit a text file from anywhere in mintty like:

    np my.xml

    However, Windows 7 does not allow its users to set the default for opening a particular file type to an executable which is outside one of the 'Program Files' folders. This might trouble you if you want Notepad++ to be the default .txt, .xml etc. editor. MinGW doesn't like spaces in paths so let's do a workaround!

    The workaround for this is to add a .bat file (Windows version of a shell script) in a folder contained in the Program Files folder. Start off by creating the directory:

    C:\Program Files\Shortcuts
    (You may prefer C:\Program Files\AAShortcuts as it will be at the top of the list)

    Create a file inside that directory called notepad++.bat and add the line:

    start notepad++ %*

    This starts Notepad++ (which is on the PATH) in a new window and forward the arguments given to the script.

    Then you can set the default application for the desired file type (e.g. .xml) as normal
    Shift+Right Click on an .xml file -> Choose Default Program -> Browse to:

    C:\Program Files\Shortcuts\notepad++.bat


    Notepad++ Solarized Theme (Optional):


    Notepad++ has Solarized theme built in:

    Settings -> Style Configurator -> Select Theme -> Solarized

    For a consistent theme with the terminal:

    Font name: Consolas
    Font size: 11

    Save & Close