Posts tagged ‘Unix’

Parsing XML data using bash and standard Unix tools

Parsing XML can be a tedious and unpleasant job if you insist on using just standard Unix tools like sed, awk, cut, grep and so on. One might say that it’s better to use python/perl/ruby/other language that ships with a full blown XML parser and use the standard Unix utilites for what they were meant for, plain old text files and not pesky XML. The problem with those nice programming languages is that they take away the one liners. You need to import stuff, have variables, flow control and so on.

A nice tool that makes one’s life easier when it comes to XML is XML2. It can convert a normal xml file to a more line oriented file format. The standard debian distribution has this neat tool in the repos so you are one apt-get away from using it.

 

One simple example. Take this XML file:


<xml>
<fruits>
<fruit name="apple" type="royal gala" quantity="2" price="1"/>
<fruit name="orange" type="tasty" quantity="4" price="1.5"/>
<fruit name="banana" type="green" quantity="3" price="1"/>
</fruits>
</xml>

We run xml2 against it:

cosu@roadwarrior:/tmp$ xml2 < fruits.xml
/xml/fruits/fruit/@name=apple
/xml/fruits/fruit/@type=royal gala
/xml/fruits/fruit/@quantity=2
/xml/fruits/fruit/@price=1
/xml/fruits/fruit
/xml/fruits/fruit/@name=orange
/xml/fruits/fruit/@type=tasty
/xml/fruits/fruit/@quantity=4
/xml/fruits/fruit/@price=1.5
/xml/fruits/fruit
/xml/fruits/fruit/@name=banana
/xml/fruits/fruit/@type=green
/xml/fruits/fruit/@quantity=3
/xml/fruits/fruit/@price=1

And now we extract all the fruit names:

cosu@roadwarrior:/tmp$ xml2 < fruits.xml |grep name |cut -d"=" -f2
apple
orange
banana

There you go! A fruit salad! Of course for more complicated stuff use other tools :)

 

Unix toolbox

Anumite lucruri nu ai cum sa le tii minte. Pentru asta exista man pages, carti, manuale, tutoriale, how-to-uri sau linkul asta

File System History

Ars Tehnica are un super articol care trece prin mai toate file systemurile importante de la Burebista si pana acum. Detaliile de implementare sunt minime asa ca e o lectura placuta de seara. Recomandat!

Si un citat din Linus Torvalds despre sistemul de fisiere de pe OS X:  “Their file system is complete and utter crap, which is scary.

30+ Reasons Why You Should Say MsDog and not MsDos.

Cautand prin documentatia lui libc am dat peste urmatorul comentariu

This function is not part of the ISO or POSIX standards, and is not customary on Unix systems, but we did not invent it either. Perhaps it comes from MS-DOG.

Dupa cateva secunde eram pe jos de ras. Motivul este asta:

http://www.delorie.com/djgpp/doc/msdog.html