README

Demonstrate how to handle Xml in shell scripts.

Install

# install xmllint package
sudo apt install libxml2-utils

Troubleshooting

xmllint --debug ./danceloop_00009.svg

Shell

Manually find the correct paths

xmllint --shell ./danceloop_00009.svg 

> help
> setns x=http://www.w3.org/2000/svg
> xpath /x:svg/x:g
> xpath /x:svg/x:g/x:path/@d

Extraction

# this fails because svg is namespaced
xmllint --xpath '//svg/g/path/@d' ./52_xml/danceloop_00009.svg

# loosens the path selection to avoid having to specify namespace everywhere.  
xmllint --debug --xpath "//*[local-name()='path']" ./danceloop_00009.svg

# extract the attribute
xmllint --debug --xpath "string(//*[local-name()='path']/@d)" ./danceloop_00009.svg

Convert into JSON

Refer to the jq examples

cat <<- EOF > "./frames.json"
{
    "frames": [
    ]
}
EOF
_filename="./danceloop_00009.svg"
_no_extension="${_filename%.*}"
_frame_number=$(echo ${_no_extension} | grep --color=never -o -E '[0-9]+')
xmllint --debug --xpath "string(//*[local-name()='path']/@d)" ./danceloop_00009.svg > /path.txt
jq --rawfile path ./path.txt --arg filename "${_no_extension}" --arg number "${_frame_number}" '.frames += [ {"name":$filename, "path":$path, "number":$number | tonumber }]' "./frames.json"

Process NMAP

Ref: 11_nmap_scanning

# scan network (save xml file)
nmap -p 22 -oX ./net.xml -vvv 192.168.1.0/24   

# show hosts that are up
xmllint --xpath '//host/status[@state="up"]/../address/@addr' ./net.xml

RSS Feed

mkdir -p ./out

# get rss feed
curl -s -o ./out/rss.xml https://latenightlinux.com/feed/mp3

# get first url
FEED_URL=$(xmllint --xpath 'string(//rss/channel/item[1]/enclosure/@url)' --format --pretty 2 ./out/rss.xml)

# get the file
curl -s -L -o ./out/$(basename $FEED_URL) $FEED_URL

S3 bucket listing

Process S3 bucket listing.

# acquire the bucket listing
curl -s "https://mybucket.s3.eu-west-1.amazonaws.com/?prefix=myprefix&max-keys=3&marker=key/path/file" | xmllint --format -

# process manually in xmllint shell
xmllint --shell ./s3listing.xml  
setns x=http://s3.amazonaws.com/doc/2006-03-01/
xpath /x:ListBucketResult/x:Contents/x:Key/text()

# xmllint
xmllint --xpath '//*[local-name()="ListBucketResult"]/*[local-name()="Contents"]/*[local-name()="Key"]/text()' --format ./s3listing.xml   

# easier to write it in python.
./extract_xml.py ./s3listing.xml   

# using yq that ignores namespaces
yq -oy '.ListBucketResult.Contents[].Key' ./s3listing.xml

Resources

Some example Xml tooling here
Explanation of the namespacing issues here
More namespacing here
Extract XML Elements Using xmllint hereextract-xml-elements-using-xmllint/
xmllint in Linux here
xpath cheatsheet here
What is RSS? here
YQ: Working with XML here

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

README

Install

Troubleshooting

Shell

Extraction

Convert into JSON

Process NMAP

RSS Feed

S3 bucket listing

Resources

Files

README.md

Latest commit

History

README.md

File metadata and controls

README

Install

Troubleshooting

Shell

Extraction

Convert into JSON

Process NMAP

RSS Feed

S3 bucket listing

Resources