Thursday, 4 June 2015

PBS Pro Tutorial

Posted by sparutwar
What is PBS Pro?
Portable Batch System (PBS) is a software which is used in cluster computing to schedule jobs on multiple nodes. PBS was started as contract project by NASA.
            PBS is available in three different versions as below
1) Torque: Terascale Open-source Resource and QUEue Manager (Torque) is developed from OpenPBS. It is developed and maintain by Adaptive Computing Enterprises. It is used as a distributed resource manager can perform well when integrated with Maui cluster scheduler to improve performance.
2) PBS Professional (PBS Pro): It is commercial version of PBS offered by Altair Engineering.
3) OpenPBS: It is open source version released in 1998 developed by NASA. It is not actively developed.
In this article we are going to concentrate on tutorial of PBS Pro it is similar to some extent with Torque.
Fig. 1.1 PBS complex cluster with eight execution host
PBS contain three basic units server, MoM (execution host), scheduler.
1)      Server: It is heart of the PBS, with executable named “pbs_server”. It uses IP network to communicate with the MoMs. PBS server create a batch job, modify the job requested from different MoMs. It keeps track of all resources available, assigned in the PBS complex from different MoMs. It will also monitor the PBS license for jobs. If your license expires it will throw an error.
2)      Scheduler: PBS scheduler uses various algorithms to decide when job should get executed on which node or vnode by using detail of resources available from server. It has executable as “pbs_sched”.
3)      MoM: MoM is the mother of all execution job with executable “pbs_mom”. When MoM gets job from server it will actually execute that job on the host. Each node must have MoM running to get participate in execution.

Installation and Setting up of environment (cluster with multiple nodes)
Extract compressed software of PBS Pro and go the path of extracted folder it contain “INSTALL” file, make that file executable you may use command like “chmod +x ./INSTALL”. As shown in the image below run this executable. It will ask for the “execution directory” where you want to store the executable (such as qsub, pbsnodes, qdel etc.) used for different PBS operations and “home directory” which contain different configuration files. Keep both as default for simplicity.
            There are three kind of installation available as shown in figure:
Fig. 1.2 PBS installation
1) Server node: PBS server, scheduler, MoM and commands are installed on this node. PBS server will keep track of all execution MoMs present in the cluster. It will schedule jobs on this execution nodes. As MoM and commands are also installed on server node it can be used to submit and execute the jobs.
2) Execution node: This type installs MoM and commands. This nodes are added as available nodes for execution in a cluster. They are also allowed to submit the jobs at server side with specific permission by server as we are going to see below. They are not involved in scheduling. This kind of installation ask for PBS server which is used to submit jobs, get status of jobs etc.
3) Client node: This are the nodes which are only allowed to submit a PBS job at server with specific permission by the server and allowed to see the status of the jobs. They are not involved in execution or scheduling.

Creating vnode in PBS Pro:
We can create multiple vnodes in a single node which contain some part of resources in a node. We can execute job on this vnodes with specified allocated resources. We can create vnode using qmgr command which is command line interface to PBS server. We can use command given below to create vnode using qmgr.

Qmgr: create node Vnode1,Vnode2 resources_available.ncpus=8, resources_available.mem=10gb, 
resources_available.ngpus=1, sharing=default_excl 

The command above will create two vnodes named Vnode1 and Vnode2 with 8 cpus cores, 10gb of memory and 1 GPU with sharing mode as default_excl which means this vnode can execute exclusively only one job at a time independent of number of resources free. This sharing mode can be default_shared which means any number of jobs can run on that vnode until all resources are busy. To know more about all attributes which can be used with vnode creation are available in PBS Pro reference guide. 

You can also create a file in "/var/spool/PBS/mom_priv/config.d/" this folder with any name you want I prefer hostname-vnode with sample given below. It will select all files even temporary files with (~) and replace configuration for same vnode so delete unnecessary files to get proper configuration of vnodes.

$configversion 2
Here in this example we assigned default node configuration to resource available as 0 because by default it will detect and allocate all available resources to default node with sharing attribute as is default_shared. Which cause problem as all the jobs will by default get scheduled on that default vnode because its sharing type is default_shared. If you want to schedule jobs on your customized vnodes you should allocate resources available as 0 on default vnode. Every time whenever you restart the PBS server this vnodes get create unlike vnodes created manually using command line.

PBS get status:
get status of Jobs:
qstat will give details about jobs there states etc.
useful options:
To print detail about all jobs which are running or in hold state: qstat -a 
To print detail about subjobs in JobArray which are running or in hold state: qstat -ta
To print all finished jobs: qstat -x

get status of PBS nodes and vnodes:
"pbsnode -a" command will provide list of all nodes present in PBS complex with there resources available, assigned, status etc.
To get details of all nodes and vnodes you created use "pbsnodes -av" command.
You can also specify node or vnode name to get detail information of that specific node or vnode.
pbsnodes wolverine (here wolverine is hostname of the node in PBS complex which is mapped with IP address in /etc/hosts file)
Job submission (qsub):
PBS MoM will submit jobs to the PBS server. Server maintain queue of jobs by default all jobs are submitted to default queue named “workq”. You may create multiple queues by using “qmgr” command which is administrator interface mainly used to create, delete & modify queues and vnodes. PBS server will decide which job to be scheduled on which node or vnode based on scheduling policy and privileges set by user. To schedule jobs server will continuously ping to all MoMs in the PBS complex to get detail of resources available and assigned. PBS assigns unique job identifier to each and every job called JobID.
For job submission PBS uses “qsub” command. It has syntax as shown below
qsub script
Here script may be a shell (sh, csh, tchs, ksh, bash) script. PBS by default uses /bin/sh. You may refer simple script given below
echo “This is PBS job”
sleep 100

When PBS completes execution of job it will store errors in file name with JobName.e{JobID} e.g. Job1.e1492
Output with file name
JobName.o{JobID} e.g. Job1.o1492
By default it will store this files in the current working directory (can be seen by pwd command). You can change this location by giving path with -o option.

you may specify job name with -N option while submitting the job
qsub -N firstJob ./

If you don't specify job name it will store files by replacing JobName with script name.
e.g. qsub ./ this command will store results in file with and in current working directory.
qsub -N firstJob -o /home/user1/ ./ this command will store results in file with and in /home/user1/ directory.
If submitted job terminate abnormally (errors in job is not abnormal, this errors get stored in JobName.e{JobID} file) it will store its error and output files in "/var/spool/PBS/undelivered/" folder.

Useful Options:
Select resources:
qsub -l select="chunks":ncpus=3:ngpus=1:mem=2gb script 

qsub -l select=2:ncpus=3:ngpus=1:mem=2gb /home/titan/PBS/scripts/

This Job will selects 2 copies with 3 cpus, 1 gpu and 2gb memory which mean it will select 6 cpus, 2 gpus and 4 gb ram.

qsub -l nodes=megamind:ncpus=3 /home/titan/PBS/input/

This job will select one node specified with hostname.
To select multiple nodes you may use command given below
qsub -l nodes=megamind+titan:ncpus=3 /home/titan/PBS/input/
Submit multiple jobs with same script (JobArray):
qsub -J 1-20 script

If you specify resources to Job Array each subjob will require resources specified. JobId in Job Array start with JobArrayID[0], JobArrayID[1], .....,, JobArrayID[n-1]

Submit dependant jobs:
In some cases you may require job which should run after successful or unsuccessful completion of some specified jobs for that PBS provide some options such as

qsub -W depend=afterok:316.megamind /home/titan/PBS/input/

This specified job will start only after successful completion of job with job ID "316.megamind". Like afterok PBS has other options such as beforeok, beforenotok to , afternotok. You may find all this details in the man page of qsub.

Submit Job with priority:
There are two ways using which we can set priority to jobs which are going to execute.
1) Using single queue with different jobs with different priority:
       To change sequence of jobs queued in a execution queue open "$PBS_HOME/sched_priv/sched_config" file, normally $PBS_HOME is present in "/var/spool/PBS/" folder. Open this file and uncomment the line below if present otherwise add it.
job_sort_key : "job_priority HIGH"
After saving this file you will need to restart  the pbs_sched daemon on head node you may use command below
service pbs restart
After completing this task you have to submit the job with -p option to specify priority of job within queue. This value may range between (-1024) to 1023, where -1024 is the lowest priority and 1023 is the highest priority in the queue.
qsub -p 100 ./
qsub -p 101 ./
qsub -p 102 ./ 
 In this case PBS will execute jobs as explain in the diagram given below

2) Using different queues with specified priority: We are going to discuss this point in PBS Queue section.

In this example all jobs in queue 2 will complete first then queue 3 then queue 1, since priority of queue 2 > queue 3 > queue 1.
Because of this job execution flow is as shown below

 J4=> J5=> J6=>J7=> J8=> J9=> J1=> J2=> J3 
PBS Queue:
PBS Pro can manage multiple queue as per users requirement. By default every job is queued in "workq" for execution. There are two types of queue are available execution and routing queue. Jobs in execution queue are used by pbs server for execution. Jobs in routing queue can not be executed they can be redirected to execution queue or another routing queue by using command qmove command. By default queue "workq" is an execution queue. The sequence of job in queue may change by using priority defined while job submission as specified above in job submission section.

Useful qmgr commands:
 First type qmgr which is Manager interface of PBS Pro.
To create queue: 
Qmgr: create queue test2

To set type of queue you created: 
Qmgr: set queue test2 queue_type=execution

Qmgr: set queue test2 queue_type=route

To enable queue: 
Qmgr: set queue test2 enabled=True

To set priority of queue: 
Qmgr: set queue test2 priority=50

Jobs in queue with higher priority will get first preference. After completion of all jobs in the queue with higher priority jobs in lower priority queue are scheduled. There is huge probability of job starvation in queue with lower priority.

To start queue: 
Qmgr: set queue test2 started = True

To activate all queue (present at particular node): 
Qmgr: active queue @default

To set queue for specified users: You require to set acl_user_enable attribute to true which indicate PBS to only allow user present in acl_users list to submit the job.
 Qmgr: set queue test2 acl_user_enable=True

To set users permitted (to submit job in a queue): 
Qmgr: set queue test2 acl_users="user1@..,user2@..,user3@.."

(in place of .. you have to specify hostname of compute node in PBS complex. Only user name without hostname will allow users (with same name) to submit job from all nodes (permitted to submit job) in a PBS Complex).

To delete queues we created: 
Qmgr: delete queue test2

To see details of all queue status: 
qstat -Q 

You may specify specific queue name: qstat -Q test2
To see full details of all queue: qstat -Q -f 
You may specify specific queue name: qstat -Q -f test2

Read More

Tuesday, 5 May 2015

Linux commands for Text Manipulating

Posted by Unknown
Unix commands for text filtering and manipulating; Shell commands for text filtering and manipulating
    Text Filtering/Manipulating is usually building block for most of the problems we solve. Linux and its shell turns out to be very rich in providing tools/utilities for this consideration. It enriches the user with a numerous handy commands/tools for text filtering and manipulation. Here, I try to cover most of the linux manipulation tools which should certainly apply for most of your needs related to text manipulation. They are listed and detailed in alphabetical order below:

awk [ -F fs ] [ -v var=value ] [ 'prog' | -f progfile ] [ file ... ] : Awk is an interpreted programming language in itself, which executes complex pattern matching on streams of textual data. It heavily uses associative arrays, strings and regular expressions.Its usefulness for parsing system data and generation of automated reports is commendable.

Few essential arguments:
-F fs Sets the input field separator to the regular expression fs.
-v var=value Assigns the value value to the variable var before executing the awk program.
'prog' An awk program.
-f progfile Specify a file, progfile, which contains the awk program to be executed.
file ... A file to be processed by the specified awk program.

comm [options] [FILE1 FILE2] : comm compares two sorted files FILE1, FILE2 line by line.
1 : Suppress lines unique to the left file
2 : Suppress lines unique to the right file
3 : Suppress lines that appear in both the files

For example:
comm -12 file1 file2
                 Print only lines present in both file1 and file2.
comm -3   file1 file2
                 Print lines in file1 not in file2, and vice versa.

csplit [options] [file] [pattern] : Splits a file into sections depending upon the context/pattern

Few essential arguments:
-f, prefix=PREFIX   Use PREFIX instead of 'xx'
-z   Remove empty output files
-n, digits=DIGITS   Use specified number of digits instead of 2

cut [options] [file pattern] : This command can be used for extracting a portion of text from a file by selecting columns.
-c range :  Outputs only the characters in the range

For example:
This is your text file "file.txt"

$ cat file.txt
This is a test for cut.
Linux text filtering commands.
Linux text manipulating commands. 

The following example displays 4th character from each line of a file "file.txt".

$ cut -c4 file.txt

diff [options] [file1] [file2] : This commands differentiates between the given two files. A handy utility for having a quick check to see for difference between two files.
Few essential arguments:
-a   Treat all files as text and compare them line-by-line, even if they do not seem to be text.
-b     Ignore changes in amount of white space.
-c   Use the context output format.

echo [options] [string] : Prints the given input string.
Few essential arguments:
-n    do not output the trailing newline
-e    enable interpretation of the backslash-escaped characters listed below

For example: echo is useful in checking what values your environment variables holds.
$ echo $PATH 

fold [options] [files] : Wraps each line / text file to fit in a specified width. By default the output is on stdout one can redirect to a file if needed.

fold -sw [SIZE] [input.txt] > [output.txt]

-s   break at spaces
-w   {SIZE} use SIZE as WIDTH columns instead of default 80.

grep  [options and pattern to find] [file] : This is a crucial utility for finding the lines having existence of a pattern in a given plain-text data set. The name grep comes from globally search  a regular expression and print. Some of the more often used options are:

-m <num>  Stops reading a file after <num> of matching lines.
-c <num>   Prints <num> lines of output context.
-x                Selects only those matches which exactly match the whole line.
-i                 Do case insensitive matching.
-l                 Just print the files that match the pattern.
-R , -r          Read all files within a directory recursively.
-w               Select only those lines containing matches that form whole words.             

head [options] [file] : Prints the first 10 lines of the given file.
Essential arguments:
-n N         print the first N lines instead of the first 10

nl [options] [file] : Numbers the lines of the given file. Adds line number to the lines of the given file displaying it on standard output.

$ cat fruits.txt
  Jack fruit
$ nl fruits.txt
  1    apples
  2    bananas
  3    Orange
  4    Jack fruit

sed [expression] [file] : sed (stream editor) is a marvelous utility, IMHO you can do almost any kind of text filtering and transformation if one learns its intricacies. In certain ways it is similar to an editor which allows scripted edits, sed does only one pass over the input(s), and is consequently more efficient. sed’s power to filter text in a pipeline makes it to stand out from other types of editors.
Few essential arguments:
-n suppress automatic printing of pattern space
-e script add the script to the commands to be executed
-f script-fileadd the contents of script-file to the commands to be executed
-l N specify the desired line-wrap length for the ‘l’ command
-r use extended regular expressions in the script.
Essential command: s - substitution
There are many things which can be covered to learn sed which out of scope of this article But s - substitution the mostly used and known command by many of them. The example shown below will replace the passed input old to sed to new.
$ echo old | sed s/old/new/  

sort [options] [file] : Sort is easy to use useful command which sorts the lines in the given file in alphabetically and numerically.

Few essential arguments:
-b   Ignore leading blanks.
-d   Consider only blanks and alphanumeric characters.
-g   Compare according to general numerical value.
-i   Consider only printable characters.
-R   Sort by random hash of keys.
-r   Reverse the result of comparisons.

tail [options] [file] : Prints last 10 lines of the file on the standard output.
Essential arguments:
-f   Output appended data as the file grows.
-n <N>   Print last N lines of the file instead of the default last 10 lines.

tee [options] [file] : Sends the current output stream to the file. It does this at the same time, that is, displaying it on standard output and sending the stream to the file as well.
For example : The command below prints the output of ls command to both the standard output as well as file.txt
$ ls | tee file.txt
One can also let tee to send output to multiple files through command show below:
$ ls | tee file1.txt file2.txt file3.txt

uniq [options] [input] [output] : Removes/filter outs duplicate lines in the input file. If [input] is not specified it reads from stdin and if [output] is not specified it writes to stdout.
Essential arguments:
-c  Prefix lines with a number representing how many times they occurred.
-d  Only print duplicate lines
-i  By default comparisons are case-sensitive, this option enable case-insensitive comparisons.
-u  Only print unique lines
-w <N>  Compare not more than N characters in the lines.

wc [options] [file] : Prints the count of newlines, words and bytes for each input file.
Essential arguments:
-c   Print the byte counts.
-l   Print the newline counts.
-m   Print the character counts.
-w   Print the word counts.
Read More