Skip to content

JAliEn documentation - how to - catalogue

This document is JAliEn update of the AliEn tutorial available here.

Basic commands

If you got this far, you have a working installation of alien and are able to authenticate to the system. Congratulations!!

Now, let's take a look at the different commands that you can execute inside alien. The first thing that you have to do is enter a supported environment:

either use an AliPhysics JALIEN version that would also enable the alien.py shell or one can enable the standalone module

/cvmfs/alice.cern.ch/bin/alienv enter xjalienfs/1.2.2-9

nhardi@pcalice ~> /cvmfs/alice.cern.ch/bin/alienv enter AliPhysics/vAN-20200330_JALIEN-1
[AliPhysics/vAN-20200330_JALIEN-1] ~ > alien.py
Welcome to the ALICE GRID
support mail: adrian.sevcenco@cern.ch

AliEn[nhardi]:/alice/cern.ch/user/n/nhardi/ >

From this prompt, you can pass any commands to the AliEn System. Be careful, because even if it looks like a UNIX shell, it IS NOT a UNIX shell.

To get the full list of the commands that you can use, press tab:

AliEn[nhardi]:/alice/cern.ch/user/n/nhardi/ > <tab>
$?                        df                        kill                      mkdir                     quota                     token-init
$?err                     dirs                      la                        motd                      resubmit                  top
?                         du                        lfn2guid                  mv                        rm                        touch
access                    edit                      listFilesFromCollection   nano                      rmdir                     type
addFileToCollection       error                     listSEDistance            packages                  run                       uptime
cat                       exec                      listSEs                   pfn                       setSite                   user
cd                        exit                      listTransfer              pfn-status                showTagValue              uuid
cert-info                 exitcode                  ll                        ping                      stat                      version
changeDiff                find                      lla                       popd                      submit                    vi
chgroup                   find2                     logout                    prompt                    testSE                    vim
chown                     getSE                     ls                        ps                        time                      w
commandlist               grep                      masterjob                 pushd                     toXml                     whereis
commit                    groups                    mcedit                    pwd                       token                     whoami
cp                        guid2lfn                  md5sum                    queryML                   token-destroy             whois
deleteMirror              help                      mirror                    quit                      token-info                xrdstat

Most of the commands are similar to the standard UNIX commands:

AliEn[nhardi]:/alice/cern.ch/user/n/nhardi/ >whoami
nhardi
AliEn[nhardi]:/alice/cern.ch/user/n/nhardi/ >ls -l | tail
-rwxr-xr-x     nhardi   nhardi         3748 Feb 04 12:48    input.jdl
-rwxr-xr-x     nhardi   nhardi         3772 Feb 04 12:42    input.jdl_20200204_124400
drwxr-xr-x     nhardi   nhardi            0 Jan 29 13:33    jalien-job-1759206333/
drwxr-xr-x     nhardi   nhardi            0 Jan 29 13:42    jalien-job-1759209346/
drwxr-xr-x     nhardi   nhardi            0 Jan 29 13:43    jalien-job-1759209959/
drwxr-xr-x     nhardi   nhardi            0 Feb 06 22:19    LHC18q/
drwxr-xr-x     nhardi   nhardi            0 Dec 20 15:07    minimal_example/
drwxr-xr-x     nhardi   nhardi            0 Jan 14 16:51    output/
-rwxr-xr-x     nhardi   nhardi          472 Feb 11 15:23    test.zip
drwxr-xr-x     nhardi   nhardi            0 Jan 15 10:07    workspace/
AliEn[nhardi]:/alice/cern.ch/user/n/nhardi/ >cd workspace/                                      
AliEn[nhardi]:/alice/cern.ch/user/n/nhardi/workspace/ >ls
test/
AliEn[nhardi]:/alice/cern.ch/user/n/nhardi/workspace/ >ls -l
drwxr-xr-x     nhardi   nhardi            0 Jan 15 10:07    test/

Finally, to exit the catalogue, you can use 'exit', 'quit', or Ctrl+c

AliEn[nhardi]:/alice/cern.ch/user/n/nhardi/ >quit
Exit
[AliPhysics/vAN-20200330_JALIEN-1] ~ >

If you want to execute only one command, you can do it also from the UNIX prompt (not the alien prompt!), typing alien.py command

[AliPhysics/vAN-20200330_JALIEN-1] ~ > alien.py ls -l
drwxr-xr-x     nhardi   nhardi            0 Feb 24 16:52    alien-job-1794849428/
drwxr-xr-x     nhardi   nhardi            0 Feb 24 17:17    alien-job-1794875442/
drwxr-xr-x     nhardi   nhardi            0 Jan 14 15:46    CharmTriggerEfficiency/
-rwxr-xr-x     nhardi   nhardi         3748 Feb 04 12:48    input.jdl
-rwxr-xr-x     nhardi   nhardi         3772 Feb 04 12:42    input.jdl_20200204_124400
drwxr-xr-x     nhardi   nhardi            0 Jan 29 13:33    jalien-job-1759206333/
drwxr-xr-x     nhardi   nhardi            0 Jan 29 13:42    jalien-job-1759209346/
drwxr-xr-x     nhardi   nhardi            0 Jan 29 13:43    jalien-job-1759209959/
drwxr-xr-x     nhardi   nhardi            0 Feb 06 22:19    LHC18q/
drwxr-xr-x     nhardi   nhardi            0 Dec 20 15:07    minimal_example/
drwxr-xr-x     nhardi   nhardi            0 Jan 14 16:51    output/
-rwxr-xr-x     nhardi   nhardi          472 Feb 11 15:23    test.zip
drwxr-xr-x     nhardi   nhardi            0 Jan 15 10:07    workspace/
[AliPhysics/vAN-20200330_JALIEN-1] ~ >

You can also use aliases of the form alien_<command> for the aliases that are predefined. Currently, the following aliases are available:

[AliPhysics/vAN-20200330_JALIEN-1] ~ > alien_ <tab>
alien_cmd       alien_lfn2guid  alien_mv        alien_rmdir     alien_whereis
alien_cp        alien_ls        alien_pfn       alien_rsync.sh  
alien_find      alien_mirror    alien_ps        alien_stat      
alien_guid2lfn  alien_mkdir     alien_rm        alien_submit

Host shell interactions

Notice that when a command starts with ! it means that it will be executed in the shell on a local machine. For instance, running !ls $HOME lists files from the local home directory.
AliEn commands can be piped to one or more host shell commands (the output of the left side of the first pipe is passed as input into the shell to right side of the pipe)

AliEn[nhardi]:/alice/cern.ch/user/n/nhardi/ >whereis input.jdl | awk '/CERN/ {print $NF}'
root://eosalice.cern.ch:1094//02/44400/313c07a0-57bb-11ea-8f07-e83935243962

Working with files

Downloading files from the Grid

All the entries that you can see in the catalogue are not real files, but Logical File Names (LFN). You can think of an LFN like an index that points to one (or more) Physical File Names (PFN). The physical file name points to a copy of the file. It can be on a disk, tape, or any kind of mass storage system.

You can use commands whereis and xrdstat to get the list of replicas of an LFN and their status.

AliEn[nhardi]:/alice/cern.ch/user/n/nhardi/ >whereis input.jdl
the file input.jdl is in

     SE => ALICE::CERN::EOS         pfn => root://eosalice.cern.ch:1094//02/62390/335c0bc0-4744-11ea-8877-0242599c3dd1
     SE => ALICE::ICM::EOS          pfn => root://a-se.grid.icm.edu.pl:1094//02/62390/335c0bc0-4744-11ea-8877-0242599c3dd1
     SE => ALICE::PRAGUE::SE        pfn => root://xrdhead.farm.particle.cz:1094//02/62390/335c0bc0-4744-11ea-8877-0242599c3dd1

AliEn[nhardi]:/alice/cern.ch/user/n/nhardi/ >xrdstat input.jdl
Checking the replicas of /alice/cern.ch/user/n/nhardi/input.jdl
    ALICE::CERN::EOS        root://eosalice.cern.ch:1094//02/62390/335c0bc0-4744-11ea-8877-0242599c3dd1 OK
    ALICE::ICM::EOS         root://a-se.grid.icm.edu.pl:1094//02/62390/335c0bc0-4744-11ea-8877-0242599c3dd1 OK
    ALICE::Prague::SE       root://xrdhead.farm.particle.cz:1094//02/62390/335c0bc0-4744-11ea-8877-0242599c3dd1 OK

If an LFN points to more than one PFN, all these PFNs are identical copies (also known as mirrors or replicas) of the same file. However, users usually work with just logical filenames. The translation from logical to physical filenames is transparent to the user and is done automatically by the framework.

Use the command cp to copy a file to and from the Grid. This command requires two arguments to transfer a file. You can print the help message by passing flag -h to find more options such as using filename patterns and parallel transfers to copy multiple files at once.

The paths must have a label to denote the path type:

  • alien: or alien:// to specify a remote (GRID) file
  • file: or file:// to specify the local host location

Either one can be used, but at least one is required.
Any path without a prefix are considered to be remote paths by default.

WARNING: with the legacy AliEn shell, the aliensh, the file paths without a prefix are considered to be local files. So for compatibility purposes always specify a prefix, to both local and Grid files.

File path alias %ALIEN can be used in place of the full path to the Grid home directory OR use a relative path, the current work directory will be automatically appended.

The remote (GRID) paths, either as source or destination support the @ qualifier.
This can specify the number of replicas to be used and additionally (comma separated list) a list of Storage Elements (SE)s to be used or excluded
@disk:3,SE1,!SE2,!SE3
This have the meaning of: get 3 replicas but excluding SE2 and SE3 and add also SE1 (for a total of 4 replicas)
When downloading, a specification like SE1,SE2,!SE3,!SE4 means: download only from SE1 and SE2 and exclude SE3 and SE4
The location of replicas can be found with xrdstat, see above

AliEn[nhardi]:/alice/cern.ch/user/n/nhardi/ >cp
at least 2 arguments are needed : src dst
the command is of the form of (with the strict order of arguments):
cp args src dst
...

AliEn[nhardi]:/alice/cern.ch/user/n/nhardi/ >pwd
/alice/cern.ch/user/n/nhardi/

AliEn[nhardi]:/alice/cern.ch/user/n/nhardi/ >!pwd
/home/nhardi

AliEn[nhardi]:/alice/cern.ch/user/n/nhardi/ >ls -l input.jdl
-rwxr-xr-x     nhardi   nhardi         3748 Feb 04 12:48    input.jdl

AliEn[nhardi]:/alice/cern.ch/user/n/nhardi/ >!ls -l input.jdl
ls: cannot access 'input.jdl': No such file or directory

AliEn[nhardi]:/alice/cern.ch/user/n/nhardi/ >cp input.jdl file://input.jdl
jobID: 1/1 >>> Start
jobID: 1/1 >>> ERRNO/CODE/XRDSTAT 0/0/0 >>> STATUS OK >>> SPEED 215.51 B/s MESSAGE: [SUCCESS] 

AliEn[nhardi]:/alice/cern.ch/user/n/nhardi/ >!ls -l input.jdl
-rw-r--r-- 1 nhardi nhardi 3748 Feb 25 11:28 input.jdl

AliEn[nhardi]:/alice/cern.ch/user/n/nhardi/ >!ls -l /tmp/local_file.jdl
-rw-r--r-- 1 nhardi nhardi 3748 Feb 25 11:28 /tmp/local_file.jdl

You can also see the file, if you use the commands cat, less and vim to view and edit files directly on the Grid. After editing a file, the updated version will be uploaded automatically. A backup version of the file is kept in a filename called the same as the original one + a tilda (~) character.

AliEn[nhardi]:/alice/cern.ch/user/n/nhardi/ >cat input.jdl
jobID: 1/1 >>> Start
jobID: 1/1 >>> ERRNO/CODE/XRDSTAT 0/0/0 >>> STATUS OK >>> SPEED 64.61 B/s MESSAGE: [SUCCESS] 
LPMJobTypeID = "20351"; 
InputDataListFormat = "xml-single"; 
MasterJobId = "1766663504"; 
PWG = "COMMON"; 
LegoResubmitZombies = "1"; 
ValidationCommand = "/alice/cern.ch/user/a/aliprod/QA/validation_merge.sh"; 
JDLPath = "/alice/cern.ch/user/a/aliprod/QA/QA_merge.jdl"; 
LPMChainID = "718"; 
...

AliEn[nhardi]:/alice/cern.ch/user/n/nhardi/ >vim input.jdl
jobID: 1/2 >>> Start
jobID: 1/2 >>> ERRNO/CODE/XRDSTAT 0/0/0 >>> STATUS OK >>> SPEED 53.70 KiB/s MESSAGE: [SUCCESS] 
jobID: 2/2 >>> Start
200225 11:40:09 17580 cryptossl_X509CreateProxy: Your identity: /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=nhardi/CN=801402/CN=Nikola Hardi
jobID: 2/2 >>> ERRNO/CODE/XRDSTAT 0/0/0 >>> STATUS OK >>> SPEED 4.35 KiB/s MESSAGE: [SUCCESS] 

AliEn[nhardi]:/alice/cern.ch/user/n/nhardi/ >ls -l input.jdl*
-rwxr-xr-x     nhardi   nhardi         3749 Feb 25 11:40    input.jdl
-rwxr-xr-x     nhardi   nhardi         3748 Feb 04 12:48    input.jdl~

Uploading files to the Grid

To create a new file just copy a local file to the Grid using copy command cp. Specify a local file as the first parameter (source), and remote file as the second parameter (destination). Note, the alien:// prefix and %ALIEN Grid home directory alias are not required.

You can inspect details of the new file registered in the AliEn file catalog with the stat command. The cp command by default creates two replicas and registers links them to an LFN in the catalog.

AliEn[nhardi]:/alice/ >cp file:///tmp/local_file.jdl alien://%ALIEN/remote_file.jdl
jobID: 1/2 >>> Start
jobID: 1/2 >>> ERRNO/CODE/XRDSTAT 0/0/0 >>> STATUS OK >>> SPEED 6.35 KiB/s MESSAGE: [SUCCESS] 
jobID: 2/2 >>> Start
jobID: 2/2 >>> ERRNO/CODE/XRDSTAT 0/0/0 >>> STATUS OK >>> SPEED 5.32 KiB/s MESSAGE: [SUCCESS] 

AliEn[nhardi]:/alice/ >xrdstat remote_file.jdl 

AliEn[nhardi]:/alice/cern.ch/user/n/nhardi/ >stat remote_file.jdl
File: /alice/cern.ch/user/n/nhardi/remote_file.jdl
Type: f
Owner: nhardi:nhardi
Permissions: 755
Last change: 2020-02-25 11:56:41.0 (1582628201000)
Size: 3748 (3.66 KB)
MD5: 6f605d1c2301e8c0753281e22179bd4b
GUID: 3ce6b710-57bd-11ea-8f07-e83935243962
    GUID created on Tue Feb 25 11:54:47 CET 2020 (1582628087041) by e8:39:35:24:39:62

Replicating files

For critical files (accessed from many jobs, have to be distributed around the world for performance / availability reasons) you can either - create more replicas upfront, with cp file://... alien://<filename>@disk:5 - use the mirror command to increase the number of replicas for a file: mirror <filename> -S disk:5

Searching for files

Like in UNIX file system, you can search for files in the catalogue. You can specify path and pattern for the files you are looking for. Regex patterns are also supported. See find command help for more information.

Check cp command help for more information how to use filename patterns with copy command and -select and -name arguments.

WARNING: the find command in legacy AliEn shell aliensh printed XML collections to stdout, while the JAliEn alien.py shell creates output collection on the Grid directly.

AliEn[nhardi]:/alice/cern.ch/user/n/nhardi/ >find . *.jdl     
/alice/cern.ch/user/n/nhardi/.input.jdl~
/alice/cern.ch/user/n/nhardi/CharmTriggerEfficiency/output/CharmTriggerStudy.jdl
/alice/cern.ch/user/n/nhardi/CharmTriggerEfficiency/output/CharmTriggerStudy_merge.jdl
/alice/cern.ch/user/n/nhardi/CharmTriggerEfficiency/output/CharmTriggerStudy_merge_final.jdl
/alice/cern.ch/user/n/nhardi/input.jdl
/alice/cern.ch/user/n/nhardi/input.jdl~
/alice/cern.ch/user/n/nhardi/minimal_example/.minimal.jdl~
/alice/cern.ch/user/n/nhardi/minimal_example/minimal.jdl
/alice/cern.ch/user/n/nhardi/minimal_example/minimal.jdl~
/alice/cern.ch/user/n/nhardi/minimal_example/minimal_any.jdl
/alice/cern.ch/user/n/nhardi/minimal_example/minimal_any.jdl~
/alice/cern.ch/user/n/nhardi/remote_file.jdl

AliEn[nhardi]:/alice/cern.ch/user/n/nhardi/ >find -r . .*jdl$              
/alice/cern.ch/user/n/nhardi/CharmTriggerEfficiency/output/CharmTriggerStudy.jdl
/alice/cern.ch/user/n/nhardi/CharmTriggerEfficiency/output/CharmTriggerStudy_merge.jdl
/alice/cern.ch/user/n/nhardi/CharmTriggerEfficiency/output/CharmTriggerStudy_merge_final.jdl
/alice/cern.ch/user/n/nhardi/input.jdl
/alice/cern.ch/user/n/nhardi/minimal_example/minimal.jdl
/alice/cern.ch/user/n/nhardi/minimal_example/minimal_any.jdl
/alice/cern.ch/user/n/nhardi/remote_file.jdl

MetaData catalogue

Files can be associated custom tags and one place in particular where this feature is used is calibration objects. To see the metadata associated to a file you can use the showTagValue command.

Advanced usage

The alien.py command allows printing of json output if -json argument is used.
If used for the main invocation (alien.py -json command) it will be enabled globally for the whole session, but it can be used per command (alien.py command -json). The same will happen in the shell mode.

  • Unknown commands will be automatically passed for execution to the shell.
  • Multiple commands, delimited by ; or \\n can be passed as input
  • If the argument of alien.py is a file, it will parsed as input.
  • cp is aware of globbing *
  • Site administrators can use commands like: listSEDistance,listSEs, getSE, queryML

Environment variables

Basic usage

ALIENPY_PROMPT_DATE: enable printing of date in alien.py shell prompt
ALIENPY_PROMPT_CWD: enable printing of cwd in alien.py shell prompt
ALIENPY_NO_CWD_RESTORE: do not restore the last (saved)session cwd

Advanced usage

ALIENPY_DEBUG: enable DEBUG output to logfile
ALIENPY_DEBUG_FILE: set to new filepath; default to ~/alien_py.log
ALIENPY_JSON: all output will be in the json format. N.B.!! client-side implementations have no json output. Use -json argument for per-command usage.

Expert usage

ALIENPY_TIMECONNECT: enable timing of connections (it will be shown in log file)
ALIENPY_KEEP_META: cp operations:: keep the metafile used for download operations. it will be found in $TMPDIR

ALIENPY_TIMEOUT: connection timeout when waiting for server answer
ALIENPY_NO_STAGGER: disable the staggered socket connection to the server
ALIENPY_CONNECT_TRIES: try this many times to connect before giving up (3)
ALIENPY_CONNECT_TRIES_INTERVAL: wait this many seconds between tries (0.5)
ALIENPY_JCENTRAL: set the remote endpoint
ALIENPY_JCENTRAL_PORT: the the remote endpoint port

API usage

Internal Python interpreter

alien.py allow usage within python environment.
The fast option is alien.py term:

alien.py term
Welcome to the ALICE GRID - Python interpreter shell
support mail: adrian.sevcenco@cern.ch
AliEn seesion object is >jalien< ; try jalien.help()
>>> jalien.help()
Methods of AliEn session:
.run(cmd, opts) : alias to SendMsg(cmd, opts); It will return a RET object: named tuple (exitcode, out, err, ansdict)
.ProcessMsg(cmd_list) : alias to ProcessCommandChain, it will have the same output as in the alien.py interaction
.wb() : return the session WebSocket to be used with other function within alien.py
>>> jalien.ProcessMsg('pwd')
/alice/cern.ch/user/a/asevcenc/
0
Note that this is a python interpreter shell within context of alien.py code that has a preloaded AliEn object.

High level/Basic API usage

The class AliEn is a helper to make it easy to establish websocket connection to central services.
The basic usage is shown below:

import alienpy.alien as alien
alien.setup_logging()
j = alien.AliEn()
j.ProcessMsg('whoami -v')

The instantiation of the class will establish connection automatically.
help() method will show the methods descriptions.

Low level/Advanced API usage

With the exception of functions designed to be used exclusively within alien.py (but there are no hard limits) all user-sided functions return an RET object:

class RET(NamedTuple):
    exitcode: int = -1
    out: str = ''
    err: str = ''
    ansdict: dict = {}
There is a function designed to handle the printing of such object :
retf_print(ret_info: RET, opts: str = '') -> int
that will return the exitcode and print the out and err messages.

The message printing/logging (opts string) can be steered by:

  1. json: it will pretty print the json and return the exitcode

  2. if exitcode != 0 (that means error and there is NO stdout)

    • info/warn/err/debug : the errror message will be logged to corresponding logging facilities

    • noerr/noprint : no error message will be printed

  3. noout/noprint: no output message will be printed

Connection is established by:
InitConnection(token_args, use_usercert: bool = False, localConnect: bool = False) -> websockets.client.WebSocketClientProtocol

  • token_args are token arguments to be passed to token command for the token creation IF connection is created with user certificate,

  • use_usercert if True will force connection re-initialization with user certificate,

  • localConnect (WIP) if True will try to connect to a local socket of a host-level websocket proxy service (long lived websocket endpoint)