CRAB Rucio Stageout Tutorial¶
Prerequisite¶
Using Rucio CLI¶
Be aware that $USER
stays for your CERN account username or Rucio's group account.
# Note: please use CLI outside CMSSW environment.
voms-proxy-init --voms cms
source /cvmfs/cms.cern.ch/rucio/setup-py3.sh
export RUCIO_ACCOUNT=$USER
Quota¶
First, check your Rucio Quota:
Expected output:
[tseethon@lxplus808 ~]$ rucio list-account-usage tseethon
+------------+-----------+------------+--------------+
| RSE | USAGE | LIMIT | QUOTA LEFT |
|------------+-----------+------------+--------------|
| T2_CH_CERN | 0.000 B | 100.000 GB | 100.000 GB |
| T2_IT_Rome | 19.070 GB | 2.000 TB | 1.981 TB |
+------------+-----------+------------+--------------+
+------------------+---------+---------+--------------+
| RSE EXPRESSION | USAGE | LIMIT | QUOTA LEFT |
|------------------+---------+---------+--------------|
+------------------+---------+---------+--------------+
If you still do not have any quota, please consult quota request in FAQs.
Submit task with Rucio stageout¶
We will submit a simple analysis task and using HammerCloud dataset as our input.
PSet.py:
from __future__ import division
import FWCore.ParameterSet.Config as cms
process = cms.Process('NoSplit')
process.source = cms.Source("PoolSource", fileNames = cms.untracked.vstring('root://cms-xrd-global.cern.ch///store/mc/HC/GenericTTbar/AODSIM/CMSSW_9_2_6_91X_mcRun1_realistic_v2-v2/00000/8ADD04E5-1776-E711-A1BA-FA163E6741E0.root'))
process.maxEvents = cms.untracked.PSet(input = cms.untracked.int32(10))
process.options = cms.untracked.PSet(wantSummary = cms.untracked.bool(True))
process.output = cms.OutputModule("PoolOutputModule",
outputCommands = cms.untracked.vstring("drop *", "keep recoTracks_globalMuons_*_*"),
fileName = cms.untracked.string('output.root'),
)
process.out = cms.EndPath(process.output)
crabConfig.py:
from WMCore.Configuration import Configuration
config = Configuration()
config.section_('General')
config.General.transferLogs = False
config.General.requestName = 'rucio_transfers_tutorial'
config.section_('JobType')
config.JobType.pluginName = 'Analysis'
config.JobType.psetName = 'pset.py'
config.JobType.maxJobRuntimeMin = 60
config.section_('Data')
config.Data.totalUnits = 10
config.Data.splitting = 'LumiBased'
config.Data.publication = True
config.Data.unitsPerJob = 1
config.Data.outputDatasetTag = 'ruciotransfer-tutorial'
config.Data.outLFNDirBase = '/store/user/rucio/tseethon/'
config.Data.inputDataset = '/GenericTTbar/HC-CMSSW_9_2_6_91X_mcRun1_realistic_v2-v2/AODSIM'
config.section_('User')
config.section_('Site')
config.Site.storageSite = 'T2_CH_CERN'
config.section_('Debug')
The most important part of CRAB config is
CRAB recognize Rucio stage-out only when output LFN is prefixed with /store/{user,group}/rucio/${rucioaccount}
.
Then, submit the task with the usual crab submit
:
We will wait until some jobs finish, and move to PostJob stage (jobs change from "running" to "transferring" in crab status
).
Inspect "transferring" status¶
Assume after running crab submit
, we get task name:
To inspect transferring status,
-
Run
crab status
and looking for the lineTransfer container's rule
.On the line with "Transfer container's rule:", copy the link and open with your web browser.
-
Check at
state
field, if "OK" mean files are transferred to destination (destination is inrse_expression
field). -
You can look at individual files in Locks Overview:
-
You can click hyperlink in the "name" field to see the content of the container.