Home | Hardware | Software | User Guides | Admin Notes | Projects

User Guides

Accessing Felina

Use a ssh client and connect to the host felina.math.utep.edu.

Cray XD1 Documentation

See documentation in felina:/opt/XD1/documents/ directory. Recommended readings:

  • S-2429-131_CrayXD1SystemOverview.pdf
  • S-2433-131_CrayXD1Programming.pdf
  • man sge_intro

Scratch Directories

Do not use home directory for storing big data. Use /home/HOSTNAME/scratch instead, where HOSTNAME is the name of a node (f740-1, f740-2, ... f742-6). Those directories are automatically mounted when accessed. Please, keep it clean: store data under directory named as your login name and delete unneeded data. The scratch directories are excluded from backup.

There is also distributed parallel file system pvfs2 virtualy mounted in the /scratch directory. Using this filesystem is a faster way to store/read data in your parallel program. For access to this filesystem you have to use parallel I/O.

Compiling an MPI program

Use one of MPI compiler wrappers:

mpicc C Language
mpiCC C++ Language
mpif77 FORTRAN 77

For example:

> mpicc cpi.c -o cpi

Running an MPI program

  • Create a job script, say runit, that contains the following line:
    mpirun -np $NSLOTS -hostfile $TMPDIR/machines $HOME/path/to/your/mpi_executeble
    
    For a description of the command options and arguments, see the mpirun(1) man page. In addition, Cray XD1 System Administration (S­2430) describes some aspects of its usage.
  • Submit the job to run on all processors in the compute partition:
    > qsub -pe am.mpi 68 -l pn=compute runit
    
    The command displays the job ID.
  • Monitor the progress of the job by using the following command (repeatedly if necessary):
    > qstat
    
    In the output of this command, check the state value of the job. This is the only job in the system, so the state should move quickly from qw (queued waiting) to r (running). After the job terminates, it is gone from the WLM system, and the qstat command produces no output.
  • Examine the output file of the job in your working directory; for example:
    > more ~/runit.o###
    
    where ### is the job ID. The output file contains both messages from the job management subsystem and the output of the application.

You can also submit job using using Active Manager GUI (only from UTEP).

FAQ

Q: What are the rules regarding the use of the machine?

  • Use master node (f742-6) for preparing (edit and compile) your computation only, not for computing.
  • Felina is intended for parallel computing. If you need to run long-term serial job, let us know.

Q: Is there a limit to the number of processors one can use ?

No. You are limited only by the number of processors in compute partition (currently 68 processors are available).