29. May 2026 · Comments Off on Chimera · Categories: News, Novinky, Tutorials

Accessing the Chimera Cluster

Most of the Chimera cluster services are accessible using your existing SIS login/password.

To login to the head node of Chimera cluster use the ssh command and enter your sis credentials:

ssh <sis_login_name>@hpc.troja.mff.cuni.cz

This head node is not intended for any CPU-intensive tasks.

Other services:

  1. GitLab account at gitlab.mff.cuni.cz (Note: Karlin local GitLab is separate service at gitlab.karlin.mff.cuni.cz)
  2. Initialize your JupyterHub Chimera account by loging at https://hpc.troja.mff.cuni.cz:8000
  3. Optional: Mattermost (collaboration platform designed for technical and operational teams).
  4. To report general problem with the Chimera cluster and get useful to get notifications from admins about planned outages and report use https://gitlab.mff.cuni.cz/mff/hpc/clusters/-/work_items

For users from mathematical section, please send email to hron@karlin.mff.cuni.cz to be added to math account to get priority access to partition of mathematical section (=hardware resources).

Computational resources

The cluster is mannaged by a SLURM system which divides the resources into partitions, serving as queues for jobs with different priorities. Each user belongs to certain accounts giving him access to selected partitions.

Accounts

To see the accounts you are a member of use command sshare -U. Every user should be member of ffa (free for all) account.
Please send email to hron@karlin.mff.cuni.cz to be added to math account to get priority access to partition of mathematical section.

Partitions

The list of all partitions can be generated by command sinfo -s. Partitions starting with ffa (ffa-short, ffa-preempt, ffa-checkpoint, ffa-long) are available to everyone with low priority. Partition math is available to all members of the math account.

You can use our utility freenodes to see nodes nad partitions status. To see particular partion (for example math) use freenodes -p math.

freenodes -p math

Cluster: chimera
node name ↔ busy cores state alloc cores in: partitions
r41⇄ ( 1 of 36x2) part 2BJMUV [..................|..................]
r42⇄ ( 0 of 36x2) free 2BJMUV [..................|..................]
r43⇄ ( 0 of 36x2) free 2BJMUV [..................|..................]
r44⇄ ( 0 of 36x2) free BJMUV [..................|..................]
r45⇄ ( 0 of 36x2) free BJMUV [..................|..................]
r46⇄ ( 0 of 36x2) free BJMUV [..................|..................]
r47⇄ ( 0 of 36x2) free BJMUV [..................|..................]
r50⇄ ( 0 of 192x2) free JUV [..... ...... .... .... ..... ....... ....... ... ....... ...... ... ........ ... ..... ..... .... ...... ... .....| ................ ................ ........... ............... ............... .............. .........]
There are 111 running jobs and 37 queued pending jobs.
partition allocation duration cores jobs running [ queue % ] jobs in queue next job to go in hh:mm
J math 1 day (max 0h) 444 0 ( 0 cores) [..........] 0% 0 ( 0 cores)
TOTAL 444 0 ( 0 cores) [..........] 0% 0 ( 0 cores)

More detailed tutorial on slurm usage is in preparation.

Data storage options

Every user has a home folder /home/<sis_login_name> with quota of 10 GB and hard limit of 15 GB of disc space.

If that is not sufficient, please send email request to J. Hron (hron@karlin.mff.cuni.cz) or J. Eliášek (Jiri.Eliasek@mff.cuni.cz) to setup a work folder with larger amount of disk space.

  • /home/... home folder, fast, space limmited
  • /work/... storage for large data and computed results
  • /scratch/... temporal fast storage
  • /archive/... slow, long-term data storage

Environment modules

Software packages for users:

module available

--------------------------------------------------------- /etc/modulefiles ----------------------------------------------------------
DMTCP oneapi/debugger/2023.2.0 oneapi/intel_ippcp_intel64/2021.8.0
Mathematica/13.3 oneapi/debugger/2025.2.0 (D) oneapi/intel_ippcp_intel64/2025.2 (D)
UKRmol oneapi/dev-utilities/latest oneapi/ishmem/latest
....

Additional modules migrated from Snehurka cluster - still ongoing. Send email to J. Hron (hron@karlin.mff.cuni.cz) to request additional modules.

module use /home/hron/WORK/pkg/modulefiles
module load Arch/linux-rocky9-x86_64
module available

------------------------------------- /home/hron/WORK/pkg/Modules/User/linux-rocky9-x86_64/Math -------------------------------------
pari/2.17.3

------------------------------------- /home/hron/WORK/pkg/Modules/User/linux-rocky9-x86_64/Sets -------------------------------------
fenics/2019.1.0 (E) firedrake-env/2025.10 firedrake/2025.10 python-env/2026.04

------------------------------------- /home/hron/WORK/pkg/Modules/User/linux-rocky9-x86_64/Core -------------------------------------
R/4.5.2 (E) dmd/2.081.1 python/3.13.12 rclone/1.70.2 (T,L)

-------------------------------------------------- /home/hron/WORK/pkg/modulefiles --------------------------------------------------
Arch/linux-rocky9-cascadelake Arch/linux-rocky9-x86_64 (L) Arch/StdEnv (D)

Moving data to/from the cluster

Basic commands to transfer your data from Karlin (r3d3.karlin.mff.cuni.cz) to Chimera (hpc.troja.mff.cuni.cz). First login to r3d3.karlin.mff.cuni.cz with your karlin account. Then copy your files to troja by:

  • Copy one file from local machine to the chimera cluster
    scp local.dat <sis_login_name>@hpc.troja.mff.cuni.cz:/path/to/
  • Copy whole local folder
    scp -r local_folder <sis_login_name>@hpc.troja.mff.cuni.cz:/path/to/
  • Synchronize a directory (recommended for repeated or large transfers)
    rsync -avzP ./results/ <sis_login_name>@hpc.troja.mff.cuni.cz:/path/to/results/

This can be done also by loging to the Chimera cluster first and then copy the data from remote location, for example
scp <karlin_login>@r3d3.karlin.mff.cuni.cz:/path/to/folder ~/destination/

If you encounter any problems, please contact us at clusteradmin@karlin.mff.cuni.cz.

Useful chimera resources

Collection of useful notes for migration from Sněhurka cluster to Chimera cluster.

Official Chimera information sources

Contacts

  • general questions about Chimera cluster: Gitlab cluster work items at gitlab.mff.cuni.cz/mff/hpc/clusters or email to J. Eliášek (Jiri.Eliasek@mff.cuni.cz)
  • questiotns or software additions for mathematical section: email to J. Hron (hron@karlin.mff.cuni.cz)
  • data transfers from Snehurka cluster: email to clusteradmin@karlin.mff.cuni.cz

02. March 2026 · Comments Off on Sněhurka cluster relocation to Troja HPC facility · Categories: Downtime, News, Novinky

Last update: April 22, 2026

Most of the machines in the Sněhurka computing cluster in Karlín was moved to a new server room in Troja and is being integrated into the Chiméra cluster there. For current status of relocation process: see Relocation schedule.

This change will affect Sněhurka cluster users in two ways:

  1. During the move (expected to take a few days), the relevant Sněhurka cluster computing nodes will be unavailable. Chiméra cluster nodes will be available during this time.
  2. The control and rules of the Chiméra cluster differ in some respects from those of the Sněhurka cluster. On February 6, 2026, Jaroslav Hron organized a cluster training session where these changes were explained (see below for more detailed information about the training).

Details of the relocation are provided below:

Relocation schedule

The dates listed are tentative and subject to change depending on how the situation develops.

  • Mid-February April 8, 2026: completion of the new server room in Troja (if completion is delayed, the relocation dates will also have to be postponed)
  • Probably February 24, 2026 (Tuesday) March 2, 2026 (Monday) April 9, 2026 (Thursday): first wave of relocation = transfer of the Troja Chiméra cluster from the old server room to the new one (i.e., Chiméra cluster downtime – a few days)
  • Probably March 10, 2026 (Tuesday) April 15 (Wednesday): second wave of relocation = transfer of the Karlín Sněhurka cluster (and also servers from other locations, e.g., Malá Strana, Jinonice, Ovocný trh) to the new server room in Troja
    • April 14 (Tuesday) 12:00 p.m. (half a day earlier) Sněhurka cluster downtime – preparations for the cluster relocation (disconnecting nodes and cabling)
    • April 15 (Wednesday): relocation of most cluster nodes to Troja
      • ✅ The following was moved to Troja: CPU nodes (r31-r50), InfiniBand 100 Gb switch (and related InfiniBand cabling: for all nodes + 2 more), Ethernet switch (48 x 1 Gb/s, karc)
      • ✅ The following stays in Karlín: GPU nodes (g1–g6), InfiniBand 40 Gb switch, head nodes (r3d3, r0d0), r6 (for GitLab continuous integration), disk array ("home" and "work", pole2018)
    • After the relocation of hardware (as soon as possible, but it may take a few days):
      • ✅ Karlín: Reconnecting the remaining cluster components (g1–g6, switches, etc.) and bringing them back online. The head node (r3d3), disk array ("home" and "work"), and GPU nodes (g1–g6) will continue to function as before; however, all CPU nodes (r31–r50) will no longer be present.
        • ✅ Notify users that the GPU nodes (g1–g6) and disk array are up and running again.
      • Troja: Integrating all Karlín CPU nodes (r31–r50) into the Chimera cluster and setting up an environment similar to the one in Karlín:
        • ✅ Installation of nodes and switches in a rack, and cabling. Connecting cluster nodes (r31–r50) to the original InfiniBand switch (100 Gb/s) and connecting that switch to the local InfiniBand infrastructure.
        • ✅ All nodes: OS installation and configuration, integration with SLURM.
        • Modules (software) – installation and verification
        • Creating partitions (queues) similar to those in Karlín
        • Create a user group (probably named "math") that will have priority access to the cluster nodes from Karlín
        • Notify users that the Karlín cluster relocation is complete
      • Troja: disk space expansion:
        • ✅ New storage servers (with all those storage disks) arrived
        • Get storage servers up and running
        • Notify users that disk space has been expanded (and that they can start transferring data from Karlín)

Transfer of cluster data from Karlín to Troja

Instructions on how to transfer data to the Chimera cluster are available on the Chimera (usefull notes for migration) webpage.

The current situation is as follows:

  1. The data for the Karlín cluster (both cluster "home" and "work") is now stored on an old disk array (pole2018).
    • We purchased the disk array in 2018 (i.e., 8 years ago), and it is no longer under warranty.
    • The disk array has two controllers (components that manage the operation of the entire array) – one serves as a backup for the other. However, one controller is no longer functional, so while the array continues to operate, we no longer have a "backup" controller.
    • The data in cluster "home" is backed up (usually every night), but the data in "work" is not backed up.
  2. Recently, user demand for cluster storage space has increased, so we're struggling with a shortage of available space.
  3. We will need space for users who will continue to use nodes g1–g6 (which will remain in Karlín).
  4. The Chimera cluster in Troja is also currently struggling a bit with a lack of space. They’ve already purchased additional storage, but the priority right now is to complete relocation of all clusters to Troja and ensure that everything is fully operational. The disk space expansion will take place afterward; we expect it to take a matter of weeks.

In light of the above:

  1. Once the storage space in Troja has been expanded (i.e., once there is sufficient space there), we will ask Karlín users who do not use the Karlín GPU nodes (g1–g6) to move their cluster data from Karlín to Troja (not just "copy", but actually "move" – to free up space in Karlín). After that, only the data of GPU node (g1–g6) users should remain in Karlín (on the pole2018 disk array).
  2. We then plan to move the remaining data (i.e., data from users of g1–g6) to a Karlín storage location that is better protected against failure (than the pole2018 disk array). Details are yet to be determined.

What is the Chimera cluster?

The Chimera cluster is an HPC (high-performance cluster) currently located in the "old" server room in Troja (and will also be moved to the "new" server room in Troja). More detailed information about the cluster can be found on the website: https://www.mff.cuni.cz/en/hpc-cluster/ (login via CAS is required).

The cluster website also has a section dedicated to Introductory training (June 2022), which includes a link to slides [pptx, 17 slides, 9 MB] (the slides don't contain all the material covered in the hands-on session) and a recording of the entire training session [mp4, 2 h 25 min, 1.7 GB].

Where and why is the Karlín cluster moving?

There are several different computing clusters in various locations at the Faculty of Mathematics and Physics. A new server room is currently being completed in Troja, to which clusters from other locations will gradually be moved and unified under central administration.

Advantages of consolidating and unifying computing clusters:

  • More efficient use of technical resources:
    • Space, electricity, cooling, data network, data storage
  • More efficient use of human resources:
    • Hardware and software management, easier sharing of know-how
  • It is difficult for users to work with multiple different clusters (different accounts, controls, settings, rules etc.)
  • The unified cluster will have support that local clusters do not have:
    • In the first half of 2026, the cluster will be expanded with new computing nodes (CPU and GPU) and other equipment worth a total of approximately CZK 24 million (these new nodes will be available to all users)

How will the sharing of unused computing capacity work?

The sharing of computing nodes will work on the following principle:

  1. Those who provided/financed the computing nodes (schools, departments, groups) will have priority access to these nodes (in the form of higher priority computing queues).
  2. However, when the nodes are not in use, anyone else will be able to use them. Each node in the cluster will be part of a queue that will have the lowest priority but will be of the FFA (free for all) type.

Which Karlín machines will be moved to Troja?

The following will be moved to Troja:

  • all CPU nodes (r31–r50)
  • InfiniBand 100 Gb switch

In addition, two new computing CPU nodes purchased from the UNCE project (prof. J. Málek) will be delivered directly to Troja (expected delivery date is in the first half of 2026).

The following will not be moved to Troja:

  • all GPU nodes (g1–g6), because their form factor (dimensions and other physical characteristics, including cooling requirements) is not suitable for mounting in server room rack stands
  • InfiniBand 40 Gb switch

What is the RSE (Research Software Engineering) group?

RSE group is a university group of people (currently approx. 3-4 employees) who can help users with the use of the cluster. It serves as an interface between the latest technologies and the academic environment.

The main goal of RSE is to reduce barriers to computing resources, for example:

  • Assistance in the development of scientific code (new features, optimization, parallelization, version control)
  • Deployment on new computing infrastructure
  • Commissioning of complex computing pipelines, selection of suitable tools

Website: https://rse.cuni.cz/

Information about RSE was also presented at Jaroslav Hron's training course.

30. July 2015 · Comments Off on Volné místo na disku · Categories: News, Novinky

V poslední době dochazí k pomalému zaplnění disku vyhrazeného pro /usr/nobackup. Pokud máte data, o kterých víte že je nebudete potřebovat je čas je smazat 🙂

Informace a doporučení jak zacházet s větším objemem dat najdete zde.

Je připravován nový diskový prostor s větší kapacitou, který bude v nejbližší době zprovozněn.