Grids and supercomputers
Metacentrum wiki is deprecated after March 2023
Dear users, due to integration of Metacentrum into https://www.e-infra.cz/en (e-INFRA CZ service), the documentation for users will change format and site. The current wiki pages won't be updated after end of March 2023. They will, however, be kept for a few months for backwards reference. The new documentation resides at https://docs.metacentrum.cz. |
This page provides information about grids and supercomputers - technology used in MetaCentrum . You will learn which problems can be solved with the help of MetaCentrum resources, what software you need, how to acces MetaCentrum and how to prepare and submit your first job.
Grids
The Grid term was introduced in 1998 by Carl Kesselman and Ian Foster in book The Grid: Blueprint for a New Computing Infrastructure. Though this word has many meanings, it requires some explanation here.
Evolving of definition
Carl Kesselman and Ian Foster define grid as follows:
1998/99 A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities. Grid has to fulfill following three points:
- coordinates resources that are not under centralized management
- use standard, open, generic protocols and interfaces
- provides non-trivial quality of services (more than each individual part of its own)
2001 ... coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations.
2002 ...a Grid is a system that: coordinates resources that are not subject to centralized control … using standard, open, general-purpose protocols and interfaces … to deliver nontrivial qualities of service …
Computational grid is described by analogy with electrical power network - around year 1910 every electrified building had its own electricity generator but those generators were not interconnected. There was no possibility to use larger capacity than capacity of a single generator, total capacity of all generators were not used effectively and establishment of electricity was expensive. The real utilization of electricity usage begun after start of interconnected large power stations and distribution network to the consumers. This allowed cheap, ubiquitous and standardized source of electric power. Similarly, every organization today manages its own computational capacities (computers, disc capacity, software, data, specialized hardware) that can not be effectively shared with others. Computational grid should enable such effective sharing of computational capacities among organizations.
Features of Grid
- Heterogeneous hardware and software
- Loosely-coupled machines
- Potentially unreliable low-speed network
- Several distributed locations
- Several administrative domains
- Potentially insecure
- Grid middleware (ARC, gLite, etc.)
- Best for embarrassingly parallel jobs
- Focus on high throughput / capacity computing
Scientific Areas of Interest
Mostly Computational Sciences
- Computer Science
- Computational Chemistry, Cheminformatics
- Bioinformatics, Biomedical Computing (e.g., Imaging)
- Physics (High Energy Physics, Plasma Physics, Solid State Physics, Theoretical Physics, QCD, etc.)
- Earth Observation, Satellite Imaging, Meteorology, Climatology, Geography, Geology, etc.
- Nanotechnology
- Many more
Supercomputers
A supercomputer is a computer with a high-level computational capacity compared to a general-purpose computer.
What are the supercomputers good for
Supercomputers are ordinary computers, just more powerful than "normal" computers. One has to realise why they are more powerful.
In one type of supercomputers – considered as the real supercomputers by public – it is because their processors are much faster in computing some exact operations. In particular this is the case of vector supercomputers that are able to process whole vectors (n-tuples) of numbers at a speed of normal processors working only with scalar operands (separate numbers). Nevertheless we don't have a vector supercomputer in MetaCentrum yet.
Generally, the supercomputers needn't have faster processors than are available in PC's or simple servers, but they have a lot of these processors. That has a fundamental impact on which problems supercomputers can compute faster and which they can't.
If one labourer does a job in 2 hours, then 100 labourers...
Real Life example: Try to imagine a large corn field and one farmer with a scythe. It will take two weeks to cut the field. When you send two farmers to cut the field, it will take one week. 14 farmers cut the field for one day. 140 farmers cut the field in an hour. 8400 farmers cut the field in a minute. It´s possible to increase of number of farmers, so maybe 8400 farmer cut the field in a minute. Cutting field is a good example of parallelizable job, more used workers means the task is done faster.
But try to imagine one broken watch. One watchmaker repairs it for 10 hours. Two watchmakers repair it for 10 hours too as well as 8000 watchmakers, because only one watchmaker can repair the watch and rest of watchmakers have to wait. Repairing the watch is non-parallelizable job.
The supercomputer isn´t one worker (= processor), who works faster then another workers. But supercomputers are a big group well coordinated workers. It depends only on you if they work hard or if the most of them wait idly.
It is worthwhile to use supercomputers for computational jobs solved by co-work of many processors. If each step of your job depends on the result of previous step, it´s not possible to use more than one processor. So supercomputers can´t accelerate solution of your jobs.
If it´s possible to separate you computational jobs into parallel processed parts, welcome in MetaCentrum.
Long-running jobs
MetaCentrum can offer you environment for long-running jobs, even if you don´t have the parallellizable job. Some jobs could take months, so during this time black out or collapse of the operating system could happen or simply somebody turns off your PC.
MetaCentrum has machines running for a long time. Their power supply is backed up by accumulators and the diesel generator. They are placed in guarded air-conditioned hall.
If you have computational jobs, possibly running for months, welcome on MetaCentrum too!
Clusters and Grids
Let me introduce you supercomputers´s relatives – clusters and grids, which are also used in MetaCentrum.
Cluster is same computers connected together by communication network. Grid is many computers belonged to various organizations free connected by net. Grid´s computers can be various.
Clusters and grids provide computational power on the same principle like supercomputers. They also offer many processors. They differ in their interconnection. In the supercomputer there are processors connected by very fast internal network with minimal level of delay. These processors have access to shared memory. In clusters and grids we use common Internet, IP protocol and non-shared memory. The advantage of clusters and grids is lower price, so it´s possible to acquire more processors for same price.
In clusters you can link processors to each other (by suitable options, switched nets) or we can use special high-speed nets with minimal level of delay (myrinet or Infiniband). In Grids we are dependent on large nets, of which total permeability is significantly smaller and the level of delay is many times higher. More info...
The increasing level of delay affects parallellizable jobs. Let me remind you example with farms and the corn field. We didn´t count with time necessary for organization their work. So it could happened very easy that preparation of the job can take more time then job´s solution.
Parallel computers prepare their jobs faster than clusters and clusters faster then Grids. Jobs can exchange messages faster and more often in supercomputer. Despite this fact clusters and Grids are effective option (economical and productive) of specialized supercomputers. It´s possible to use them for solution of large parallellizable jobs.
We offer you supercomputers with many processors and shared memory as well as clusters composed of hundreds machines with dual or quad-core processors. We also provide access to international Grids, where a large amount of processors are available. Don´t forget: large amount available processors means increase in difficulty of working with them.
Don´t worry we are able to help you. You can start with dual-core processors and then go to more advanced configurations. The current list of available machines on web List of machines in MetaCentrum