UNICOS
UNICOS is a family of proprietary Unix-based operating systems developed by Cray Research (later Cray Inc.) specifically for their high-performance supercomputers, succeeding the earlier Cray Operating System (COS) and introducing the first 64-bit implementation of Unix to support vector processing and massive parallelism.[1][2] Originally released in 1985 and derived primarily from AT&T's UNIX System V with influences from the Fourth Berkeley Software Distribution (4BSD), UNICOS provided a scalable, multi-user environment optimized for scientific computing workloads, including multi-threading, multitasking, and POSIX compliance.[1][3] It supported symmetric multiprocessing (SMP) on up to 32 processors or more, high-performance I/O subsystems capable of handling file systems up to 8 terabytes, and features like the Network Queuing Environment (NQE) for batch job management and Multilevel Security (MLS) for protected environments.[1][2] The system evolved through several variants to address advancing Cray hardware architectures: UNICOS itself powered iconic machines like the Cray-1, X-MP, Y-MP, and C90 from the 1980s to 1990s; UNICOS MAX used a Mach-based microkernel for massively parallel systems like the T3D; UNICOS/mk employed a Chorus microkernel for the T3E; UNICOS/mp was based on SGI's IRIX 6.5 kernel for the X1 series; and UNICOS/lc transitioned to a SUSE Linux foundation for the XT3, XT4, and XT5, incorporating a lightweight Catamount microkernel for compute nodes.[3][2] Major releases spanned from UNICOS 1.0 in 1986 to version 10.0.0.8 around 2000, with the /mp and /lc lines continuing into the mid-2000s until succeeded by the Cray Linux Environment (CLE) starting with version 2.1.[1][3] UNICOS's design emphasized low I/O latency, data integrity, and resource accounting, making it essential for supercomputing applications in fields like weather modeling, nuclear simulations, and astrophysics, while its CLI-based interface and extensions for Cray-specific hardware ensured compatibility across generations of vector and massively parallel processors.[1][3]History
Origins
The Cray Operating System (COS), introduced in 1976 with the Cray-1 supercomputer, served as the initial non-Unix baseline for Cray Research's vector processing machines, including early Cray X-MP systems delivered starting in 1982. COS was a proprietary, batch-oriented operating system designed for high-performance computing workloads, emphasizing efficient job scheduling and resource allocation without Unix compatibility, which limited its appeal for multi-user and networked environments.[4] In 1984, Cray Research initiated development of CX-OS, the precursor to UNICOS, as a Unix-compatible operating system to address these limitations and support growing demands for standard software portability on supercomputers. CX-OS was first deployed as a guest operating system under COS on the Cray X-MP, running in a dedicated memory partition to enable Unix-like functionality without disrupting the primary batch environment; this prototype allowed initial testing of Unix ports on the X-MP hardware. The system originated from efforts to adapt Unix for vector architectures, marking Cray's shift toward a more open, standards-based OS ecosystem.[5][6] UNICOS, the renamed and fully evolved version of CX-OS, was introduced in 1985 alongside the Cray-2 supercomputer launch, becoming the primary operating system for subsequent Cray platforms. It was derived from AT&T's UNIX System V Release 2, incorporating elements from the Fourth Berkeley Software Distribution (4BSD) to enhance networking, file systems, and multi-user support tailored for supercomputing. Early experimentation occurred at AT&T Bell Laboratories, where a Cray X-MP/24 was installed in November 1985 and ran UNICOS starting in early 1986, enabling Unix pioneers like Dennis M. Ritchie to port and test components from earlier Unix editions on Cray hardware. This deployment solidified UNICOS as a bridge between traditional supercomputer OSes and the Unix paradigm.[7][8][3]Development Phases
Following its initial introduction in 1986 as UNICOS 1.0 (initially known as CX-OS), a Unix derivative developed for the Cray-2 to replace the older COS operating system due to lower development and maintenance costs, UNICOS underwent significant expansions to support evolving Cray vector processor architectures.[6] By 1988, UNICOS was adapted for the Cray Y-MP, incorporating enhancements for the system's multiple vector processors and improved I/O via the Model-E IOS, enabling scalable multiprocessing on configurations with up to eight CPUs.[6] This adaptation maintained binary compatibility with prior systems while optimizing for the Y-MP's 6 ns clock cycle and air-cooled variants like the Y-MP EL.[6] In 1991, further development extended UNICOS (version 7.0) to the Cray C90, doubling vector pipe sets from the Y-MP design; multithreaded capabilities were later introduced in UNICOS 8.0 (1994) to minimize OS overhead to 2-3% on 16-CPU systems, thus supporting higher sustained performance on water-cooled vector platforms.[6][1][9] These adaptations continued through the 1990s to other vector systems, including the T90 in the mid-1990s, which featured total immersion cooling and advanced board interconnects, ensuring UNICOS's role as a stable host OS for scientific computing workloads.[6] UNICOS's integration with Unix standards progressed steadily from its System V Release 2 foundation, incorporating enhancements from subsequent AT&T releases to improve portability and functionality.[3] Early versions drew on System V for core utilities and file management, with incremental updates adding Berkeley Software Distribution (BSD) elements for networking and security by the late 1980s.[10] By the mid-1990s, this evolution culminated in POSIX compliance for process management and ISO standards for networking, facilitating broader interoperability.[6] A key advancement toward X/Open compliance occurred with UNICOS 10.0 in 1997, which achieved X/Open Base 95 branding, enabling certified conformance to Unix portability interfaces while supporting internationalization via POSIX locales for collating, character types, numerics, and time formatting.[10] Cray Research's internal R&D efforts drove much of UNICOS's evolution, with dedicated phases focused on Unix porting and performance tuning beginning in the mid-1980s.[6] Collaborations included contributions from Dennis M. Ritchie of Bell Labs to early implementations on X-MP and experimental systems, aiding the transition from proprietary OS elements to a full Unix environment.[3] In the early 1990s, partnerships such as with DEC for Y-MP EL distribution (1992-1994) and ARPA funding for massively parallel processor (MPP) integrations like the T3D further influenced UNICOS variants, emphasizing microkernel adaptations while preserving the core system's Unix heritage.[6] In the 1990s, UNICOS achieved key milestones in scalability, featuring native 64-bit addressing from its inception in 1985, allowing up to 8 terabytes of file system capacity on PVP systems and enabling large-scale scientific simulations without memory constraints.[6][10] Symmetric multiprocessing (SMP) support was enhanced starting with the Y-MP in 1988, evolving to handle up to 32 processors on later vector systems like the T90 by using modified kernels for parallel I/O and process scheduling, with disk striping and mirroring for fault tolerance.[6][10] These developments, spanning over 10 major releases, positioned UNICOS as a mature, mainframe-class Unix variant with C2-level security by the decade's end.[6]Technical Architecture
Kernel and System Design
UNICOS initially featured a monolithic kernel design, optimized for the vector processing architectures of early Cray systems such as the Cray-1, X-MP, and Y-MP. This kernel, derived from AT&T UNIX System V with influences from the Fourth Berkeley Software Distribution (BSD), integrated core operating system services including process scheduling, inter-process communication, and device drivers directly into a single address space for efficient execution on shared-memory multiprocessors.[11] Process management supported both multitasking, where multiple processes shared the same address space within multitasking groups (m-groups) to enable parallel execution across CPUs, and multiprocessing via macrotasking directives that allowed concurrent program runs on multiple processors.[11] Memory handling utilized the system's central memory, consisting of 64-bit words organized in interleaved banks with single-error correction and double-error detection (SECDED) for reliability, while avoiding traditional paging or segmentation to align with the hardware's flat address space and high-speed access requirements.[11] File systems were based on the hierarchical structure of System V UNIX, enhanced with Cray-specific I/O primitives for high-bandwidth transfers to solid-state storage devices.[11] As Cray systems evolved toward massively parallel processing (MPP), UNICOS transitioned to microkernel architectures to improve scalability and modularity. UNICOS MAX, introduced for the Cray T3D, employed a Mach-based microkernel on the processing elements (PEs), which handled basic thread management, virtual memory, and inter-process communication (IPC) via message passing, while the host Y-MP or C90 system ran the full monolithic UNICOS for I/O and job management.[3] This design minimized kernel footprint on PEs to support lightweight, distributed computation without shared memory across nodes. Similarly, UNICOS/mk for the Cray T3E adopted a Chorus-based microkernel, restructuring the OS into distributed servers for functions like process management, file serving, and networking.[12] The Chorus model used actors (address spaces), threads, ports for IPC, and regions for memory protection, allowing OS services to run as user-level processes across PEs in a scalable, fault-tolerant manner, with support PEs hosting multiple servers and application PEs running minimal kernels.[12] This serverized approach enabled distributed operation without requiring the entire OS image in local memory, contrasting the monolithic UNICOS's reliance on shared central memory.[12] UNICOS provided Unix-compatible system calls and APIs, ensuring portability for standard applications while incorporating Cray-specific extensions tailored to supercomputing workloads. Core interfaces adhered to System V and BSD standards, including calls likefork(2), exec(2), open(2), read(2), and write(2) for process creation, execution, and I/O, with modifications to support multitasking groups where fork(2) was restricted within m-groups to maintain shared address spaces.[13] Cray extensions included multitasking primitives such as tfork(2) for creating processes within m-groups and resch(2) for rescheduling, alongside asynchronous I/O calls like reada(2) and writea(2) to overlap data movement with computation.[13] For vector operations, extensions integrated with Cray's programming environment through APIs in libraries like LibSci, enabling directives for vectorization without dedicated system calls, though kernel support for efficient memory access patterns underpinned these capabilities.[11]
UNICOS represented the first 64-bit implementation of a Unix operating system, released in 1985 for Cray systems, leveraging the inherent 64-bit architecture of the Cray-1 and subsequent models.[14] Addressing modes supported a flat 64-bit virtual address space, with direct addressing of up to 128 million 64-bit words in central memory on Y-MP systems, using 44-bit physical addresses for bank selection and word offset.[14] Data types aligned with 64-bit words as the native unit, where integers, pointers, and floating-point values were 64 bits wide, ensuring seamless handling of large datasets without the 32-bit limitations of contemporary Unix variants; for example, the long type was 64 bits, while int remained 64 bits for compatibility with vector hardware.[11] This design facilitated high-performance numerical computations by avoiding address truncation and supporting full-word arithmetic in both scalar and vector units.[11]
Hardware Support
UNICOS was designed to fully exploit the vector processing capabilities of early Cray supercomputers, including the Cray-2, X-MP, Y-MP, and C90 systems, through deep integration with their hardware architectures.[4] The operating system supported dual vector pipelines in these machines, where pipe 0 handled even-numbered elements and pipe 1 processed odd-numbered elements simultaneously, enabling up to two results per clock period across five pipeline segments for enhanced throughput in scientific computations.[15] Vector registers, configurable to 64 elements in Y-MP mode or 128 elements in C90 mode (each 64 bits wide), were managed via an 8-bit vector length (VL) register, allowing programmable lengths from 1 to 128 elements to optimize data processing without excessive overhead.[16] Instruction scheduling in UNICOS facilitated efficient vector execution by coordinating exchange, fetch, and issue sequences, reserving memory ports (A, B, C, D) and functional units to minimize conflicts.[16] Chaining mechanisms allowed continuous operand flow between vector registers and dedicated functional units (such as add, multiply, and shift), starting operations at any point in the result stream and achieving full concurrency when aligned with the first element's arrival.[15] For vector instructions, readiness times ranged from (VL)/2 + 3 to (VL)/2 + 14 clock periods, depending on the operation, with hold conditions resolved via priority arbitration to prevent stalls from register reservations or memory port contention.[16] This hardware-software synergy, supported by UNICOS kernel primitives and compilers like CF77, enabled automatic vectorization and up to 10-fold performance gains in loop-dominated workloads on these vector architectures.[15] To address massively parallel processing, UNICOS variants incorporated extensions for the Cray T3D and T3E systems, adapting to their distributed-memory topologies.[17] In the T3D, UNICOS MAX utilized a 3D torus interconnect to distribute tasks across up to 1,024 Alpha processors, with the kernel providing lightweight process migration and communication primitives tailored to the network's bidirectional links.[18] For the T3E, enhancements in UNICOS/mk supported a 3D torus interconnect, enabling scalable message passing and barrier synchronization across up to 1,456 processors, including hardware-assisted latency hiding through prefetching in the memory controller; GigaRing provided the I/O subsystem.[19] These adaptations ensured efficient load balancing and fault tolerance in large-scale simulations by abstracting the interconnect as a virtual shared address space where feasible.[20] Scalability features in UNICOS extended to symmetric multiprocessing (SMP) configurations in the J90 and SV1 series, supporting up to 32 processors in shared-memory environments.[21] The kernel managed SMP coherence through cache protocols and interprocessor interrupts, allowing seamless task distribution across vector-capable CPUs in the J90 for mid-range workloads.[22] In the SV1, UNICOS scaled to 32 multi-streaming processors (MSPs), each delivering 4 GFLOPS peak via eight vector pipes, with OS-level optimizations for bandwidth allocation in the high-bandwidth memory system.[23] UNICOS included dedicated I/O subsystems with drivers for high-speed peripherals, such as solid-state disks (SSDs) and front-end processors, to sustain data-intensive supercomputing tasks.[24] The I/O Subsystem (IOS) handled up to 512 million words of SSD capacity in Y-MP/C90 configurations, using channels like VHISP (1,800 MB/s) for rapid transfers to vector pipelines.[14] Front-end support integrated with minicomputers via the IOS, managing peripherals like DD-49 disks and IBM-compatible tape drives, while kernel buffers minimized contention during asynchronous I/O operations.Key Features
Performance Optimizations
UNICOS introduced asynchronous I/O capabilities to Unix systems, pioneering non-blocking operations that allow applications to continue processing while I/O requests, particularly on large datasets, are handled in the background.[1] This feature, implemented through mechanisms like list I/O (which supports both synchronous and asynchronous modes) and raw I/O (enabling direct data transfer to user process space without kernel buffering), was a key enhancement for multitasking environments on Cray supercomputers.[25] By permitting concurrent computation and data movement, asynchronous I/O significantly improved throughput for high-performance computing workloads involving massive scientific datasets.[26] The operating system maintains a balanced approach to batch and interactive processing, utilizing the Network Queuing System (NQS) for efficient job submission and execution in batch mode, while supporting interactive shells for real-time user sessions.[25] The Unified Resource Manager (URM) oversees resource queuing, employing a fair-share scheduler to equitably allocate CPU time, memory, and other resources based on historical usage patterns, preventing any single user or job from monopolizing system capacity.[25] This integration ensures seamless transitions between batch queues—managed via daemons and commands likeqsub—and interactive multiuser modes (run level 2), with configurable limits such as CPU time (jcpulim) applied uniformly to both.[25]
Memory management in UNICOS is tailored for vector processing, incorporating a scalar cache enabled by default to optimize data access patterns in vectorized computations, though it is disabled for multitasking processes unless restricted to a single CPU to avoid coherency issues.[25] System parameters like NBANKS and CHIPSZ allow administrators to fine-tune memory configuration for vector workloads, enhancing bandwidth and reducing latency.[25] For caching, UNICOS leverages Solid State Disk (SSD) technology, such as the Cray SSD-T90 connected via GigaRing, with a logical device cache (ldcache) in central memory to accelerate I/O by buffering sectors from secondary storage before transfer to user space.[25]
UNICOS integrates closely with Cray's optimizing compilers, including CFT (Cray Fortran Compiler) and CF77 (Cray Fortran 77), providing OS-level support for vectorization, scalar optimization, and multitasking directives that align code generation with the system's hardware architecture.[27] These compilers benefit from UNICOS kernel enhancements, such as efficient memory mapping and I/O handling, enabling fine-tuned performance for scientific applications through features like autotasking and loop-level optimizations.[25] This synergy allows developers to achieve near-peak vector throughput without extensive manual intervention.[28]
Security and Management
UNICOS provides comprehensive user and resource accounting capabilities tailored for multi-user supercomputing environments, utilizing both standard System V accounting and the proprietary Cray System Accounting (CSA) mechanism. CSA tracks detailed metrics such as per-process and per-job CPU usage, memory allocation, device I/O, connect sessions, and disk space consumption, charging users based on configurable service units (SBUs) that can be customized for different resources. User database (UDB) entries in/etc/udb store account information, including login IDs, passwords, and resource limits like job CPU time (jcpulim) and memory (jmemlim), enabling administrators to enforce policies via commands such as /etc/xadmin for database management and /usr/lib/acct/csarun for daily reports. Periodic summaries are generated with tools like /usr/lib/acct/csaperiod, integrating data from logs in /usr/adm/acct/day/ and /etc/wtmp to support billing and usage analysis.[25]
File system quotas in UNICOS control disk space and inode usage per user, group, or account ID, with soft limits that issue warnings at 90% utilization (e.g., default 5000 blocks and 200 inodes) and hard limits to prevent overages. Administration occurs through the .Quota60 configuration file and /etc/fstab for automatic enforcement on mounted file systems, monitored via commands like qudu for display, quadmin for management, and dodisk for usage reports; root file systems are excluded from quotas to maintain system stability. Job scheduling integrates with these accounting features via the Unified Resource Manager (URM), which balances batch and interactive workloads using a fair-share algorithm that adjusts CPU priorities based on historical usage stored in the UDB nice field (ranging 0-19). URM allocates resources like multistreaming processors (MSPs) dynamically, supporting up to six MSPs with four CPUs each, while enforcing limits such as process counts (jproclim) and tape units (jtapelim).[25]
Security in UNICOS extends traditional Unix permissions with Cray-specific controls, including discretionary access control (DAC) using standard Unix permission bits for files, directories, and inter-process communication objects. Owners and administrators can modify attributes to enforce read, write, and execute rules, with root privileges granting full access except where user execute permissions are required; immutable or read-only files deny writes regardless of privileges. Multilevel security (MLS) features, available in variants like UNICOS/mk, incorporate mandatory access control (MAC) policies through security labels and privilege assignment lists (PALs), configurable at boot via system parameters, preserving compatibility with UNICOS 9.2 interfaces while enhancing network security. Audit logging captures security-relevant events such as user authentications, access attempts, configuration changes, and MLS violations, recording timestamps, subject identities, and outcomes in append-only files like /usr/adm/sulog for super-user actions and /usr/adm/sl/slogfile for MLS events. The system supports configurable thresholds for audit trail capacity, triggering actions like alerts or single-user mode if exceeded, with tools such as reduce enabling selective review by authorized administrators only.[29][30][25]
System administration in UNICOS leverages tools like the Network Queuing Environment (NQE) for distributed job management across heterogeneous networks, comprising components such as the Cray Network Queuing System (NQS) for batch execution, a central database for up to 36 servers, and a network load balancer (NLB) for workload optimization. NQE enables job submission, monitoring, and control from clients via command-line or graphical interfaces, routing requests to NQS servers while integrating with URM for resource allocation in multi-node setups. Compliance with standards is ensured through POSIX adherence, supporting commands, utilities, and APIs for portability, as seen in variants like UNICOS/mp which scale to thousands of processors. UNICOS version 10.0 holds X/Open Base 95 branding, confirming alignment with the X/Open Portability Guide for system interfaces and utilities.[31][32][25]