Fact-checked by Grok 2 weeks ago

One-instruction set computer

A one-instruction set computer (OISC) is an abstract computational model or machine architecture that employs only a single type of instruction to execute all possible operations, demonstrating the theoretical minimum required for general-purpose computing.^[1] Despite the extreme reduction, well-designed OISC instructions can achieve Turing completeness, enabling simulation of arbitrary algorithms through clever combinations of memory accesses and control flow.^[2] The concept emerged in the late 1980s as an illustrative extreme of reduced instruction set computing (RISC) principles, with early proposals focusing on simplifying hardware by eliminating opcode fields entirely.^[1] Pioneering work includes the 1988 "Ultimate RISC" design, which uses a single "move memory to memory" instruction to perform arithmetic and branching via specialized memory-mapped units, requiring multiple microcycles per operation but supporting pipelining for efficiency.^[1] Another influential variant is the URISC (Ultimate Reduced Instruction Set Computer), proposed around the same time, which leverages a subtract-and-branch-if-negative operation to implement all necessary functions in a fixed-format instruction.^[3] The SUBLEQ (subtract and branch if less than or equal to zero) instruction represents one of the most studied OISC formats, where a ternary operation subtracts the value at one memory address from another, stores the result, and conditionally branches based on the outcome, allowing implementation of addition, multiplication, and loops through indirect addressing.^[4] SUBLEQ-based OISCs have been implemented in hardware, such as multi-processor arrays on field-programmable gate arrays (FPGAs), achieving performance comparable to conventional CPUs for specific tasks.^[4] OISCs serve primarily as educational tools for understanding instruction set design and compiler theory, as well as in research on fault-tolerant computing and minimal architectures for embedded systems.^[5] Practical implementations, like student-built simulators and co-processors, highlight their utility in demonstrating how complexity arises from simplicity.^[6]

Fundamentals

Definition and Principles

A one-instruction set computer (OISC) is an abstract computational model and practical architecture that relies on a solitary machine instruction to execute all operations, representing the pinnacle of instruction set minimalism. In contrast to multi-instruction set computers (ISCs), which employ diverse instructions tailored for specific tasks like arithmetic, data transfer, and control flow, an OISC demonstrates that computational universality can be attained through the flexible interpretation of a single instruction's operands and semantics. This approach underscores the theoretical insight that complexity in hardware can be shifted to software, where sequences of the same instruction emulate more elaborate behaviors.^[7] The core principles of OISC design emphasize extreme reduction in hardware complexity while preserving expressive power, achieved by engineering the single instruction to support indirect addressing modes and conditional execution. OISCs are classified into categories such as arithmetic-based, bit-manipulation, and transport-triggered architectures, each leveraging the single instruction differently to achieve universality (detailed in subsequent sections). Through careful selection of operands—typically memory addresses—the instruction can simulate loading and storing data, performing arithmetic or logical operations, and implementing branching for control flow, all without dedicated opcodes for each function. OISCs typically incorporate a simple memory model, such as an unbounded array of integer cells serving as both program storage and data space, along with an implicit program counter that advances execution by fetching operands from sequential memory locations; registers are often absent or derived from memory to maintain uniformity. This structure aligns with the minimalist philosophy, minimizing decoding logic and potentially simplifying processor implementation.^[7] OISC instruction formats vary, but many prominent designs use three operands (memory addresses) A, B, and C. In such formats, the operation typically subtracts the value at A from the value at B, stores the result in B, and branches to C if the result is less than or equal to zero (as in the SUBLEQ instruction). Other formats, such as binary bit-manipulation instructions (e.g., XOR or ADD with two operands), achieve similar universality through different mechanisms. Such designs enable the encoding of diverse computations by leveraging memory indirection and zero/non-zero tests, proving the architecture's capacity for Turing completeness.^[8] However, OISCs present design challenges, particularly in code density and execution efficiency relative to RISC or CISC systems. Programs in OISC require longer sequences of instructions to replicate basic operations, resulting in expanded memory footprints and slower performance due to increased instruction fetches and the absence of hardware optimizations for common patterns. While theoretically elegant, these factors limit practical adoption beyond educational or experimental contexts, though multicore variants have shown potential energy savings in specialized applications.^[9]^[10]

Turing Completeness

A computational system is Turing complete if it can simulate the behavior of any Turing machine, meaning it is capable of performing any computation that can be carried out by a universal Turing machine, given sufficient resources such as time and memory. One-instruction set computers (OISCs) achieve Turing completeness through their ability to emulate a universal computer using sequences of a single instruction type, which provides the foundational operations necessary for arbitrary computation. In particular, certain OISC designs, such as the URISC model, demonstrate this equivalence by constructing all required computational primitives from one instruction. The proof that an OISC is Turing complete typically involves showing that its single instruction can be composed to implement key mechanisms: conditional branching for control flow, direct memory access for data manipulation, and state transitions to manage program execution. For instance, an instruction supporting subtraction from memory locations combined with a conditional jump if the result is non-positive enables the simulation of zero-testing, data copying, addition, and unconditional jumps; these in turn allow the OISC to model a register machine, which is known to be equivalent in power to a Turing machine. By encoding a Turing machine's tape as an array in the OISC's memory, its head position and states as addressable values, and its transition rules as instruction sequences, the OISC can step through the TM's configuration, reading symbols, updating the tape, and altering states accordingly. This simulation establishes that any computable function executable on a TM is also executable on the OISC. Theoretical OISC models rely on infinite memory to attain full Turing completeness, mirroring the unbounded tape of a Turing machine, which permits the representation of arbitrarily large inputs and intermediate states without resource exhaustion. With finite memory, OISCs would be limited to computations within that bound, akin to finite automata. The applicability of the halting problem further confirms this equivalence: determining whether an arbitrary OISC program will halt on a given input is undecidable, just as it is for Turing machines, reinforcing the shared theoretical boundaries of computability. In comparison to other minimal computational models, such as Brainfuck—an esoteric language that achieves Turing completeness with effectively two instructions for pointer movement and value adjustment under loops—OISCs exemplify greater extremity by relying on just one instruction to encode all necessary operations, including implicit looping via branching. This minimalism underscores key implications for computability theory: under the Church-Turing thesis, which posits that any effectively calculable function can be computed by a Turing machine, OISCs affirm that computational universality depends not on instruction richness but on the capacity for unbounded memory manipulation and conditional control, challenging assumptions about the necessity of complex architectures for general-purpose computing.

Historical Development

Origins in Theoretical Computing

The conceptual foundations of one-instruction set computers (OISC) trace back to early 20th-century efforts in computability theory, which sought to define the minimal mechanisms capable of universal computation. Alan Turing's 1936 paper introduced the Turing machine as an abstract model demonstrating that a simple device with a read-write head, an infinite tape, and a finite set of states could simulate any algorithmic process, establishing the theoretical possibility of computation through minimal rules. This work underscored that universality does not require complex instruction sets, laying groundwork for later explorations into reduced computational models, including those with a single instruction.^[11] Parallel developments in minimal formal systems further influenced OISC concepts. Alonzo Church's lambda calculus, developed in the early 1930s, provided a foundation for functional computation using abstraction and application as primitive operations, proving equivalent in expressive power to the Turing machine and highlighting how a sparse set of rules could encode arbitrary computability. Similarly, Emil Post's 1936 formulation of tag systems and production systems used simple rewriting rules to model combinatory processes, demonstrating Turing completeness with a minimal repertoire of actions that prefigured single-operator designs. These systems collectively emphasized that computational universality could emerge from highly constrained primitives, inspiring theoretical interest in even more stripped-down architectures. An early specific exploration of minimal operations in automatic computers appeared in W.L. van der Poel's 1956 work, which examined essential operations for computation.^[11]^[12] Despite these theoretical advances, practical focus on OISC remained limited before the 1990s, as computing research prioritized multi-instruction architectures optimized for efficiency, speed, and hardware realization in real-world applications like scientific computation and data processing.^[11]

Key Developments and Milestones

The development of one-instruction set computers (OISCs) gained momentum in the 1990s through esoteric programming communities, with Oleg Mazonka's introduction of BitBitJump in 2009 marking a milestone in bit-manipulating OISC designs. BitBitJump employs a single instruction to copy a bit from one memory location to another and conditionally jump based on the bit's value, demonstrating Turing completeness in an unbounded memory model while highlighting the minimalism possible in OISC architectures.^[13]^[14] In the 2000s, the rise of the esoteric programming languages (esolangs) community, formalized through online resources starting around 2001, led to the widespread adoption and refinement of SUBLEQ as a canonical OISC instruction. This period saw the creation of emulators and assemblers for SUBLEQ, enabling practical experimentation and proving its utility for general-purpose computation. A key advancement was the development of compilers targeting SUBLEQ, such as Oleg Mazonka's Higher Subleq around 2005–2009, which translated simplified C programs into SUBLEQ code, significantly enhancing the practicality of OISCs for non-trivial applications.^[15]^[16] The 2010s brought hardware-focused prototypes and specialized applications, exemplified by the TOGA computer in 2011, an early hardware OISC implementation using a toggle-and-branch instruction to manipulate bits with minimal circuitry. This prototype underscored the feasibility of physical OISC designs, albeit limited by bounded storage in practice. In 2013, Cryptoleq emerged as an innovative OISC variant tailored for cryptographic computations, supporting homomorphic encryption via Paillier schemes to process encrypted data alongside unencrypted operands in a heterogeneous environment.^[17]^[18] Post-2010 efficiency benchmarks highlighted OISC viability in multi-processor configurations; for instance, a 2011 implementation of 28 SUBLEQ processors on an FPGA achieved performance comparable to a modern personal computer's CPU for basic tasks, executing programs at speeds sufficient for real-time simulation.^[4] In the 2020s, software simulations proliferated through open-source projects, including Python-based OISC interpreters like those on GitHub, facilitating accessible education and experimentation. Recent open-source efforts, such as the SIC-1 8-bit SUBLEQ computer integrated into hardware design tools like Tiny Tapeout, have further democratized OISC development. Theoretical extensions to quantum OISC variants remain largely hypothetical as of 2025, with ongoing research exploring single-instruction models for quantum instruction sets but without mature implementations.^[19]^[20]

Architectures

Bit-Manipulation Architectures

Bit-manipulation architectures represent the simplest class of one-instruction set computers (OISCs), where the single instruction operates directly on individual bits within memory modeled as an infinite array of bit strings. In this paradigm, the instruction typically involves a basic bit-level operation, such as inversion or copying of a bit at a specified memory position, combined with a conditional or unconditional jump based on the resulting bit state. For instance, the operation might invert the bit at address A and branch to address B if the inverted bit meets a certain condition, such as being set to 1. A well-known example is the bit-copying instruction, which copies a bit from one position to another (overwriting the destination), enabling Turing-complete computation through sequences of such operations. This mechanism allows simulation of more complex operations like copying data or conditional branching by sequencing multiple such instructions, treating the entire memory as a manipulable bit field without reliance on higher-level arithmetic primitives.^[21] The core appeal of these architectures lies in their hardware simplicity, as they require only minimal logic gates for bit flipping, testing, and address decoding, eliminating the need for dedicated arithmetic logic units or multi-bit processing hardware. This design aligns with theoretical models of computation grounded in digital logic foundations, where all universal computation can be reduced to bit-level manipulations, ensuring Turing completeness through the ability to emulate any Turing machine via sequences of bit operations and jumps. However, practical implementation reveals significant disadvantages, particularly inefficiency in handling non-bit-oriented tasks, as emulating standard data structures like integers or arrays demands expansive sequences of instructions, leading to poor performance for typical workloads.^[21] Scalability poses a further challenge for bit-manipulation OISCs in real-world applications, as the bit-array memory model translates to enormous storage requirements; for example, supporting even modest processor counts on resource-constrained hardware like low-cost FPGAs can demand around 1 Mb (128 KB) per processor, rendering it impractical for embedded or high-density systems without specialized optimizations. Despite these limitations, the architecture's emphasis on primitive bit operations provides a theoretical bridge to concepts in digital design, highlighting how minimal instructions can underpin complex behaviors while underscoring the trade-offs between conceptual elegance and practical utility.^[8]

Arithmetic-Based Architectures

Arithmetic-based architectures for one-instruction set computers (OISCs) rely on a single arithmetic operation, typically subtraction combined with a conditional branch, to achieve computational universality. The fundamental operation involves subtracting the value stored at one memory location from another and branching based on the result's sign or zero status, enabling the encoding of data movement, comparisons, and logical decisions through the arithmetic outcome. For instance, a negative or zero result after subtraction can trigger a jump, simulating conditional control flow essential for Turing completeness. This approach contrasts with bit-manipulation methods by leveraging integer arithmetic rather than direct bit-level operations.^[8] The memory model in these architectures consists of an array of integer cells, often treated as signed or unsigned values, where the instruction operates on three operands specifying source, destination, and branch addresses. A representative form performs mem[B] = mem[B] - mem[A], followed by a branch to mem[C] if the result is less than or equal to zero; otherwise, execution continues sequentially. This model supports an infinite or large addressable space in theory, with practical implementations using fixed-size words, such as 32-bit signed integers, to store both data and encoded instructions in a von Neumann-style unified memory. Arithmetic operations on these integers allow simulation of more complex behaviors, including logical operations, by exploiting properties like zero-testing for equality and negative results for inequality detection.^[8] To simulate bit-level manipulations within an arithmetic framework, bits are represented as components of large integers, such as encoding individual bits at positions corresponding to powers of 2, allowing subtraction to perform operations like clearing or testing without immediate carry propagation in simple cases. More robust simulations use "gapped" representations, where bits are separated by sufficiently large intervals (e.g., powers of a base much larger than the bit width) to isolate manipulations and prevent interference from arithmetic overflow or carry effects during subtractions. This enables the emulation of boolean logic gates and finite-state machines using sequences of arithmetic instructions.^[8] Efficiency in arithmetic-based OISCs is limited by significant code bloat, as basic tasks like addition or multiplication require multiple instructions to simulate through loops of subtractions and branches. Such expansion arises because the single instruction must encode all primitive actions indirectly, leading to longer programs compared to multi-instruction architectures.^[8] Extensions for handling overflow and negative numbers typically incorporate two's complement representation in signed integer memory cells, allowing subtractions to naturally produce negative results for branching while wrapping around on overflow to maintain computational consistency. This arithmetic convention ensures that negative values can be used for control flow without additional instructions, though practical implementations may impose word-size limits to bound overflow effects.^[8]

Transport-Triggered Architectures

Transport-triggered architectures (TTAs) represent a paradigm in one-instruction set computer (OISC) design where the sole instruction is a data transport operation, typically a "move" that copies values between locations, and computation is initiated as a side effect of these transports. In this model, the destination of a move serves as a trigger port connected to specialized function units, such as adders or multipliers, which execute operations upon receiving data without requiring explicit opcodes. This approach embeds functionality directly into the architecture's addressing scheme, allowing memory or register addresses to implicitly specify the intended computation, thereby eliminating the need for a diverse instruction set.^[22]^[23] The core principle of TTAs draws from dataflow computing concepts, where execution is driven by data availability rather than sequential control flow, enabling parallelism through concurrent moves in a single instruction cycle. For instance, moving operands to an arithmetic unit's input ports triggers the computation, with results potentially routed to output ports via subsequent or simultaneous transports, mimicking reactive systems where operations respond dynamically to data movement. This design exposes the processor's internal buses and datapaths in the instruction encoding, contrasting with traditional architectures by prioritizing transport scheduling over operation specification. Seminal work by Henk Corporaal formalized TTAs as a superclass of very long instruction word (VLIW) processors, emphasizing their simplicity for application-specific customization while supporting multiple concurrent transports in one instruction.^[22]^[23] Advantages of TTAs in OISC contexts include reduced hardware complexity and opcode overhead, as functions are realized through a fixed layout of trigger-enabled units, fostering energy-efficient datapaths optimized for specific workloads like signal processing. By embedding operations in addresses, TTAs achieve Turing completeness through sequences of moves that simulate arithmetic and control flow, such as conditional branching via move-triggered comparisons. However, implementation challenges arise from the need for a rigidly structured functional memory or register file, which complicates exception handling and multitasking, as state is distributed across execution units rather than centralized. Relation to VLIW is evident in the explicit parallelism of multi-move instructions, but TTAs minimize this to a unified move primitive, enhancing modularity for embedded systems.^[22]^[23]

Instruction Types

SUBLEQ Instruction

The SUBLEQ (SUBtract and Branch if Less than or Equal to zero) instruction serves as the foundational operation in arithmetic-based one-instruction set computers (OISCs), enabling universal computation through a single, versatile primitive. In its standard form, SUBLEQ takes three operands a, b, and c, which are memory addresses. The instruction performs the subtraction \mathrm{mem} \leftarrow \mathrm{mem} - \mathrm{mem}, storing the result back in \mathrm{mem}. It then branches to the address specified by c if the result is less than or equal to zero; otherwise, execution proceeds to the next instruction, typically located three memory cells ahead to account for the three operands.^[15]^[8] This design achieves Turing completeness by combining arithmetic manipulation with conditional branching in a single mechanism. The subtraction operation allows for data modification, including negation (by subtracting from a zero-initialized cell) and addition (via double negation). The zero-or-negative check provides the conditional control flow essential for decision-making, while self-modifying code—enabled by treating instruction operands as modifiable memory—facilitates loops and indirect addressing. Collectively, these features permit the emulation of a Minsky register machine, which is known to be Turing-equivalent.^[15]^[8] SUBLEQ synthesizes higher-level operations through short sequences of instructions, leveraging a dedicated zero cell (denoted Z, initialized to 0) and temporary cells for intermediate values. To emulate addition of \mathrm{mem} to \mathrm{mem} (i.e., \mathrm{mem} \leftarrow \mathrm{mem} + \mathrm{mem}), first clear a temporary cell T to 0 using two SUBLEQ instructions: \mathrm{SUBLEQ}\ T\ T\ \mathrm{next} (sets T = T - T = 0) followed by \mathrm{SUBLEQ}\ Z\ T\ \mathrm{next} (ensures T = 0 - 0 = 0). Then, negate \mathrm{mem} into T: \mathrm{SUBLEQ}\ y\ T\ \mathrm{next} ( T = 0 - \mathrm{mem} = -\mathrm{mem} ). Finally, add to x: \mathrm{SUBLEQ}\ T\ x\ \mathrm{next} ( x = x - (-\mathrm{mem}) = x + \mathrm{mem} ), and clear T again. For movement from a to b (copying \mathrm{mem} to \mathrm{mem} while clearing a), first clear b as above, then perform the addition sequence with y = a and x = b, and finally clear a using the two-instruction zeroing sequence. LOAD and STORE operations are inherent to the memory-addressing model, as all accesses are direct; emulating a load from memory to a "register" (another memory cell) is simply a MOVE, while STORE reverses this by moving from the source cell to the target.^[15]^[24] To emulate a Turing machine, SUBLEQ programs encode the machine's tape as an array of memory cells, the head position as an offset, and states via a finite state controller implemented through branching sequences. Head movement is simulated by incrementing or decrementing the position index using synthesized ADD/SUBLEQ operations on the offset. Reading the current symbol involves subtracting it from zero to test its value (branching on the result to match transition rules). Writing updates the tape cell via ADD or subtraction of 1 (precomputed as a constant). State transitions are handled by unconditional jumps ( \mathrm{SUBLEQ}\ Z\ Z\ \mathrm{target} , since $0 - 0 = 0 \leq 0) or conditional branches to select the next state based on the symbol read, with self-modification adjusting the instruction pointer for rule application. Halting occurs on a special negative address convention. This step-by-step mapping allows any Turing machine to be simulated, confirming universality.^[15]^[8] Compilation techniques translate high-level languages like C subsets to SUBLEQ by first generating an intermediate assembly and then optimizing the resulting sequences. Tools such as Higher Subleq (HSQ) parse C-like code, managing a stack for variables and function calls (e.g., pushing return addresses via MOVE to stack pointer decrements), and embed library routines for I/O. Expressions are broken into temporary assignments using the synthesized ADD/MOVE operations, with pointers handled as integer offsets. Optimizations focus on code size by reusing a pool of temporary cells to minimize clearing overhead, replacing complex operations like multiplication with logarithmic-time library calls, and applying peephole optimizations to combine adjacent zeroing or branching sequences.^[25]^[8]

SUBNEG Instruction

The SUBNEG (Subtract and Branch if Negative) instruction forms the basis of a class of one-instruction set computers (OISCs) optimized for handling signed integers through conditional branching strictly on negative results. In its standard three-operand form, SUBNEG a, b, c subtracts the value stored at memory location a from the value at memory location b, storing the result back in b, and branches to the address at c if the result is negative.^[26] This operation is defined formally as:

\text{mem} \leftarrow \text{mem} - \text{mem}; \quad \text{if } \text{mem} < 0 \text{ then } \text{PC} \leftarrow c

where PC denotes the program counter. Unlike the related SUBLEQ instruction, which branches if the result is less than or equal to zero, SUBNEG employs stricter branching that ignores zero results, thereby avoiding unintended jumps in scenarios involving zero values during signed arithmetic operations.^[7] This distinction makes SUBNEG particularly advantageous for computations in two's complement representations, where zero does not trigger negative flags and helps prevent "zero traps" that could disrupt control flow in arithmetic sequences. Basic operations in a SUBNEG-based OISC are synthesized through sequences of instructions leveraging temporary memory locations to manage branching and data manipulation. For a zero test on a value at location x, one approach subtracts x from a known positive temporary value t (initialized to 1), storing in another temporary u; if u >= 0 (no branch), the original x was zero, allowing continuation, while a negative u branches to handle non-zero cases—requiring additional SUBNEG steps to restore state using further temporaries.^[27] Copying a value from location x to y involves first zeroing y via repeated subtractions to a negative sentinel, then adjusting with negated copies of x stored in temporaries to accumulate the value without erroneous branches on intermediate zeros. Conditional jumps based on value signs are achieved by subtracting the tested value from zero in a temporary and branching accordingly, with cleanup sequences to preserve memory integrity. These methods rely on auxiliary locations to isolate computations and ensure the stricter <0 condition does not halt progress on zero intermediates.^[7] A four-operand extension, known as subneg4 or SBN4, enhances efficiency for bit-level simulations and complex operations by specifying a distinct destination for the subtraction result: mem = mem - mem; if mem < 0 then jump to c. This variant reduces overwriting of source operands, facilitating denser code for emulating multi-operand arithmetic in resource-constrained environments, such as multicore stencil processors where it achieves approximately 1/20th the circuit area and 1/10th the power consumption of a MIPS-based design. SUBNEG architectures excel in emulating two's complement systems due to their native support for negative detection without zero interference, enabling straightforward implementation of signed operations like overflow handling in integer arithmetic. For instance, to add the signed values at memory locations src1 and src2 into dest, a sequence first computes the negation of src1 into a temporary neg_src1 by subtracting src1 from 0 (branching only if src1 > 0 to adjust), then subtracts neg_src1 from dest (effectively adding src1), followed by similar steps for src2, using additional temporaries to track signs and avoid branch disruptions on zero sums. This approach mirrors conventional signed addition while leveraging SUBNEG's branching for conditional sign extensions.

RSSB Instruction

The Reverse Subtract and Skip if Borrow (RSSB) instruction serves as the core operation in certain arithmetic-based one-instruction set computers, leveraging borrow mechanics to enable conditional control flow and low-level arithmetic manipulation. In these systems, RSSB facilitates Turing-complete computation by combining subtraction with implicit branching based on overflow conditions, allowing programs to simulate more complex instructions through sequences of executions.^[28]^[29] The syntax of the RSSB instruction is RSSB a, b, where a and b denote memory addresses. It executes by updating the memory at address a according to the formula mem[a] ← mem[a] − mem[b] − borrow, where borrow is an input value (typically 0 or 1 from prior operations to propagate multi-word subtraction). The next instruction is then skipped if a borrow occurred during the operation, defined as the result being negative (indicating underflow in signed interpretation or the need for borrow from a higher word in unsigned). This borrow-based skip provides a mechanism for conditional execution without dedicated flag registers.^[28] The borrow mechanics in RSSB simulate the carry propagation of traditional subtraction, where the skip condition acts as an implicit branch on underflow, enabling the synthesis of bit-wise operations such as shifts and tests through chained subtractions. For instance, addition can be implemented via double negation: to compute mem[c] ← mem[a] + mem[b], a sequence of RSSB instructions first negates mem[a] and mem[b] by subtracting from zero (using borrow to handle signs), then subtracts the negated values, and finally negates the result again, with borrow checks ensuring correct handling of overflows at each step. This approach relies on the skip to route control around error conditions or to loop for multi-bit processing.^[28] In hardware implementations, the RSSB instruction supports a minimalist ALU design comprising primarily a subtractor for the core operation and a comparator to detect the negative result for skipping, which reduces gate count and power consumption compared to multi-instruction architectures. Such simplicity makes RSSB suitable for resource-constrained environments like FPGAs, where the lack of opcode decoding further streamlines the control unit.^[28] A representative example is a basic loop for decrementing a counter at memory address c (initially set to a positive value). The sequence uses RSSB instructions to perform mem[c] ← mem[c] − 1; if the result is negative (counter reached zero), the skip branches out of the loop, otherwise control flows to repeat the sequence. This demonstrates RSSB's utility in iterative control structures.^[28] The RSSB instruction is commonly employed in arithmetic architectures that support borrow propagation, enhancing efficiency in emulating conventional processors.^[28]

Other Specialized Instructions

The arithmetic machine represents a class of OISC architectures centered on addition operations, enabling Turing-complete computation through positive integer manipulations. A key example is the addleq instruction, which adds the contents of memory location a to location b (mem += mem) and branches to location c if the result in mem is less than or equal to zero. This design supports simulation of more complex operations, such as subtraction and multiplication, via repeated increments and conditional branching, aligning with abacus-like models of computation that rely on successive positive adjustments without direct support for negative values. Such instructions facilitate efficient encoding of algorithms in domains requiring non-negative arithmetic, though they may require additional steps for handling zero or underflow conditions compared to subtraction-based OISCs like SUBLEQ. Cryptoleq introduces an equality-focused OISC instruction tailored for secure, homomorphic computation on encrypted data, particularly cryptographic primitives such as hashing and elliptic curve operations. The core instruction, cryptoleq a, b, c, performs a Paillier-based homomorphic subtraction on encrypted operands (effectively mem -= mem in the decrypted domain) and branches to c if the result satisfies an equality or threshold condition (e.g., via an equality test where Equal(x, y) = 1 if decrypted x equals y).^[30] Proposed in 2015, it extends traditional OISC principles to mixed encrypted-unencrypted environments, enabling universal computation while preserving data privacy through partially homomorphic encryption.^[30] This makes Cryptoleq suitable for resource-constrained secure computing, where code efficiency is enhanced for equality checks in hashing but at the cost of increased computational overhead from encryption operations.^[30] Beyond the standard reverse subtract and skip if borrow (RSSB) instruction, variants like subtract and branch if negative (SBN) and subtract and branch if not zero (SBNZ) provide minimal support for signed operations in OISC designs. In SBN, the instruction subtracts mem from mem and branches to c if the result is negative, allowing direct handling of signed comparisons without full borrow logic.^[28] A specialized form, "subtract from zero and branch," simulates negation or signed decrement by using a zero operand (e.g., mem = 0 - mem; branch if mem < 0), enabling compact signed arithmetic in environments where borrow flags are unavailable.^[28] These variants improve code density for signed integer tasks, such as elliptic curve cryptography, though they demand careful memory management to avoid overflow.^[28] Overall, arithmetic and reverse subtract OISCs exhibit varying efficiency: addleq variants excel in positive-only loops, while Cryptoleq and SBN prioritize security and signed ops at higher complexity.^[28]

Implementations

Early Bit-Manipulating Examples

One of the earliest examples of a bit-manipulating one-instruction set computer (OISC) is the TOGA computer, which uses a single instruction to toggle (flip) a bit at a specified memory address and conditionally branch based on the result.^[17] The instruction, denoted as TOGA(a, b), inverts the bit at address a and sets the program counter to b if the resulting bit value is 1 (which occurs if the original bit was 0); otherwise, execution proceeds to the next instruction.^[17] This mechanism simulates both toggling for data manipulation and conditional branching for control flow, enabling the construction of basic operations like bit copying and logical functions through sequences of such instructions.^[17] Although not Turing-complete due to its bounded memory model in basic implementations, the TOGA design highlights the minimal hardware requirements for bit-level computation, potentially realizable with few transistors.^[17] Another foundational bit-manipulating OISC is BitBitJump, introduced in the late 2000s as a model for ultimate computational simplicity through bit copying.^[21] The single instruction, BBJ(A, B, C), copies the bit at memory address A to address B and then unconditionally jumps to address C, operating on an infinite array of bits organized into words (e.g., 8-bit groups).^[21] This design avoids explicit logical gates like AND or OR, relying instead on bit duplication and referencing to achieve Turing-completeness via self-modifying code and conditional effects from address overlaps.^[21] For instance, a left bit shift by one position (effectively multiplying by 2, setting the lowest bit to 0) can be implemented using a macro that sequentially copies bits within a word:

.def shiftL X : ZERO
: ZERO X'(w-1)
X'w X'(w-2) X'(w-1)
... X'1 X'2 X'0 X'1
ZERO X
.end
.def shiftL X : ZERO
: ZERO X'(w-1)
X'w X'(w-2) X'(w-1)
... X'1 X'2 X'0 X'1
ZERO X
.end

Here, X represents the word address, w is the word size, and ZERO clears the target bit; the sequence propagates bits leftward through repeated copying and jumps.^[21] To handle multi-bit words beyond single bits, BitBitJump and similar designs extend operations via sequential addressing, where n-bit copying is achieved by iterating the single-bit copy instruction across each bit position in the word.^[21] For example, copying a full 32-bit word involves 32 invocations of the BBJ instruction, using predefined jump tables to manage bit positions and ensure correct propagation without data corruption.^[21] This approach scales conceptually to arbitrary word sizes but incurs linear overhead in instruction count. Practical implementations of these early bit-manipulating OISCs have primarily been software emulators written in languages like C or assembly for simulation on conventional hardware.^[21] For instance, BitBitJump emulators support adjustable word sizes and include compilers for higher-level constructs, demonstrating feasibility on standard processors.^[21] However, scaling to 32- or 64-bit architectures poses challenges, as the dense packing of instructions required for even simple operations (e.g., thousands of BBJ calls for arithmetic) leads to high memory usage and slow execution; hardware realizations remain theoretical due to the inefficiency compared to multi-instruction sets.^[21]

Arithmetic and Hybrid Examples

One notable hardware implementation of an arithmetic-based OISC is a multi-processor system using the SUBLEQ instruction, prototyped in 2011 on a low-cost Altera Cyclone III EP3C16 FPGA board. This design features an array of 28 independent SUBLEQ processors, each with 2 KB of dual-port RAM memory, synchronized by a 150 MHz clock and interfaced to a host PC via USB for input/output operations. The architecture leverages the arithmetic subtraction and conditional branching of SUBLEQ to achieve Turing completeness, demonstrating practical hardware feasibility for such minimalist designs; circuit overviews include block diagrams showing the FPGA integration with a Cypress FX2 USB controller and shared synchronization logic.^[8] Performance evaluations of this prototype highlight its efficiency for parallel workloads, with basic computational tests showing execution times comparable in scale to conventional CPUs when scaled across processors—for instance, the "Order of Function Residue" test (M=5039) completed in 94 seconds on one processor versus 0.37 seconds native C on a single PC core, and the "Modular Double Factorials" benchmark (N=5029, M=5039) in 62 seconds on the full array versus 0.15 seconds native C. These results underscore the hardware viability of arithmetic OISCs, addressing gaps in prior discussions by quantifying cycle-efficient parallelism without complex instruction decoding.^[8] A software example of an arithmetic OISC variant is the SUBNEG4 instruction, which extends SUBLEQ to four operands (SUBNEG4 a, b, c, d subtracts b from a, stores in a, branches to c if non-positive, or d otherwise), easing hardware realization through reversed operand order while maintaining Turing completeness. Implementations in C simulate SUBNEG4 for bit-level operations, such as emulating multi-bit arithmetic via repeated subtractions, allowing demonstration of full computation on standard hosts. Such emulators pack multiple bit operations into word-sized registers for efficiency, blending arithmetic execution with bit simulation to optimize resource use in software environments.^[29] Hybrid approaches combine arithmetic OISC cores with conventional architectures for enhanced performance in embedded systems; for example, a 2015 synthesizable processor integrates a SUBLEQ co-processor within a MIPS ISA framework, translating higher-level instructions to OISC operations for area-efficient execution. This design achieves significant reductions in circuit complexity, with the OISC component consuming approximately 1/4 the area and 1/3 the power of a full MIPS core, as validated in FPGA prototypes. Bit-packing techniques in these hybrids further optimize emulators, such as mapping SUBLEQ operations to 2-bit schemes for compact representation of arithmetic logic in resource-constrained settings.^[5]

Modern and Theoretical Implementations

In the 2020s, software implementations of OISCs have proliferated as educational and experimental tools, particularly simulators for the SUBLEQ instruction implemented in languages like Python and JavaScript. These simulators enable users to execute OISC programs efficiently on conventional hardware, often with assemblers for higher-level abstraction. For instance, the SIC-1 project provides an interactive JavaScript-based emulator and game that teaches computer architecture through SUBLEQ programming, allowing compilation and simulation of programs in a browser environment. Similarly, Python implementations on platforms like Rosetta Code demonstrate SUBLEQ emulation with minimal overhead, supporting Turing-complete computation for pedagogical purposes. The SIC-1 design was further implemented on the Tiny Tapeout 9 shuttle in 2024, fabricating multiple 8-bit SUBLEQ cores in silicon.^[31]^[32] Cryptographic applications of OISCs have advanced through specialized architectures designed for secure computation on encrypted data. The HEROIC (Homomorphically EncRypted One Instruction Computer) framework, introduced in 2014, employs a single subtraction-based instruction adapted for Paillier homomorphic encryption, enabling native processing of encrypted operands in cloud environments while preserving data confidentiality.^[33] Building on this, Cryptoleq (2015) extends the OISC model with a heterogeneous abstract machine that supports both encrypted and unencrypted execution using a modified subtraction-and-branch instruction, achieving 4-5 orders of magnitude faster performance than general homomorphic libraries like HElib for basic operations such as additions and subtractions in private information retrieval tasks. These implementations form the basis for secure minimal virtual machines in cryptographic libraries, facilitating applications in semi-trusted computing scenarios post-2015.^[30] OISCs find practical use in educational tools, code golf challenges, and minimal system software. SIC-1 serves as an engaging educational platform, where users program an 8-bit SUBLEQ machine to solve puzzles, illustrating concepts like memory management and control flow with a single instruction. In code golf communities, SUBLEQ challenges on platforms like Stack Exchange encourage concise implementations of complex algorithms, such as interpreters or mathematical functions, highlighting the instruction's universality since at least 2023.^[34] For minimal operating system components, a 16-bit SUBLEQ implementation hosts a self-hosting eForth interpreter, functioning as a basic kernel capable of executing Forth programs on simulated or hardware-emulated OISCs.^[35] Theoretical proposals explore OISCs in non-classical computing paradigms to push hardware limits. Modern hardware realizations leverage accessible ASIC fabrication for efficiency gains. The SIC-1 SUBLEQ design has been implemented on Tiny Tapeout shuttles since 2024, fabricating 8-bit OISC cores in silicon using shared ASIC runs, demonstrating practical viability for custom chips with minimal resource overhead.^[20] Benchmarks of SUBLEQ-based systems reveal significant performance trade-offs; for example, an enhanced SUBLEQ variant on FPGA prototypes operates at 158 MHz, achieving an average 2.78× speedup compared to standard SUBLEQ due to optimizations.^[36] Future directions emphasize ASIC optimizations to mitigate these inefficiencies, potentially through specialized photonic or optical integrations for niche applications like secure enclaves.

References

[1]
The ultimate RISC | ACM SIGARCH Computer Architecture News
The ultimate RISC architecture presented here is an extreme yet simple illustration of such an architecture. It has only one instruction, move memory to memory, ...
[2]
The Ultimate RISC - University of Iowa
OISC, by the way, is an acronym for One Instruction Set Computer, a fairly new term for computers with no opcode field in their instructions. SISC, Single ...
[3]
[PDF] URISC: The Ultimate Reduced Instruction Set Computer
2 URISC SPECIFICATIONS. Reduced instruction set computers (RISCs) are machines with a small number of simple, fixed-length, fixed-format instructions. The fact ...
[4]
[1106.2593] A Simple Multi-Processor Computer Based on Subleq
Jun 14, 2011 · We describe a hardware implementation of an array of 28 one-instruction Subleq processors on a low-cost FPGA board.
[5]
[PDF] Synthesizable-from-C Embedded Processor Based on MIPS-ISA ...
An OISC with SUBLEQ is proven to be Turing complete—it is capable of performing any computation. We propose a novel processor architecture that supports the ...
[6]
One Instruction Set Computer - Department of Computer Science
This single instruction computer design and simulator was put together by Peter Crampton, one of my Computer Science students at Drexel University.Missing: history | Show results with:history
[7]
[PDF] A Simple Multi-Processor Computer Based on Subleq - arXiv
Our test results demonstrate that computational power of our Subleq OISC multi-processor is comparable to that of CPU of a modern personal computer.Missing: paper | Show results with:paper
[8]
One-instruction set computer-based multicore processors for energy ...
In this work, we utilize one of the simplest RISC processors, One-Instruction Set Computer (OISC), as a core. Our evaluation demonstrates that our ...
[9]
https://dl.acm.org/doi/10.1145/3130265.3130318
[10]
Computer Architecture: A Minimalist Perspective - Book - SpringerLink
This book examines computer architecture, computability theory, and the history of computers from the perspective of minimalist computing.
[11]
BitBitJump - Esolang
### Summary of BitBitJump from https://esolangs.org/wiki/BitBitJump
[12]
Subleq - Esolang
Nov 21, 2024 · Subleq refers to a kind of OISC where the one instruction is "SUBtract and branch if Less-than or EQual to zero", conventionally abbreviated to subleq.
[13]
http://mazonka.com/bbj/bbj.html
[14]
TOGA computer - Esolang
May 8, 2021 · The TOGA is a one instruction set computer (OISC), sometimes called an ultimate reduced instruction set computer (URISC), is an abstract ...Missing: 2011 | Show results with:2011
[15]
Cryptoleq: A Heterogeneous Abstract Machine for Encrypted and Unencrypted Computation
**Publication Date and Summary of Cryptoleq as OISC for Cryptography**
[16]
andrew-d/oisc: Old one-instruction-set compiler I wrote - GitHub
Old one-instruction-set compiler I wrote. Contribute to andrew-d/oisc development by creating an account on GitHub.
[17]
102 SIC-1 8-bit SUBLEQ Single Instruction Computer - Tiny Tapeout
How it works. SIC-1 is an 8-bit Single Instruction computer. The only instruction it supports is SUBLEQ: Subtract and Branch if Less than or Equal to Zero.Missing: 2005 | Show results with:2005
[18]
(Rojas R.) Conditional Branching is not Necessary for Universal ...
... [Minsky 67]. We can even drop one instruction: CLR is unnecessary. If we take care of initializing address Z to 0 (as part of the program load phase), we can ...<|separator|>
[19]
About Transport-Triggered Architectures - OpenASIP
Transport Triggered Architecture (TTA) is a processor design philosophy where the processors internal datapaths are exposed in the instruction set. All ...
[20]
https://tinytapeout.com/chips/ttihp25b/tt_um_urish_sic1
[21]
SUBLEQ - A One Instruction Set Computer (OISC) - TechTinkering
May 29, 2020 · It has only one instruction, hence why it is called a One Instruction Set Computer (OISC), which isn't the best name considering that most processors have one ...Missing: history | Show results with:history
[22]
Higher Subleq - Esolang
Jul 27, 2023 · Basics. Unlike C, Higher Subleq does not have preprocessor, does not support structs, bitfields, abstract declarations, and bit operations.Missing: 2000s | Show results with:2000s
[23]
Stanford scientists build first carbon nanotube computer - New Atlas
Sep 26, 2013 · Containing 178 carbon nanotube field-effect transistors, the computer is only able to carry out only one instruction, called SUBNEG. However, ...Missing: OISC | Show results with:OISC
[24]
OISC - Esolang
Jul 4, 2025 · OISC is the One Instruction Set Computer (by analogy with RISC and CISC), a machine providing only one instruction. The abbreviation URISC ...
[25]
SBN - Esolang
### Summary of SBN from https://esolangs.org/wiki/SBN
[26]
RSSB - Esolang
May 21, 2022 · RSSB stands for Reverse Subtract and Skip if Borrow, taking the principles of RISC to the extreme. The specification defines one instruction ...
[27]
[PDF] A Single Instruction Approach for Lightweight ECC Design in GF(p)
Dec 25, 2015 · It is well known that using a single instruction like SBN. (subtract and branch if negative), SUBLEQ (subtract and branch if the answer is.
[28]
One-instruction set computer - HandWiki
Feb 8, 2024 · A one-instruction set computer ... There exists a class of universal computers with a single instruction based on bit manipulation such as bit ...
[29]
[PDF] Cryptoleq: A Heterogeneous Abstract Machine for Encrypted and ...
Dec 26, 2015 · Leveraging the power of encryption, in this paper we introduce Cryptoleq: an abstract machine based on the concept of One Instruction Set.
[30]
[PDF] Bit Copying – The Ultimate Computational Simplicity - arXiv
Bit Copying – The Ultimate Computational Simplicity. Oleg Mazonka, July 2009 mazonka@gmail.com. Abstract. A computational abstract machine based on two ...
[31]
UB07 Session 7 | DATE 2018
Thanks to its simplicity, SUBNEG4 has only 1/20x circuit area and 1/10x power consumption against MIPS processor. As SUBNEG4 is Turing-complete, it is ...
[32]
Subleq - Rosetta Code
Subleq is an example of a One-Instruction Set Computer (OISC). It is named after its only instruction, which is SUbtract and Branch if Less than or EQual to ...Missing: history | Show results with:history
[33]
HEROIC: homomorphically EncRypted one instruction computer
This new design utilizes a single instruction architecture and provides native processing of encrypted data at the architecture level. The security of the ...
[34]
[PDF] Virtual Secure Platform: A Five-Stage Pipeline Processor over TFHE
FURISC implements an One Instruction Set Com- puter (OISC) processor which supports only one instruction,. SBN. This means modifying modern compilers like Clang ...
[35]
Implement Subleq - Code Golf Stack Exchange
Sep 22, 2023 · Subleq is a Turing-complete esolang with only one instruction, SUBLEQ. This instruction takes in three parameters, A, B, and C, all of which are memory ...Missing: proof | Show results with:proof
[36]
16-bit SUBLEQ CPU running eForth - just for fun - GitHub
This project contains a working (self-hosting) Forth interpreter that runs on top of a SUBLEQ 16-bit machine. SUBLEQ machines belong to the class of One ...
[37]
[PDF] SubleqΘ: An Area-Efficient Two-Instruction-Set Computer
In this paper, we target the subleq OISC ISA, whose instruction performs subtraction followed by a conditional jump, as the base processor. We add a bit ...