Fact-checked by Grok 2 weeks ago

Natural language programming

Natural language programming is a in that allows users to specify computational tasks and create software using everyday human languages, such as English, instead of rigid syntactic structures typical of traditional programming languages. This approach seeks to democratize programming by reducing the need for specialized technical knowledge, relying on (NLP) and techniques to interpret ambiguous instructions and generate executable code. The concept has roots in early human-computer interaction research from the , where studies explored the feasibility of as a programming to bridge the gap between and machine execution. Initial efforts focused on analyzing imperative sentences in English to develop procedural semantics, addressing challenges like dialog focus and context handling in command interpretation. A longstanding debate has centered on the suitability of for programming, weighing its intuitive appeal against inherent ambiguities that could lead to errors in translation to formal code. Key challenges in natural language programming include resolving in human descriptions, managing in multi-step instructions, and ensuring reliable execution, often requiring domain-specific clarification from users or systems. Modern advancements, particularly with large language models (LLMs), have revitalized the field by enabling more accurate interpretation of prompts into structured formats like pseudo-code or flowcharts, thus supporting accessible building for non-experts. These systems typically incorporate logical syntax—such as step types (e.g., , decision) and connections—to mitigate ambiguity while preserving the fluidity of human expression. Overall, programming represents an evolving effort to make software development more inclusive, though it remains constrained by the precision demands of .

Definition and Fundamentals

Core Definition

Natural language programming (NLPg) is a that enables the creation of computer programs through instructions expressed in natural languages, such as English, using (NLP) and (AI) techniques to interpret and translate them into executable code or structures. This approach encompasses various methods, including ontology-assisted systems that map linguistic semantics to computational operations, as well as probabilistic models in large language models (LLMs) for generating code from prompts. Early conceptual foundations were explored in AI research from the and , with researchers like Sandor M. Veres advancing ontology-based frameworks for agents and robots in the 2000s and 2010s. Unlike simple command-line interfaces or general for queries, natural language programming focuses on generating complete, structured programs from descriptions. For instance, a like "If the temperature exceeds 100 degrees, turn on the cooling system" can be interpreted—via ontological mappings or models—to produce conditional logic, such as an if-then statement. In contemporary contexts as of 2025, this intersects with -driven code generation tools like OpenAI's Codex and successors, which use LLMs for probabilistic interpretation alongside or instead of strict ontologies.

Key Concepts and Distinctions

In natural language programming, key concepts include semantic interpretation and contextual understanding to bridge human intent and machine execution, often supported by structured knowledge representations like ontologies or trained models. Ontologies provide explicit definitions of concepts, relationships, and axioms to reduce ambiguity—for example, specifying "move" in as spatial displacement via predicates—but are one tool among others, with modern systems leveraging LLMs for implicit handling of context. A central distinction is between semantics (deriving meaning through context and inference) and (minimal structural rules to allow flexible phrasing), prioritizing intuitive expression over rigid . This enables diverse inputs, from imperatives to narratives, to yield executable outcomes, lowering barriers for non-experts. Natural language programming differs from or domain-specific languages (DSLs) by aiming for direct machine interpretability from , without manual developer translation. is an informal sketch needing conversion to formal code, while DSLs use constrained for domains. In contrast, NLPg employs parsing via ontologies, , or to make unconstrained executable. At its core, natural language programming views programs as sequences of declarative sentences modeling entities, actions, and relationships, akin to everyday discourse. These can define states or procedures, such as "The picks up the block," mapped to actions, fostering intuitive intent expression transformed into operational logic.

Historical Development

Early Pioneering Efforts

The concept of natural language programming emerged in the mid-20th century alongside early research, with initial explorations in the 1950s focusing on and basic language understanding as precursors to more interfaces. By the , researchers began experimenting with systems that could interpret simple human-like instructions for computational tasks, laying groundwork for domain-specific interactions. A seminal milestone came in 1970 with SHRDLU, developed by at , which demonstrated a robot-like system manipulating virtual blocks through English commands such as "Pick up a big red block." This program integrated procedural representations with a simulated world, allowing coherent dialogues that highlighted the potential for to drive programmatic actions in constrained environments. SHRDLU's success in handling and influenced subsequent efforts to bridge human language and machine execution. In the 1980s, research advanced toward practical applications, including L.A. Miller's study on natural language programming styles, which analyzed how non-experts expressed algorithms in English-like forms for problem-solving tasks. Concurrently, commercial systems like , introduced by Artificial Intelligence Corporation, enabled users to query databases using typed English sentences, translating them into retrieval commands for business . These developments emphasized strategies and domain-specific grammars to make programming accessible beyond traditional code syntax. The 1990s saw creative extensions into specialized domains, with Graham Nelson's system originating in 1993 as a tool for authoring , evolving toward constructs in later iterations like Inform 7 (2006). allowed definitions of game logic through declarative English sentences, such as specifying object behaviors and interactions in prose form. Meanwhile, the (SPL), created in 2001 by Jon Åslund and Karl Hasselström, offered a humorous esoteric approach where programs mimicked Shakespearean plays, mapping character dialogues and stage directions to variables, assignments, and . SPL illustrated the feasibility of highly stylized mappings while underscoring challenges in resolution. These pioneering efforts from the 1950s to the 1990s established foundational techniques in syntax parsing and semantic mapping, paving the way for AI-integrated advancements in later decades.

Evolution in the AI Era

The evolution of natural language programming entered a transformative phase in the 2000s, driven by advancements in computational linguistics and early AI interfaces that bridged human language with executable computations. In 2010, Stephen Wolfram published an influential essay advocating for natural language as a viable medium for programming, positing that linguistic parsing could enable users to specify complex computations in everyday English rather than formal syntax. This vision was exemplified by the launch of Wolfram Alpha in 2009, a computational knowledge engine that interprets natural language queries to perform and return results from mathematical and data-driven operations, laying groundwork for ontology-assisted interpretation in programming contexts. These efforts built on earlier inspirations like Inform 7, a domain-specific language for interactive fiction that used English-like commands, but shifted toward broader AI integration for general-purpose use. The 2010s saw further progress through ontology-based approaches that enhanced for specialized autonomous systems, particularly in . Sándor M. Veres advanced this area with publications on programming for belief-desire-intention (BDI) agents in robotic systems, introducing frameworks where English sentences are mapped to ontological structures for generating verifiable agent behaviors. His 2012 work demonstrated how conceptual graphs and theory could translate specifications into executable code for complex robotic tasks, such as and in dynamic environments, emphasizing precision through formal semantics. This period highlighted AI's growing role in disambiguating intent via knowledge bases, making viable for safety-critical applications beyond simple scripting. The marked a leap forward with large language models (LLMs) enabling scalable, end-to-end from prompts. OpenAI's , released in 2021 as a descendant of fine-tuned on code, powered tools like , which autocompletes and generates entire functions or modules from descriptive English inputs, achieving over 37% acceptance rates in real-world developer workflows. By 2023, integrations in platforms like Cursor—an AI-native code editor forked from VS Code—and AI expanded this capability, allowing LLMs to produce full applications from high-level prompts, such as "build a app for ," with iterative refinement based on user feedback. As of 2025, natural language programming has increasingly incorporated AI, where models process text alongside visual inputs to generate code for user interfaces and prototypes. For instance, Google's tool, introduced in 2025, uses LLMs to convert sketches or images into interactive UI code in frameworks like , streamlining prototyping by combining descriptive prompts with visual references for more accurate outputs. This integration represents a maturation of AI-driven , where contextual understanding across modalities enhances the fidelity of generated programs, fostering broader adoption in .

Methods and Techniques

Interpretation Mechanisms

In traditional natural language programming systems, the interpretation of user inputs begins with a parsing process that breaks down English sentences into structured components. Tokenization occurs via a scanner that identifies basic units such as numbers, names, and dictionary words, incorporating morphological analysis and spelling correction to handle variations; for instance, ambiguous tokens may yield multiple possible definitions for further processing. This is followed by syntactic analysis using nondeterministic transition networks—functionally akin to context-free grammars—to delineate subjects, verbs, and objects in imperative sentences, such as parsing "Put 5 in A" to recognize "put" as the verb, "5" as the object, and "A" as the destination. Semantic checks during parsing reject invalid structures, ensuring only coherent subject-verb-object (SVO) triples proceed, often resolving conjunctions through rules that prioritize type similarity among nouns (e.g., linking multiple numbers in a list). Mapping these parsed elements to executable actions involves associating verbs and verb phrases with predefined procedure calls from a library of operations. In early systems like NLC, imperative verbs such as "put" or "add" directly trigger matrix-based computer primitives, like assignment or arithmetic operations, while noun groups are resolved into internal representations called "datareps" (e.g., (ROW 1 2) for a list of values). This step leverages rule-based to disambiguate references, preferring deeper parses or contextually compatible interpretations without relying on probabilistic models. Ontologies play a supporting role here by providing structured mappings between linguistic terms and domain-specific s, enhancing precision in variable or function resolution. Compilation transforms these mappings into high-level code equivalents, often generating intermediate or direct invocations in languages like or skeletons. For example, a like "Generate 10000 random numbers" is parsed into a structure and compiled to a procedural call such as a iterating over a . Systems maintain libraries to support this, drawing from standard operations like or to produce functional outputs. Testing and execution emphasize validation through iterative user feedback and library-based . Parsed commands are executed in a controlled , often with visual display of results (e.g., updates on screen), allowing users to test subroutines against expected behaviors; procedure libraries ensure outputs align with predefined validations, such as checking arithmetic results against known inputs. Intermediate representations like facilitate , where ambiguities unresolved by initial rules prompt user paraphrasing for re-parsing, achieving high success rates in constrained domains (e.g., 81% correct processing of sample sentences). Rule-based systems, exemplified by in NLC, handle residual ambiguities through semantic compatibility rules, avoiding by enforcing strict linguistic constraints.

Ontology and Compilation Approaches

Ontology-centric approaches in natural language programming rely on structured representations to bridge the gap between human-readable sentences and executable , particularly through formal that define domain-specific concepts. These are constructed as explicit, formal specifications of shared conceptualizations, comprising classes (such as perceived objects, imagined objects, or mathematical objects), roles (like "has attribute"), and relations that link them, ensuring compatibility with natural language descriptions. This construction often leverages and the (), a W3C standard for compatibility, allowing ontologies to be machine-readable and interoperable across systems. Tools like Protégé, an open-source ontology editor developed by , facilitate this process by providing graphical interfaces for defining entities, properties, and relations in format, which can then be integrated into pipelines. The compilation pipeline in ontology-based natural language programming typically involves a sequence of steps: initial parses inputs into conceptual graphs, which are formal diagrams representing semantic structures; this is followed by lookup to resolve ambiguities and map concepts to predefined classes and relations; finally, translates these mappings into executable instructions via a deterministic meaning . For instance, in systems like sEnglish developed by Veres, descriptions of behaviors—such as "the moves to the and stops"—are analyzed into conceptual graphs, looked up against a , and compiled into control algorithms that can be executed in various programming languages. This pipeline builds on basic interpretation mechanisms by enforcing ontological constraints to ensure semantic accuracy during translation. In domains like , ontology-based offers significant advantages, including the creation of machine-independent descriptions of events, actions, and world models that reduce programming errors through enhanced clarity and enable to diverse target languages without redesign. By formalizing in reusable ontologies, these approaches promote shared understanding between humans and machines, facilitating the of intelligent agents that can interpret and execute complex commands reliably. s such as the sEnglish Authoring (sEAT) support ontology integration by allowing users to author and validate natural language programs against custom ontologies, while Protégé enables seamless editing and export for workflows.

Examples and Implementations

Traditional Natural Language-Like Languages

Traditional natural language-like programming languages emerged as attempts to make coding more accessible by mimicking everyday English syntax, often prioritizing over conciseness. These languages typically feature declarative or verbose structures that resemble , allowing non-experts to express logic without traditional symbols like brackets or semicolons. Unlike modern AI-driven systems, they rely on fixed rules for interpretation within constrained domains. COBOL, developed in 1959 by the under a U.S. Department of Defense initiative, exemplifies an early business-oriented language with English-like keywords such as "ADD," "MOVE," and "DISPLAY" to facilitate tasks. Its verbose syntax was intentionally designed for clarity and maintainability across diverse mainframe systems, enabling business analysts to read and modify code as if it were English sentences. For instance, a simple addition might be written as "ADD A TO B GIVING C," emphasizing self-documenting intent over brevity. Despite its influence on standardized programming practices, COBOL's wordiness limited its adoption beyond enterprise environments. Inform 7, released in 2006, is a specialized language for authoring games, using to define worlds, objects, and interactions in a declarative style. Developers write sentences like "The kitchen is a room" to create spatial elements or "Instead of taking the cake, try eating the cake" to handle player actions, drawing on linguistic concepts for intuitive rule-based programming. This approach transforms code into narrative prose, making it suitable for writers and game designers without deep technical expertise. Inform 7 compiles these descriptions into executable bytecode, supporting a niche of text adventures. The Shakespeare Programming Language (SPL), created in 2001 by Jon Åslund and Karl Hasselström, takes a theatrical form by structuring programs as Shakespearean plays, where variables are named after characters like "" or "." Dialogue between characters performs operations—such as assignments via questions ("Is the number of pigeons flying equal to the number of times has appeared?")—while soliloquies output results, for example, "!" to print a character's value as text. This esoteric design enforces dramatic flair, with acts and scenes organizing , but requires adherence to strict conventions for Turing-complete functionality. SPL serves primarily as an educational curiosity rather than a practical tool. AppleScript, introduced by Apple in 1993 with , enables automation of Macintosh applications through scripting in a near-natural format, such as "tell application 'Finder' to set the view of the front window to icon view." Its syntax combines English verbs and object references to send interapplication messages, allowing users to orchestrate workflows across apps like or Photoshop without low-level APIs. Designed for end-users and scripters, integrates with macOS scripting additions for extended capabilities, though its domain remains tied to Apple ecosystems. These languages, while innovative in bridging human language and computation, face inherent limitations due to their domain-specificity; for example, Inform 7 is confined to creation, restricting its use to game development without broader applicability. Similarly, COBOL's and SPL's rigidity hinder general-purpose programming, underscoring the trade-offs in prioritizing natural syntax over flexibility.

Modern AI-Powered Tools

Modern AI-powered tools for natural language programming leverage large language models (LLMs) to interpret user prompts in everyday language and generate executable code, marking a shift from rigid syntax to flexible, context-aware generation in the . These tools integrate directly into development environments, enabling developers to describe intentions—such as implementing a specific or refactoring code—and receive suggestions or full implementations. This approach democratizes coding by reducing the need for precise programming syntax, though it relies on the model's training to handle nuances effectively. One of the pioneering examples is , released on June 29, 2021, and powered by OpenAI's model, which was fine-tuned on vast code repositories to suggest code completions from comments or prompts. For instance, a comment like "Sort array by length" in a file might generate the following function:
python
def sort_by_length(arr):
    return sorted(arr, key=len)
This tool operates as an extension in IDEs like , providing inline suggestions that developers can accept or modify, thereby accelerating tasks like writing functions or debugging. Building on this foundation, Cursor emerged in 2023 as an AI-native (IDE) forked from , emphasizing natural language editing for seamless code manipulation. Users can issue commands like "refactor this function to use async" directly in the editor, prompting the tool to rewrite the code in real-time while preserving context across files. Cursor's features, including inline chat and multi-file awareness, support iterative development by allowing developers to converse with the for refinements, making it particularly useful for complex refactoring or prototyping. By 2025, tools like AI have advanced to full application generation from descriptions, enabling users to prompt "Build a with user " and receive a deployable app complete with frontend, backend, and database integration. 's agent-based system automates the entire workflow, from to deployment, and incorporates recent integrations with models like GPT-5 for enhanced capabilities in handling dynamic requirements. Similarly, CodeWhisperer, now integrated into Amazon Q Developer, caters to enterprise environments by supporting in 15 programming languages, including , , and , with features tailored for secure, scalable development in IDEs like AWS SageMaker Studio. At the core of these tools are fine-tuned LLMs such as OpenAI's (released in 2023) and its successors like GPT-4o, which are trained on massive datasets of public code from repositories like to enable prompt-to-code translation. These models use techniques like supervised fine-tuning on instruction-following tasks to align natural language inputs with programming outputs, achieving high fidelity in generating syntactically correct and functionally relevant code. For example, GPT-4o demonstrates improved performance in coding benchmarks compared to earlier versions, powering tools that handle diverse languages and paradigms.

Applications and Impacts

Value in Documentation and Publication

Natural language programming enhances documentation by transforming complex algorithms into readable prose, making it accessible to non-programmers and domain experts who may lack traditional coding skills. In systems like sEnglish, developed by Sándor Veres, programs are expressed in sentences that build conceptual structures for shared understanding between humans and machines, facilitating easier comprehension and maintenance without requiring deep programming knowledge. This approach also improves machine readability, allowing natural language programs to be directly executable by intelligent agents or robotic systems without the need for recompilation into low-level code, as the ontology-based enables runtime processing of the prose-like specifications. Additionally, since these programs are stored as plain text, they support efficient versioning through sentence-level diffs, similar to standard text revision tools, which track changes in requirements or logic more intuitively than alterations. In publication contexts, programs integrate seamlessly with wikis, journals, and collaborative platforms, where they can serve as verifiable appendices in papers, ensuring that described behaviors or controls are both human-readable and computationally testable. For instance, Veres' framework emphasizes publishing for both humans and machines, enabling outputs suitable for technical documentation that maintains fidelity across formats. A practical case is found in NASA's use of for specifications, where English-based commands from a PC were employed to define and execute behaviors, aiding by providing clear, inspectable descriptions of system operations that reduced ambiguity in technical reports. Another application is in , exemplified by the Aptly (as of June 2025), which uses large language models to translate descriptions into visual blocks for in tools like App Inventor. Targeted at young learners such as high school students, Aptly democratizes app creation, enhancing accessibility and fostering creativity without requiring prior programming expertise.

Contributions to Machine Knowledge and AI

Natural language programming facilitates knowledge modeling by parsing natural language sentences into structured representations, such as subject-predicate-object , which form the basis of semantic . For instance, a sentence like "The moves to the " can be decomposed into a triple (, moves_to, ), capturing entities, actions, and relations to build interconnected knowledge structures that machines can query and traverse. This approach draws from early systems like JIMMY3, which employed actor-act-object to store facts in memory, enabling efficient semantic matching and representation of hierarchical , such as (BRANDT, OWN, ). More recent methods extend this by using grammar ontologies to transform into graph semantics, supporting through triple relationships and enhancing machine-readable knowledge bases. These structured outputs from natural language programming directly enhance AI systems by populating knowledge graphs that support advanced reasoning capabilities. By converting descriptive text into , the resulting graphs allow engines to perform logical deductions, such as transitive relations or , integrating seamlessly with ontologies like RDF for scalable knowledge representation. This integration enables models to reason over domain-specific facts derived from human-like instructions, improving accuracy in tasks requiring contextual understanding without manual encoding. The broader impact of programming lies in empowering autonomous agents to acquire and apply knowledge directly from natural language descriptions, fostering self-directed learning and adaptation. In multi-agent systems, this capability allows agents to collaboratively interpret shared textual instructions, coordinating actions based on inferred relations from , which has seen practical applications by 2025 in frameworks like MetaGPT, where agents use natural language to program and execute complex workflows. Such systems demonstrate how natural language programming bridges human intent with machine execution, enabling scalable collaboration in dynamic environments. A notable example is the work of Sándor M. Veres on machine-independent pseudocode through "system English" (sEnglish), a natural language framework designed for programming intelligent agents and robots. sEnglish translates sentences into executable, platform-agnostic structures that agents can parse to learn skills, facts, and behaviors, serving as high-quality training data for AI models by providing unambiguous, human-readable descriptions. This approach, supported by tools like the sEnglish Authoring Tool (sEAT) and Reader Agent (sERA), ensures deterministic interpretation of conceptual graphs, contributing to robust knowledge transfer in AI systems without reliance on low-level code.

Challenges and Limitations

Handling Ambiguity and Precision

In natural language programming, arises at multiple linguistic levels, complicating the of human instructions into code. Lexical ambiguity occurs when a word or phrase has multiple possible meanings, such as "" referring to a or the side of a river, which can lead to incorrect if the system selects the wrong interpretation. involves unclear sentence structures, for instance, the phrase "process the data with the tool" might imply processing data using a specific tool or using a process that involves both data and tool simultaneously, resulting in divergent parse trees. Semantic ambiguity pertains to contextual interpretations, such as quantifier scope in queries like "every student reads some book," which could mean each student reads at least one book (wide scope) or there exists one book read by all students (narrow scope), affecting the logic of the generated program. To resolve these ambiguities, systems employ context disambiguation through ontologies, which provide structured knowledge representations to map ambiguous terms to domain-specific concepts, ensuring alignment with programming semantics. User clarification is another , where the system prompts for additional details to narrow interpretations, as seen in interactive natural language interfaces that query users on potential mismatches. Resolution methods contrast rule-based approaches, which apply predefined grammatical and semantic rules to select preferred interpretations, with probabilistic AI methods that use statistical models to assign likelihoods based on training data, favoring the most probable meaning in context. In historical systems like the 1970s LUNAR natural language query , rule-based semantic rules resolved ambiguities in about 78% of cases, with failures often due to parsing or semantic issues. Natural language programming trades the flexibility of human-like expression—allowing intuitive, varied inputs—for the precision of formal languages, which enforce unambiguous syntax to prevent errors but limit expressiveness. This often results in higher misinterpretation rates in early systems, with error rates around 20-30% attributed to unresolved ambiguities, compared to near-zero errors in rigidly structured code. By 2025, large language models (LLMs) have significantly reduced ambiguity resolution errors through contextual pattern recognition, achieving higher accuracy in translating natural language to code, yet they introduce hallucinations—fabricated outputs that appear correct but deviate from intent—exacerbating precision challenges in critical applications.

Scalability and Practical Constraints

One major constraint in natural language programming stems from computational demands during and phases. Transforming extended descriptions into executable code often requires large language models (LLMs) with billions of parameters, such as the 62B-parameter PACHINCO model used in benchmarks for notebooks, leading to significant and resource consumption. Full lookups for semantic disambiguation in systems exacerbate this, as they involve iterative queries across vast bases, resulting in cubic or higher complexity for context-free grammar equivalents in structures. Multi-round interactions in LLM-based agents further amplify costs, with each inference step consuming substantial GPU resources and tokens, making it impractical for resource-constrained environments like devices. Natural language programming demonstrates greater efficacy in domain-specific applications, such as , where constrained vocabularies and task-oriented instructions align well with predefined ontologies. For instance, frameworks integrating with LLMs have improved robot planning and execution in controlled settings, enabling intuitive command interpretation for tasks like or . However, it encounters substantial limitations in general scenarios, where broad, unstructured requirements demand handling diverse libraries, architectures, and interdependencies that exceed current semantic capabilities. Adoption remains predominantly confined to research prototypes rather than widespread production use, as systems struggle with the open-ended nature of beyond narrow fields. Maintenance of underlying poses ongoing challenges, particularly as programming languages and evolve rapidly. Semantic relies on static or semi-static to map intents to constructs, but updates to these structures—such as incorporating new versions or deprecated features—require curation or retraining, increasing long-term overhead. Integrating -generated with legacy codebases compounds this, as mismatched semantics and unhandled gaps in coverage lead to failures or errors, necessitating extensive human intervention. Recent surveys highlight limited practical adoption among developers, with pure interfaces facing resistance from frustrations like inconsistent outputs and loss, contributing to their niche status outside experimental . While AI-assisted is prevalent—85% of developers regularly employ such tools—due to persistent difficulties and trust issues. This low uptake underscores barriers in transitioning from prototypes to enterprise-scale deployment.

Security and Ethical Concerns

Natural language programming systems, particularly those powered by LLMs, introduce security vulnerabilities such as prompt injection attacks, where malicious inputs manipulate the model's output to execute unintended or leak sensitive data. Ethical challenges include the propagation of biases from training datasets, potentially leading to discriminatory behaviors, and concerns over when generating code from user prompts that may inadvertently incorporate copyrighted material. As of 2025, these issues remain underexplored in production deployments, limiting broader trust and adoption.

Future Directions

Advancements in AI Integration

Recent advancements in (LLM) have significantly enhanced natural language programming by enabling context-aware from textual descriptions. Models such as OpenAI's o, released in 2024, achieve performance comparable to GPT-4 Turbo on coding benchmarks while processing multimodal inputs like text and images, allowing users to describe user interfaces or visual elements in for automated code synthesis. techniques, including parameter-efficient methods like , adapt pre-trained LLMs to specialized tasks with reduced computational demands, improving accuracy on datasets such as HumanEval by up to 20% in targeted domains. These developments facilitate more intuitive programming workflows, where developers specify requirements in everyday language, and the model generates executable code while maintaining contextual coherence across iterations. Hybrid systems integrating ontologies with LLMs address limitations in pure generative approaches by providing structured knowledge for robust interpretation of intents in programming. For instance, ontologies capture domain-specific relationships and constraints, which LLMs query to refine code outputs, enhancing reliability in tasks like system state management. A prominent example is Devin AI, developed by Labs in 2024 and updated through 2025, which combines LLM-driven with tool integration to enable end-to-end application building from prompts; it autonomously handles , coding, testing, and deployment while incorporating verification steps for accuracy. This hybrid paradigm reduces hallucinations in by grounding LLM responses in ontological frameworks, achieving up to 4x faster task completion in real-world scenarios compared to standalone models. Ethical considerations in AI integration for natural language programming emphasize and output to ensure equitable and reliable translations from to code. Studies reveal that LLMs like exhibit gender biases in up to 49% of generated code for sensitive tasks, such as resume screening algorithms, often inheriting stereotypes from training data. strategies, including feedback-driven prompt refinement and chain-of-thought prompting, have proven effective, reducing bias rates to as low as 4.8% in refined outputs without retraining the model. layers, such as automated testing integrated into agent workflows, further check generated code against predefined criteria, promoting fairness and correctness in applications like automated . Ongoing research highlights agentic programming paradigms, where AI agents engage in natural language dialogues to plan and execute complex coding tasks collaboratively. A 2025 arXiv survey outlines techniques for LLM-based agents to decompose user intents into multi-step plans, interact with tools like compilers, and adapt via feedback loops, surpassing traditional by enabling autonomous software lifecycle management. These agentic systems, exemplified in frameworks that structure instructions as executable syntax, foster interactive programming sessions that mimic human- , with benchmarks showing improved success rates on long-horizon tasks like full application prototyping. Such advancements underscore the shift toward verifiable, dialogue-driven in programming, prioritizing transparency and iterative refinement. Recent research in natural language programming has increasingly focused on extending support to multilingual contexts, particularly for non-English and low-resource languages, to democratize access to coding. Efforts in have leveraged techniques to adapt models trained on high-resource languages like English to generate code from prompts in languages such as , , and , enabling non-native English speakers to program more effectively. For instance, the Bridge-Coder framework uses in-context learning from high-resource programming languages like to bridge gaps in low-resource ones such as , D, Racket, and , achieving performance improvements of up to 18.71% on benchmarks like M-HumanEval. These advancements address the English-centric bias in existing tools, promoting inclusivity in global through cross-lingual transfer. Collaborative aspects of programming are gaining traction, with tools designed to facilitate team-based development via shared prompts within integrated development environments (). The CoPrompt system, introduced in 2024, supports prompt sharing, referring, requesting, and linking among collaborators, allowing programmers to build upon each other's descriptions without repetitive communication or updates. By integrating these mechanisms directly into , CoPrompt enhances awareness of team progress and reduces cognitive overhead in , as demonstrated in user studies showing improved efficiency in joint tasks. This approach is particularly valuable for distributed teams, where prompts serve as a common, intuitive medium for iterative development. Emerging interfaces for non-classical computing systems represent another frontier, including natural language-driven tools for quantum and edge environments. In quantum computing, updates to the lambeq toolkit in 2025 have introduced command-line interfaces that enable users without programming expertise to experiment with quantum natural language processing, leveraging compositional models to translate natural language queries into quantum circuits. For edge computing, platforms like Latent AI allow autonomous agent deployment through natural language instructions, optimizing AI models for resource-constrained devices without traditional coding. Additionally, research on verifiable natural language contracts for blockchain has advanced, with NLP techniques enabling the generation and validation of smart contracts from legal prose, improving security and reducing vulnerabilities in automated execution. A 2025 survey highlights how such integrations enhance contract reliability by automating annotation and vulnerability detection, though challenges in semantic precision persist. Open challenges in natural language programming include the of to ensure consistent interpretation across tools and domains. Efforts toward , such as those outlined in IEEE and ISO frameworks, aim to create reusable semantic structures, but face hurdles like bottlenecks and domain-specific variability. Integrating with models in pipelines adds complexity, as rigid structures often conflict with probabilistic learning approaches. Industry forecasts predict significant adoption, with estimating that by 2030, 25% of IT work will be fully automated by and 75% will involve human- augmentation. This trajectory suggests natural language programming could integrate into over half of development workflows by the decade's end, fostering broader innovation.

References

  1. [1]
    [PDF] Natural Language Programming for Controlled Object-Orientated ...
    Jul 13, 2022 · Natural language programming (NLPr) is a subset of NLP tasks that aims at allowing the users to program with natural languages, such as English.
  2. [2]
    Natural language programming | ACM SIGART Bulletin
    The object is to determine the extent to which there exist sufficiently reliable and powerful communication mechanisms which might be employed in natural ...
  3. [3]
    Foundations of the case for natural-language programming
    Foundations of the case for natural-language programming. Author: Mark Halpern.
  4. [4]
    [PDF] How Humans Communicate Programming Tasks in Natural ...
    This suggests that clear IPT communication may be a necessary condition for enabling natural language programming. We expect that this result will hold ...
  5. [5]
    [PDF] Programming in Natural Language: Building Algorithms from Human ...
    One of the difficulties is that natural language programming re- quires a domain-aware counterpart that asks for clarification, thereby overcoming the chief ...
  6. [6]
    (PDF) Theoretical foundations of natural language programming ...
    Dec 22, 2015 · Theoretical foundations of natural language programming and publishing for intelligent agents and robots. January 2010. Authors: Sandor Veres at ...
  7. [7]
    Natural language programming of agents and robotic devices.
    The book provides examples of how conceptual structures can be built up for the purpose of developing shared understanding between man and machine.
  8. [8]
    On the foolishness of "natural language programming". (EWD 667)
    On the foolishness of "natural language programming". Since the early days of automatic computing we have had people that have felt it as a shortcoming that ...
  9. [9]
    Natural language programming of agents and robotic devices
    The book provides examples of how conceptual structures can be built up for the purpose of developing shared understanding between man and machine.
  10. [10]
    Programming with Natural Language Is Actually Going to Work
    Nov 16, 2010 · But as of yesterday we now have an important new source of data: actual examples of natural language programming being done in Mathematica 8.
  11. [11]
    [PDF] Theoretical foundations of natural language programming and ...
    In the computer science community the idea of natural language programming has been largely considered impossible and impractical due to ambiguity. This paper.
  12. [12]
    [PDF] From Natural Language to Programming Language
    language is the rigid and unnatural syntax and semantics. After analysis of ... Natural language programming. In Computer Program. Synthesis ...
  13. [13]
    [PDF] Chapter 17 - Knowing what you're talking about: Natural language ...
    We present MOOIDE (pronounced “moody”), a natural language programming system for a. MOO (an extensible multi-player text-based virtual reality storytelling ...
  14. [14]
    Procedures as a Representation for Data in a Computer Program for ...
    This paper describes a system for the computer understanding of English. The system answers questions, executes commands, and accepts information in normal ...
  15. [15]
    [PDF] shrdlu.pdf - Computer Science
    SHRDLU was an integrated artificial intelligence system could make plans and carry on simple con- versations about a set of blocks on a table. INTRODUCTION.Missing: original | Show results with:original
  16. [16]
    Natural language programming: styles, strategies, and contrasts
    The written texts were examined from five points of view: solution correctness, preferences of expression, contextual referencing, word usage, and formal ...Missing: control | Show results with:control
  17. [17]
    [PDF] The Inform Beginner's Guide
    Inform, the program and its source code, its example games and documentation, are copyright © Graham Nelson 1993—2002. First Edition: April 2002. Second ...<|separator|>
  18. [18]
    The History of Artificial Intelligence - IBM
    1970. Terry Winograd creates SHRDLU, a groundbreaking natural language understanding program. SHRDLU can interact with users in plain English to manipulate ...Missing: 1990s | Show results with:1990s
  19. [19]
    Natural Language Programming of Complex Robotic BDI Agents
    Sep 8, 2012 · This paper presents a natural language design environment that enables the programming of complex robotic agent systems, comprising of a top ...
  20. [20]
    Cursor: The best way to code with AI
    The best LLM applications have an autonomy slider: you control how much independence to give the AI. In Cursor, you can do Tab completion, Cmd+K for targeted ...Models · Blog · Features · Enterprise
  21. [21]
    State of AI Development: 34x growth in AI projects, OpenAI's ...
    Jul 13, 2023 · Replit has grown to become the central platform for AI development. Tools like ChatGPT can generate code, but creators still need infrastructure to run it.
  22. [22]
    From idea to app: Introducing Stitch, a new way to design UIs
    May 20, 2025 · Explore Stitch, a new Google Labs experiment that uses AI to generate UI designs and frontend code from text or image inputs in minutes.
  23. [23]
    [PDF] Toward Natural Language Computation 1 - ACL Anthology
    This format for natural language programming enables users to examine system performance as each command is typed and to detect most errors immedi- ately. 1.2 ...
  24. [24]
    [PDF] NLP (Natural Language Processing) for NLP ... - MIT Media Lab
    Starting with an. English text, we show how a natural language programming system can automatically identify steps, loops, and comments, and convert them into a ...
  25. [25]
    OWL Web Ontology Language Reference - W3C
    Feb 10, 2004 · The Web Ontology Language OWL is a semantic markup language for publishing and sharing ontologies on the World Wide Web. OWL is developed as a ...
  26. [26]
    protégé
    Protégé is a free, open-source ontology editor and framework for building intelligent systems.About · Software · Community · SupportMissing: natural | Show results with:natural
  27. [27]
    What Is COBOL? - IBM
    The first version of the COBOL programming language was released in 1960. And though COBOL programming was originally intended to serve as a stopgap measure, ...What is COBOL? · History of COBOL
  28. [28]
    [PDF] A View of The History of Cobol
    COBOL's progenitors were FLOW-MATIC, Commercial Translator, and AIMACO. Early meetings were held in 1959, and Charles Phillips was proposed as leader.
  29. [29]
    Inform 7 | Inform is a natural-language-based programming ...
    Inform is a programming language for creating interactive fiction, using natural language syntax. Using natural language and drawing on ideas from linguistics.Documentation · Downloads · Bugs · Donate
  30. [30]
    Shakespeare - Esolang
    Jul 22, 2025 · The Shakespeare Programming Language (SPL) is an esoteric programming language created by Karl Hasselström and Jon Åslund in 2001.Missing: 1996 | Show results with:1996
  31. [31]
    Introduction to AppleScript Language Guide - Apple Developer
    Jan 25, 2016 · This document is a guide to the AppleScript language—its lexical conventions, syntax, keywords, and other elements. It is intended primarily ...Commands Reference · AppleScript Fundamentals · Script Objects
  32. [32]
    Introducing GitHub Copilot: your AI pair programmer
    Jun 29, 2021 · Developed in collaboration with OpenAI, GitHub Copilot is powered by OpenAI Codex, a new AI system created by OpenAI. OpenAI Codex has broad ...
  33. [33]
    Introducing Codex - OpenAI
    May 16, 2025 · Update on June 3, 2025: Codex is now available to ChatGPT Plus users. We're also enabling users to provide Codex with internet access during ...
  34. [34]
    How Cursor Serves Billions of AI Code Completions Every Day
    Jul 29, 2025 · Cursor is an AI-powered code editor (IDE) that has quickly become a standout tool for developers since its initial release in March 2023 by the ...
  35. [35]
    What is Cursor AI? Features, Benefits & Coding Use Cases Explained
    Aug 29, 2025 · Cursor, built as a full editor, goes further with features like natural language code editing, deeper multi-file context, inline chat, and ...What Is Cursor Ai? · Cursor Ai Vs. Other Ai Code... · The Future Of Ai In Software...
  36. [36]
    Turn natural language into apps and websites - Replit AI
    Tell Replit Agent your app or website idea, and it will build it for you automatically. It's like having an entire team of software engineers on demand.Missing: LLM 2023
  37. [37]
  38. [38]
    Optimize software development with Amazon CodeWhisperer
    May 30, 2023 · CodeWhisperer supports code generation for 15 programming languages. CodeWhisperer can be used in various IDEs like Amazon Sagemaker Studio ...
  39. [39]
    27 of the best large language models in 2025 - TechTarget
    Jul 10, 2025 · GPT-4o. GPT-4 Omni (GPT-4o) is OpenAI's successor to GPT-4 and offers several improvements over the previous model. GPT-4o creates a more ...
  40. [40]
    Natural language programming of agents and robotic devices ...
    The book outlines the principles and its use with electronic personal assistants and intelligent software agents. The focus is on sEnglish and no commitment is ...
  41. [41]
    Laboratory process control using natural language commands from ...
    Apr 1, 1989 · Software required includes the Microsoft Disk Operating System (MS-DOS) operating system, a PC-based FORTRAN-77 compiler, and user-written ...
  42. [42]
    [PDF] American Jo&nsl of Computational Linguistics - ACL Anthology
    They are: a) semantic triples, b) aug- mented transition networks, c ... natural language programming by professionals was initiated. The objective.
  43. [43]
    Grammar to graph—An approach for semantic transformation of ...
    Sep 2, 2025 · The semantics of graphs derived from natural language using grammar ontology contributes to knowledge properties and inference among triples.
  44. [44]
    A Systematic Literature Review on RDF Triple Generation from ...
    Feb 27, 2024 · This survey examines methods for RDF triple generation and Knowledge Graphs (KGs) enhancement from natural language texts.Missing: programming | Show results with:programming
  45. [45]
    Generating Knowledge Graphs by Employing Natural Language ...
    Oct 28, 2020 · This paper presents a new architecture using NLP and Machine Learning to extract entities and relationships from research publications and ...Missing: programming | Show results with:programming
  46. [46]
    FoundationAgents/MetaGPT: The Multi-Agent Framework - GitHub
    Feb. 19, 2025: Today we are officially launching our natural language programming product: MGX (MetaGPT X) - the world's first AI agent development team. More ...
  47. [47]
    AgentAI: A comprehensive survey on autonomous agents in ...
    Oct 1, 2025 · 3. Large language models. Based on the transformer architecture, LLMs represent a significant advancement in Natural Language Processing (NLP).
  48. [48]
    [PDF] Programming in Natural Language: Building Algorithms from Human ...
    One of the difficulties is that natural language programming re- quires a domain-aware counterpart that asks for clarification, thereby overcoming the chief ...
  49. [49]
    Ambiguity in NLP and how to address them - GeeksforGeeks
    Jul 23, 2025 · Types of Ambiguity in NLP · 1. Lexical Ambiguity. Lexical ambiguity occurs when a single word has multiple meanings, making it unclear which ...
  50. [50]
    [PDF] Semantics and Quantification in Natural Language Question ...
    The LUNAR system consists of three principal components: a general purpose grammar and parser for a large subset of natural English, a rule- driven semantic ...
  51. [51]
    LLM Hallucinations in 2025: How to Understand and Tackle AI's ...
    Oct 3, 2025 · LLM hallucinations explained: what they are, classic causes, and the 2025 research breakthroughs reshaping how we measure and reduce them.
  52. [52]
    [PDF] Natural Language to Code Generation in Interactive Data Science ...
    Jul 9, 2023 · ARCADE is a benchmark for generating code from natural language in data science notebooks, using a 62B code language model, PACHINCO, and few- ...
  53. [53]
    [PDF] Computational Complexity of Natural Languages - ACL Anthology
    In particular, the most efficient parsing algorithm for context-free grammars has polynomial (cubic) complexity, while best parsers for regular grammars have ...
  54. [54]
    A Survey on Code Generation with LLM-based Agents - arXiv
    Jul 31, 2025 · Although LLM technology originated from natural language processing, it has also demonstrated remarkable potential in code generation tasks.
  55. [55]
    Natural language boosts LLM performance in coding, planning, and ...
    May 1, 2024 · Three new frameworks from MIT CSAIL reveal how natural language can provide important context for language models that perform coding, AI planning, and ...<|control11|><|separator|>
  56. [56]
    A Survey on Large Language Models for Code Generation
    Jul 8, 2025 · This survey reviews Large Language Models (LLMs) for code generation, covering data, advances, performance, ethics, impact, and applications. ...
  57. [57]
    Handling Ontology Gaps in Semantic Parsing - arXiv
    Jun 27, 2024 · Semantic Parsing (SP) is one of the long-standing tasks in Natural Language Understanding, aiming at mapping complex natural language to machine ...
  58. [58]
    The State of Developer Ecosystem 2025: Coding in the Age of AI ...
    Oct 15, 2025 · AI is becoming a standard in developers' lives: 85% of developers regularly use AI tools for coding and development, and 62% rely on at least ...AI proficiency is becoming a... · Languages and tools · Developer reality
  59. [59]
    2025 Stack Overflow Developer Survey
    84% of respondents are using or planning to use AI tools in their development process, an increase over last year (76%). This year we can see 51% of ...Missing: natural | Show results with:natural
  60. [60]
    Hello GPT-4o
    ### Summary of GPT-4o Capabilities for Code Generation and Multimodal Support in Natural Language Programming
  61. [61]
    [PDF] Enhancing Contextual Memory in LLMs for Software Engineering via ...
    Oct 8, 2025 · Our approach combines the generative capabilities of LLMs with a dynamically updated ontology that captures the evolving state of the system.
  62. [62]
    Devin | The AI Software Engineer
    Devin is an AI coding agent and software engineer that helps developers build better software faster. Parallel cloud agents for serious engineering teams.Devin · Devin Docs · Billing · Pricing
  63. [63]
  64. [64]
    Bias Testing and Mitigation in LLM-based Code Generation
    ### Summary of Bias Mitigation in LLM-based Code Generation
  65. [65]
    AI Agentic Programming: A Survey of Techniques, Challenges, and Opportunities
    ### Summary of Key Points on Agentic Programming Using Natural Language Dialogues
  66. [66]
  67. [67]
    Supporting Prompt Sharing and Referring in Collaborative Natural ...
    Oct 13, 2023 · CoPrompt assists programmers in comprehending collaborators' prompts and building on their collaborators' work, reducing repetitive updates and communication ...Missing: team IDEs 2024-2025
  68. [68]
    CoPrompt: Supporting Prompt Sharing and Referring in ...
    May 11, 2024 · CoPrompt enables programmers to conduct collaborative prompt engineering by building upon collaborators' prompts in natural language programming ...
  69. [69]
    Quantinuum Announces Updates to Quantum Natural Language ...
    The new update is equipped with a command-line interface, making most of the toolkit's functionality available to users with no programming knowledge. It also ...
  70. [70]
    We Did It - Again: Latent AI Named to Constellation ShortList™ for AI ...
    Traditional AI platforms provide tools to enhance human work. Latent Agent goes further, offering fully autonomous edge AI through natural language programming.
  71. [71]
    A Survey on the Integration of Blockchain Smart Contracts and ...
    Aug 6, 2025 · The automatic execution of a legal contract written in natural language is an open research question that can extend the blockchain ...Missing: verifiable | Show results with:verifiable
  72. [72]
    Learning ontologies from natural language texts - ScienceDirect.com
    The major problems in building ontologies are the bottleneck of knowledge acquisition and time-consuming construction of various ontologies for various domains/ ...Missing: programming challenges
  73. [73]
    What are the key challenges in creating and applying ontologies for ...
    Scalability Issues: Large-scale ontologies demand substantial computational resources for storage and processing, especially when dealing with high-dimensional ...
  74. [74]
    AI in software development – boosting productivity or a buzzword?
    Oct 13, 2025 · Gartner forecasts that by 2030, AI-driven environments will automate 70% of routine coding tasks [2]. McKinsey predicts that fully ...<|separator|>