configure script

A configure script is a Bourne shell script automatically generated by GNU Autoconf, a component of the GNU Autotools suite, to configure software source code packages for building on various POSIX-compliant systems.^[1] Its primary purpose is to probe the host environment for system-specific details, such as available compilers, libraries, header files, and functions, and to produce tailored output files—including Makefiles from Makefile.in templates and configuration headers like config.h—to enable portable compilation without requiring manual user intervention.^[2] Within the broader GNU build system, the configure script integrates with tools like Automake and Libtool to create a standardized, flexible workflow for software distribution.^[3] Developers write a template file, typically named configure.ac or configure.in, using M4 macros to specify feature tests and conditional logic; Autoconf then processes this input with the GNU M4 macro processor (version 1.4.6 or later required; 1.4.13 or later recommended) to output the executable script.^[1] Upon invocation, the script supports standard command-line options for customization, including --prefix to set installation directories, --enable-feature and --disable-feature to toggle optional components, --with-package to integrate external dependencies, and variables like CC to override tools, while also handling cross-compilation via --host and --target specifications.^[2] This mechanism can cache test results in config.cache (when using the --config-cache option) to speed up subsequent runs and generates config.status to embed results for reproducible output file generation, while ensuring generated files include markers indicating their automated origin, facilitating maintenance and debugging in large-scale open-source projects.^[1]

Overview

Definition and origins

A configure script is an executable shell script, typically written in the Bourne shell (sh) or Bash, designed to probe the host system during the software build process on Unix-like operating systems. It performs a series of tests to detect system characteristics, such as the presence of specific compilers, libraries, headers, and utilities, and generates customized configuration files—like Makefiles or config.h headers—that adapt the source code to the target environment. This mechanism ensures that the software can be compiled and installed portably across diverse Unix variants without manual intervention for each platform.^[2]^[4] The origins of the configure script date to 1984, when Larry Wall incorporated the first known example into his rn Usenet newsreader program. Released that year, rn included a hand-written configuration script that automatically detected the host Unix version and any deviations from expected standards, adjusting the build parameters accordingly to facilitate installation on varying systems. This manual approach laid the groundwork for automated tools like GNU Autoconf, introduced in 1991, which generate configure scripts from declarative inputs.^[5]^[6] In its basic operation, the configure script executes prior to the compilation phase, outputting files that define variables and settings based on the detected environment, thereby enabling source code portability. For instance, an early hand-coded script might check for the availability of the GNU Compiler Collection (GCC) and set the appropriate compiler variable, falling back to the system's default if necessary:

sh
#!/bin/sh
if which gcc >/dev/null 2>&1; then
    CC=gcc
    echo "Using GCC compiler."
else
    CC=cc
    echo "Using default cc compiler."
fi
echo "CC = $CC" >> Makefile
#!/bin/sh
if which gcc >/dev/null 2>&1; then
    CC=gcc
    echo "Using GCC compiler."
else
    CC=cc
    echo "Using default cc compiler."
fi
echo "CC = $CC" >> Makefile

Such checks exemplify how the script embeds conditional logic to resolve dependencies, a practice that became foundational for subsequent build tools.^[2]

Role in build systems

The configure script serves as a foundational component in the traditional "configure-make-install" triad prevalent in open-source software projects, where it executes prior to the make phase to generate customized Makefiles tailored to the host environment by incorporating system-specific details.^[2] A primary function of the configure script is to enhance portability by detecting operating system features, including the presence of libraries, header files, and compiler capabilities, thereby accommodating variations between systems such as Linux, BSD, and Solaris without requiring modifications to the core source code. For instance, in GNU build systems, it identifies the system type (e.g., i686-pc-linux-gnu) using auxiliary scripts like config.guess and config.sub, and adjusts build parameters accordingly to handle platform-specific quirks. This detection mechanism supports cross-compilation scenarios through options like --host and --target, enabling builds for one architecture on another, which is crucial for embedded systems and multi-platform distributions.^[2] The benefits of employing a configure script in build systems include significant reductions in manual configuration efforts, as it automates the resolution of dependencies and environmental variables to produce reproducible builds across different machines. In GNU projects, for example, it sets key variables such as CFLAGS for compiler flags or LIBS for linker libraries based on detected features, ensuring software compiles reliably on diverse architectures while minimizing errors from overlooked system differences. This automation not only accelerates development workflows but also promotes consistency in open-source ecosystems by standardizing the adaptation process.^[2]

Generation

Autoconf tool

Autoconf is a GNU tool within the Autotools suite designed to generate portable configure scripts from declarative input files, enabling software packages to adapt automatically to various POSIX-like systems without manual intervention.^[7] It produces shell scripts that detect system features, such as compiler capabilities and library availability, to facilitate cross-platform builds.^[1] Initial development of Autoconf began in 1991, with the first stable release (version 1.0) occurring in July 1992.^[8]^[9] As a prerequisite, Autoconf relies on the GNU M4 macro processor (version 1.4.8 or later required) to expand macros during script generation.^[1] It integrates with other Autotools components, such as Automake for Makefile creation and Libtool for library management, to provide a complete ecosystem for automating software builds. The core process involves Autoconf reading an input file named configure.ac (or configure.in), which contains M4 macros defining configuration checks, and expanding them into a portable Bourne shell script called configure.^[1] This output script incorporates extensive feature tests to ensure compatibility across diverse environments. A basic invocation to generate the configure script uses the autoreconf wrapper, which orchestrates Autoconf and related tools: autoreconf -f -i. This command forces regeneration and installs necessary auxiliary files from configure.ac.^[10]

Input files and macros

The primary input file for generating a configure script using Autoconf is configure.ac, a declarative script written in the M4 macro language that specifies system checks, feature probes, and output actions through a sequence of macro invocations.^[11] This file, sometimes referred to by its legacy name configure.in, serves as the blueprint for the resulting shell script, allowing developers to define portable detection logic without embedding imperative shell code.^[12] The M4-based structure ensures that the input remains concise and readable, with macros expanding into the necessary runtime tests during script generation.^[13] Key macros in configure.ac form the core of this declarative system, each handling specific aspects of configuration. The AC_INIT macro must be called first to initialize essential package metadata, including the package name, version, and optional bug report email; for instance, AC_INIT([GNU Project], [1.0], [[email protected]]) sets variables like PACKAGE_NAME and PACKAGE_VERSION for use in outputs.^[14] Following initialization, macros like AC_PROG_CC locate and verify a suitable C compiler, setting the CC output variable to the detected tool (such as gcc or cc) and ensuring it supports required standards.^[15] For library detection, AC_CHECK_LIB probes for a specific library by attempting to link a test program calling a given function, adding -llibrary to LIBS if successful; its syntax is AC_CHECK_LIB([library], [function], [action-if-found], [action-if-not-found]).^[16] The sequential nature of macro calls in configure.ac defines the order of probes, with each macro potentially setting shell variables or cache results for subsequent use. For example, AC_FUNC_MALLOC tests whether the system's malloc function returns a non-null pointer for a zero-size allocation, defining HAVE_MALLOC=1 and providing a replacement implementation if the check fails, as cached in ac_cv_func_malloc_0_nonnull.^[17] Similarly, AC_CHECK_HEADERS verifies the presence of header files like stdlib.h by attempting compilation. The script concludes with AC_OUTPUT, which generates and executes config.status to produce substituted files such as Makefiles, using variables accumulated from prior macros.^[18] A representative snippet of configure.ac illustrates this structure:

AC_INIT([MyProject], [1.0])
AC_PROG_CC
AC_CHECK_HEADERS([stdlib.h])
AC_FUNC_MALLOC
AC_CHECK_LIB([m], [pow])
AC_OUTPUT([Makefile])
AC_INIT([MyProject], [1.0])
AC_PROG_CC
AC_CHECK_HEADERS([stdlib.h])
AC_FUNC_MALLOC
AC_CHECK_LIB([m], [pow])
AC_OUTPUT([Makefile])

This sequence initializes the project, sets up the compiler, checks for standard headers and the malloc function, probes for the math library, and finally outputs the Makefile with substitutions.

Execution and features

Invoking the script

The configure script is typically invoked from the root directory of the extracted source code package, after unpacking it from a distribution archive such as a tarball using a command like tar -xvf package.tar.gz. This placement ensures that the script can locate necessary files and perform its checks relative to the source tree. Build tools such as make must be installed on the system beforehand, as the configuration process prepares the environment for subsequent compilation steps.^[19] To execute the script, ensure it has executable permissions by running chmod +x configure if necessary, then issue the command ./configure [options] in a Bourne-compatible shell environment, such as sh or bash. The script is designed for POSIX-like systems and relies on a standard shell for its operations, producing verbose output during checks while appending detailed logs—including compiler outputs and test results—to config.log in the current directory for debugging purposes if any checks fail.^[20]^[21]^[19] A common example is ./configure --prefix=/usr/local, which specifies the installation directory prefix; upon successful completion, this is followed by make to compile the software and make install to deploy it to the designated location. Options can be viewed via ./configure --help for customization, such as setting cache files or source directories, but the basic form suffices for standard setups.^[22]

System detection mechanisms

The configure script employs a variety of probing methods to detect system features during its execution, primarily by attempting to compile and link small code snippets using the host system's compiler and linker. These tests are implemented through low-level Autoconf macros such as AC_TRY_COMPILE, which checks if a given source code fragment compiles successfully without linking, often to verify the availability of headers or compiler-specific features. For instance, to detect the presence of a header file like <pthread.h>, the script uses AC_CHECK_HEADER, which invokes AC_TRY_COMPILE with a minimal program that includes the header and a simple statement, reporting success if compilation succeeds. Similarly, AC_TRY_LINK extends this by also attempting to link the object file, useful for verifying function availability or basic library linkage.^[23] To check for libraries, the script utilizes macros like AC_CHECK_LIB, which constructs a test program calling a specific function from the library and attempts to link it with the corresponding flag, such as -lpthread for the POSIX threads library. A representative example is detecting the pthread library: the script compiles and links a small C program that includes <pthread.h> and calls pthread_create, appending -lpthread to the linker flags; if successful, it sets a shell variable like HAVE_PTHREADS=yes and may add the flag to LIBS for subsequent builds. These probes are non-intrusive, relying on the system's build tools, and are typically wrapped in higher-level macros like AC_SEARCH_LIBS to try multiple library names or paths systematically. Failed compilations or linkages are diagnosed by examining compiler error messages, ensuring the script adapts to diverse environments without requiring user intervention.^[23] The configure script supports an optional caching mechanism to optimize repeated executions, which can store the results of these probes in a cache file to avoid redundant tests on the same system. This is facilitated by macros such as AC_CACHE_CHECK, which assigns a unique cache variable (e.g., ac_cv_header_pthread_h) to each test result and checks/sets it as needed; within a single run, it uses shell variables for efficiency, while a cache file enables reuse across invocations. By default, no cache file is used (equivalent to --cache-file=/dev/null), so all probes are performed each time. Caching to a file can be enabled with --cache-file=FILE (e.g., --config-cache for config.cache), or disabled explicitly with --no-cache; the cache file, if used, is sourced at the start and updated only at the end of a successful run, just before generating output files. The --no-create option allows the script to perform all detections and caching but aborts without creating Makefiles or other outputs, useful for dry runs or verification. This system significantly reduces execution time in iterative development or when configuring multiple packages on the same host.^[24] Error handling during these detections is robust, with detailed diagnostics logged to config.log for every test invocation, including the exact compiler commands, input code snippets, and full output from failed compilations or linkages. This log captures environmental variables, include paths, and linker flags used, enabling users to troubleshoot issues like missing dependencies or incompatible toolchains by inspecting the verbose error traces. For example, a failed pthread detection would record the compilation attempt, such as gcc -c conftest.c -o conftest.o followed by the error (e.g., "undefined reference to pthread_create'"), without halting the entire [script](/page/Script) unless configured to do so via AC_MSG_ERROR`.^[25] Cross-compilation support modifies these mechanisms when the --host option specifies a target different from the build system, entering a mode where runtime execution tests (e.g., AC_TRY_RUN) are disabled to prevent attempting to run unexecutable binaries. Instead, the script relies on compilation and linkage probes or user-provided defaults, setting the cross_compiling variable to yes and adjusting variables like host and host_alias accordingly; for instance, ./configure --host=arm-linux would probe using an ARM cross-compiler for library detections while caching results for the target environment. This ensures portability but requires careful macro selection to avoid unreliable assumptions in cross scenarios.^[26]

Output and customization

Generated files

The configure script primarily generates several key files to facilitate the build process of a software package. Among these, the most central output is config.status, a shell script that records the configuration options and results from the configure invocation, enabling subsequent reconfiguration without rerunning the full set of system probes. This file is created during the execution of configure and serves as the mechanism for instantiating other output files by applying substitutions and actions defined in the input template.^[27] One of the primary outputs is the Makefile, generated from template files such as Makefile.in through variable substitution using the AC_CONFIG_FILES macro in configure.ac. The configure script replaces placeholders in the form @VAR@—defined via the AC_SUBST macro in the configure.ac input file—with their corresponding values, such as replacing @CC@ with the detected compiler like gcc. For instance, in a typical package, this might produce src/Makefile with paths adapted to the installation prefix specified via the --prefix option, ensuring build rules reflect the target environment without hardcoding system-specific details. Similarly, config.h is generated from config.h.in using the AC_CONFIG_HEADERS macro, incorporating preprocessor defines based on detection results, such as #define HAVE_STDLIB_H 1 if the standard library header is present on the system.^[28]^[27] These generated files are designed such that the build system avoids unnecessary regeneration to preserve custom or manually edited content; for example, if a Makefile already exists and is newer than its .in template and other dependencies, make will skip invoking config.status unless explicitly forced (e.g., with make -B), preserving user modifications in non-standard setups. Regeneration of outputs can be performed efficiently using config.status --recheck, which reruns the configure script's detection mechanisms—such as those for system features—with the original arguments but without fully reinstantiating files, updating config.status only as needed. This approach leverages the prior configuration state to minimize redundant checks while ensuring outputs remain consistent with any changes in the build environment.^[29]^[27]

Common options

The configure script accepts a variety of command-line options to customize the build process, allowing users to specify installation locations, enable or disable features, and override default settings. These options are typically defined in the input file (configure.ac) using macros such as AC_ARG_ENABLE for feature toggles and AC_ARG_WITH for package dependencies, which generate corresponding help text and handling logic in the script. Running ./configure --help displays a comprehensive list of all available options for a specific script, including package-specific ones alongside standard flags.^[30] Among the most commonly used options is --prefix=DIR, which sets the base directory for installing files, defaulting to /usr/local if unspecified; this affects paths for binaries, libraries, and data files in the generated Makefiles.^[30] Feature-related options include --enable-FEATURE to activate optional components (e.g., debugging support) and --disable-FEATURE to deactivate them, with the script's behavior for each depending on the package's configure.ac definitions. Similarly, --with-PACKAGE[=ARG] specifies the location or configuration of external dependencies (e.g., --with-libxml=/usr), while --without-PACKAGE excludes them entirely, aiding in dependency management during configuration.^[30] For advanced customization, --cache-file=FILE enables or directs the use of a cache file to store configuration results, accelerating subsequent runs by avoiding redundant system probes.^[30] The --srcdir=DIR option supports out-of-tree builds by pointing to the source directory when running configure from a separate build directory.^[30] Users can also override environment variables like the compiler with CC=compiler ./configure, allowing specification of tools such as CC=gcc-12 for targeted builds.^[30] A practical example is ./configure --enable-shared --without-gtk, which configures the build to produce shared libraries while omitting the GTK dependency, useful for creating lighter installations or avoiding unavailable libraries.^[30] These options collectively enable flexible adaptation of the build to diverse environments without modifying source code.

History and evolution

Early development

The configure script originated in 1984 with Larry Wall's development of rn, a Usenet newsreader designed to handle the fragmented landscape of Unix systems. Wall hand-coded the initial Configure script as a shell script featuring interactive question-and-answer prompts and humorous commentary to guide users through system detection, such as verifying the operating system with messages like "Congratulations! You’re not running Eunice." This approach used conditional if-then statements to probe for hardware and software variations, including differences between Version 7 Unix and BSD derivatives, such as path separators and library locations. The script's manual construction reflected the era's need for portability across diverse Unix implementations, where even basic features like file archiving varied significantly. Early adoption of similar configure scripts extended beyond rn, influencing Wall's subsequent projects and the broader Unix software community. In 1987, Wall adapted the rn Configure for Perl's initial release, marking one of the first instances of this detection mechanism in a widely distributed scripting language. The script in Perl inherited rn's probing logic to detect system-specific traits, ensuring compilation across Unix variants without extensive manual intervention. Other tools, including Wall's patch utility for applying software updates, incorporated comparable hand-crafted detection routines, focusing on reconciling inconsistencies like V7 Unix's simpler file handling versus BSD's extensions. These scripts proliferated in free software distributions throughout the late 1980s and early 1990s, as developers sought to distribute portable code amid the explosion of Unix-like systems. However, the manual nature of these early configure scripts posed significant challenges, as maintaining them required constant updates for emerging system differences, leading to error-prone processes and duplicated efforts across projects. For instance, the original rn Configure included specific checks for terminal handling libraries—distinguishing between termcap in older systems and terminfo in newer BSD implementations—as well as archive formats like ar, which varied in command-line options and output structures. Such custom logic, often spanning hundreds of lines, became burdensome as Unix fragmentation grew, resulting in a landscape of similar yet non-standardized scripts that complicated software installation and portability. This proliferation highlighted the limitations of ad-hoc approaches, where a single overlooked system quirk could break builds on targeted platforms.

Modern developments

Autoconf has evolved significantly since version 2.13, released on January 5, 1999, which introduced support for Fortran 77 and enhanced include statements in macro processing.^[31] Subsequent releases, culminating in version 2.72 on December 22, 2023, have expanded the macro library through better integration with Gnulib, providing portable C source code and macros for modern standards like C23 compatibility and Y2038 safety options for 64-bit time_t handling. These updates include refined macros for testing libraries and programs, such as AC_CHECK_LIB and AC_PROG_AR, improving overall portability across POSIX-like systems.^[32] Autoconf supports cross-compilation to Windows environments, including compatibility with MinGW and MinGW-w64 for generating build files adaptable to Windows without native POSIX shells. This addresses limitations in handling Windows-specific paths and libraries.^[1] Recent enhancements include deeper integration with pkg-config, originating in the late 1990s, which supplies Autoconf macros like PKG_CHECK_MODULES for automated dependency detection and flag extraction.^[33] This allows configure scripts to query package metadata directly, streamlining library linkages without manual path specification. Additionally, autoreconf supports version control workflows, such as invocation in Git repositories to regenerate build files from source trees, ensuring consistency during clones and updates. Despite these advances, configure scripts face persistent limitations, including verbose output that floods terminals with diagnostic messages during execution, though this can be mitigated with the --quiet option. On large projects, the scripts' reliance on sequential shell-based tests—compiling and running small programs for feature detection—leads to noticeable slowdowns, often taking minutes or longer without caching mechanisms.^[34] Since the 2010s, Autotools usage has declined in favor of declarative build systems like CMake and Meson, which offer faster configuration, better parallelization, and simpler syntax for cross-platform projects.^[35] For instance, major initiatives such as LLVM and GNOME have migrated to these alternatives, citing reduced maintenance overhead and improved performance on diverse hardware.^[36] In Autoconf 2.70 and later, refinements to the AC_REQUIRE macro address macro ordering issues by enforcing explicit invocations and restricting its use in complex shell flows, preventing latent bugs from unordered dependencies.^[37] These changes promote safer macro expansion.