Fact-checked by Grok 2 weeks ago

Set packing

Set packing is a classical problem in which, given a finite universe of elements and a collection of subsets of that universe, the goal is to select the largest possible subcollection of these subsets such that no two selected subsets share any common element, meaning they are pairwise disjoint. This unweighted version seeks to maximize the number of selected sets, while the weighted variant assigns positive weights to each subset and aims to maximize the total weight of the selected disjoint subcollection. Formally, the problem can be expressed as an integer linear program: maximize \sum_{j=1}^m w_j x_j subject to \sum_{j: i \in S_j} x_j \leq 1 for each element i in the , where S_1, \dots, S_m are the given , w_j > 0 are the weights, and x_j \in \{0,1\} indicates whether S_j is selected. The corresponding M (with rows for elements and columns for ) is a 0-1 , and the \max \{ w x : x \geq 0, M x \leq 1 \} provides a bound, with the optimum being under special conditions such as when M is totally unimodular or the associated clutter (a of minimal non-empty sets with no inclusions) is ideal. The dual problem relates closely to the set covering problem, where the objective is to minimize the cost to cover all elements. Set packing is NP-hard in general, and even the unweighted version restricted to subsets of size at most 3 (known as 3-set packing) is NP-complete. It remains NP-hard to approximate within a factor of \Omega(k / \log k) for the k-set packing variant, where all subsets have size k \geq 3. However, polynomial-time algorithms exist for restricted cases, such as when the subsets correspond to edges in a (reducible to maximum matching) or when the incidence matrix has perfect or balanced properties. The problem has broad applications across various fields, including airline crew scheduling, where disjoint shifts must be assigned without overlapping personnel; combinatorial auctions, for selecting non-conflicting bids; surgical scheduling to avoid resource overlaps; and packet transmission in communication networks to prevent . Additional uses arise in , logical inference in propositional logic, and problems like maximum cuts in planar graphs or T-joins. Due to its , practical solutions often rely on heuristics, branch-and-bound methods, or approximations, with recent advances encoding it as a maximum weighted set problem for improved performance on benchmark instances.

Definition and Background

Formal Definition

The set packing problem is a fundamental combinatorial optimization problem in which one seeks to select as many pairwise disjoint subsets as possible from a given family of subsets of a universe. Formally, the decision version of the set packing problem is defined as follows: given a finite universe U, a finite family \mathcal{F} of non-empty subsets of U, and a positive integer k, determine whether there exists a subfamily \mathcal{F}' \subseteq \mathcal{F} consisting of at least k sets such that no two sets in \mathcal{F}' share a common element. Pairwise disjointness in this context means that for any two distinct sets S, T \in \mathcal{F}', the S \cap T = \emptyset. The optimization version of the problem asks to find the maximum of such a subfamily \mathcal{F}', denoted by \nu(\mathcal{F}), which represents the of the largest collection of pairwise from \mathcal{F}. This formulation assumes a finite U and a finite family \mathcal{F} consisting of non-empty subsets, ensuring the problem is well-defined over finite instances. Conceptually, set packing is equivalent to finding a maximum matching in a , where the elements of U serve as vertices and the sets in \mathcal{F} as hyperedges.

Historical Context

The study of set packing traces its origins to foundational developments in during the mid-20th century, particularly through connections to matching problems. In the , key results such as on the decomposition of partial orders into chains and the Ford-Fulkerson for bipartite graphs established early min-max relations that underpin packing concepts in networks and matchings. These works highlighted the problem of selecting maximum disjoint subsets, with bipartite matching serving as a special case of set packing where the sets are edges of two. By the 1960s, the framework evolved further with ' contributions to general , including his 1965 association of set packing with stable sets via conflict graphs, where sets correspond to vertices and intersections define edges. This period also saw the emergence of problems in , which influenced set packing by emphasizing disjoint selections that partition the universe, as explored in early contexts. D.R. Fulkerson's anti-blocking theory during this decade linked these packing and covering dualities, providing theoretical foundations for non-bipartite cases through polyhedral approaches. A pivotal milestone occurred in 1972 when Richard Karp formalized set packing as one of 21 NP-complete problems in his seminal paper, demonstrating its reducibility from 3-dimensional matching and solidifying its place in . Building on Stephen Cook's 1971 result for , Karp's work shifted focus from exact solvability to intractability, inspiring subsequent research. In the 1990s, attention turned to approximation algorithms, with Hurkens and Schrijver's 1989 local search heuristic marking an early breakthrough by achieving a k/2 + \epsilon-approximation for the , where \epsilon > 0 is arbitrary and the algorithm examines small neighborhoods for improvements. This approach, rooted in analyzing systems of sets with system of distinct representatives, influenced later greedy and LP-based methods, emphasizing practical heuristics over exact solutions.

Mathematical Formulations

Integer Linear Programming

The set packing problem, in its optimization form known as the maximum set packing problem, seeks to select the maximum number of pairwise from a given family of subsets of a . This can be naturally modeled as an integer linear program (ILP), where binary decision variables indicate whether each set is included in the solution. Let U be the universe of elements and \mathcal{F} the family of subsets. Introduce a variable x_S \in \{0,1\} for each S \in \mathcal{F}, where x_S = 1 if set S is selected and 0 otherwise. The objective is to maximize the number of selected sets: \max \sum_{S \in \mathcal{F}} x_S The key constraints ensure disjointness by limiting each to at most one selected set: for every e \in U, \sum_{\substack{S \in \mathcal{F} \\ e \in S}} x_S \leq 1. These packing inequalities, along with the binary domain x_S \in \{0,1\} for all S \in \mathcal{F}, fully specify the ILP. A standard approach to bounding or approximating solutions involves the natural (LP) relaxation of this ILP, obtained by relaxing the integrality constraints to x_S \in [0,1] for all S \in \mathcal{F}, while retaining the objective and packing inequalities unchanged. This LP provides an upper bound on the optimal ILP value and serves as a foundation for various solution techniques, though its solutions may not be .

Hypergraph Interpretation

In the hypergraph interpretation of set packing, the universe U serves as the vertex set V of a H = (V, E), while the family of subsets \mathcal{F} corresponds to the hyperedge set E, where each hyperedge is a from \mathcal{F}. A set packing in this formulation is equivalent to a matching in the , defined as a collection of hyperedges such that no two hyperedges share a common , ensuring vertex-disjointness. The size of the maximum set packing is thus the matching number \nu(H), which denotes the cardinality of the largest such matching in H. In the case of uniform hypergraphs, where all hyperedges have the same cardinality k (i.e., k-uniform hypergraphs), this interpretation specializes to the k-set packing problem, maintaining the same structural equivalence.

Computational Complexity

NP-Completeness Proofs

The decision version of the set packing problem asks whether, given a family of sets \mathcal{F} over a universe U and an integer k, there exists a subfamily of k pairwise disjoint sets from \mathcal{F}. This problem is in NP because a certificate consisting of the k sets can be verified in polynomial time: for each element in U, check that it appears in at most one of the k sets, which requires time linear in the input size after sorting or hashing the elements across the sets. To establish NP-hardness, consider the reduction from the independent set problem, which is known to be NP-complete. Given an undirected graph G = (V, E) and integer k, construct a universe U consisting of one element e_{uv} for each edge \{u, v\} \in E. For each vertex v \in V, form a set S_v = \{e_{uv} \mid \{u, v\} \in E\} (i.e., the elements corresponding to edges incident to v). The family is \mathcal{F} = \{S_v \mid v \in V\}, and use the same k. This construction is polynomial-time computable. A set of k vertices in G forms an independent set if and only if the corresponding k sets in \mathcal{F} are pairwise disjoint, since S_u \cap S_v = \emptyset precisely when no edge connects u and v. Thus, G has an independent set of size at least k if and only if \mathcal{F} has a packing of size k. A seminal proof of NP-hardness, due to Karp, leverages the exact cover by 3-sets (X3C) problem, which is NP-complete via reduction from 3-dimensional matching. In an X3C instance, the universe U has |U| = 3m and \mathcal{C} is a collection of 3-element subsets of U; the question is whether there is a subcollection of exactly m sets from \mathcal{C} that covers every element of U exactly once. This is equivalent to asking whether \mathcal{C} has a packing of size m, because any m pairwise disjoint 3-element sets would cover exactly $3m distinct elements, hence U entirely. Thus, X3C is a special case of 3-set packing (where all sets in \mathcal{F} have size at most 3), establishing the NP-hardness of 3-set packing in polynomial time via the identity mapping. Since 3-set packing is a special case of general set packing, the latter is also NP-hard. Combining membership in NP and NP-hardness yields NP-completeness.

Approximation Properties

The maximum set packing problem cannot be approximated to within any constant factor unless P = , since it is equivalent to finding a maximum independent set in the of the given sets, where vertices correspond to sets and edges connect intersecting pairs. More precisely, for any ε > 0, there is no polynomial-time algorithm approximating the optimum within a factor of m^{1 - ε} unless ⊆ ZPP, where m is the number of sets. It is NP-hard to approximate k-set packing within Ω(k / log k). A simple for maximum set packing iteratively selects an arbitrary set from the current collection and removes all sets that intersect it, repeating until no sets remain; this achieves an approximation ratio of O(|U|). Improved performance is possible through local search techniques, which start with a solution and iteratively replace a small number of sets in the current packing with a larger disjoint subfamily if possible, yielding an approximation ratio of O(√|U|). LP-based methods, such as primal-dual approaches, also attain the O(√|U|) ratio by solving the natural fractional packing relaxation and the solution while preserving the packing constraints. The standard LP relaxation for set packing maximizes the sum of variables x_S associated with each set S (subject to ∑_{S ∋ u} x_S ≤ 1 for every u ∈ U and x_S ≥ 0), providing an upper bound on the integer optimum. This relaxation exhibits an integrality gap of at least Ω(√|U|), as demonstrated by instances derived from graphs with high girth and independence number, such as certain random or constructions where the fractional solution is roughly √|U| times the integer one; this gap explains the limitations of LP-based approximations and matches the upper bound achieved by local search. Advancements in the have refined approximations for restricted cases of set packing, such as when set sizes are bounded, with local search methods achieving ratios like (k + 1)/3 + ε for k-set packing. For example, as of , weighted 3-set packing can be approximated within 1.786. These results build on earlier techniques and offer better guarantees for practical instances while the general case remains governed by the √|U| barrier under standard assumptions.

Algorithms for Special Cases

Bounded Set Size

In the bounded set size variant of the set packing problem, all sets in the collection have cardinality at most k, for a fixed positive integer k. This restriction, known as k-set packing, corresponds to the case of hypergraphs where all hyperedges have size at most k. When k=1, the problem is trivial: the optimal solution consists of selecting all singleton sets, as they are disjoint by definition and cover the entire universe without overlap. For k=2, the problem reduces to finding a maximum cardinality matching in an undirected , where sets correspond to edges; this can be solved exactly in polynomial time using Edmonds' blossom algorithm, which runs in O(n^3) time for a graph with n vertices. For k \geq 3, the k-set packing problem is NP-hard. Despite the hardness, approximation algorithms exist; a local search achieves an approximation ratio of \frac{k+1 + \epsilon}{3} for any \epsilon > 0, running in polynomial time. For exact solutions when k is small, parameterized algorithms based on techniques like color coding solve instances in time f(k,t) \cdot |U|^{O(1)}, where t is the size of the packing.

Bounded Degree

In the bounded degree case of set packing, the family of sets F has maximum degree Δ(F) ≤ d, meaning that each element belongs to at most d sets in F. This structural restriction limits the overlapping of sets and enables more effective approximation algorithms compared to the general case. The weighted version admits approximation guarantees via LP rounding techniques, leveraging the bounded overlap to reduce the integrality gap and achieve ratios depending on d. For d=2, the problem is exactly solvable in polynomial time, as the hypergraph structure reduces to a collection of paths and cycles, allowing dynamic programming to find the maximum packing. For constant d, improved approximation ratios better than the general case's O(n / log^2 n) are possible via LP rounding techniques, leveraging the bounded overlap to reduce the integrality gap.

Equivalent Formulations

Set packing is computationally equivalent to the maximum independent set problem via a straightforward polynomial-time reduction. Given an instance of set packing consisting of a family of sets \mathcal{F} = \{S_1, S_2, \dots, S_m\} over a universe U, construct the intersection graph G = (V, E) where V = \mathcal{F} (each set is a vertex) and there is an edge between vertices S_i and S_j (for i \neq j) if and only if S_i \cap S_j \neq \emptyset. A maximum set packing in \mathcal{F} then corresponds exactly to a maximum independent set in G, as an independent set in G selects a collection of pairwise non-intersecting sets, and the size of the packing equals the independence number \alpha(G). This equivalence extends to the maximum clique problem through the complement graph construction. Specifically, since maximum in a H is equivalent to maximum independent set in the \overline{H} (where the clique size equals the independence number of \overline{H}), composing the above reduction with this standard transformation yields a from set packing to maximum that preserves optimality. Set packing is also directly isomorphic to hypergraph matching. In this view, the universe U forms the vertex set of a hypergraph H = (U, \mathcal{F}), where the sets in \mathcal{F} are the hyperedges; a maximum set packing then corresponds to a maximum matching in H, consisting of vertex-disjoint hyperedges, with the matching size equal to the packing size. These equivalences are bidirectional: polynomial-time reductions exist in both directions among set packing, maximum independent set, maximum clique, and hypergraph matching, preserving the optimal solution sizes, as all are NP-complete problems with known inter-reductions in the complexity literature.

Broader Connections and Uses

Set packing is closely related to the , which serves as its minimization dual, where the objective is to select a minimum number of sets to cover all elements, contrasting with set packing's maximization of . Another related problem is , a decision variant of set packing that requires selecting a collection of that precisely cover the entire universe without overlap or omission. Generalizations of set packing extend the basic model to more complex scenarios, such as multidimensional set packing, where sets are evaluated across multiple dimensions like cost and capacity constraints beyond simple . Weighted versions incorporate varying profits or costs for sets, transforming the objective into maximizing total weight subject to disjointness, while capacitated variants allow limited overlaps or resource bounds per element. In practical applications, set packing models tasks, such as scheduling disjoint jobs on machines to maximize throughput without conflicts, as seen in systems. In bioinformatics, it aids set selection by identifying non-overlapping functional groups from genomic data to prioritize targets. For network design, set packing formulations support finding maximum disjoint paths in graphs to enhance reliability and capacity. Ongoing addresses open problems in set packing, including the pursuit of improved approximation ratios for instances with specific set densities, such as those where sets have bounded intersections. studies explore fixed-parameter tractability (FPT) algorithms for parameters like , aiming to solve dense or structured instances efficiently.

References

  1. [1]
    [PDF] Combinatorial Optimization: Packing and Covering - andrew.cmu.ed
    The integer programming models known as set packing and set cov- ering have a wide range of applications, such as pattern recognition,.
  2. [2]
    Solving the Set Packing Problem via a Maximum Weighted ...
    Dec 16, 2020 · The set packing problem (SPP) is a significant NP-hard combinatorial optimization problem with extensive applications.Abstract · Introduction · Diversion Local Search Based... · Computational Results
  3. [3]
    [PDF] ON THE COMPLEXITY OF APPROXIMATING k-SET PACKING
    We improve the inapproximability factor for Max- imum k-Set Packing, and prove that it is NP-hard to approximate k-SP to within Ω(k/lnk). We extend this ...
  4. [4]
    Set Packing Optimization by Evolutionary Algorithms with ... - NIH
    The set packing problem is a core NP-complete combinatorial optimization problem which aims to find the maximum collection of disjoint sets from a given ...
  5. [5]
    [PDF] COSC 311: ALGORITHMS
    The Set Packing problem is defined as follows: Input: • A set U consisting of n elements. • m subsets of U, S1,...,Sm. • A positive integer k. Output: Is there ...
  6. [6]
    None
    ### Extracted Formal Definition of the Set Packing Problem
  7. [7]
  8. [8]
    On local search and LP and SDP relaxations for k-Set Packing - arXiv
    Jul 27, 2015 · It is equivalent to hypergraph matching and it is strongly related to the maximum independent set problem. In this thesis we study the k-set ...
  9. [9]
    [PDF] Aspects of Set Packing, Partitioning, and Covering
    we investigate set packing relaxations of combinatorial optimization problems associated with acyclic digraphs and linear orderings, cuts and multicuts ...
  10. [10]
    Exact cover - Wikipedia
    An exact cover problem involves the relation contains between subsets and elements. But an exact cover problem can be represented by any heterogeneous relation ...Missing: 1960s | Show results with:1960s
  11. [11]
    [PDF] Reducibility among Combinatorial Problems
    Together Cook & Karp, and independently Levin laid the foundations of the theory of NP-Completeness. • “… Karp introduced the now standard methodology for.
  12. [12]
    [PDF] On the Size of Systems of Sets Every t of Which Have an SDR ...
    On the Size of Systems of Sets Every t of Which Have an SDR, with an Application to the Worst-Case Ratio of Heuristics for Packing Problems · C. Hurkens, A.
  13. [13]
    [PDF] Lecture 5
    Sep 9, 2014 · This problem is traditionally called the integer set packing problem (we want to pack as many disjoint sets as possible). The linear program is ...
  14. [14]
    [PDF] A Tutorial on Integer Programming
    The set packing problem arises when each set element must appear in at most one subset. In this case, the constraints are of the less-than-or-equal form. The ...
  15. [15]
    [PDF] On linear and semidefinite programming relaxations for hypergraph ...
    The hypergraph matching problem, also known as the set packing problem, is a fun- damental problem in combinatorial optimization with various applications ...
  16. [16]
    [PDF] On a hypergraph matching problem - Princeton Math
    A hypergraph H is an ordered pair H = (V,E) where V is a finite set (the vertex set) and E is a family of distinct subsets of V (the edge set). A hypergraph ...
  17. [17]
    [PDF] reducibility among combinatorial problems - CS@Purdue
    MINIMUM CUT [Edmonds & Karp (1972)] ... INPUT: graph G, positive integer k. PROPERTY: G has a set of k mutually adjacent nodes. SET PACKING. INPUT: Family of sets ...
  18. [18]
    [PDF] approximating covering and packing problems: set cover, vertex ...
    We trace 94 Page 2 3.1 INTRODUCTION 95 the history and development of this approach as it evolved for the set cover, set packing, and related problems.
  19. [19]
    An Improved Approximation for Maximum Weighted $k$-Set Packing
    Jan 18, 2023 · Our algorithm is based on the local search procedure of Berman that attempts to improve the sum of squared weights rather than the problem's ...
  20. [20]
    [PDF] Edmonds' Blossom Algorithm - Stanford University
    Jun 6, 2016 · Edmonds' Blossom algorithm is a polynomial time algorithm for finding a maximum matching in a graph. Definition 1.1. In a graph G, a matching is ...
  21. [21]
    Approximate the k-Set Packing Problem by Local Improvements - arXiv
    Jul 8, 2013 · We study algorithms based on local improvements for the k-Set Packing problem. The well-known local improvement algorithm by Hurkens and ...Missing: 1989 | Show results with:1989
  22. [22]
    [PDF] A faster parameterized algorithm for set packing 1 Introduction - NJIT
    Dec 14, 2004 · We present an efficient parameterized algorithm for the (k, t)-set packing problem, in which we are looking for a collection of k disjoint sets ...
  23. [23]
    Approximating maximum independent set in bounded degree graphs
    Efficient bounds for the stable set, vertex cover, and set packing problems. Disc. Applied Math. 6, (1982), 243-254. [J] D.S. Johnson, The NP-Completeness ...
  24. [24]
    [PDF] EFFICIENT BOUNDS FOR THE STABLE SET, VERTEX COVER ...
    A cover in a graph G is a set of vertices C such that each edge of G has at least one endpoint in C. The vertex cover problem is the problem of finding a ...