Fact-checked by Grok 2 weeks ago

Visual servoing

Visual servoing is a robotic control technique that uses real-time visual feedback from cameras to direct and adjust the motion of robots, integrating computer vision for feature extraction with control theory to minimize errors between current and desired visual configurations.^[1] This approach enables precise tasks such as positioning, tracking, and manipulation without relying solely on pre-programmed models or external sensors.^[2] The concept emerged in the late 1980s, with early foundational work including Weiss et al.'s 1987 demonstration of vision-guided robot control and subsequent schemes by Feddema and Mitchell in 1989, building toward a unified framework by the mid-1990s.^[1] A seminal tutorial by Hutchinson, Hager, and Corke in 1996 formalized visual servoing as the fusion of image processing, kinematics, dynamics, and real-time computing to servo robots based on visual features.^[3] Over time, it has expanded from static environments to dynamic scenarios, incorporating robot dynamics for higher speed and accuracy, and addressing challenges like camera calibration and feature occlusion.^[2] Central to visual servoing are two primary paradigms: image-based visual servoing (IBVS), which directly regulates features in the image plane to avoid explicit 3D reconstruction, and position-based visual servoing (PBVS), which estimates the camera's 3D pose relative to targets and controls motion in Cartesian space.^[1] Hybrid methods combining these, along with 2.5D or switching schemes, further enhance robustness by decoupling translational and rotational motions or fusing visual data with other sensors.^[3] Camera configurations vary, including eye-in-hand (mounted on the robot) for dexterous manipulation and eye-to-hand (fixed) for broader scene observation.^[1] Applications span mobile robotics for navigation and localization, aerial vehicles for obstacle avoidance, medical systems for minimally invasive procedures, and industrial manipulators for assembly tasks.^[2] Recent advances incorporate deep learning for feature detection in unstructured environments and model predictive control for optimal trajectories, improving adaptability to uncertainties like lighting variations or motion blur.^[2] These developments underscore visual servoing's role in enabling autonomous, vision-driven robotics across diverse domains.^[1]

Introduction

Definition and principles

Visual servoing is a closed-loop control technique that employs visual feedback from cameras to direct robot motion, allowing the end-effector to attain a desired pose relative to a target object.^[1] This approach integrates computer vision data directly into the servo loop, enabling precise and adaptive control without relying on precomputed trajectories.^[4] At its core, visual servoing relies on real-time image processing to extract visual features, such as points or contours, which are compared to desired values to generate corrective commands.^[1] These features feed into the control loop to minimize positioning errors, setting it apart from open-loop vision guidance methods that lack ongoing feedback and are prone to inaccuracies from calibration drifts or environmental changes.^[4] The fundamental system architecture includes a vision sensor, typically a camera mounted on the robot (eye-in-hand) or fixed in the environment, a feature extractor that identifies and tracks relevant image elements, a controller that processes errors to compute velocity commands, and robot actuators that implement the motions.^[1] Visual servoing surpasses traditional sensors, such as tactile or proprioceptive devices, by accommodating unstructured environments through direct use of visual data and by adapting to dynamic scenes via continuous feedback, thus enhancing robustness without needing full 3D environmental models.^[2] For instance, a robotic arm can employ visual servoing to adjust its gripper based on the target's position in the image plane, ensuring reliable manipulation amid minor perturbations.^[4]

Historical development

The origins of visual servoing trace back to the integration of computer vision and robotics in the 1970s, with early experiments focusing on visual feedback for robotic manipulation. In 1973, Shirai and Inoue demonstrated one of the first uses of visual feedback to guide a robot in assembly tasks, marking an initial step toward closed-loop vision-based control. By 1979, Hill and Park introduced the term "visual servoing" and developed a real-time system using a mobile camera attached to a robot for hand-eye coordination, laying foundational concepts for eye-in-hand configurations. Throughout the 1980s, researchers advanced these ideas through taxonomies and control frameworks; notably, Sanderson and Weiss in 1980 classified visual servo systems into look-and-move and direct servo categories, while Weiss et al. in 1987 explored dynamic sensor-based control with visual feedback, emphasizing the need for robust integration of vision data into robot dynamics. The 1990s saw a surge in theoretical and practical developments, establishing core paradigms in visual servoing. Espiau, Chaumette, and Rives in 1992 proposed a seminal framework for image-based visual servoing (IBVS), deriving interaction matrices to directly regulate image features for robot control. Concurrently, Weiss et al. in 1987 advanced position-based visual servoing (PBVS) by estimating 3D pose from visual data to guide robotic motion. These contributions were synthesized in the influential 1996 tutorial by Hutchinson, Hager, and Corke, which formalized IBVS and PBVS as primary control schemes and highlighted their implementation on standard hardware. Key figures like François Chaumette and Seth Hutchinson drove much of this progress, with Chaumette's work on feature selection and stability analysis becoming central to the field. In the 2000s, advancements focused on hybrid methods and real-time capabilities, enabled by improved computational power. Malis, Chaumette, and Boudet in 1999 introduced 2.5D visual servoing, combining 2D image features with partial 3D depth information to mitigate limitations of pure IBVS and PBVS. Researchers like Corke further refined these through open-source toolboxes, facilitating widespread adoption in robotic applications. Post-2010 developments integrated machine learning to enhance feature robustness and adaptability, particularly for dynamic environments like unmanned aerial vehicles (UAVs). For instance, Saxena et al. in 2017 proposed end-to-end visual servoing using convolutional neural networks to predict control commands directly from images, improving performance in unstructured settings.^[5] By the 2020s, hybrid ML-enhanced approaches, such as deep model predictive control for visual servoing, have addressed challenges in feature extraction and trajectory optimization, with applications in UAV docking and manipulation tasks.^[6]

Fundamentals

Visual feedback mechanisms

Visual feedback in visual servoing relies on specialized vision sensors to capture environmental data, which is then processed to guide robotic actions. The primary configurations include eye-in-hand systems, where the camera is mounted on the robot's end-effector, providing a dynamic viewpoint that moves with the manipulator for precise local tracking; eye-to-hand setups, featuring a fixed camera external to the robot that observes the workspace globally; and eye-in-body arrangements, typically used in mobile robots like unmanned aerial vehicles, where the camera is attached to the robot's body frame to enable navigation and obstacle avoidance.^[7] The data flow begins with image acquisition, where the vision sensor captures sequential frames of the scene at high rates to ensure temporal continuity. Preprocessing follows, involving operations such as noise filtering through Gaussian smoothing or histogram equalization to mitigate distortions from sensor artifacts or environmental interference. Feature detection then extracts relevant visual cues, such as edges using Canny algorithms or corners via the Harris detector, which identifies points of high curvature by computing the autocorrelation matrix of image gradients to localize stable keypoints for tracking.^[8]^[9] In the feedback loop, these processed features continuously update estimates of the robot's pose relative to the target, forming a closed-loop control where visual errors drive corrective velocities. Systems qualitatively handle challenges like occlusions—where target features are temporarily obscured—through predictive tracking or multi-view redundancy, and lighting variations via adaptive thresholding or illumination-invariant descriptors to maintain feature reliability without interrupting the loop.^[10] Sensor fusion enhances feedback robustness by integrating visual data with complementary sensors, such as inertial measurement units (IMUs), which provide acceleration and angular velocity readings to compensate for visual drift or momentary losses in feature tracking, yielding more accurate pose estimates in dynamic environments. Real-time performance is critical, as processing latency—from acquisition delays to computation overhead—can destabilize the feedback loop by introducing phase lags that amplify errors in high-speed tasks; mitigation strategies include parallel hardware acceleration and predictive filtering to ensure loop closure rates exceeding 30 Hz for stable servoing.^[11]

Mathematical foundations

Visual servoing relies on well-defined coordinate systems to relate visual observations to robotic motion. The primary frames include the camera frame, attached to the optical center of the imaging sensor; the image plane, where two-dimensional pixel coordinates are measured; and the robot's Cartesian space, encompassing the base frame and end-effector frame. These systems enable the mapping of three-dimensional world points to image features, crucial for feedback control.^[8] The projection of three-dimensional points onto the image plane is typically modeled using the pinhole camera equation, which assumes an ideal perspective projection. For a 3D point in homogeneous world coordinates \tilde{\mathbf{X}}_w = [X_w, Y_w, Z_w, 1]^T, the homogeneous image coordinates [u, v, 1]^T are given by

s \begin{bmatrix} u \\ v \\ 1 \end{bmatrix} = \mathbf{K} [\mathbf{R} | \mathbf{t}] \tilde{\mathbf{X}}_w,

where s is a scaling factor, \mathbf{K} is the intrinsic camera matrix incorporating focal length and principal point, and [\mathbf{R} | \mathbf{t}] represents the extrinsic parameters defining the rotation \mathbf{R} and translation \mathbf{t} from the world frame to the camera frame. This model forms the basis for interpreting visual data in visual servoing tasks.^[12]^[8]^[13] Pose estimation in visual servoing involves determining the relative positions and orientations between the robot's end-effector, the camera, and the target object. This is achieved through homogeneous transformation matrices, which compactly represent rigid-body motions in six degrees of freedom. A homogeneous transformation \mathbf{T} = \begin{bmatrix} \mathbf{R} & \mathbf{t} \\ 0 & 1 \end{bmatrix} describes the pose from one frame to another, such as from the robot base to the end-effector or from the camera to the target. Chains of these transformations link the robot's joint space to the visual observations, enabling pose reconstruction from image correspondences or direct measurements.^[8]^[4] The interaction matrix, also known as the image Jacobian, bridges the gap between image feature dynamics and camera motion. For a feature s in the image plane, the time derivative \dot{s} relates to the camera velocity v = [v_x, v_y, v_z, \omega_x, \omega_y, \omega_z]^T via \dot{s} = \mathbf{L}_s v, where \mathbf{L}_s = \frac{\partial s}{\partial v} is the k \times 6 interaction matrix for a feature vector of dimension k. This matrix depends on the feature type and current image coordinates, allowing the projection of control commands from image space to three-dimensional velocities. Its computation is essential for ensuring the stability and convergence of servoing loops.^[12]^[4] Robot kinematics integrate with visual data by combining forward and inverse kinematic models. Forward kinematics map joint velocities \dot{q} to end-effector velocities via the manipulator Jacobian \dot{x} = \mathbf{J}(q) \dot{q}, where x is the Cartesian pose. In visual servoing, this is extended to include the camera frame, often yielding a composite Jacobian relating joint velocities to image feature changes: \dot{s} = \mathbf{L}_s \mathbf{V}_c \mathbf{J}(q) \dot{q}, with \mathbf{V}_c transforming end-effector to camera velocities. Inverse kinematics then solve for \dot{q} to achieve desired visual motion, accommodating constraints like joint limits.^[8]^[4] The visual error in servoing is defined as the discrepancy between current and desired feature configurations. In image-based approaches, the error is \mathbf{e} = \mathbf{s} - \mathbf{s}^*, where \mathbf{s} and \mathbf{s}^* are the current and desired image features, respectively. In position-based methods, it is formulated in three-dimensional space as \mathbf{e} = \mathbf{T} - \mathbf{T}^*, using pose differences via homogeneous transformations. This error drives the control law, with its minimization ensuring task convergence.^[12]^[8]

Taxonomy and Classification

Control schemes

Visual servoing control schemes are primarily classified based on the reference frame in which the control law operates, with the two foundational approaches being image-based visual servoing (IBVS) and position-based visual servoing (PBVS). These schemes determine how visual features are mapped to robot velocities or positions to achieve task convergence, balancing computational efficiency, robustness to modeling errors, and trajectory predictability. IBVS emerged in the late 1980s as a direct method to leverage raw image data, while PBVS relied on 3D pose estimation techniques available at the time.^[4] In IBVS, control is performed directly in the 2D image plane using pixel coordinates or other image features, without explicit 3D reconstruction of the environment. This decoupling from camera calibration and 3D modeling makes IBVS robust to calibration errors and noise in pose estimates, as it operates solely on observable image data. However, it can suffer from nonlinear interactions between image features, leading to potential local minima and curved camera trajectories that may exit the robot's workspace for large initial errors. Early IBVS implementations, such as those using point features, demonstrated real-time feasibility on robotic arms.^[4]^[14] PBVS, in contrast, reconstructs the 3D relative pose between the camera and target using visual data, then controls the robot in the Cartesian task space to minimize this pose error. This approach allows for straight-line trajectories and global asymptotic stability when accurate 3D models are available, making it suitable for tasks requiring precise positioning. Its drawbacks include high sensitivity to calibration inaccuracies, depth estimation errors, and feature occlusions, which can propagate into unstable control if the pose computation fails. PBVS was among the first visual servoing methods proposed, building on pose estimation from stereo or monocular vision.^[4]^[14] Hybrid schemes, such as 2.5D visual servoing, partition the control between 2D image space and partial 3D information, often using image coordinates for in-plane motions and depth or pose components for out-of-plane adjustments. This partitioning mitigates the local minima of pure IBVS while reducing the calibration dependence of PBVS, enabling more predictable trajectories without full 3D reconstruction. For instance, 2.5D methods employ logarithmic depth features alongside 2D projections to ensure convergence even from distant initial positions. These hybrids evolved in the late 1990s to address limitations of the basic schemes, incorporating techniques like epipolar geometry for uncalibrated environments.^[15]^[4] Additional classifications distinguish schemes by control output and target motion. Velocity-based control, the most common framework, computes joint or end-effector velocities from visual errors, integrating robot dynamics for smooth motion in eye-in-hand configurations. Position-based control, less prevalent, directly optimizes joint positions, which is advantageous for avoiding velocity saturation but requires more complex stability guarantees. Regarding targets, traditional schemes assume static objects for convergence analysis, whereas extensions for dynamic targets incorporate predictive models or filtering to track moving features, though these are often treated separately from core control design.^[16]^[4] The evolution of these schemes reflects advancements in computing and vision: late 1980s work focused on PBVS for its intuitive 3D control, but in the early 1990s, IBVS gained prominence for its calibration insensitivity, leading to hybrids like 2.5D that combine strengths for practical robotics applications. Modern variants include switching strategies that alternate between IBVS and PBVS based on error thresholds, enhancing robustness in unstructured environments.^[17]^[4]

Scheme	Advantages	Disadvantages
IBVS	Robust to calibration errors; uses direct image feedback for local stability.^[4]	Prone to local minima; nonlinear trajectories for large displacements.^[14]
PBVS	Global stability; enables Cartesian straight-line paths.^[4]	Sensitive to pose estimation and calibration inaccuracies.^[14]
Hybrid (e.g., 2.5D)	Balances 2D robustness with 3D predictability; avoids full reconstruction.^[15]	Requires partial depth estimation; increased computational partitioning.^[4]

Feature types

In visual servoing, visual features serve as the primitive elements extracted from image data to guide robotic control, categorized based on their geometric or photometric properties. These features can be 2D projections or reconstructed 3D representations, enabling tasks from simple point tracking to complex pose estimation. The choice of feature type depends on the application's requirements for robustness, computational efficiency, and the degrees of freedom to be controlled.^[18] Point features represent discrete locations in the image plane, typically as 2D coordinates (x, y) or polar forms (\rho, \theta), derived from the perspective projection of 3D points. They are extracted using corner detectors like Harris or Shi-Tomasi, or blob detection algorithms such as the Laplacian of Gaussian for centroids of uniform regions. For enhanced robustness to illumination and viewpoint changes, descriptors like SIFT (Scale-Invariant Feature Transform) or SURF (Speeded-Up Robust Features) are often matched across frames to track points reliably in dynamic environments.^[19]^[20] Line features capture linear structures, parameterized by normal distance \rho and angle \theta in the image, often from edges of objects or environmental contours. Extraction commonly employs the Hough Transform for detecting infinite lines or the Line Segment Detector (LSD) for finite segments, providing invariance to partial occlusions and suitable for controlling orientation in 2D or 3D tasks. These features are particularly effective in structured scenes, such as aligning a robot with hallway lines.^[19]^[18] Region features describe extended areas rather than isolated points, using image moments (e.g., zeroth-order for area, first-order for centroid) or template matching for textured surfaces. Computed via spatial integrals over pixel intensities, they offer scale and rotation invariance through normalized central moments, as in Hu's invariant moments, making them ideal for servoing on deformable or non-rigid objects like grasping tools.^[19]^[18] 3D features involve reconstructed spatial information, such as object poses or point coordinates (X, Y, Z), estimated from multiple views via structure-from-motion techniques or directly from depth sensors like RGB-D cameras. These enable full 6-DOF control but require accurate calibration and handling of estimation uncertainties, often combined with 2D features for hybrid approaches.^[19]^[18] Advanced features include optical flow fields, which quantify pixel velocities (\dot{x}, \dot{y}) from brightness constancy assumptions, useful for motion-based servoing in cluttered scenes, and vanishing points, derived from intersecting projected parallel lines to infer perspective and camera orientation. Optical flow supports dense tracking but is sensitive to lighting variations, while vanishing points provide geometric constraints for navigation in man-made environments.^[19]^[18] Selection criteria for features emphasize properties like scale and rotation invariance to ensure stable control under varying viewpoints, alongside low computational cost for real-time implementation—point features are typically fastest, while 3D reconstructions demand more processing. Robustness to noise and occlusions further guides choices, with hybrid combinations often optimizing performance across tasks.^[19]^[18]

Visual Servoing Methods

Image-based visual servoing

Image-based visual servoing (IBVS) operates directly in the image plane by using two-dimensional visual features extracted from camera images to compute control commands for the robot or camera system. Unlike methods that reconstruct three-dimensional pose, IBVS minimizes the error between the current positions of selected image features and their desired positions without requiring an explicit 3D model of the environment. This approach leverages the projective geometry of the camera to relate image feature velocities to the camera's relative motion, enabling real-time feedback control.^[21]^[8] The feature vector in IBVS typically consists of 2D coordinates of geometric primitives such as points, lines, or moments of image regions, which are robustly tracked using image processing techniques. For instance, the coordinates of corner points on a target object serve as features to guide motion. The core algorithm computes the camera velocity \mathbf{v} to drive the feature error \mathbf{e} = \mathbf{s} - \mathbf{s}^* (where \mathbf{s} are current features and \mathbf{s}^* are desired features) to zero. This is achieved via the control law \mathbf{v} = -\lambda \mathbf{L}^+ \mathbf{e}, where \lambda > 0 is a positive gain, \mathbf{L} is the interaction matrix (also known as the image Jacobian) that maps camera velocities to feature velocities, and \mathbf{L}^+ is its pseudoinverse. The interaction matrix for a point feature (x, y) in the image is given by

\mathbf{L} = \begin{bmatrix} -\frac{1}{Z} & 0 & \frac{x}{Z} & xy & -(1 + x^2) & y \\ 0 & -\frac{1}{Z} & \frac{y}{Z} & 1 + y^2 & -xy & -x \end{bmatrix},

where Z is the depth of the point relative to the camera; approximations or online estimates are often used for Z in practice.^[21]^[8] A key advantage of IBVS is its robustness to camera calibration errors and the absence of need for a precise 3D model, as control remains effective even with uncalibrated cameras by relying solely on image measurements. This makes it suitable for unstructured environments where modeling the target is challenging. However, limitations include the potential for singularities in the interaction matrix when features align in certain configurations, leading to loss of controllability, and the risk of local minima along nonlinear trajectories in the image space, which can cause convergence failures if the initial error is large.^[8]^[22] In practical implementations, the task function approach addresses issues like coupling between translational and rotational motions by defining a hierarchical or decoupled task \boldsymbol{\alpha}(\mathbf{s}) such that the error is regulated as \dot{\boldsymbol{\alpha}} = -\lambda (\boldsymbol{\alpha} - \boldsymbol{\alpha}^*), allowing prioritization of primary tasks (e.g., feature alignment) while satisfying secondary constraints (e.g., joint limits). This decoupling helps mitigate unwanted rotations during pure translations. For example, in servoing a camera to center a target, four point features from the target's corners are selected; the control law adjusts the camera velocity to move these points toward their centered desired positions, ensuring smooth convergence while keeping the target in the field of view.^[21]

Position-based visual servoing

Position-based visual servoing (PBVS) is a control strategy that reconstructs the three-dimensional pose of a target object relative to a camera and regulates the camera's motion in Cartesian space to achieve a desired pose.^[4] In this approach, visual features extracted from the image, such as point correspondences between known 3D model points and their 2D projections, are used to estimate the current pose \mathbf{T} of the target. The control objective is then formulated in the 3D task space, decoupling translational and rotational degrees of freedom for intuitive motion planning. The core algorithm of PBVS involves two main steps: pose estimation and velocity command generation. First, the 3D pose \mathbf{T} (comprising rotation and translation) is computed from the visual features using Perspective-n-Point (PnP) algorithms, which solve for the camera pose given at least four 3D-2D correspondences and known camera intrinsics.^[4] Efficient implementations like the EPnP algorithm achieve this in linear time O(n) for n points by expressing world points as a linear combination of four virtual control points and solving a linear system followed by eigenvalue decomposition for pose recovery. Once the current pose \mathbf{T} and desired pose \mathbf{T}^* are obtained, the pose error is computed, and the camera linear velocity \mathbf{v}_c = -\lambda (\mathbf{t} - \mathbf{t}^*) and angular velocity \boldsymbol{\omega}_c = -\lambda \boldsymbol{\theta} \mathbf{u} are commanded, where \mathbf{t} and \mathbf{t}^* are the translational components of \mathbf{T} and \mathbf{T}^*, and \boldsymbol{\theta} \mathbf{u} is the axis-angle representation of the rotational error, with \lambda > 0 a gain ensuring exponential convergence to the target pose in Cartesian space.^[4] A key advantage of PBVS is its ability to provide global asymptotic stability and decoupled control of the six degrees of freedom (6DOF) when pose estimates are accurate, allowing for straightforward integration with task-specific constraints like obstacle avoidance.^[4] This makes it particularly suitable for applications requiring precise 3D positioning without singularities in the control law. However, PBVS is highly sensitive to errors in pose estimation, which can arise from noisy feature detection, inaccurate camera calibration, or modeling mismatches, potentially leading to instability or large control errors.^[4] Accurate 3D models of the target and reliable calibration are thus essential prerequisites. To mitigate some limitations of full 3D pose estimation, variants such as 2.5D PBVS incorporate partial depth information by using a combination of 2D image coordinates and estimated depths (e.g., logarithmic depth ratios) as features, reducing sensitivity to full pose inaccuracies while maintaining some decoupling benefits. In this approach, the interaction matrix is adapted to handle the hybrid feature set, enabling more robust control for planar or near-planar targets.^[4] An illustrative example of PBVS is the positioning of a robotic arm to grasp a target object, where corner points of the object are detected in the camera image, and PnP is applied to reconstruct the target's 3D pose relative to the arm's end-effector. The arm's joint velocities are then derived via inverse kinematics from the commanded Cartesian velocity, guiding the end-effector to align with the reconstructed target pose.^[4]

Hybrid and advanced methods

Hybrid visual servoing methods integrate elements of image-based visual servoing (IBVS) and position-based visual servoing (PBVS) to leverage the strengths of both approaches, such as IBVS's robustness to calibration errors and PBVS's decoupling of translational and rotational motions.^[23] These hybrids often employ partitioning strategies, where 2D image features handle positioning tasks while 3D pose estimates manage orientation, reducing sensitivity to camera modeling errors.^[24] For instance, optimized hybrid decoupled schemes use supervised learning to refine feature interactions, achieving improved convergence in cluttered environments compared to pure IBVS or PBVS.^[25] Switching schemes within hybrid frameworks dynamically transition between IBVS and PBVS based on predefined error thresholds or visibility constraints, ensuring stability during feature occlusions or large initial displacements.^[24] This approach treats the controllers as complementary subsystems in a switched system, with transitions triggered by metrics like image Jacobian conditioning to avoid local minima.^[26] Model predictive control (MPC) further enhances hybrids by optimizing future visual feature trajectories under constraints, incorporating visual feedback to predict and correct deviations in real-time.^[6] In DeepMPCVS, a deep network forecasts optical flow for MPC planning, enabling precise alignment in novel scenes with faster convergence than traditional methods.^[6] Learning-based methods advance hybrid servoing through reinforcement learning (RL) for adaptive feature selection and neural networks for end-to-end control, bypassing explicit interaction matrices.^[27] Deep RL uncalibrated IBVS, for example, trains policies to estimate relative camera motion from image features, enhancing robustness to dynamic disturbances without calibration.^[28] Convolutional neural networks (CNNs) provide feature robustness in post-2015 advances, such as regressing 6-DOF poses from perturbed images to handle occlusions and lighting variations with sub-millimeter accuracy.^[29] CNN-based optical flow estimation further supports hybrid schemes by enabling predictive control in unstructured settings.^[6] As of 2025, further advances incorporate transformer architectures in deep reinforcement learning to enhance PBVS performance and adaptive uncalibrated control with sensor fusion for greater robustness.^[30]^[31] A representative application is UAV landing using hybrid visual-inertial servoing, where visual tracking of infrared targets fuses with IMU data via an extended Kalman filter to estimate relative pose and predict ship motion for precise touchdown in GPS-denied environments.^[32] This integration reduces estimation errors and improves landing stability under dynamic conditions, as demonstrated in simulations achieving accurate state prediction.^[32]

Feature Selection and Interactions

Common visual features

Point features, such as corners or interest points, are among the most commonly employed primitives in visual servoing due to their distinctiveness and ease of tracking across image sequences. These features are typically extracted using corner detection algorithms like the Shi-Tomasi method, which identifies high-quality points by evaluating the eigenvalues of the local structure tensor to select locations with sufficient texture and stability for reliable tracking.^[33] This approach ensures sub-pixel accuracy through techniques such as quadratic interpolation or least-squares fitting on the surrounding intensity gradients, enabling precise localization even under moderate motion or noise.^[8] In visual servoing applications, point features facilitate tasks like pose estimation and trajectory following by providing sparse yet robust 2D coordinates that can be directly integrated into control loops.^[8] Line segments serve as effective visual primitives for servoing tasks involving structured environments, particularly planar targets where edges define object boundaries. Detection begins with edge extraction using the Canny algorithm, which applies Gaussian smoothing to reduce noise, computes intensity gradients, performs non-maximum suppression to thin edges, and uses hysteresis thresholding to connect weak edges to strong ones, yielding a clean edge map. These edges are then linked into line segments via methods like connected component analysis or Hough transform variants, ensuring continuity and accurate endpoint localization.^[34] In visual servoing, line segments are valuable for representing geometric constraints, such as aligning a robot end-effector with straight contours on manufactured parts, offering advantages in scenarios with partial occlusions compared to point-based methods.^[34] Image moments provide a versatile set of region-based features for visual servoing, capturing global properties of segmented image regions without relying on explicit contours. The zeroth-order moment corresponds to the area of the region, serving as a scale indicator insensitive to translation, while first-order moments yield the centroid coordinates, enabling position control decoupled from rotation.^[35] Higher-order moments, such as second-order for orientation and eccentricity, extend this to shape description, computed via integrals over pixel intensities weighted by powers of coordinates.^[36] These features are particularly suited for blob-like or symmetric targets in servoing, as they promote stability by avoiding singularities in the interaction matrix and handling perspective effects through normalization.^[36] Templates and patches are utilized in visual servoing for tracking textured or patterned objects where local image regions serve as features. Extraction involves selecting a reference patch from the initial image and matching it to subsequent frames using correlation-based metrics, such as sum of squared differences or zero-mean normalized cross-correlation, to compute displacement vectors with sub-pixel precision via phase correlation or optimization.^[8] This method excels in maintaining consistency for non-rigid or deformable targets, as the entire patch encodes richer contextual information than individual points, facilitating robust servoing in dynamic scenes.^[37] Depth-enhanced features incorporate 3D information from RGB-D sensors, such as the Microsoft Kinect, to augment 2D image data with per-pixel depth maps for hybrid servoing. These sensors project structured light or use time-of-flight to estimate depth, enabling features like 3D points or planes by back-projecting 2D detections using the camera intrinsic model and depth values.^[38] In visual servoing, this fusion supports tasks requiring accurate distance estimation, such as collision avoidance or precise grasping, by providing direct metric information that mitigates scale ambiguities in monocular setups.^[38] Robustness of visual features to environmental variations, particularly illumination changes, is often achieved through techniques like normalized cross-correlation, which normalizes template and image patches by their mean and variance to yield invariance to linear intensity shifts and gains.^[39] This metric computes similarity as the correlation coefficient over overlapping regions, ensuring stable tracking in varying lighting conditions common to real-world servoing deployments, such as indoor-outdoor transitions or shadowed workspaces.^[39]

Performance impacts

The choice of visual features significantly affects the accuracy, speed, and reliability of visual servoing systems, with tradeoffs arising from feature richness and computational demands. Employing a larger number of features, such as multiple image points or regions, enhances estimation accuracy and decoupling of control axes by providing redundant information that mitigates uncertainties in camera pose or target motion. However, this richness increases processing time, as feature extraction, tracking, and interaction matrix computation scale with the number of elements, potentially limiting real-time performance in resource-constrained setups. For instance, in scenarios requiring six degrees-of-freedom control, using more points than minimal configurations can improve trajectory precision but increases the computational load.^[8] Point features exhibit high sensitivity to noise, particularly in low-texture scenes where distinctive corners or blobs are scarce, leading to tracking failures or large localization errors. In such environments, point-based extraction methods like sum-of-squared-differences correlation yield mean squared errors up to 25 pixels and success rates as low as 40%, exacerbated by background clutter or illumination changes that obscure feature discriminability. Line features, by contrast, offer greater reliability for scenes with prominent straight edges, such as industrial parts or structural elements, as they aggregate edge information over segments, reducing noise impact and improving robustness in textured but non-point-rich areas. This makes lines preferable for tasks like alignment in manufacturing, where point detection might falter.^[40] Degeneracy issues further compromise performance when features like points are collinear, causing the interaction matrix in image-based visual servoing to lose rank and resulting in ill-conditioned control that hinders accurate depth estimation. In configurations with five or more points, collinearity in subsets creates singularities where camera motions produce stationary image projections, leading to ambiguous pose recovery and potential system instability or divergence. These degeneracies are particularly problematic in planar targets or aligned structures, where they can amplify estimation errors in depth, necessitating careful feature placement to avoid singular cylinders defined by the points' geometry.^[41] Experimental evaluations highlight these impacts, with image-based visual servoing using moments as features showing improved performance under dynamic lighting variations compared to traditional point-based approaches, owing to moments' invariance properties that preserve shape descriptors amid photometric changes. Moments integrate global image information, yielding exponential error decay and lower variance in feature trajectories during illumination shifts. Adaptation strategies, such as dynamic feature selection, mitigate these issues by switching features based on scene conditions—for example, transitioning from point tracking to optical flow in high-motion scenarios to maintain reliability without excessive computation. This approach ensures sustained performance in varying environments, like aerial navigation with occlusions.^[42] Key performance metrics underscore these tradeoffs: convergence time measures how quickly features align to desired positions, often extended with richer but noisier sets; trajectory smoothness quantifies path regularity via metrics like jerk or curvature variance, improved by robust features like lines to avoid oscillations; and failure rates under perturbations reflect overall reliability, generally higher for points in noisy conditions compared to moments or adapted flows. These indicators guide feature optimization, prioritizing smoothness in precision tasks like grasping while tolerating longer convergence in exploratory applications. Recent advances as of 2025 incorporate deep learning for feature selection and extraction, enabling neural networks to detect keypoints in unstructured environments and reinforcement learning to adaptively choose features for robustness against occlusions or varying conditions. These methods enhance performance in dynamic scenarios, such as autonomous manipulation, by learning optimal feature interactions without manual tuning.^[43]^[44]

Analysis and Design

Error propagation and stability

In visual servoing systems, errors can arise from various sources, including calibration inaccuracies in camera intrinsics and extrinsics, noise in extracted image features, and time delays in the control loop. Calibration errors propagate through the pose estimation process, leading to deviations in the computed camera velocity and potential steady-state offsets in the trajectory. Feature noise, often modeled as additive zero-mean Gaussian disturbances with a standard deviation on the order of several grey levels, affects the accuracy of feature point tracking and can amplify discrepancies in the interaction matrix \mathbf{L}. Time delays, such as those exceeding one sample period in low-frame-rate systems (e.g., 30 Hz), can destabilize the closed-loop dynamics by introducing phase lags that violate stability margins. These errors propagate via the Jacobian, or interaction matrix \mathbf{L}, which relates the feature error velocity \dot{\mathbf{e}} to the camera velocity \mathbf{v}_c through \dot{\mathbf{e}} = \mathbf{L} \mathbf{v}_c, with approximations in the estimated \hat{\mathbf{L}} exacerbating the issue in the control law \mathbf{v}_c = -\lambda \hat{\mathbf{L}}^+ \mathbf{e}.^[1]^[45] Stability in visual servoing is typically analyzed using Lyapunov methods to ensure exponential convergence of the feature error \mathbf{e} to zero. A common Lyapunov function candidate is the quadratic form V = \frac{1}{2} \mathbf{e}^T \mathbf{e}, which is positive definite for \mathbf{e} \neq 0. The time derivative is \dot{V} = \mathbf{e}^T \dot{\mathbf{e}} = -\lambda \mathbf{e}^T \mathbf{L} \hat{\mathbf{L}}^+ \mathbf{e}, and for asymptotic stability, this must satisfy \dot{V} < 0 when \mathbf{L} \hat{\mathbf{L}}^+ > 0, guaranteeing exponential decay under bounded uncertainties. This criterion holds locally around the desired pose but requires full-rank conditions on \mathbf{L} to avoid singularities.^[1] Image-based visual servoing (IBVS) exhibits local stability but is prone to local minima, where the error decreases initially yet converges to suboptimal points due to nonlinear image projections, limiting its suitability for large initial displacements. In contrast, position-based visual servoing (PBVS) offers global stability for significant pose errors when pose estimates are accurate, as the decoupled 3D error dynamics align directly with the task space. However, PBVS remains sensitive to calibration errors that corrupt the 3D reconstruction.^[1] To mitigate singularities from ill-conditioned \mathbf{L}, decoupled control strategies partition the interaction matrix into independent rotational and translational components, such as using cylindrical coordinates for rotation or image moments for translation, thereby maintaining invertibility and preventing control decoupling failures.^[1] Experimental validations through simulations demonstrate stability bounds under Gaussian noise; for instance, in IBVS setups with an industrial robot arm like the Adept Viper, adding zero-mean Gaussian noise to feature points results in bounded tracking errors, confirming Lyapunov-predicted convergence under moderate noise levels.^[1]^[45] Robustness to uncertainties, such as varying depth or unmodeled dynamics, is enhanced by adaptive gains that adjust the control parameters online; for example, Lyapunov-stable adaptive laws update depth estimates to bound errors under persistent disturbances, achieving convergence under calibration errors.^[46]^[1]

Control laws and optimization

The design of control laws in visual servoing aims to regulate the motion of a robot or camera based on visual feedback, typically by minimizing the error between current and desired visual features. The most fundamental approach employs a proportional control law, where the velocity command \mathbf{u} is given by \mathbf{u} = -\lambda \mathbf{L}^+ (\mathbf{s} - \mathbf{s}^*), with \lambda > 0 as the proportional gain, \mathbf{L} the interaction matrix relating feature velocity to camera velocity, and \mathbf{s} - \mathbf{s}^* the feature error.^[8] This law drives the system toward the desired configuration but can exhibit steady-state errors due to modeling inaccuracies or disturbances. To address this, proportional-integral (PI) controllers extend the basic form by incorporating an integral term, yielding \mathbf{u} = -\lambda \mathbf{L}^+ [(\mathbf{s} - \mathbf{s}^*) + k_i \int (\mathbf{s} - \mathbf{s}^*) \, dt], where k_i > 0 accumulates past errors to eliminate offsets and improve robustness.^[47] Advanced control laws integrate additional mechanisms to handle robotic constraints and nonlinear dynamics. Inverse kinematics is often incorporated to map visual velocity commands to joint velocities, ensuring feasible motions within the robot's workspace, as in \dot{\mathbf{q}} = \mathbf{J}^\# \mathbf{u}, where \mathbf{J} is the Jacobian and \# denotes the pseudoinverse.^[21] Gain scheduling adapts the proportional gain \lambda dynamically based on operating conditions, such as feature depth or velocity, to mitigate non-linearities like those arising from perspective projection; for instance, \lambda may decrease as features approach singularities to prevent oscillations.^[48] Optimization techniques enhance control laws by incorporating constraints and multi-objective criteria. Quadratic programming (QP) is commonly used to satisfy joint limits, velocity bounds, or singularity avoidance while minimizing visual error, formulated as \min_{\mathbf{u}} \frac{1}{2} \mathbf{u}^T \mathbf{H} \mathbf{u} + \mathbf{f}^T \mathbf{u} subject to \mathbf{A} \mathbf{u} \leq \mathbf{b}, where the cost function prioritizes feature convergence.^[49] Cost functions may also weight visual errors against secondary tasks, such as obstacle avoidance, to balance performance. Predictive control employs models to forecast feature trajectories over a horizon, optimizing future commands via \min \sum_{k=1}^N \| \mathbf{s}_{t+k} - \mathbf{s}^* \|^2 + \| \mathbf{u}_{t+k} \|^2 under dynamics constraints, enabling anticipation of occlusions or rapid motions.^[50] Tuning methods ensure desirable closed-loop behavior. Pole placement designs gains to assign specific eigenvalues for desired response speeds and damping, applied to linearized visual servoing models. Linear quadratic regulator (LQR) optimizes gains by minimizing a quadratic cost \int ( \mathbf{x}^T \mathbf{Q} \mathbf{x} + \mathbf{u}^T \mathbf{R} \mathbf{u} ) dt, providing optimal damping for systems like mobile robots under visual feedback.^[51] A representative example is the weighted task function approach for multi-objective servoing, where the primary visual task \mathbf{e}_p = C(\mathbf{s} - \mathbf{s}^*) is augmented with secondary tasks \mathbf{e}_s, solved via \dot{\mathbf{x}} = -\lambda (\mathbf{I} - \mathbf{N} \mathbf{N}^T) \dot{\mathbf{e}}_p - \mu \mathbf{N} \mathbf{N}^T \dot{\mathbf{e}}_s, with \mathbf{N} the null space projector and weights \lambda, \mu prioritizing objectives like joint limit avoidance alongside feature tracking.^[21]

Applications

Industrial robotics

Visual servoing has been widely adopted in industrial robotics for tasks involving fixed manipulators in manufacturing environments, enabling precise operations without reliance on fixed fixtures. In bin picking and assembly applications, image-based visual servoing (IBVS) is commonly employed to grasp irregular objects from cluttered bins, where cameras mounted on the robot end-effector track image features such as centroids or edges to guide the gripper toward targets. This approach compensates for variations in object pose and lighting, allowing robots to handle non-rigid or randomly oriented parts in automotive and electronics assembly lines. For instance, a heterogeneous distributed visual servoing system has demonstrated real-time bin-picking of complex industrial objects by integrating multiple cameras for robust feature extraction.^[52] In welding and inspection tasks, position-based visual servoing (PBVS) facilitates precise alignment by estimating the 3D pose of workpieces from stereo or monocular vision, guiding the robot tool along seams in automotive production. PBVS is particularly suited for these applications due to its ability to provide metric accuracy for path planning, ensuring weld torches or inspection probes follow curved surfaces with minimal deviation. A robust visual servo control system for double-head welding robots has shown effectiveness in tracking narrow seams under dynamic conditions, reducing misalignment errors in real-time. Similarly, structured light-based visual servoing has been applied to robotic pipe welding, achieving sub-millimeter precision in industrial settings.^[53]^[54] Case studies from the 2010s highlight practical implementations, such as ABB industrial robots integrated with the ViSP library for part mating tasks, where eye-in-hand cameras enable hybrid visual-force control to align components like shafts into housings during assembly. These systems, tested on ABB IRB series manipulators, used IBVS to achieve contact-free initial positioning followed by force feedback for insertion, demonstrating reliability in aerospace and automotive manufacturing. The ViSP platform's modular architecture supported rapid prototyping and deployment, with reported success in simulations and physical setups for peg-in-hole operations.^[55] The primary benefits of visual servoing in industrial robotics include enhanced flexibility compared to traditional fixture-based methods, allowing adaptation to variable part geometries and reducing setup times in high-mix production. Error rates have been significantly lowered, with positioning accuracies often below 1 mm, enabling reliable operations in precision-demanding tasks like insertion and welding. For example, a visual servoing control method achieved 100% success rates with position errors under 1 mm in robotic assembly. Another dynamic accuracy enhancement technique confined pose errors to less than 0.10 mm for position and 0.05 degrees for orientation in industrial manipulators.^[56]^[57] Integration challenges arise when incorporating visual servoing with programmable logic controllers (PLCs) and adhering to safety standards like ISO 10218, which mandates risk assessments for collaborative environments and limits robot speeds near humans. Synchronizing vision feedback loops with PLC-driven factory automation requires low-latency communication protocols, often leading to issues with real-time determinism and fault tolerance. Compliance with ISO 10218-1 and -2 involves additional safeguards, such as emergency stops and speed monitoring, complicating system validation in human-robot collaborative cells. A vision-based quality inspection setup highlighted these hurdles, emphasizing the need for updated safety classifications under the 2025 revisions.^[58] Post-2020 developments have incorporated AI augmentation to enhance visual servoing for adaptive manufacturing, where machine learning models predict feature trajectories or compensate for occlusions in dynamic environments. For instance, imitation learning integrated with direct visual servoing uses the large projection formulation for faster convergence in assembly tasks, improving robustness to novel objects. In AR-assisted manufacturing, AI-driven perception augments visual servoing by overlaying predictive analytics, enabling robots to adjust to production variations in real-time. These advancements support flexible, reconfigurable lines, as reviewed in AI-AR integration studies for industrial applications. As of 2025, applications include visual servoing for drawer retrieval and storage operations in robotic manipulators, enhancing precision in logistics tasks.^[59]^[60]^[61]

Mobile and aerial robots

Visual servoing has been integral to ground robot navigation, particularly through integration with visual odometry and simultaneous localization and mapping (SLAM) techniques for rovers operating in unstructured terrains. In planetary exploration, NASA's Mars Exploration Rovers (MER), such as Spirit and Opportunity, employed visual target tracking—essentially visual servoing—to mitigate wheel slippage and odometry errors during autonomous mobility, enabling precise approach to scientific targets like rocks despite terrain irregularities.^[62] Earlier prototypes like the Rocky 7 rover demonstrated visual servoing on elevation maps derived from stereo vision for autonomous rock acquisition, allowing the robot to approach and position for manipulation from over one meter away without relying on pre-mapped environments.^[63] For wheeled mobile robots on Earth, switched visual servo control schemes address nonholonomic constraints, using monocular cameras to track features while compensating for limited maneuverability in indoor or outdoor settings.^[64] In aerial robotics, visual servoing enables unmanned aerial vehicles (UAVs) and drones to perform target tracking and autonomous landing, often leveraging optical flow for maintaining hover stability in dynamic conditions. Image-based visual servoing (IBVS) with fiducial markers, such as AprilTags, allows drones to localize and follow moving targets in real-time, as demonstrated in systems using particle filters for robust detection during approach and descent phases.^[65] Optical flow-based methods provide ego-motion estimates to stabilize hover and adjust velocity, crucial for operations in windy or cluttered environments where traditional inertial measurements alone are insufficient.^[66] The Parrot AR.Drone serves as a seminal example, where IBVS facilitates indoor tracking of 3D moving objects by controlling the quadrotor's velocity based on image feature errors, achieving smooth pursuit without external positioning aids.^[67] Case studies from the 2010s highlight visual servoing's role in vision-based autonomy during DARPA challenges, such as the Subterranean (SubT) Challenge, where multi-robot teams integrated visual perception for mapping and artifact detection in GPS-denied underground environments like tunnels and caves.^[68] Teams like CoSTAR used visual-inertial odometry fused with servoing for coordinated exploration, enabling ground and aerial robots to navigate unknown terrains collaboratively.^[69] The Parrot AR.Drone further exemplified practical deployment, with extensions to outdoor tracking and following using forward-facing cameras for object pursuit in vision-only setups.^[70] Adaptations for mobile and aerial robots emphasize handling ego-motion disturbances and sensor fusion to enhance reliability. Visual servoing compensates for rapid ego-motion in drones by estimating 3D pose from 2D features, often integrating with inertial measurement units (IMUs) for short-term stability during aggressive maneuvers.^[71] Fusion with GPS and IMUs, as in VINS-Fusion frameworks, combines visual odometry with inertial data for robust state estimation in hybrid navigation, allowing seamless transitions between GPS-available and denied modes.^[72] Performance in GPS-denied environments underscores visual servoing's precision, with drone landing systems achieving average errors of approximately 19 cm using multi-sensor fiducial tracking, sufficient for safe touchdown on unprepared surfaces.^[73] Recent advancements in the 2020s extend this to swarm servoing for multi-UAV coordination, where image-based visual servoing enables precise interception and formation control among drones, facilitating collaborative tasks like target encirclement in cluttered spaces.^[74] As of 2025, improved DETR-based visual servoing has been applied to robotic arm satellite tracking, enhancing robustness in space environments.^[75]

Challenges and Future Directions

Limitations and robustness issues

Visual servoing systems are highly sensitive to environmental variations, particularly changes in lighting conditions that can alter image feature visibility and lead to tracking failures. For instance, variations in illumination cause edges, corners, and color-based cues to become unreliable, resulting in loss of feature detection and subsequent control instability.^[40]^[76] Occlusions, whether partial or complete, further exacerbate this by temporarily hiding critical features, causing error spikes in image-based approaches where depth estimation relies on continuous visibility.^[77] Motion blur from rapid camera or object movement introduces additional distortions, degrading feature extraction accuracy and often leading to divergent trajectories if not mitigated.^[78] System-level constraints also limit the practical deployment of visual servoing, with high computational demands posing a primary barrier to real-time operation. Image processing tasks, such as feature detection and velocity estimation, require significant processing power, often resulting in latencies that exceed control loop requirements in dynamic scenarios.^[8]^[79] Calibration drift over time, due to mechanical wear, temperature changes, or sensor inaccuracies, introduces cumulative errors in camera intrinsics and extrinsics, amplifying pose estimation inaccuracies without periodic recalibration.^[80]^[81] Failure modes are particularly pronounced in unstructured environments, where systems may diverge due to insufficient feature richness or unexpected perturbations, as seen in stability analyses of eye-in-hand configurations. Studies report high failure rates for standard image-based methods in low-contrast or low-light conditions with partial occlusions, such as up to 100% in some grasping tasks.^[22]^[79] In these cases, error propagation can amplify discrepancies by factors related to feature loss, leading to non-convergent behaviors without adaptive measures. Robustness gaps persist in handling dynamic obstacles, as conventional visual servoing lacks inherent predictive capabilities, relying instead on reactive feature tracking that fails against fast-moving interferences. This often results in collision risks or task abandonment in cluttered scenes, underscoring the need for supplementary sensing or redundancy to maintain performance.^[82]^[42] While basic mitigations like multi-camera setups or hybrid control can provide fallback features, they do not fully resolve these issues without increasing system complexity.^[83]

Emerging technologies

The integration of artificial intelligence and machine learning has advanced visual servoing toward end-to-end control paradigms, particularly through deep reinforcement learning (DRL) techniques that enable adaptive policies without explicit feature engineering. For instance, DRL-based visual servoing for unmanned aerial vehicles (UAVs) dynamically adjusts servo gains in real-time to handle field-of-view constraints, achieving stable tracking in dynamic environments as demonstrated in simulations and hardware experiments.^[84] Similarly, deep deterministic policy gradient (DDPG) variants have been applied to UAV servoing tasks, optimizing continuous action spaces for precise target following while mitigating issues like partial observability.^[85] These approaches, prominent in post-2018 research, enhance autonomy in aerial robotics by learning robust policies from image data alone, with improvements in task success over classical methods in cluttered scenarios.^[86] Event-based vision, leveraging neuromorphic sensors, represents a breakthrough for low-latency visual servoing in high-speed applications, where traditional frame-based cameras struggle with motion blur and bandwidth limitations. These sensors output asynchronous events triggered by pixel-level intensity changes, enabling sub-millisecond response times suitable for robotic manipulation and UAV control. A neuromorphic eye-in-hand visual servoing controller, for example, has been validated on industrial manipulators, reducing positioning errors to 0.183 mm during high-speed machining tasks.^[87] Experimental results show that event-based methods maintain stability in lighting variations and occlusions, outperforming conventional vision by factors of 10 in temporal resolution for dynamic object tracking.^[88] Multi-modal fusion integrates visual servoing with complementary sensors like LiDAR and emerging 5G networks to bolster robustness in outdoor robotics, addressing challenges such as GPS denial or adverse weather. In agricultural settings, LiDAR-assisted visual servoing fuses point clouds with image features for precise inter-zone navigation in greenhouses, achieving mean path deviations around 4-6 cm at low speeds (0.2-0.4 m/s) under variable lighting.^[89] When combined with 5G for low-latency data sharing, this fusion supports distributed servoing in multi-robot systems, enhancing real-time coordination for outdoor tasks like inspection or search-and-rescue.^[90] As of 2025, research continues to explore advanced optimization techniques for robotics control, alongside collaborative human-robot servoing through bio-inspired AI frameworks that incorporate human intent via shared visual cues, enabling safer industrial interactions with notable reductions in collision risks in shared workspaces.^[91] Research frontiers emphasize scalable swarms, where AI-driven visual servoing coordinates drone collectives for collective perception, as seen in agentic UAV systems that adaptively navigate using distributed vision policies.^[92] Ethical considerations in vision control are gaining traction, with frameworks embedding transparency and accountability to mitigate biases in AI-perceived environments, ensuring equitable deployment in surveillance and autonomous operations.^[93] Learning-based features have demonstrated improved robustness against perturbations in visual servoing, based on benchmarks showing reduced error variance in deep feature extractors compared to hand-crafted ones.^[94]

Software Tools

Open-source frameworks

One prominent open-source framework for visual servoing is the Visual Servoing Platform (ViSP), a modular C++ library with Python bindings that facilitates prototyping and development of applications involving visual tracking and servoing techniques.^[95]^[96] ViSP supports both image-based visual servoing (IBVS) and position-based visual servoing (PBVS), enabling real-time control of robots using camera feedback, and is cross-platform compatible with Linux, Windows, macOS, and others via CMake builds. As of July 2025, the latest release is version 3.7.0.^[97] It includes simulation capabilities for testing servoing algorithms without physical hardware and provides interfaces for robotic systems, such as those using microcontrollers like Arduino for low-level control in experimental setups.^[95]^[98] ViSP integrates seamlessly with OpenCV, an open-source computer vision library, through dedicated bridging tools that allow conversion of images, camera parameters, and features between the two for enhanced feature tracking in custom visual servoing pipelines.^[99] OpenCV's modules for detecting and tracking keypoints, such as corners or blobs, serve as foundational components in these pipelines, supporting real-time processing essential for servoing tasks.^[99] This integration has been utilized in various prototypes, including those combining visual feedback with hardware actuation.^[98] Within the Robot Operating System (ROS) ecosystem, ViSP is extended through packages like visp_auto_tracker, which wraps model-based trackers for automated detection and pose estimation of patterns such as QR codes or blobs on objects.^[100] This package is particularly suited for robot arms, with examples demonstrating integration with Universal Robots like the UR5 for tasks involving visual guidance and servoing.^[100]^[101] The vision_visp ROS stack further enables interfacing ViSP with ROS nodes for broader robotic applications, supporting both ROS 1 and ROS 2.^[102] Another relevant open-source tool is ORB-SLAM, a feature-based simultaneous localization and mapping (SLAM) library that provides robust monocular, stereo, or RGB-D pose estimation, often incorporated into visual servoing loops for real-time camera-to-object positioning.^[103] ORB-SLAM's oriented FAST and rotated BRIEF (ORB) features enable loop closure and relocalization, making it valuable for servoing in dynamic environments where initial pose estimates are needed.^[104] Frameworks like ViSP have been combined with ORB-SLAM variants to bridge SLAM outputs directly into servoing controllers.^[104] These frameworks benefit from active open-source communities, with ViSP maintaining development since the early 2000s through Inria's Rainbow team (formerly Lagadic), offering over 125 tutorials and 515 example codes for beginners and advanced users alike.^[95]^[105] Community contributions via GitHub ensure ongoing updates, including enhancements for real-time performance and hardware compatibility.^[96]

Simulation and implementation tools

Simulation environments play a crucial role in the development of visual servoing systems, allowing researchers and engineers to test control algorithms in virtual settings that replicate real-world physics, lighting, and sensor dynamics without risking hardware damage. These tools facilitate the integration of vision feedback with robot kinematics, enabling iterative design of image-based (IBVS) and position-based (PBVS) servoing strategies. Popular open-source simulators include the Gazebo simulator (in its modern iteration, succeeding Gazebo Classic which reached end-of-life in January 2025) integrated with the Robot Operating System (ROS 2), which supports virtual testing of eye-in-hand and eye-to-hand configurations for manipulators and mobile platforms.^[106] For instance, Gazebo has been used to simulate five-degree-of-freedom visual servoing robots, easing debugging and validation of control laws in dynamic environments. Similarly, CoppeliaSim (formerly V-REP) provides physics-based simulation for vision-guided tasks, incorporating accurate rendering of camera perspectives and object interactions, as demonstrated in visual servoing experiments with Franka Emika robots where it enables dynamic model validation. Commercial software further supports visual servoing implementation, particularly in industrial and control design contexts. MathWorks' Simulink offers dedicated blocks for modeling visual servoing controllers through the Computer Vision Toolbox and Robotics System Toolbox, allowing users to simulate camera-robot interactions and tune parameters like feature extraction and velocity commands before hardware deployment. For industrial applications, Cognex VisionPro provides robust machine vision tools that can be adapted for IBVS in assembly lines, supporting real-time image processing for pose estimation and feedback control in fixed-camera setups. Hardware kits and integrated platforms bridge simulation and real-world execution in visual servoing. The Franka Emika Panda robot, equipped with visual plugins via libraries like ViSP, serves as a standard hardware kit for testing servoing algorithms, supporting eye-in-hand PBVS with depth cameras such as Intel RealSense for precise end-effector positioning. NVIDIA Isaac Sim, built on Omniverse, facilitates AI-enhanced visual servoing simulations, leveraging GPU-accelerated physics and synthetic data generation for training perception models in complex scenarios like multi-robot coordination. Implementation aids like Peter Corke's MATLAB Robotics Toolbox streamline prototyping by providing functions for visual servoing, including feature selection, interaction matrices, and simulation of camera poses relative to targets. This toolbox supports rapid development of control schemes, from basic PBVS to advanced hybrid methods, with built-in visualization for trajectory analysis. These tools offer significant advantages, such as rapid iteration cycles that substantially reduce the number of physical trials required, minimizing wear on hardware and accelerating development timelines in resource-constrained settings. In recent years, Unity-based simulations have emerged for UAV visual servoing, enabling high-fidelity rendering of aerial environments for tasks like target tracking, as explored in event-based servoing frameworks during the 2020s.

References

[1]
[PDF] Chapter 34 - Visual Servoing - Hal-Inria
Visual servoing uses computer vision data to control a robot's motion, using techniques from image processing, computer vision, and control theory.
[2]
Visual Servoing in Robotics - MDPI
Visual servoing guides robots using visual information, combining image processing, robotics, and control theory to control robot motion.Missing: definition | Show results with:definition
[3]
https://ieeexplore.ieee.org/document/538972
[4]
[PDF] Handbook of Robotics Chapter 24: Visual servoing and visual tracking
This chapter introduces visual servo control, using computer vision data in the servo loop to control the motion of a robot. We first describe the basic ...
[5]
An onboard-eye-to-hand visual servo and task coordination control ...
This paper proposes an onboard-eye-to-hand visual servo and task coordination control for aerial manipulators using a spherical model, controlling both the UAV ...
[6]
[PDF] A Tutorial on Visual Servo Control - Robotics and Automation, IEEE ...
Visual servoing is the fusion of results from many elemental areas including high-speed image processing, kinematics, dy- namics, control theory, and real-time ...
[7]
[PDF] Features tracking for visual servoing purpose - Hal-Inria
Jan 12, 2009 · In this paper we give an overview of a few tracking algorithms developed for visual servoing experiments at IRISA-INRIA Rennes. I. MOTIVATION.
[8]
[PDF] FlowControl: Optical Flow Based Visual Servoing
The pose estimation algorithms must estimate this relative pose. We evaluate ... effects, such as lighting changes and partial occlusion. This helps it ...
[9]
[PDF] REAL-TIME IMAGE BASED VISUAL SERVOING ARCHITECTURE ...
A commonly cited problem in real-time applications is the bounded bandwidth communication between the visual sensor and the robot which induce a latency in.Missing: considerations | Show results with:considerations
[10]
[PDF] Visual servoing - HAL Inria
Jun 19, 2014 · – Ls is the interaction matrix of s defined such that s = Lsv where v is the relative velocity between the camera and the environment. In the ...
[11]
[PDF] Visual servo control, Part I: Basic approaches - Hal-Inria
Jan 6, 2009 · In this article, we will see two very different approaches. First, we describe image-based visual servo con- trol (IBVS), in which s consists of ...Missing: seminal | Show results with:seminal
[12]
[PDF] 2 1/2 D Visual Servoing - Hal-Inria
Jan 13, 2009 · Malis, François Chaumette, S. Boudet. 2 1/2 D Visual Servoing. IEEE Transactions on Robotics and Automation, 1999, 15 (2), pp.238-250.
[13]
[PDF] Visual Servo Control - Hal-Inria
Mar 3, 2007 · This article is the second of a two-part tutorial on visual servo control. In Part I (IEEE Robotics and. Automation Magazine, vol. 13, no.
[14]
https://inria.hal.science/inria-00350283v1/document
[15]
[PDF] Visual Servoing - HAL Inria
Nov 18, 2020 · Visual servoing schemes mainly differ in the way that the visual features are designed. As represented on Fig. 2, the two most classical ...
[16]
[PDF] Visual servoing Franc¸ois Chaumette IRISA / INRIA Rennes, France ...
3D visual features with one camera. Based on pose estimation p(t) from Fc to Fo using. • an image of the object: x(t). • the knowledge of the object 3D CAD ...<|control11|><|separator|>
[17]
Robotic visual servoing system based on SIFT features | Request PDF
This paper presents a robotic visual servoing system based on SIFT features and its implementation for realtime control of a robot manipulator.
[18]
[PDF] A new approach to visual servoing in robotics - l'IRISA
[26]. “Adaptive visual servo control of robots,” in Robot Vision, A. Pugh, Ed. Bedford, UK: IFS Pub. Ltd., 1983, pp. 107-116.
[19]
[PDF] Potential problems of stability and convergence in image-based and ...
Jan 13, 2009 · The aim of this paper is to emphasize these problems by considering an eye-in-hand system and a positioning task with respect to a static target ...
[20]
[PDF] A new hybrid image-based visual servo control scheme
we present a new partitioned visual servo control scheme that overcomes a number of the perfor- mance problems faced by classical IBVS but with less computation ...Missing: advancements | Show results with:advancements
[21]
[PDF] Combining IBVS and PBVS to ensure the visibility constraint - Hal-Inria
Abstract— In this paper we address the issue of hybrid 2D/3D visual servoing. Contrary to popular approaches, we consider the position-based visual servo as ...Missing: seminal | Show results with:seminal
[22]
Optimized hybrid decoupled visual servoing with supervised learning
This study proposes an optimized hybrid visual servoing approach to overcome the imperfections of classical two-dimensional, three-dimensional and hybrid ...Missing: advancements | Show results with:advancements
[23]
[PDF] Hybrid PBVS-IBVS Model Predictive Visual Servoing - Pedro Roque
It starts with the background of autonomous rendezvous and docking (ARD), highlighting how important visual servoing methods are in space robotics. It is ...<|separator|>
[24]
DeepMPCVS: Deep Model Predictive Control for Visual Servoing
May 3, 2021 · We present a deep model predictive visual servoing framework that can achieve precise alignment with optimal trajectories and can generalize to novel ...
[25]
Classical and Deep Learning based Visual Servoing Systems
The visual servoing module requires estimating an interaction matrix that maps observed image features on to robot velocities, which in turn requires 3-D ...
[26]
Deep Reinforcement Learning-Based Uncalibrated Visual Servoing ...
In this article, we put forward a brand-new uncalibrated image-based visual servoing (IBVS) method. It is designed for monocular hand–eye manipulators with ...
[27]
[PDF] Visual Servoing from Deep Neural Networks - HAL Inria
The paper describes how to create a dataset simulating various perturbations. (occlusions and lighting conditions) from a single real-world image of the scene.
[28]
A visual/inertial integrated landing guidance method for UAV ...
Aug 7, 2025 · This paper presents a visual/inertial integrated guidance method for UAV shipboard landing. The airborne vision system is utilized to track ...
[29]
[PDF] Good Features to Track - Duke Computer Science
This PDF file was recreated from the original LaTeX file for technical report TR 93-1399,. Cornell University. The only changes were this note and the ...
[30]
[PDF] A Shape Tracking Algorithm for Visual Servoing - l'IRISA
A 1D Canny edge detector is applied to each measurement line and the points of local maximum are adopted as detected features. The measurement procedure.
[31]
A General and Useful Set of Features for Visual Servoing
Aug 6, 2025 · While moments of orders zero up to three are used to represent gross level image ... A first step toward visual servoing using image moments.
[32]
[PDF] Point-based and region-based image moments for visual servoing of ...
Jan 12, 2009 · In this paper, we present improvements in image-based visual servo using image moments. First, the analytical form of the interaction matrix.
[33]
A TSR Visual Servoing System Based on a Novel Dynamic ... - MDPI
Improved Template Matching. We have introduced some common correlation-based similarity functions in the section above, namely SSD, NCC, SAD and SHD. In ...
[34]
[PDF] A dense and direct approach to visual servoing using depth maps
Apr 28, 2015 · Our approach has been validated in various servoing experiments using the depth information from a low cost RGB-D sensor. Positioning tasks are ...Missing: enhanced | Show results with:enhanced
[35]
[PDF] Direct visual servoing using ZNCC criterion - Hal-Inria
Jan 16, 2018 · Abstract—This paper proposes a direct visual scheme. In direct visual servoing approaches, the goal is to consider all the image as a whole.
[36]
[PDF] Robust Visual Servoing
A visual servoing task in general includes some form of (i) positioning, such as aligning the robot/gripper with the tar- get, and (ii) tracking, updating the ...Missing: richness | Show results with:richness
[37]
Singularities in the Image-Based Visual Servoing of Five Points
Nov 5, 2020 · ... collinear ... ﬁnite number of singular positions of the camera. The goal is to ﬁnd the conﬁgurations of ﬁve non-degenerate points and the camera.
[38]
Adaptive Visual Servoing for Obstacle Avoidance of Micro ... - MDPI
Nov 25, 2021 · A vision-based adaptive switching controller that uses optical flow information to avoid obstacles for micro unmanned aerial vehicles (MUAV) ...
[39]
[PDF] Camera Modelling for Visual Servo Control Applications
In this paper, we present a detailed camera model which can be used in the design and analysis of visual servo systems. Using the free-standing acrobat as a ...
[40]
[PDF] Adaptive visual servo control to simultaneously stabilize image and ...
There are many methods of visual servo control that are classi- cally grouped into image based visual servoing (IBVS) and position based visual servoing (PBVS) ...
[41]
[PDF] Visual Servo Control Part I: Basic Approaches - l'IRISA
Dec 2, 2006 · Visual servo control refers to the use of computer vision data to control the motion of a robot. The vision data may be acquired from a camera ...
[42]
https://www.mdpi.com/2227-9717/9/12/2126
[43]
[PDF] Visual servoing in an optimization framework for the whole-body ...
Dec 22, 2016 · visual servoing. In fact, the exact same method can be found in ... Quadratic programming objective. Recall that a general optimization ...
[44]
[PDF] Visual Servoing via Nonlinear Predictive Control
A nonlinear global model and a local model based on the interaction matrix are considered. Advantages and drawbacks of both models are pointed out. Finally, ...<|separator|>
[45]
https://faculty.cc.gatech.edu/~seth/ResPages/pdfs/BisHutSpo94.pdf
[46]
[PDF] A heterogeneous distributed visual servoing system for real-time ...
The algorithms we use for model-based matching involve assessment of scene measurement and the pose estimation uncertainties, and feature correspondence search ...
[47]
A robust visual servo control system for narrow seam double head ...
Aug 5, 2025 · In this paper, an image-based visual servo control system is developed and integrated into a double head welding robot for CO2 gas shielded ...
[48]
Visual Servoing Control Based on EGM Interface of an ABB Robot
A new visual servoing controller based on the (External Guided Motion, EGM) interface of ABB robots is introduced, with the emphasis on its adaptability to ...Missing: case | Show results with:case
[49]
Research on a Visual Servoing Control Method Based on ... - MDPI
Nov 18, 2022 · This study presents a visual servo control method based on perspective transformation to transport a workpiece to an unmarked spatial ...
[50]
Visual Servoing-Based Dynamic Accuracy Enhancement of ...
Jun 5, 2024 · In the third part, a practical dynamic path tracking (DPT) scheme for industrial robots is elaborated for improving the path tracking accuracy.
[51]
[PDF] Human-Robot Collaboration for a Vision-Based Quality Inspection
Jun 16, 2025 · The updated standard ISO 10218-1:2025, Robotics - Safety requirements - Part 1: Industrial robots [2] and ISO 10218-2:2025, Robotics - Safety.<|separator|>
[52]
Imitation learning-based Direct Visual Servoing using the large ...
This study introduces a dynamical system-based imitation learning for direct visual servoing. It leverages off-the-shelf deep learning-based perception modules.Missing: manufacturing | Show results with:manufacturing
[53]
(PDF) Artificial intelligence (AI) in augmented reality (AR)-assisted ...
Dec 28, 2020 · This research work provides a review of current AR strategies, critical appraisal for these strategies, and potential AI solutions for every component of the ...
[54]
[PDF] Overview of the Mars Exploration Rovers' Autonomous Mobility and ...
Both of these effects can result in the rover not quite reaching its target, but both are mitigated by Vi- sual Target Tracking (also known as Visual Servoing), ...
[55]
[PDF] Autonomous Rock Tracking and Acquisition from a Mars Rover
Our algorithms perform visual servoing on an elevation map instead of image features, because the latter are subject to abrupt scale changes during the approach ...
[56]
Switched visual servo control of nonholonomic mobile robots with ...
Nov 25, 2015 · This paper presents a novel scheme for visual servoing of a nonholonomic mobile robot equipped with a monocular camera in consideration of ...
[57]
Towards autonomous tracking and landing on moving target
**Summary of UAV Tracking and Landing Using Visual Servoing:**
[58]
[PDF] Practical Challenges in Landing a UAV on a Dynamic Target - arXiv
Sep 28, 2022 · 2) Marker Based Landing: With Optical Flow and Machine learning based methods, the drone can theoretically track and land on any target ...<|separator|>
[59]
Autonomous indoor object tracking with the Parrot AR.Drone
Insufficient relevant content. The provided content snippet does not contain details on Parrot AR.Drone visual servoing for object tracking. It only includes a partial title and metadata without substantive information.
[60]
UAVs Beneath the Surface: Cooperative Autonomy for Subterranean ...
Jun 16, 2022 · This paper presents a novel approach for autonomous cooperating UAVs in search and rescue operations in subterranean domains with complex topology.
[61]
How JPL's Team CoSTAR Won the DARPA SubT Challenge: Urban ...
Nov 16, 2020 · The SubT Challenge is designed to encourage progress in four distinct robotics domains: mobility (how to get around), perception (how to make ...
[62]
https://www-robotics.jpl.nasa.gov/media/documents/mer_autonomy_icra_2007.pdf
[63]
Visual Navigation Features Selection Algorithm Based on Instance ...
Jan 2, 2020 · ABSTRACT Ego-motion estimation, as one of the core technologies of unmanned systems, is widely used in autonomous robot navigation, unmanned ...
[64]
[PDF] A Survey of Simultaneous Localization and Mapping with an ... - arXiv
Feb 14, 2020 · Furthermore, VINS-Fusion supports multiple visual-inertial sensor types (GPS, mono camera + IMU, stereo cameras + IMU, even stereo cameras only) ...
[65]
None
### Summary of Drone Landing Accuracy Using Visual Servoing or Fiducial Markers in GPS-Denied Environments
[66]
Precise Interception Flight Targets by Image-based Visual Servoing ...
Sep 26, 2024 · His main research interests include UAV swarm navigation, multicopter visual servo control, and cooperative guidance. Report issue for ...
[67]
Full article: Guest Editorial: Special Issue on Visual Servoing
Jul 14, 2010 · In particular, visual-servoing has found many applications in positioning, localization, and tracking of objects under dynamic and uncertain ...
[68]
What Matters in Constructing a Visual Servoing Scheme
By combining with global localization and mapping in large scenes, VS techniques provide local matching calibration and pose optimization, refining global.
[69]
QP-based Visual Servoing Under Motion Blur-Free Constraint
Sep 1, 2024 · This work proposes a QP-based visual servoing scheme for limiting motion blur during the achievement of a visual task.
[70]
Robot Closed-Loop Grasping Based on Deep Visual Servoing ...
In the robot's camera view, the low-light environment results in diminished brightness and contrast, with noticeable noise in the upper regions of the image.3. Proposed Method · 4. Training And Evaluation... · 5. Physical Grasping...
[71]
(PDF) A Tutorial on Visual Servo Control - ResearchGate
Aug 10, 2025 · This article provides a tutorial introduction to visual servo control of robotic manipulators. Since the topic spans many disciplines our goal is limited to ...
[72]
[PDF] Data-efficient Unsupervised Recalibrating Visual Servoing via ...
Feb 8, 2022 · In this work, we present a method for unsupervised learning of visual servoing that does not require any prior calibration and is ... extrinsics ...
[73]
A Visual Servoing approach for road lane following with obstacle ...
This paper presents a local navigation strategy with obstacle avoidance applied to autonomous robotic automobiles in urban environments, based on the ...
[74]
Robust and Cooperative Image-Based Visual Servoing System ...
The work presented in this paper is based on our previous works [19,20]. This paper presents a robust visual servoing based on a redundant and cooperative 2D ...Missing: seminal | Show results with:seminal
[75]
Deep RL for UAV Visual Servoing Control w/ FOV
Visual servoing control for UAVs based on the deep reinforcement learning (DRL) method is proposed, which dynamically adjusts the servo gain in real time.Missing: DDPG | Show results with:DDPG
[76]
(PDF) Deep Reinforcement Learning for the Visual Servoing Control ...
Jun 3, 2023 · Visual servoing is a control method that utilizes image feedback to control robot motion, and it has been widely applied in unmanned aerial ...Missing: post- | Show results with:post-
[77]
Progress in artificial intelligence-based visual servoing of ...
This work comprehensively examines the application and advancements of AI-enhanced visual servoing in autonomous UAV systems, covering critical control tasks.
[78]
Neuromorphic vision based control for the precise positioning of ...
We propose a novel neuromorphic vision based controller for robotic machining applications to enable faster and more reliable operation.
[79]
[2004.07398] Neuromorphic Eye-in-Hand Visual Servoing - arXiv
Apr 15, 2020 · The event based visual servoing (EVBS) method is validated experimentally using a commercial robot manipulator in an eye-in-hand configuration.Missing: sensors | Show results with:sensors
[80]
A LiDAR SLAM and Visual-Servoing Fusion Approach to Inter-Zone ...
To address these challenges, this study presents an integrated localization-navigation framework for mobile robots in multi-span glass greenhouses. In the ...Missing: 5G outdoor
[81]
https://dspace.mit.edu/bitstream/handle/1721.1/150400/2202.03697.pdf?sequence=2&isAllowed=y
[82]
Titans of Tomorrow: Quantum Computing and Robotics on the Brink ...
Jan 16, 2025 · In 2025, two titans of technology stand at the forefront of innovation: quantum computing and robotics. Each offers a vision of a future transformed.Missing: servoing 2023-2025 inspired
[83]
(PDF) Artificial Intelligence-Driven and Bio-Inspired Control ...
Jul 28, 2025 · This systematic review analyzes 160 peer-reviewed industrial robotics control studies (2023–2025), including an expanded bio-inspired/human- ...
[84]
UAVs Meet Agentic AI: A Multidomain Survey of Autonomous Aerial ...
Jun 8, 2025 · Advanced path planning algorithms allow UAVs to avoid obstacles, reconfigure missions on-the-fly, and coordinate with other agents ...
[85]
Privacy, ethics, transparency, and accountability in AI systems for ...
This article introduces a data-driven methodological framework that embeds transparency, accountability, and regulatory alignment across all stages of AI ...
[86]
Learning Visual Servoing with Deep Features and Fitted Q-Iteration
Feb 6, 2017 · Our approach is based on servoing the camera in the space of learned visual features, rather than image pixels or manually-designed keypoints.
[87]
ViSP
Open source visual servoing platform library · Easy integration. ViSP provides simple ways to integrate and validate new algorithms with already existing tools.ViSP 3.6.1 main page · Visual servoing · Visual features · Overview
[88]
lagadic/visp: Open Source Visual Servoing Platform - GitHub
ViSP is a cross-platform library (Linux, Windows, MacOS, iOS, Android) that allows prototyping and developing applications using visual tracking and visual ...
[89]
Visual Servoing Platform: ViSP 3.6.1 main page - Inria
Written in C++, ViSP is based on open-source cross-platform libraries (such as OpenCV) and builds with CMake. Several platforms are supported, including OSX, ...
[90]
Visual Servoing using a Webcam, Arduino and OpenCV
Jun 4, 2015 · Install OpenCV as instructed here. Connect the webcam; Connect arduino and load it with the code for the microcontroller; Build the tracking ...Missing: integration | Show results with:integration
[91]
Tutorial: Bridge over OpenCV - Visual Servoing Platform - Inria
In this tutorial we explain how to convert data such as camera parameters or images from ViSP to OpenCV or vice versa.Missing: integration | Show results with:integration
[92]
visp_auto_tracker - ROS Wiki
Aug 16, 2016 · Overview. This package wraps an automated pattern barcode based tracker using ViSP library. The tracker estimates the pattern position and ...
[93]
How to use ViSP example for visual servoing?
Feb 2, 2018 · I want to do visual servoing with real UR5 arm, I can control UR5 with ur_modern_driver properly and I can use visp_auto_tracker to detect the QR-code and ...
[94]
lagadic/vision_visp: ViSP stack for ROS - GitHub
ROS 2 vision_visp contains packages to interface ROS 2 with ViSP which is a library designed for visual-servoing and visual tracking applications.
[95]
UZ-SLAMLab/ORB_SLAM3: ORB-SLAM3: An Accurate ... - GitHub
ORB-SLAM3 is the first real-time SLAM library able to perform Visual, Visual-Inertial and Multi-Map SLAM with monocular, stereo and RGB-D cameras.Missing: servoing | Show results with:servoing
[96]
(PDF) Bridging the Gap Between Visual Servoing and Visual SLAM
Mar 10, 2021 · The SLAM module provides feedback signals for the servo controller; meanwhile, velocities designed by the servo controller are utilized for the ...
[97]
[PDF] ViSP for visual servoing: a generic software platform with a ...
A fully functional modular architecture that allows fast development of visual servoing applications, ViSP (Visual Servoing Platform), which takes the form ...