Scatter matrix
A scatter matrix, also known as a scatterplot matrix or SPLOM (scatterplot matrix), is a graphical tool used in exploratory data analysis (EDA) to visualize the pairwise relationships between multiple variables in a multivariate dataset.[1] It consists of a square grid of small scatter plots, where the rows and columns correspond to the variables, and each off-diagonal cell displays a scatter plot of one variable against another, revealing patterns such as correlations, clusters, or outliers.[2] The diagonal cells typically show univariate representations of each variable, such as histograms, density plots, or box plots, to summarize marginal distributions.[3] Introduced as part of multivariate visualization techniques in the late 1970s, the scatter matrix helps assess linear and nonlinear dependencies across variables simultaneously, aiding in data understanding before more advanced modeling.[4] Common in statistical software like R (via thepairs() function), Python (via seaborn.pairplot() or pandas.plotting.scatter_matrix()), and tools like JMP or SPSS, it is particularly useful for datasets with 3 to 10 continuous variables, though larger sets may require enhancements like smoothing or color coding for clarity.[5] Limitations include overplotting in dense data and challenges in interpreting higher dimensions, often complemented by other plots like heatmaps for correlation matrices.[6]