HOME

TheInfoList



OR:

Interactive Visual Analysis (IVA) is a set of techniques for combining the computational power of computers with the perceptive and cognitive capabilities of humans, in order to extract knowledge from large and complex datasets. The techniques rely heavily on user interaction and the human visual system, and exist in the intersection between visual analytics and big data. It is a branch of
data visualization Data and information visualization (data viz or info viz) is an interdisciplinary field that deals with the graphic representation of data and information. It is a particularly efficient way of communicating when the data or information is nume ...
. IVA is a suitable technique for analyzing high-dimensional data that has a large number of data points, where simple graphing and non-interactive techniques give an insufficient understanding of the information.Interactive Visual Analysis of Scientific Data
Steffen Oeltze, Helmut Doleisch, Helwig Hauser, Gunther Weber. Presentation at IEEE VisWeek 2012, Seattle (WA), USA
These techniques involve looking at datasets through different, correlated views and iteratively selecting and examining features the user finds interesting. The objective of IVA is to gain knowledge which is not readily apparent from a dataset, typically in tabular form. This can involve generating, testing or verifying hypotheses, or simply exploring the dataset to look for correlations between different variables.


History

Focus + Context visualization and its related techniques date back to the 1970s. Hauser, Helwig. "Generalizing focus+ context visualization." Scientific visualization: The visual extraction of knowledge from data. Springer Berlin Heidelberg, 2006. 305-327. Early attempts at combining these techniques for Interactive Visual Analysis occur in the WEAVE visualization system for cardiac simulation Gresh, Donna L., et al. "WEAVE: A system for visually linking 3-D and statistical visualizations, applied to cardiac simulation and measurement data." Proceedings of the conference on Visualization'00. IEEE Computer Society Press, 2000. in the year 2000. SimVis appeared in 2003,Doleisch, Helmut, Martin Gasser, and Helwig Hauser. "Interactive feature specification for focus+ context visualization of complex simulation data." Proceedings of the symposium on Data visualisation 2003. Eurographics Association, 2003. and multiple Ph. D. projects have explored the concept since then - notably Helmut Doleisch in 2004,Doleisch, Helmut. Visual analysis of complex simulation data using multiple heterogenous views. 2004. Johannes Kehrer in 2011 Kehrer, Johannes. Interactive visual analysis of multi-faceted scientific data. PhD dissertation, Department of Informatics, University of Bergen, Norway, 2011. and Zoltan Konyha in 2013. ComVis, which is used in the visualization community, appeared in 2008.


Basics

The objective of Interactive Visual Analysis is to discover information in data which is not readily apparent. The goal is to move from the data itself to the information contained in the data, ultimately uncovering knowledge which was not apparent from looking at the raw numbers. The most basic form of IVA is to use coordinated multiple views Roberts, Jonathan C. "State of the art: Coordinated & multiple views in exploratory visualization." Coordinated and Multiple Views in Exploratory Visualization, 2007. CMV'07. Fifth International Conference on. IEEE, 2007. displaying different columns of our dataset. At least two views are required for IVA. The views are usually among the common tools of
information visualization Information is an abstract concept that refers to that which has the power to inform. At the most fundamental level information pertains to the interpretation of that which may be sensed. Any natural process that is not completely random, ...
, such as
histograms A histogram is an approximate representation of the distribution of numerical data. The term was first introduced by Karl Pearson. To construct a histogram, the first step is to " bin" (or " bucket") the range of values—that is, divide the ent ...
,
scatterplot A scatter plot (also called a scatterplot, scatter graph, scatter chart, scattergram, or scatter diagram) is a type of plot or mathematical diagram using Cartesian coordinates to display values for typically two variables for a set of data. ...
s or parallel coordinates, but using volume rendered views is also possible if this is appropriate for the data. Typically, one view will display the
independent variables Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or demand ...
of the dataset (e.g. time or spatial location), while the others display the dependent variables (e.g. temperature, pressure or population density) in relation to each other. If the views are linked, the user can select data points in one view and have the corresponding data points automatically highlighted in the other views. This technique, which intuitively allows exploration of higher-dimensional properties of the data, is known as linking and brushing.Martin, Allen R., and Matthew O. Ward. "High dimensional brushing for interactive exploration of multivariate data." Proceedings of the 6th Conference on Visualization'95. IEEE Computer Society, 1995.Keim, Daniel A. "Information visualization and visual data mining." Visualization and Computer Graphics, IEEE Transactions on 8.1 (2002): 1-8. The selection made in one of the views doesn't have to be binary. Software packages for IVA can allow for a gradual “degree of interest” Doleisch, Helmut, and Helwig Hauser. "Smooth brushing for focus+ context visualization of simulation data in 3D." Journal of WSCG 10.1 (2002): 147-154. in the selection, where data points are gradually highlighted as we move from low to high interest. This allows for an inherent “focus+context” Lamping, John, Ramana Rao, and Peter Pirolli. "A focus+ context technique based on hyperbolic geometry for visualizing large hierarchies." Proceedings of the SIGCHI conference on Human factors in computing systems. ACM Press/Addison-Wesley Publishing Co., 1995. aspect to the search for information. For instance, when examining a tumor in a Magnetic resonance imaging dataset, the tissue surrounding the tumor might also be of some interest to the operator.


The IVA loop

Interactive Visual Analysis is an iterative process. Discoveries made after brushing of the data and looking at the linked views can be used as a starting point for repeating the process, leading to a form of information drill-down. As an example, consider the analysis of data from a simulation of a combustion engine. The user brushes a histogram of temperature distribution, and discovers that one specific part of one cylinder has dangerously high temperatures. This information can be used to formulate the hypothesis that all cylinders have a problem with heat dissipation. This could be verified by brushing the same region in all other cylinders and seeing in the temperature histogram that these cylinders also have higher temperatures than expected.


Data model

The data source for IVA is usually tabular data where the data is represented in columns and rows. The data variables can be divided into two different categories: independent and dependent variables. The independent variables represent the domain of the observed values, such as for instance time and space. The dependent variables represent the data being observed, for instance temperature, pressure or height.Konyha, Zoltan, et al. "Interactive visual analysis of families of function graphs." Visualization and Computer Graphics, IEEE Transactions on 12.6 (2006): 1373-1385. IVA can help the user uncover information and knowledge about data sources that have fewer dimensions as well as datasets that have a very large number of dimensions.


Levels of IVA

The IVA tools can be divided into several different levels of complexity. These levels provides the user with different interaction tools to analyze the data. For most uses, the first level will be sufficient and this is also the level that provides the user with the fastest response from the interaction. The higher levels make it possible to uncover more subtle relationships in the data. However, this requires more knowledge about the tools and the interaction process has a longer response time.


Base level

The most simple form of IVA is the base level which consists of
brushing and linking In databases, brushing and linking is the connection of two or more views of the same data, such that a change to the representation in one view affects the representation in the other. Brushing and linking is also an important technique in int ...
. Here the user can set up several views with different dataset variables and mark an interesting area in one of the views. The data points corresponding to the selection is marked automatically in the other views. A lot of information can be derived from this level of IVA. For datasets where the relationships between the variables are reasonably simple, this technique is usually sufficient for the user to achieve the required level of understanding.Konyha, Zoltán, et al. "Interactive visual analysis of families of curves using data aggregation and derivation." Proceedings of the 12th International Conference on Knowledge Management and Knowledge Technologies. ACM, 2012.


Second level

Brushing and linking In databases, brushing and linking is the connection of two or more views of the same data, such that a change to the representation in one view affects the representation in the other. Brushing and linking is also an important technique in int ...
with logical combination of brushes is a more advanced form of IVA. This makes it possible for the user to mark several areas in one or several views and combine these areas with the logical operators: and, or, not. This makes it is possible to explore deeper into the dataset and see more hidden information. A simple example would be the analysis of weather data: The analyst might want to discover regions that both have warm temperatures and low precipitation.


Third level

The logical combination of selections might not be sufficient to uncover meaningful information from the data set. There are multiple techniques available that make hidden relationships in the data more apparent. One of these is attribute derivation. This allows the user to derive additional attributes from the data, such as derivatives, clustering information or other
statistic A statistic (singular) or sample statistic is any quantity computed from values in a sample which is considered for a statistical purpose. Statistical purposes include estimating a population parameter, describing a sample, or evaluating a hy ...
properties. In principle, the operator can perform any set of calculations on the raw data. The derived attributes can then be linked and brushed like any other attribute. The second tool in level three of IVA is advanced brushing techniques, such as angular brushing, similarity brushing or percentile brushing. These brushing tools select data points in a more advanced fashion than plain "point and click" selection. Advanced brushing generates a faster response than attribute derivation, but has a higher learning curve and require a deeper understanding of the dataset.


Fourth level

The fourth level of IVA is specific to each dataset and varies dependent on the dataset and the purpose of the analysis. Any calculated attribute which is specific to the data under consideration, belongs to this category. An example from the analysis of flow data would be the detection and categorization of vortexes or other structures present in the flow data. This means that fourth-level IVA techniques must be individually tailored to the specific application. After detection of higher-order features, the calculated attributes would be connected to the original data set and subjected to the normal technique of linking and brushing.


Patterns of IVA

The "linking and brushing" (selection) concept of IVA can be used between different types of variables in the dataset. Which pattern we should use depends on which aspect of the correlations in the dataset are of interest.Oeltze, Steffen, et al. "Interactive visual analysis of perfusion data." Visualization and Computer Graphics, IEEE Transactions on 13.6 (2007): 1392-1399.


Feature localization

Brushing data points from the set of dependent variables (e.g. temperature) and seeing where among the independent variables (e.g. space or time) these data points show up, is called "feature localization". With feature localization, the user can easily identify the location of features in the dataset. Examples from a meteorological dataset would be which regions have a warm climate or which times of the year have a lot of precipitation.


Local investigation

If independent variables are brushed and we look for the corresponding connection to a dependent view, this is termed "local investigation". This makes it possible to investigate the characteristics of for example a specific region or specific time. In the case of meteorological data, we could for instance discover the temperature distribution during the winter months.


Multivariate analysis

Brushing dependent variables and watching the connection to other dependent variables is called multivariate analysis. This could for example be used to find out if high temperatures are correlated with pressure by brushing high temperatures and watching a linked view of pressure distributions. Since each of the linked views usually has two or more dimensions, multivariate analysis can implicitly uncover higher-dimensional features of the data which would not be readily apparent from e.g. a simple scatterplot.


Applications

Concepts from Interactive Visual Analysis have been implemented in multiple software packages, both for researchers and commercial purposes.
ComVis
is often used by visualization researchers in academia, whil
SimVis
is optimized for analyzing simulation data.Matkovic, Krešimir, et al. "ComVis: A coordinated multiple views system for prototyping new visualization technology." Information Visualisation, 2008. IV'08. 12th International Conference. IEEE, 2008Doleisch, Helmut. "SimVis: Interactive visual analysis of large and time-dependent 3D simulation data." Proceedings of the 39th conference on Winter simulation: 40 years! The best is yet to come. IEEE Press, 2007.Tableau
is another example of a commercial software product utilizing concepts from IVA.


See also

*
Information visualization Information is an abstract concept that refers to that which has the power to inform. At the most fundamental level information pertains to the interpretation of that which may be sensed. Any natural process that is not completely random, ...
* Data mining *
Scientific visualization Scientific visualization ( also spelled scientific visualisation) is an interdisciplinary branch of science concerned with the visualization of scientific phenomena. Michael Friendly (2008)"Milestones in the history of thematic cartography, st ...
* Big data *
Data visualization Data and information visualization (data viz or info viz) is an interdisciplinary field that deals with the graphic representation of data and information. It is a particularly efficient way of communicating when the data or information is nume ...
* Visual analytics * Interaction Design *
Interactivity Across the many fields concerned with interactivity, including information science, computer science, human-computer interaction, communication, and industrial design, there is little agreement over the meaning of the term "interactivity", but ...


References

Visualization (graphics)