Distribution-based Summarization for Large Scale Simulation Data Visualization and Analysis

Distribution-based Summarization for Large Scale Simulation Data Visualization and Analysis
Author :
Publisher :
Total Pages : 170
Release :
ISBN-10 : OCLC:1156324155
ISBN-13 :
Rating : 4/5 ( Downloads)

Book Synopsis Distribution-based Summarization for Large Scale Simulation Data Visualization and Analysis by : Ko-Chih Wang

Download or read book Distribution-based Summarization for Large Scale Simulation Data Visualization and Analysis written by Ko-Chih Wang and published by . This book was released on 2019 with total page 170 pages. Available in PDF, EPUB and Kindle. Book excerpt: The advent of high-performance supercomputers enables scientists to perform extreme-scale simulations that generate millions of cells and thousands of time steps. Through exploring and analyzing the simulation outputs, scientists can gain a deeper understanding of the modeled phenomena. When the size of simulation output is small, the common practice is to simply move the data to the machines that perform post analysis. However, as the size of data grows, the limited bandwidth and capacity of networking and storage devices that connect the supercomputers to the analysis machine become a major bottleneck. Therefore, visualizing and analyzing large-scale simulation datasets are posing significant challenges. This dissertation addresses the big data challenge and suggests distribution-based in-situ techniques. The technique uses the same supercomputer resources to analyze the raw data and generate compact data proxies which use distribution to statistically summarize the raw data. Only the compact data proxies are moved to the post-analysis machine to overcome the bottleneck. Because the distribution-based data representation keeps the statistical data properties, it has the potential to facilitate flexible post-hoc data analysis and enable uncertainty quantification. We firstly focus on the problem of large data volume rendering on resource-limited post analysis machines. To tackle the limited I/O bandwidth and storage space challenge, distributions are used to summarize the data. When visualizing the data, importance sampling is proposed to draw a small number of samples and minimize the demand of computational power. The error of the proxies is quantified and visually presented to scientists by uncertainty animation. We also tackle the problem of error reduction when approximating the spatial information in distribution-based representations. The error could cause low visualization quality and hinder the data exploration. The basic distribution-based approach is augmented by our proposed spatial distribution which is represented by a three-dimensional Gaussian Mixture Model (GMM). The new representation not only improves the visualization quality but can also be used in various visualization techniques, such as volume rendering, uncertain isosurface, and salient feature exploration. Then, a technique is developed to tackle the problem of large-scale time-varying datasets. This representation stores the time-varying datasets with a lower temporal resolution and utilizes the temporal coherence to reconstruct the data at non-sampled time steps. Each pixel ray at a view at non-sampled time step is decoupled into a value distribution and samples' location information. Our representation utilizes the data coherence to recover the samples' location information and store less data. In addition, similar value distributions from multiple rays are represented by one distribution to save more storage. Finally, a statistical-based super resolution technique is proposed to solve the big data problem caused by a huge parameter space. Simulation runs with a few parameter samples output full resolution data which is used to create the prior knowledge. Data from rest of simulation runs in the parameter space is statistically down-sampled to compact representation in situ to reduce the data size. These compact data representation can be reconstructed to high resolution by combining with the prior knowledge for data analysis.


Distribution-based Summarization for Large Scale Simulation Data Visualization and Analysis Related Books

Distribution-based Summarization for Large Scale Simulation Data Visualization and Analysis
Language: en
Pages: 170
Authors: Ko-Chih Wang
Categories: Databases
Type: BOOK - Published: 2019 - Publisher:

DOWNLOAD EBOOK

The advent of high-performance supercomputers enables scientists to perform extreme-scale simulations that generate millions of cells and thousands of time step
Statistical and Machine Learning Approaches for Visualizing and Analyzing Large-scale Simulation Data
Language: en
Pages: 168
Authors: Subhashis Hazarika
Categories: Information visualization
Type: BOOK - Published: 2019 - Publisher:

DOWNLOAD EBOOK

This dissertation addresses three broad categories of data analysis and visualization challenges: (i) multivariate distribution-based data summarization, (ii) u
Data Summarization for Large Time-varying Flow Visualization and Analysis
Language: en
Pages: 171
Authors: Chun-Ming Chen
Categories:
Type: BOOK - Published: 2016 - Publisher:

DOWNLOAD EBOOK

The rapid growth of computing power has expedited scientific simulations which can now generate data in unprecedentedly high quality and quantity. However, this
In Situ Visualization for Computational Science
Language: en
Pages: 464
Authors: Hank Childs
Categories: Mathematics
Type: BOOK - Published: 2022-05-04 - Publisher: Springer Nature

DOWNLOAD EBOOK

This book provides an overview of the emerging field of in situ visualization, i.e. visualizing simulation data as it is generated. In situ visualization is a p
Introduction to Data Science
Language: en
Pages: 794
Authors: Rafael A. Irizarry
Categories: Mathematics
Type: BOOK - Published: 2019-11-20 - Publisher: CRC Press

DOWNLOAD EBOOK

Introduction to Data Science: Data Analysis and Prediction Algorithms with R introduces concepts and skills that can help you tackle real-world data analysis ch