Novel Machine Learning Methods for Extraction of Features Characterizing Complex Datasets and Models

Thursday, October 25, 2018 - 11:10am - 12:00pm
Keller 3-180
Velimir Vesselinov (Los Alamos National Laboratory)
The integration of large datasets and powerful computational capabilities has resulted in the widespread use of machine learning (ML) in science, technology, and industry. However, most of the recent ML developments focus on supervised methods which require large training tests. However, these supervised ML methods are not highly applicable to science driven data applications where typically the availability of training sets is very limited. The supervised ML methods are also impacted by adversarial problems which can cause inaccurate ML predictions when random noise is added to the training data. Instead, unsupervised ML methods are generally preferred for data-analytics problems. The unsupervised ML methods can be applied for feature extraction, blind source separation, model diagnostics, detection of disruptions and
anomalies, image recognition, discovery of unknown dependencies and phenomena represented in the datasets as well as development of physics and reduced-order models representing the data. Recently, we have developed a series of novel unsupervised machine learning (ML) methods based on matrix and tensor factorizations, called NMFk and NTFk. Our novel unsupervised ML techniques are powerful tools for objective, unbiased, data analyses to extract essential features hidden in data. Our methodology is capable of identifying the unknown number of features charactering the analyzed datasets, as well as the spatial footprints and temporal signatures of the features in the explored domain.
Here, we present (1) detailed discussion of the developed methodology, (2) extensive testing and verification of the methods and computational tools, (3) a series of applications. The applications include diverse sets of problems including climate modeling, fluid and geothermal extraction, material characterization, polymer phase transitions, groundwater contamination transport, and fast irreversible bimolecular reactions.