Skip to main content
Feature Story | Mathematics and Computer Science

Using statistics to better understand our world

The best thing about being a statistician is that you get to play in everybody’s backyard.” So stated American mathematician John Wilder Tukey (1915-2000).

Researchers from Argonne’s Mathematics and Computer Science (MCS) division may well agree. Five MCS division members recently participated in the Joint Statistical Meetings annual conference (JSM2018). And their topics spanned a broad landscape — from spectral analysis of wind data to score (likelihood) equations to text analysis and natural language.

Statistics are often regarded as dull, but I find them fascinating,” said Julie Bessac, an assistant computational statistician in the MCS division. Bessac’s backyard” includes collaboration with colleagues at the National Center for Atmospheric Research in Colorado; the University of Victoria in British Columbia; and the Scripps Institution of Oceanography and Meteorological Institute of the University of Bonn, Germany. At the JSM2018 conference, Bessac discussed the stochastic framework she and her colleagues have developed to parameterize the influence of turbulent phenomena occurring at scales finer than standard ones. We used the framework to capture random effects and model the error in using area-averaged wind speeds as opposed to the fine-scale turbulent flux,” Bessac said.

Wind data was also used in a statistical study presented by Charlotte Haley, together with Chris Geoga and Mihai Anitescu, all from the MCS division. Haley, who like Bessac is an assistant computational statistician, described the analysis she and her colleagues had done using spectral methods in place of the conventional covariance method. After formulating two different expansions of the covariance kernel, the researchers compared the results on both simulated data and wind speed measurements having one space and one time dimension.

Nathan Wycoff, a summer student in Argonne’s MCS division and graduate student at Virginia Tech, presented a topic model approach to text analysis and natural language processing in unsupervised data analysis. Topic models assume that the documents being studied can be explained by a relatively small number of topics, or probability distributions of words. These topics are commonly dominated by high-frequency words that often have little semantic meaning. Wycoff and his colleagues from Virginia Tech have devised a method that allows the user to guide and iteratively refine the topic model through the use of term weights inferred based on user interaction with a 2D visualization. The final topic model thus is a synergy of the user’s expertise and the structure present in the text data.

In addition to the MCS presenters at the conference, two people from Argonne’s Environmental Systems Division — V. Rao Kotamarthi and Jiali Wang — delivered a talk on statistical models for high-dimensional computer output.

The Joint Statistical Meetings publishes its proceedings every year in the Journal of the American Statistical Association. Abstracts of the presentations currently are available online at https://​ww2​.amstat​.org/​m​e​e​t​i​n​g​s​/​j​s​m​/2018.