Astrostatistical Challenges for the New Astronomy presents a collection of monographs authored by several of the disciplines leading astrostatisticians, i.e. by researchers from the fields of statistics and astronomy-astrophysics, who work in the statistical analysis of astronomical and cosmological data. Eight of the ten monographs are enhancements of presentations given by the authors as invited or special topics in astrostatistics papers at the ISI World Statistics Congress (2011, Dublin, Ireland). The opening chapter, by the editor, was adapted from an invited seminar given at Los Alamos National Laboratory (2011) on the history and current state of the discipline; the second chapter by Thomas Loredo was adapted from his invited presentation at the Statistical Challenges in Modern Astronomy V conference (2011, Pennsylvania State University), presenting insights regarding frequentist and Bayesian methods of estimation in astrostatistical analysis. The remaining monographs are research papers discussing various topics in astrostatistics. The monographs provide the reader with an excellent overview of the current state astrostatistical research, and offer guidelines as to subjects of future research. Lead authors for each chapter respectively include Joseph M. Hilbe (Jet Propulsion Laboratory and Arizona State Univ); Thomas J. Loredo (Dept of Astronomy, Cornell Univ); Stefano Andreon (INAF-Osservatorio Astronomico di Brera, Italy); Martin Kunz ( Institute for Theoretical Physics, Univ of Geneva, Switz); Benjamin Wandel ( Institut d'Astrophysique de Paris, Univ Pierre et Marie Curie, France); Roberto Trotta (Astrophysics Group, Dept of Physics, Imperial College London, UK); Phillip Gregory (Dept of Astronomy, Univ of British Columbia, Canada); Marc Henrion (Dept of Mathematics, Imperial College, London, UK); Asis Kumar Chattopadhyay (Dept of Statistics, Univ of Calcutta, India); Marisa March (Astrophysics Group, Dept of Physics, Imperial College, London, UK)./body
This volume contains a selection of chapters based on papers to be presented at the Fifth Statistical Challenges in Modern Astronomy Symposium. The symposium will be held June 13-15th at Penn State University. Modern astronomical research faces a vast range of statistical issues which have spawned a revival in methodological activity among astronomers. The Statistical Challenges in Modern Astronomy V conference will bring astronomers and statisticians together to discuss methodological issues of common interest. Time series analysis, image analysis, Bayesian methods, Poisson processes, nonlinear regression, maximum likelihood, multivariate classification, and wavelet and multiscale analyses are all important themes to be covered in detail. Many problems will be introduced at the conference in the context of large-scale astronomical projects including LIGO, AXAF, XTE, Hipparcos, and digitized sky surveys.
"As telescopes, detectors, and computers grow ever more powerful, the volume of data at the disposal of astronomers and astrophysicists will enter the petabyte domain, providing accurate measurements for billions of celestial objects. This book provides a comprehensive and accessible introduction to the cutting-edge statistical methods needed to efficiently analyze complex data sets from astronomical surveys such as the Panoramic Survey Telescope and Rapid Response System, the Dark Energy Survey, and the upcoming Large Synoptic Survey Telescope. It serves as a practical handbook for graduate students and advanced undergraduates in physics and astronomy, and as an indispensable reference for researchers. The updates in this new edition will include fixing "code rot," correcting errata, and adding some new sections. In particular, the new sections include new material on deep learning methods, hierarchical Bayes modeling, and approximate Bayesian computation. Statistics, Data Mining, and Machine Learning in Astronomy presents a wealth of practical analysis problems, evaluates techniques for solving them, and explains how to use various approaches for different types and sizes of data sets. For all applications described in the book, Python code and example data sets are provided. The supporting data sets have been carefully selected from contemporary astronomical surveys (for example, the Sloan Digital Sky Survey) and are easy to download and use. The accompanying Python code is publicly available, well documented, and follows uniform coding standards. Together, the data sets and code enable readers to reproduce all the figures and examples, evaluate the methods, and adapt them to their own fields of interest"--
This book introduces “Astrostatistics” as a subject in its own right with rewarding examples, including work by the authors with galaxy and Gamma Ray Burst data to engage the reader. This includes a comprehensive blending of Astrophysics and Statistics. The first chapter’s coverage of preliminary concepts and terminologies for astronomical phenomenon will appeal to both Statistics and Astrophysics readers as helpful context. Statistics concepts covered in the book provide a methodological framework. A unique feature is the inclusion of different possible sources of astronomical data, as well as software packages for converting the raw data into appropriate forms for data analysis. Readers can then use the appropriate statistical packages for their particular data analysis needs. The ideas of statistical inference discussed in the book help readers determine how to apply statistical tests. The authors cover different applications of statistical techniques already developed or specifically introduced for astronomical problems, including regression techniques, along with their usefulness for data set problems related to size and dimension. Analysis of missing data is an important part of the book because of its significance for work with astronomical data. Both existing and new techniques related to dimension reduction and clustering are illustrated through examples. There is detailed coverage of applications useful for classification, discrimination, data mining and time series analysis. Later chapters explain simulation techniques useful for the development of physical models where it is difficult or impossible to collect data. Finally, coverage of the many R programs for techniques discussed makes this book a fantastic practical reference. Readers may apply what they learn directly to their data sets in addition to the data sets included by the authors.
With the onset of massive cosmological data collection through media such as the Sloan Digital Sky Survey (SDSS), galaxy classification has been accomplished for the most part with the help of citizen science communities like Galaxy Zoo. Seeking the wisdom of the crowd for such Big Data processing has proved extremely beneficial. However, an analysis of one of the Galaxy Zoo morphological classification data sets has shown that a significant majority of all classified galaxies are labelled as “Uncertain”. This book reports on how to use data mining, more specifically clustering, to identify galaxies that the public has shown some degree of uncertainty for as to whether they belong to one morphology type or another. The book shows the importance of transitions between different data mining techniques in an insightful workflow. It demonstrates that Clustering enables to identify discriminating features in the analysed data sets, adopting a novel feature selection algorithms called Incremental Feature Selection (IFS). The book shows the use of state-of-the-art classification techniques, Random Forests and Support Vector Machines to validate the acquired results. It is concluded that a vast majority of these galaxies are, in fact, of spiral morphology with a small subset potentially consisting of stars, elliptical galaxies or galaxies of other morphological variants.
Statistical literacy is critical for the modern researcher in Physics and Astronomy. This book empowers researchers in these disciplines by providing the tools they will need to analyze their own data. Chapters in this book provide a statistical base from which to approach new problems, including numerical advice and a profusion of examples. The examples are engaging analyses of real-world problems taken from modern astronomical research. The examples are intended to be starting points for readers as they learn to approach their own data and research questions. Acknowledging that scientific progress now hinges on the availability of data and the possibility to improve previous analyses, data and code are distributed throughout the book. The JAGS symbolic language used throughout the book makes it easy to perform Bayesian analysis and is particularly valuable as readers may use it in a myriad of scenarios through slight modifications. This book is comprehensive, well written, and will surely be regarded as a standard text in both astrostatistics and physical statistics. Joseph M. Hilbe, President, International Astrostatistics Association, Professor Emeritus, University of Hawaii, and Adjunct Professor of Statistics, Arizona State University
This comprehensive guide to Bayesian methods in astronomy enables hands-on work by supplying complete R, JAGS, Python, and Stan code, to use directly or to adapt. It begins by examining the normal model from both frequentist and Bayesian perspectives and then progresses to a full range of Bayesian generalized linear and mixed or hierarchical models, as well as additional types of models such as ABC and INLA. The book provides code that is largely unavailable elsewhere and includes details on interpreting and evaluating Bayesian models. Initial discussions offer models in synthetic form so that readers can easily adapt them to their own data; later the models are applied to real astronomical data. The consistent focus is on hands-on modeling, analysis of data, and interpretations that address scientific questions. A must-have for astronomers, its concrete approach will also be attractive to researchers in the sciences more generally.
Modern astronomers encounter a vast range of challenging statistical problems, yet few are familiar with the wealth of techniques developed by statisticians. Conversely, few statisticians deal with the compelling problems confronted in astronomy. Astrostatistics bridges this gap. Authored by a statistician-astronomer team, it provides professionals and advanced students in both fields with exposure to issues of mutual interest. In the first half of the book the authors introduce statisticians to stellar, galactic, and cosmological astronomy and discuss the complex character of astronomical data. For astronomers, they introduce the statistical principles of nonparametrics, multivariate analysis, time series analysis, density estimation, and resampling methods. The second half of the book is organized by statistical topic. Each chapter contains examples of problems encountered astronomical research and highlights methodological issues. The final chapter explores some controversial issues in astronomy that have a strong statistical component. The authors provide an extensive bibliography and references to software for implementing statistical methods. The "marriage" of astronomy and statistics is a natural one and benefits both disciplines. Astronomers need the tools and methods of statistics to interpret the vast amount of data they generate, and the issues related to astronomical data pose intriguing challenges for statisticians. Astrostatistics paves the way to improved statistical analysis of astronomical data and provides a common ground for future collaboration between the two fields.
Practical Guide to Logistic Regression covers the key points of the basic logistic regression model and illustrates how to use it properly to model a binary response variable. This powerful methodology can be used to analyze data from various fields, including medical and health outcomes research, business analytics and data science, ecology, fisheries, astronomy, transportation, insurance, economics, recreation, and sports. By harnessing the capabilities of the logistic model, analysts can better understand their data, make appropriate predictions and classifications, and determine the odds of one value of a predictor compared to another. Drawing on his many years of teaching logistic regression, using logistic-based models in research, and writing about the subject, Professor Hilbe focuses on the most important features of the logistic model. Serving as a guide between the author and readers, the book explains how to construct a logistic model, interpret coefficients and odds ratios, predict probabilities and their standard errors based on the model, and evaluate the model as to its fit. Using a variety of real data examples, mostly from health outcomes, the author offers a basic step-by-step guide to developing and interpreting observation and grouped logistic models as well as penalized and exact logistic regression. He also gives a step-by-step guide to modeling Bayesian logistic regression. R statistical software is used throughout the book to display the statistical models while SAS and Stata codes for all examples are included at the end of each chapter. The example code can be adapted to readers’ own analyses. All the code is available on the author’s website.
Our understanding of the universe has been revolutionized by observations of the cosmic microwave background, the large-scale structure of the universe, and distant supernovae. These studies have shown that we are living in a strange universe: 96% of the present day energy density of the universe is dominated by so-called dark matter and dark energy. But we still do not know what dark matter and dark energy actually are. This book presents lectures from the 186th Course in the Enrico Fermi International School of Physics entitled New Horizons for Observational Cosmology, held in Varenna, Italy, in July 2013. Topics covered at this school included: cosmic microwave background anisotropies; galaxy clustering; weak lensing; dark energy; dark matter; inflation; modified gravity; neutrino physics; reionization; galaxy formation; and first stars. The anticipated release of Planck data at the end of 2014 will provide a more complete view of temperature anisotropy of the cosmic microwave background, and the reporting of other important results is also expected soon. These new data will undoubtedly address fundamental questions about the universe. This book prepares the ground for future work which may answer some of these exciting questions.
"This entry-level text offers clear and concise guidelines on how to select, construct, interpret, and evaluate count data. Written for researchers with little or no background in advanced statistics, the book presents treatments of all major models using numerous tables, insets, and detailed modeling suggestions. It begins by demonstrating the fundamentals of linear regression and works up to an analysis of the Poisson and negative binomial models, and to the problem of overdispersion. Examples in Stata, R, and SAS code enable readers to adapt models for their own purposes, making the text an ideal resource for researchers working in public health, ecology, econometrics, transportation, and other related fields"--