Skip to main content

In everyday language, “astronomical” means “very large” as often as it means having to do with astronomy. That’s for good reason: the known universe contains a lot of stars, galaxies, and other objects. Modern observations gather huge amounts of data on those objects, requiring researchers to come up with new ways to process that information. Astrostatistics is the way astronomers measure the reliability of their measurements, quantify the uncertainties in theoretical models, and turn the raw numbers from observations into something useful.

The galaxy cluster SDSSJ0150+2725 as seen by NASA's Hubble Space Telescope. This cluster was identified by the Sloan Digital Sky Survey (SDSS), which has cataloged millions of objects, requiring the development of powerful statistical techniques to track them all.

Credit: ESA/Hubble & NASA; Judy Schmidt

Responding to a Universe of Data

The beautiful images coming from telescopes are only a small part of the scientific story. Underneath their amazing colors are numbers, and lots of them. The colors and brightness of the various pixels are part of the data collected by telescopes, and it’s that data that contains the scientific information astronomers need.

With increasingly larger observatories and more emphasis on monitoring big swaths of the sky for long periods of time, astronomical datasets are growing in size very quickly. Researchers are developing more powerful statistical methods to tackle this data, to get the big picture view out of all the details. Astrostatistics puts numbers on what we learn from these observations, and equally importantly how good both the measurements and our interpretations are. Along with machine learning, astrostatistics allows astronomers to find patterns that might otherwise be missed.

The advent of “big data” in astronomy came through large-scale surveys of galaxies, hunts for exoplanets, maps of the cosmic microwave background, and other observations where the goal is to study many objects simultaneously, rather than as individuals. For example, with approximately 100 billion galaxies in the known universe, astrostatistics allows researchers to study rare events — the kind that might only occur in one out of every billion galaxies.

Next-generation observatories such as NASA's Transiting Exoplanet Survey Satellite (TESS) and the Large Synoptic Survey Telescope (LSST) are designed to produce maps of large chunks of the sky on a regular basis to find exoplanets and other objects. With many terabytes of data coming through during each period of observation, advanced techniques in astrostatistics are necessary to turn this information into something astronomers can use.