Skip to main content

As astronomers build increasingly larger observatories capable of seeing more objects in the sky, the amount of data they collect has gone beyond what humans can analyze without help. Instead, researchers turn to teaching computers to sift through the data, identifying important patterns and connections that might otherwise be missed. This process is called machine learning, and it’s an essential aspect of modern astronomy at the Center for Astrophysics.

This view of the sky seen by the Pan-STARRS1 telescope in Hawaii contains 3 billion astronomical sources, ranging from nearby asteroids to distant galaxies. Researchers use machine learning techniques to help analyze the 2 petabytes of data collected.

Credit: Danny Farrow, Pan-STARRS1 Science Consortium and Max Planck Institute for Extraterrestial Physics

A Universe of Data

Machine learning plays a huge role when cataloguing large numbers of anything, like galaxies in surveys of the whole sky. Computers can learn to identify and classify galaxy types, find transient events like supernovas, and pick out features in galaxy clusters. With thousands of potential sources from large surveys, it would take humans far too much time which could be better spent on other tasks. Machine learning shifts the effort from astronomers to computers, which excel at tedious detail-oriented tasks. Computer algorithms can also apply what we’ve learned from high-resolution observational data to improve lower-resolution images, allowing astronomers to construct what that object might look like through a more powerful telescope.

In addition, machine learning is essential for “time-domain astronomy”: looking for events that change during observation. Those include:

  • Hunts for exoplanets, which are planets orbiting other stars. When these worlds pass between their host stars and Earth, they block a small amount of light. Tracking the duration and amount of light provides information about the planet’s size and orbit. Several exoplanets have been identified using machine learning, including a few in multiple-planet systems, where the signals are hard for a human to distinguish.

  • Tracking changes in the light from stars. Some stars are extremely “active”, producing flares at unpredictable intervals. Others are variable, changing brightness as they expand and contract. Computers are suited to catch these variations, which can be subtle compared with the sheer amount of data needed to find them.

  • Studying patterns in the “weather” on our own Sun. Multiple observatories monitor the Sun literally all the time, which produces a vast amount of data to sift through. Machines are very good at finding important fluctuations in the Sun’s activity within that mass of information.

  • Finding and classifying supernovas. The explosions of stars and white dwarfs are random, unpredictable occurrences for all practical purposes. Catching these transient events requires sifting through data that’s often collected for different purposes, such as catalogs of galaxies.

  • Locating asteroids, comets, and other faint objects in the Solar System. These bodies show up as transients in other astronomical images. Computer searches are well suited to find them in large catalogs of data.

  • Using machine learning to classify objects in astronomical images. For example, in a telescope image of a galaxy or cluster of stars, the resolution might not be good enough to distinguish individual stars. However, computer processing can pick out light from different stars in the population, similar to the way low-resolution photos can be sharpened to reveal details that can’t be seen in the original.

Both current and future observatories regularly process many terabytes — trillions of bytes — of data. With that much information, it can be hard to tell what is important and what is not, with the importance depending on what scientific questions are being asked. Astronomers can teach computers to process the flood of data to pick out those important pieces.

Many current and future observatories, including NASA’s Transiting Exoplanet Survey Satellite (TESS), will bring in even more data useful to many areas of research. As a result, machine learning will become more important in the coming years, and CfA scientists are on the cutting edge of that development.