February 16, 1998
"In the mid-19th century, the focus of statistics shifted to the social sciences, and that of astronomy moved to quantum mechanics, thermodynamics and electromagnetism, using such mathematical methods as differential equations," says Dr. Eric Feigelson, professor of astronomy and astrophysics. "Today, astronomers are not taught the latest statistical methods."
This has not always been the case. Newton's description of the motion of the heavens based on the gravitation laws created a need for statistics, and a variety of statistical practices were developed for astronomy.
"Because Newton made it possible to make repetitive, accurate measurements of planetary characteristics, there were more data available than the astronomers could deal with," the researchers note.
"Astronomers needed a way to reduce the data," says Dr. Jogesh Babu, professor of statistics, who is the statistical half of the team.
One attempt that worked was by a French astronomer, Adrien Legendre, who published a new method, minimizing the sum of squares of errors, for determining the orbits of comets in 1805.
"The situation is similar today," the researchers told attendees of the annual meeting of American Association for the Advancement of Science, today (Feb. 16) in Philadelphia. "Modern observations produce gigabytes of information everyday. Over a year, terabytes of information are not unusual."
These huge amounts of data pose problems for astronomers not only because of their size, but also because the number of individual properties recorded are large, creating multivariate databases. Modern techniques now also make it possible to record information continuously through time-creating time series. These types of databases are best handled with such statistical methods as time series analysis, sampling theory, multivariate analysis and nonlinear regressions. Applying such methods to astronomy forms the basis of the newly named field of astrostatistics.
"The first problem we tackled was a method for dealing with data we know exists but is below our ability to record," say Feigelson and Babu.
That method proved to be survival analysis, the same method used to estimate the lifetime of light bulbs and the survival rate of cancer patients. No one wants to wait around for the last light bulb to sputter out or the last laboratory animal to die to determine their average life spans, so statisticians developed methods to compute the averages before the last subjects expire. This same method works for astronomical objects that are too faint to be detected.
"Astronomy had a need and statistics had an answer," says Babu. "There may be many other areas where statistics already has the methods and there may be areas where astronomy can provide new problems for the statisticians to solve."
The researchers note that astronomers were using as many as six different ways to regress lines leading to the expansion rate and age of the universe and sometimes mixing up the methods. Advanced methods are also being used to understand such varied problems as large-scale clustering of galaxies to the internal structure of the sun.
Feigelson and Babu say that dealing with such large amounts of data would not be possible before the advent of computers. "It is not that computers make things easier, in this case, they make a statistical approach possible," says Feigelson.