قالب وردپرس درنا توس
Home / Gadgets / The AI ​​algorithms are now shockingly good at doing science

The AI ​​algorithms are now shockingly good at doing science

No human or [group] of people may possibly proceed with avalanche of information produced by many of today's physics and astronomy experiments. Some of them record terabytes of data every day – and the stream only increases. Square Kilometer Array, a radio telescope slated to start in the mid-2020s, will generate approximately as much data traffic each year as the entire internet.

Quanta Magazine

  author photo


Original story reprinted with permission from Quanta Magazine, an editorial independent publication of the Simons Foundation, whose task is to increase public understanding of science by covering research development and trends in mathematics and physics and life sciences.

The flop has many scientists who turn to artificial intelligence for help. With minimal human input, AI systems as artificial neural networks – data simulated networks of neurons that mimic the brain's function – can plow through mountain of data, mark aberrations, and detect patterns that humans could never have seen.

Of course, the use of computers to assist in scientific research dates back about 75 years, and the method of manually poring over data in search of meaningful patterns originated for decades before. But some researchers argue that the latest technologies in machine learning and AI represent a fundamentally new way of doing science. Such an approach, called generative modeling, can help identify the most credible theory among competing explanations for observation data, based solely on data and, more importantly, without pre-programmed knowledge of which physical processes may be in the work of the studied system. Proponents of generative modeling see it as a novel enough to be considered a potential "third way" to learn about the universe.

Traditionally, we have learned about nature through observation. Think of Johannes Kepler poring over Tycho Brahe's table with planetary positions and trying to distinguish the underlying pattern. (He eventually struck out that the planets are moving in elliptical paths.) Science has also advanced through simulation. An astronomer can model the movement of the Milky Way and its nearby galaxy, Andromeda, and predict that they will collide in a few billion years. Both observation and simulation help researchers create hypotheses that can then be tested with additional observations. Generative modeling differs from both of these approaches.

"It is basically a third approach, between observation and simulation," says Kevin Schawinski, an astrophysicist and one of generative modeling's most enthusiastic advocates, who recently worked at the Swiss Federal Institute of Technology in Zurich (ETH Zürich ). "It's another way to attack a problem."

Some researchers see generative modeling and other new techniques simply as power tools for making traditional science. But most agree that AI has a tremendous impact, and that its role in science will only grow. Brian Nord, an astrophysicist at the Fermi National Accelerator Laboratory using artificial neural networks to study the cosmos, is among those who fear that there is nothing a human scientist does to make it impossible to automate. "It's a bit of a chilling thought," he said.

Discovery by Generation

Since school, Schawinski has made a name for himself in computer-driven science. While working on his doctoral thesis, he met the task of classifying thousands of galaxies based on their appearance. Since no available software was available for the job, he decided to crowdsource it – and then the Galaxy Zoo Citizenship Project was born. As of 2007, ordinary computer users helped astronomers by logging their best guesses about which galaxy belonged in any category, with the majority rule usually leading to correct classifications. The project was a success, but as Schawinski notes, AI has outdated: "Today, a talented researcher with background in machine learning and cloud computing access can do it all in one afternoon."

Schawinski turned to the powerful new tool for generative modeling 2016. Essentially, generative modeling asks how likely it is, with condition X, that you will observe the result Y. The approach has proven to be extremely powerful and versatile. Suppose you are feeding a generative model a set of images of human faces, with each face marked with the person's age. When the computer program combines through these "training data" it begins to create a connection between older faces and increased risk of wrinkles. Eventually, it may "age" which face it is given, that is, it can predict which physical changes a given face of any age is likely to undergo.

None of these faces are real. The faces of the top row (A) and the left column (B) were constructed by a general opposite network (GAN) using real-face building elements. GAN then combined the basic functions of faces in A, including their gender, age and face, with finer facial features in B, such as hair color and eye color, to create all faces in the rest of the grid. 19659015] The most well-known generative modeling systems are "generative adversarial networks" (GAN). After adequate exposure to exercise data, a GAN can repair images that have damaged or missing pixels, or they can sharpen sharp images. They learn to derive the missing information by means of a contest (hence the term "opposite"): Some of the network, called the generator, generates false data, while a second part discriminates against false data from real data. When the program goes, both halves become progressively better. You may have seen some of the newly realistic, GAN-produced "faces" that have recently circulated – images of "freakish-real people who don't exist", as a headline says.

More generally, generative modeling takes sets of data (usually images, but not always) and breaks them down into a set of basic abstract building blocks – researchers refer to this as data latent space. The algorithm manipulates elements of the latent space to see how this affects the original data, which helps reveal physical processes at work in the system.

The idea of ​​latent space is abstract and difficult to visualize, but as a rough analogy, think of what your brain may be doing when trying to determine the gender of a human face. Perhaps you notice hairstyle, nose shape and so on, as well as patterns that you cannot easily express in words. The computer program also looks for important functions among data: Although it has no idea what a mustache is or what sex is, if it has been trained on data sets where some pictures are labeled "man" or "woman" and where some have one " mustache "tag, it will quickly derive a connection.

Kevin Schawinski, an astrophysicist who runs an AI company called Modulos, claims that a technology called generative modeling offers a third way to learn about the universe. [19659019] Der Beobachter

Schawinski and his ETH Zurich colleagues Dennis Turp and Ce Zhang used generative modeling to investigate the physical changes that the galaxies undergo as in a newspaper published in December in Astronomy and Astrophysics develop. (The software they used treats latent space slightly differently than a generative adversarial network treats it, so it's not technically a GAN, but similar.) The model created artificial data sets as a way of testing physical process hypotheses. For example, they asked how the "strengthening" of the star formation – a sharp reduction in the degree of formation – is related to the increasing density in a galaxy environment.

For Schawinski, the key question is how much information about stellar and galactic processes can be extracted from data alone. "Let's erase everything we know about astrophysics," he said. "To what extent can we rediscover that knowledge, just use the data ourselves?"

First, the galaxy images were reduced to their latent space; then Schawinski could tweak an element in this space in a way that corresponded to some change in the galaxy's environment – for example, the density. Then he could re-establish the galaxy and see what differences arose. "So now I have a hypothesis-generating machine," he explained. "I can take a whole bunch of galaxies that are originally in a low density environment and make them look like they are in a high density environment, with this process." Schawinski, Turp and Zhang saw that galaxies go from low density and high density environments, they become redder in color and their stars become more centrally concentrated. This matches existing observations on galaxies, Schawinski said. The question is why this is so.

The next step, says Schawinski, has not yet been automated: "I have to come in as a human and say," Okay, what kind of physics can explain this effect? "" For the process in question, there are two plausible explanations: Perhaps galaxies grow in high density environments because they contain more dust, or they may be scared because of a reduction in star formation (in other words, their stars tend to be older). With a generative model, both ideas can be tested: Elements in latent space related to dustiness and star formation rates change to see how this affects the color of the galaxies. "And the answer is clear," Schawinski said. Saving galaxies are "where the star formation had fallen, not those where the dust changed. So we should benefit that explanation."

Using generative modeling, astrophysicists could investigate how the galaxies change as they move from low density regions of the cosmos to high density regions and which physical processes who are responsible for these changes. [19659015] The approach is related to traditional simulation, but with critical differences. A simulation is "mainly assumed driven", said Schawinski. "The approach is to say," I think I know which underlying physical laws give rise to everything I see in the system. "So I have a recipe for star formation, I have a recipe for how dark matter behaves, and so on. I put all my hypotheses in there, and I let the simulation run. And then I ask: Does it look like reality?" What he did with generative modeling, he said, is "somehow exactly the opposite of a simulation. We know nothing; we don't want to assume anything. We want the data to tell us what can happen."

The obvious success of generative modeling in a study like this does not of course mean that astronomers and doctoral students have become redundant but it seems to represent a shift in the degree to which learning about astrophysical objects and processes can be achieved through an artificial system that has a little more at its electronic fingertips than a large database. "It's not entirely automated science, but it shows that we can mostly build the tools that make the science process automatic," says Schawinski.

Generative modeling is clearly powerful, but if it really represents a new scientific approach is open to debate. For David Hogg, a cosmologist at New York University and the Flatiron Institute (who, as Quanta funded by the Simons Foundation), the technology is impressive but ultimately only a very sophisticated way to extract patterns from data – As is what astronomers has been doing for centuries. In other words, it is an advanced form of observation plus analysis. Hoggs own work, like Schawinski, strongly leans on AI; He has used neural networks to classify stars according to their spectra and to base on other physical attributes of stars using computer-controlled models. But he sees his work, like Schawinski, as a tried and true truth. "I don't think it's a third way," he said recently. "I just think that we as a society will be much more sophisticated about how we use data. In particular, we will be much better when we compare data to data. But in my opinion, my work is still entirely in the observation mode."

Hardworking Assistants

Whether they are conceptual or not, it is evident that AI and neural networks have come to play an important role in modern astronomy and physics research. At Heidelberg's Institute for Theoretical Studies, physicist Kai Polsterer leads the astroinformatics group – a team of researchers who focus on new data-centered methods for astrophysics. Recently, they have used a machine learning algorithm to extract redshift information from galaxy data accessories, a previous difficult task.

Upholstery sees these new AI-based systems as "hard-wearing assistants" who can comb through data for hours at the end without getting bored or complaining about working conditions. These systems can do all the boring shallow work, he said, leaving you "to do the cold, interesting science himself".

But they are not perfect. Especially Polsterer warns, the algorithms can only do what they have trained to do. The system is "agnostic" about the entrance. Give it a galaxy, and the software can calculate its redshift and its age – but feed the same system to its own, or an image of a rotting fish, and it will also produce a (very wrong) age for it as well. Ultimately, monitoring of a human researcher is still necessary, he said. "It comes back to you, the researcher. You are the one responsible for making the interpretation."

For his part, North, at Fermilab, warns that it is crucial that neural networks not only deliver results, but also error bars to follow them, as all basic education is trained to do. In science, if you do a measurement and do not report an estimate of the related error, no one will take the results seriously, he said.

Like many AI researchers, North is also concerned about the unthinkable of the results produced by neural networks; often a system delivers an answer without giving a clear picture of how the result was obtained.

Still, not everyone feels that lack of transparency is necessarily a problem. Lenka Zdeborová, a researcher at the Institute of Theoretical Physics at CEA Saclay in France, points out that human intuitions are often equally impenetrable. You look at a photograph and immediately feel a cat – "but you don't know how you know," she said. "Your own brain is somehow a black box."

It is not only astrophysicists and cosmologists who migrate towards AI-driven, data-driven science. Quantum physicists such as Roger Melko from the Perimeter Institute for Theoretical Physics and the University of Waterloo in Ontario have used neural networks to solve some of the toughest and most important problems in this field, such as how to represent the mathematical "wave function" that describes a many particle systems. AI is crucial because of what Melko calls "exponential curse of dimensionality". Ie that the possibilities for the shape of a wave function grow exponentially with the number of particles in the system described. The difficulty is much like trying to train the best move in a game like chess or go: You are trying to get away for the next step, imagine what your opponent is going to play and then choose the best answer, but with each step the number of possibilities proliferates.

Of course, AI systems have mastered both of these game cases, decades ago and go in 2016, when an AI system called AlphaGo defeated a top human player. They are similarly adapted to problems in quantum physics, says Melko.

The Mind of the Machine

Whether Schawinski is entitled to claim that he has found a "third way" to do science, or whether, as Hogg says, it is only traditional observation and data analysis "on steroids", it is clearly AI change the taste of scientific discovery and it really accelerates it. How far will the AI ​​revolution go into science?

Sometimes, great claims are made about the achievements of a "robo-scientist". A decade ago, an AI robotics named Adam named the gene of baker's yeast and worked as genes responsible for making certain amino acids. (Adam did this by observing strains of yeast that had some genes missing and comparing the results to the behavior of strains that had the genes.) Wired s read, "Robot makes Scientific Discovery All by itself." [19659007] More recently, Lee Cronin, a chemist at the University of Glasgow, has used a robot to randomly mix chemicals to see what kind of new compounds are being formed. Real-time monitoring of the reactions with a mass spectrometer, a nuclear magnetic resonance machine, and an infrared spectrometer, the system finally learned to predict which combinations would be most reactive. Although it does not lead to further discoveries, Cronin has said that the robotic system could allow chemists to accelerate their research by about 90 percent.

Last year, another team of researchers at ETH used Zurich's neural network to divert physical laws from data sets. Their system, a kind of robo-kepler, rediscovered the heliocentric model of the solar system from data on the sun's position and Mars in the sky, seen from the earth, and figure out the law's conservation of momentum by observing colliding balls. Since physical laws can often be expressed in more than one way, the researchers wonder if the system can offer new paths, perhaps easier ways of thinking of known laws.

These are all examples of AI starting the process of scientific discovery, but in any case we can discuss how revolutionary the new approach is. Perhaps most controversial is the question of how much information can be retrieved from the data itself – a pressing issue in the age of stupendously large (and growing) piles of it. In The Book of Why (2018), computer scientist Judea Pearl and science-author Dana Mackenzie claim that data is "deeply stupid". Questions about causation can never be answered from data alone "do they write." When you see a paper or a study that analyzes data in a model-free way, you can be sure that the output from the study will only summarize, and perhaps transform, but not interpret data. "Schawinski sympathizes with Pearl's position, but he described the idea of ​​working with" data alone "as" a piece of a strawman ". He has never argued that its cause and effect in that way, he said." I just say that we can do more with data than we often do conventionally. "

Another often heard argument is that science requires creativity, and that – at least so far – we have no idea how to program it into a machine. Trying everything, like Cronin's robo-chemist, does not seem very creative.) "In a theory with reasoning, I think it requires creativity," says Polsterer. "Every time you need creativity, you need a human." And where does creativity come from? Upholstery suspects that it is related to boredom – something that, he says, a machine cannot experience. "To be creative, you have to dislike being bored. And I don't think a computer will ever get bored." On the other hand, words like "creative" and "inspired" have often been used to describe programs like Deep Blue and AlphaGo . And the struggle to describe what is happening within the "minds" of a machine is reflected by the difficulty we have of examining our own thought processes.

Schawinski recently left the private sector academy; He is now running a start with the name Modulos, which uses a number of ETH researchers and according to its website "works in the eye of the storm with the development in AI and machine learning". Whatever obstacles may lie between current AI technology and full-fledged artificial senses, he and other experts know that the machines are ready to do more and more of the work of human scientists. If there is a limit remains to be seen.

"Is it possible, in the foreseeable future, to build a machine that can detect physics or mathematics that the brightest people live cannot do on their own with biological hardware?" Schawinski wonders. "Will the future of science necessarily be driven by machines that operate at a level that we can never reach? I do not know. It is a good question."

Original story reprinted with permission from Quanta Magazine, an editorial independent publication of The Simons Foundation, whose task is to increase public understanding of science by covering the development of research and trends in mathematics and physical and life sciences.

More Large WIRED Stories

Source link