Bengaluru: K. VijayRaghavan, Principal Scientific Adviser to the Government of India, has highlighted the distinction between raw data and making sense of it, saying data needs to be distilled well enough to extract information from it.
Delivering the special address at Carnegie India’s Global Technology Summit in Bengaluru Thursday, VijayRaghavan spoke about data, knowledge, and understanding, with a focus on how big data is used in science and technology today.
He said that in the age of big data, where everyone like astronomers, medical professionals and transport engineers keep collecting data, it remains a big challenge to make sense of it to help make meaningful decisions. As data becomes more accessible, this increasingly becomes the focus of big data, he said.
“Information itself is not knowledge,” VijayRaghavan said. “Knowledge comes from understanding the data, the history of the field, the history of other fields, and so on, which requires a lot of work. Converting information to knowledge is a big challenge, where human intervention is often required in a substantial manner.”
VijayRaghavan also elaborated on the importance of bottom-up understanding of the sciences in order to design data-collecting instruments that can extract and collect data efficiently.
Examples from history
“Knowledge” by itself is not “understanding”, VijayRaghavan said, adding that understanding requires figuring out what data to keep and what to throw away. Making sense of big data to then filter through to understanding it is not really new at all, he explained, citing the example of astrophysicist Jocelyn Bell Burnell, who first proposed the existence of pulsars from radio telescope data.
Big data isn’t as new as we think it is, he said, explaining how astronomers have used it since time immemorial to make sense of the universe.
“The theory of evolution by natural selection is an example of the extraordinary analysis of big data, essentially by two people [Charles Darwin and Alfred Russell Wallace], and that data has changed the entire world after that,” he said. Darwin and Wallace had used vast amounts of both biological and geological data to draw patterns of the evolution of life.
Understanding the data to draw generalisable rules came from understanding the fundamentals of diversity of life through the work of people in genetics, such as Thomas Hunt Morgan and Rosalind Franklin.
“If we don’t go deep in understanding variations in data, we end up in trouble,” VijayRaghavan said, referring to the human tendency to make correlations when there are no causations.
He demonstrated the idea with the example of the Soviet-era biologist Trofim Lysenko, who believed a plant of one species could be converted to another by changing climate. This eventually led to an agricultural disaster in the Soviet bloc. “It also led to a scientific disaster, that of people disowning modern biology,” he added.
Three biggest questions
The three biggest questions with data, VijayRaghavan said, are the provenance of data being collected, what its correlations and causations are, and how it helps us understand and act.
The next challenge to act on established causal relationships requires investments in test beds. Lessons can be learned from the fields of astronomy and biology on how data was handled, analysed, and used to draw inferences from, he added.
Theoretical physics is an example of a field where the knowledge is much ahead of technology, and the efficient understanding of the field helps build technology like the Laser Interferometer Gravitational-Wave Observatory (LIGO). However, we tend to collect disproportionately more data today in biology and invest disproportionately less in understanding it, VijayRaghavan stated.
“In any field, we cannot escape into holistic technologies and not understand the basic principles if we want to understand data,” he said.
He concluded that at the end of the day, those who have knowledge have power. But knowledge will not always be available to everyone due to the expensive nature of training and ability to gain the knowledge. He stressed on the need for democratisation of data and learning.
“Data and availability is a liberating force which can overturn possession of knowledge only by those who are powerful. Training at school level is critical, and India as a large democracy can do this and have an impact in a few years,” he said.
ThePrint is a digital partner for Carnegie India’s Global Technology Summit event.
(Edited by Shreyas Sharma)
social experiment by Livio Acerbo #greengroundit #thisisnotapost #thisisart