Complex Data Analytics Overview
Complex data analytics refers to the use of advanced algorithmic methods to process large unstructured data sets effectively. Computers do analytical processing of data, in the past, this was largely about individual machines acting on individual well-defined data structures. We called this data analytics, which is the use of computers to analyze data and find meaningful patterns within it that can be used to make decisions. Today computing is evolving to cloud platforms, advanced algorithms, and big data and we can call this advanced analytics or complex analytics. Complex data analytics is the use of advanced algorithms to process big data structures.
With the convergence of cloud computing platforms, advances in algorithms, the growth of unlabeled big data sources and now the internet of things the revolution in information is entering a new stage, with the capacities of information technology greatly expanding. The creation of personal computing, the internet, and mobile devices has created a flood of new data sources. In response, computing is moving up from individual machines with well-defined instructions acting on well defined individual data sets, to now running on clusters of machines, on massive amounts of unstructured data, using qualitatively different algorithms in the form of machine learning. In this process, we are collecting ever more data about ever more aspects of our world, we bring that into data centers and apply ever more sophisticated mathematical models and computer science methods to bear on building algorithms that allow us to look into this big data, to see what we have never seen before. A world that was previously only accessible through our imagination is being presented to us as real data and visualizations.
Our data production is on an exponential growth curve with no end in sight. Our global information network is now growing at some 205,000 new gigabytes per second. A constant barrage of web searches, email, e-commerce transactions, chats, blog posts, social media feeds, data streams from factories, cars, closed-circuit TV, from financial markets, transport systems, mining equipment and buildings creates a continuous stream of structured and unstructured data. More information crosses the internet every second now than was in the entire internet just 15 years ago. As we begin to instrument our world with sensors and mobile computing our every action becomes data, it is sent to the cloud where huge modular algorithmic frameworks process and cross-correlate it with the data from everyone else. Everything becomes data, your movement, purchases, traffic, and the data gets moved to the cloud where it gets processed and compared with data from other devices, it is no longer just what you do but what everyone else also does.
We begin to be able to see and connect our individual actions with those of others and the whole of the systems that we form part of like never before. When hundreds of millions of people and devices start to contribute data we can start to see patterns emerge from across society or across the whole world. But the challenge is that 80% of all this data that is created is dark, unstructured data1. Data that the computers we have developed in the last 40 years are not able to analyze efficiently we miss 80% of the knowledge inside of this data. Unlocking this unstructured complex set of data sources requires new models and algorithms and this is the third part of the puzzle that has clicked into place only just recently.
Not only do we have a new computing infrastructure available to organizations on-demand and a wealth of new data sources but now we how new paradigm to algorithms in the form of machine learning systems. The algorithms of the past were well-defined rules that were pre-specified and hard-coded into the software, they were mechanistic in nature. Recent breakthroughs in machine learning, neural nets, and deep learning techniques have opened up new possibilities for processing large and unstructured data sets. Today a deep learning algorithm can easily deal with tens of millions of parameters and billions of connections, meaning they can do things that were previously unimaginable. Drive cars, detect security anomalies, analyze job applications, process insurance claims, coordinate traffic and the list is ever expanding. These machine learning algorithms often take the form of self-organizing computational networks, as exemplified by the hugely successful approach of deep learning. This approach enables computers to act on large unstructured datasets and to derive insight from them. As consequence algorithms are no longer confined to the internal workings of your computer but can now expand out into the world acting on ever larger, more complex data structures.
An algorithmic revolution is underway as we shift more and more of our systems of organization to cloud platforms. At the heart of those platforms will be advanced analytics which is used to coordinate and optimize the network, whether we are talking about on-demand car sharing or e-commerce and logistics platforms. With the current rise of cloud platforms, we are in the process of converting centralized closed organizations into large open networks. These networks will be based around market dynamics but to coordinate and optimize such complex systems will require the use of advanced analytics. Just as the vast user networks of Facebook, Uber, Alibaba, and Amazon are coordinated via advanced analytics the same will be true for all organizations. Mastering this new paradigm means not just understanding data science and machine learning but also how they operate in the context of this emerging platform economy.
The last major component is the integration of complex analytics with the internet of things. Machine learning will be delivered as a service over the internet and the smartness that it delivers will flow to all kinds of things, as physical systems of all kind start to exhibit new forms of adaptive, responsive, autonomous and smart behavior. The internet is in the process of coming offline, into the physical world and machine learning is a central element in this as it enables the ingestion of large amounts of unstructured data it enables machines to interpret and understand the physical environment, human behavior and likewise interact with people in a fluid fashion. Not only do these advances in algorithms enable mass automation and the proliferation of autonomous robots into the everyday world, but more significantly advanced analytics is increasingly being connected into whole physical infrastructure systems. The smart grid will throw off massive amounts of big data and be coordinated via complex analytical systems, performing dynamic load balancing, dynamic pricing, performance reporting, predictive maintenance etc. The same will be true for transport systems – whole metro systems like that of Dubai are now automated – for mines, for fleets of ships. Rolls-Royce, for example, is partnering with Google’s Cloud Machine Learning Engine as they research and develop the next generation fleet of autonomous ships.
The rise of big data and advanced analytics represents a profound change in both how we understand the world, make decisions and act on those decisions, the depth, scope and significance of which is difficult to overstate. In a recent paper by Ericsson2 the authors capture some of the significance when they note “In contrast to digitalisation, which enabled productivity improvements and efficiency gains on already-existing processes, datafication promises to completely redefine nearly every aspect of our existence as humans on this planet. Significantly beyond digitalisation, this trend challenges the very foundations of our established methods of measurement and provides the opportunity to recreate societal frameworks, many of which have dictated human existence for over 250 years.”
Advanced data analytics can be interpreted as simply how we manage our world in an age of complex systems. It holds out the possibility of actually seeing and understanding the complex systems that now run our world, from transport networks to social networks to cities and global supply chains, we actually have the possibility to manage these systems in a new way. Sean Gourley founder of Quid states it clearly3 when he says “We live in a very complex world, there are 7 billion minds now, and those 7 billion minds have created a world that not one of them can understand and yet we still have to make decisions we have to decide whether or not to sent troops to Iraq, we have to decide what to do about climate change and we have to decide how to deal with a global financial market that doesn’t want to stay still.”
Complex analytics enables a new form of data-driven understanding to our world, in that it enables us to visualize and in some way see these systems that have become so complex that no one person can comprehend them. Big Data and new visualization methods can abstract away from the underlying complexity to present a quick, high-level view to an otherwise impenetrable complicated data set. Financial markets that today are hidden behind layers of opaque, complicated obscurity, could be seen and grasped by every trader in the market. Billions of data points from around the planet could be ingested, cross-correlated and visualized to deliver a real-time vision of global security threats to everyone on the planet in a way that anyone could understand in a few seconds. The threats of climate change, the risks of cybersecurity, the real social and environmental impacts of your current purchase, all could be made transparent to any one of us, enabling us all to take responsibility for our actions and incentivizing us to make the right decisions. As MIT Professor Alex Pentland4 put it “This is the first time in human history that we have the ability to see enough about ourselves that we can hope to actually build social systems that work qualitatively better than the systems we’ve always had.” This is not just true for society it is true for all of the complex systems that now make up our engineered environment.
Datafication changes the nature of how decisions are made in society. Companies’ decision-making processes have undergone a tremendous shift in the last 20 years. Enterprises are changing their centers of gravity in their decision making units from human expertise to big data-driven systems. This shift can be attributed to people’s limited information processing capabilities in relation to the explosion of data. Take for example the shipping company Maersk Line which operates a global network with a total seaborne freight of over 2 million containers that travel to 350 different ports and work to move about 15 percent of the worlds sea freight. They estimate that they are spending more than 1 billion dollars a year moving empty containers back and forth. No human could begin to reason about how to effectively coordinate such a system, but Maersk is using data and analytics to automate and optimize where empty containers go next and strip out the wasted resources in the network.5
This is though, a relatively simple example compared to the data challenges that societies face going forward. As the sea of data gets larger, the haystack gets larger and it becomes more difficult to find the needle. While those who fail to evolve get lost in the noise and paralyzed by the complexity, the winners at this game are those that can use this technology to see through that complexity, to find the signal in the noise with which to move fast and strategically in so doing radically outperform their peers. In an information economy, it is not the big fish that eat the small, but the smart fish that are able to see what is coming and adapt the fastest that survive. For large organizations to be those smart adaptive fish is a huge challenge and mastering big data analytics is at the heart of it. The key feature of successful organizations in the age of datafication is their ability to capture and effectively analyze the wealth of data available to them and quickly convert it into actionable insights.
Not only does big data analytics offer new ways of knowing our world through data and visualizations and new ways of making decisions through advanced algorithms, but mass automation likewise offers new ways to execute on those decisions. For better or worse mass automation of physical systems and basic services is now here. Around the planet from Germany to Japan physical systems are becoming automated and connected up to the cloud. With the rise of cyber-physical technologies and autonomous systems, the nature of how we manage and control our environment is also changing fast. As Steve Lohr, of the New York Times puts it6 “indeed the long view of the technology is that it will become a layer of data-driven artificial intelligence that resides on top of both the digital and the physical realms and today we’re seeing the early steps toward that vision.” Although Steve Lohr statement has a touch of science fiction to it, the surprising thing is that science fiction is becoming a new reality in our world.
1. Emc.com. (2018). [online] Available at: https://www.emc.com/collateral/analyst-reports/idc-the-digital-universe-in-2020.pdf [Accessed 8 Feb. 2018].
2. Ericsson.com. (2018). [online] Available at: https://www.ericsson.com/assets/local/news/2014/4/the-impact-of-datafication-on-strategic-landscapes.pdf [Accessed 8 Feb. 2018].
3. YouTube. (2018). Big Data and the Rise of Augmented Intelligence: Sean Gourley at TEDxAuckland. [online] Available at: https://www.youtube.com/watch?v=mKZCa_ejbfg [Accessed 8 Feb. 2018].
4. Edge.org. (2018). REINVENTING SOCIETY IN THE WAKE OF BIG DATA | Edge.org. [online] Available at: https://www.edge.org/conversation/reinventing-society-in-the-wake-of-big-data [Accessed 8 Feb. 2018].
5. YouTube. (2018). Maersk Line: Using the Internet of Things, Data, and Analytics to Change Their Culture and Strengthe. [online] Available at: https://www.youtube.com/watch?v=KEC5DQqCykI [Accessed 8 Feb. 2018].
6. Google Books. (2018). Data-ism. [online] Available at: https://goo.gl/jB1xpq [Accessed 8 Feb. 2018].