|By William Schmarzo||
|February 18, 2017 04:00 PM EST||
My son Max is home from college and that always leads to some interesting conversations. Max is in graduate school at Iowa State University where he is studying kinesiology and strength training. As part of his research project, he is applying physics to athletic training in order to understand how certain types of exercises can lead to improvements in athletic speed, strength, agility, and recovery.
Figure 1: The Laws of Kinesiology
Max was showing me one drill designed to increase the speed and thrust associated with jumping (Max added 5 inches to his vertical leap over the past 6 weeks, and can now dunk over the old man). When I was asking him about the science behind the drill, he went into great details about the interaction between the sciences of physics, biomechanics and human anatomy.
Max could explain to me how the laws of physics (the study of the properties of matter and energy.), kinesiology (the study of human motion that mainly focuses on muscles and their functions) and biomechanics (they study of movement involved in strength exercise or in the execution of a sport skill) interacted to produce the desired outcomes. He could explain why it worked.
And that is the heart of my challenges with treating data science as a science. As a data scientist, I can predict what is likely to happen, but I cannot explain why it is going to happen. I can predict when someone is likely to attrite, or respond to a promotion, or commit fraud, or pick the pink button over the blue button, but I cannot tell you why that’s going to happen. And I believe that the inability to explain why something is going to happen is why I struggle to call “data science” a science.
Okay, let the hate mail rain down on me, but let me explain why this is an important distinction!
What Is Science?
Science is the intellectual and practical activity encompassing the systematic study of the structure and behavior of the physical and natural world through observation and experiment.
Science works within systems of laws such as the laws of physics, thermodynamics, mathematics, electromagnetism, aerodynamics, electricity (like Ohm’s law), Newton’s law of motions, and chemistry. Scientists can apply these laws to understand why certain actions lead to certain outcomes. In many disciplines, it is critical (life and death critical in some cases) that the scientists (or engineers) know why something is going to occur:
- In pharmaceuticals, chemists need to understand how certain chemicals can be combined in certain combinations (recipes) to drive human outcomes or results.
- In mechanical engineering, building engineers need to know how certain materials and designs can be combined to support the weight of a 40 story building (that looks like it was made out of Lego blocks).
- In electrical engineering, electrical engineers need to understand how much wiring, what type of wiring and the optimal designs are required to support the electrical needs of buildings or vehicles.
Again, the laws that underpin these disciplines can be used to understand why certain actions or combinations lead to predictable outcomes.
Big Data and the “Death” of Why
An article by Chris Anderson in 2006 titled “The End of Theory: The Data Deluge Makes the Scientific Method Obsolete” really called into question the “science” nature of the data science role. The premise of the article was that the massive amounts of data were yielding insights about the human behaviors without requiring the heavy statistical modeling typically needed when using sampled data sets. This is the quote that most intrigued me:
“Google conquered the advertising world with nothing more than applied mathematics. It didn’t pretend to know anything about the culture and conventions of advertising — it just assumed that better data, with better analytical tools, would win the day. And Google was right.”
With the vast amounts of detailed data available and high-powered analytic tools, it is possible to identify what works without having to worry about why it worked. Maybe when it comes to human behaviors, there are no laws that can be used to understand (or codify) why humans take certain actions under certain conditions. In fact, we already know that humans are illogical decision-making machines (see “Human Decision-Making in a Big Data World”).
However, there are some new developments that I think will require “data science” to become more like other “sciences.”
Internet of Things and the “Birth” of Why
The Internet of Things (IOT) will require organizations to understand and codify why certain inputs lead to predictable outcomes. For example, it will be critical for manufacturers to understand and codify why certain components in a product break down most often, by trying to address questions such as:
- Was the failure caused by the materials used to build the component?
- Was the failure caused by the design of the component?
- Was the failure caused by the use of the component?
- Was the failure caused by the installation of the component?
- Was the failure caused by the maintenance of the component?
As we move into the world of IoT, we will start to see increased collaboration between analytics and physics. See what organizations like GE are doing with the concept of “Digital Twins”.
The Digital Twin involves building a digital model, or twin, of every machine – from a jet engine to a locomotive – to grow and create new business and service models through the Industrial Internet.
Digital twins are computerized companions of physical assets that can be used for various purposes. Digital twins use data from sensors installed on physical objects to represent their real-time status, working condition or position.
GE is building digital models that mirror the physical structures of their products and components. This allows them to not only accelerate the development of new products, but allows them to test the products in a greater number of situations to determine metrics such as mean-time-to-failure, stress capability and structural loads.
As the worlds of physics and IoT collide, data scientist will become more like other “scientists” as their digital world will begin to be governed by the laws that govern disciplines such as physics, aerodynamics, chemistry and electricity.
Data Science and the Cost of Wrong
Another potential driver in the IoT world is the substantial cost of being wrong. As discussed in my blog “Understanding Type I and Type II Errors”, the cost of being wrong (false positives and false negatives) has minimal impact when trying to predict human behaviors such as which customers might respond to which ads, or which customers are likely to recommend you to their friends.
However in the world of IOT, the costs of being wrong (false positives and false negatives) can have severe or even catastrophic financial, legal and liability costs. Organizations cannot afford to have planes falling out of the skies or autonomous cars driving into crowds or pharmaceuticals accidently killing patients.
Traditionally, big data historically was not concerned with understanding or quantifying “why” certain actions occurred because for the most part, organizations were using big data to understand and predict customer behaviors (e.g., acquisition, up-sell, fraud, theft, attrition, advocacy). The costs associated with false positives and false negatives were relatively small compared to the financial benefit or return.
And while there may never be “laws” that dictate human behaviors, in the world of IOT where organizations are melding analytics (machine learning and artificial intelligence) with physical products, we will see “data science” advancing beyond just “data” science. In IOT, the data science team must expand to include scientists and engineers from the physical sciences so that the team can understand and quantify the “why things happen” aspect of the analytic models. If not, the costs could be catastrophic.
SYS-CON Events announced today that Ocean9will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Ocean9 provides cloud services for Backup, Disaster Recovery (DRaaS) and instant Innovation, and redefines enterprise infrastructure with its cloud native subscription offerings for mission critical SAP workloads.
Mar. 25, 2017 05:15 PM EDT Reads: 1,886
In his session at @ThingsExpo, Eric Lachapelle, CEO of the Professional Evaluation and Certification Board (PECB), will provide an overview of various initiatives to certifiy the security of connected devices and future trends in ensuring public trust of IoT. Eric Lachapelle is the Chief Executive Officer of the Professional Evaluation and Certification Board (PECB), an international certification body. His role is to help companies and individuals to achieve professional, accredited and worldw...
Mar. 25, 2017 04:00 PM EDT Reads: 451
SYS-CON Events announced today that Technologic Systems Inc., an embedded systems solutions company, will exhibit at SYS-CON's @ThingsExpo, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Technologic Systems is an embedded systems company with headquarters in Fountain Hills, Arizona. They have been in business for 32 years, helping more than 8,000 OEM customers and building over a hundred COTS products that have never been discontinued. Technologic Systems’ pr...
Mar. 25, 2017 01:45 PM EDT Reads: 3,260
SYS-CON Events announced today that CA Technologies has been named “Platinum Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY, and the 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. CA Technologies helps customers succeed in a future where every business – from apparel to energy – is being rewritten by software. From ...
Mar. 25, 2017 01:30 PM EDT Reads: 1,653
The taxi industry never saw Uber coming. Startups are a threat to incumbents like never before, and a major enabler for startups is that they are instantly “cloud ready.” If innovation moves at the pace of IT, then your company is in trouble. Why? Because your data center will not keep up with frenetic pace AWS, Microsoft and Google are rolling out new capabilities In his session at 20th Cloud Expo, Don Browning, VP of Cloud Architecture at Turner, will posit that disruption is inevitable for c...
Mar. 25, 2017 01:15 PM EDT Reads: 2,021
SYS-CON Events announced today that Cloudistics, an on-premises cloud computing company, has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Cloudistics delivers a complete public cloud experience with composable on-premises infrastructures to medium and large enterprises. Its software-defined technology natively converges network, storage, compute, virtualization, and management into a ...
Mar. 25, 2017 12:45 PM EDT Reads: 1,848
Keeping pace with advancements in software delivery processes and tooling is taxing even for the most proficient organizations. Point tools, platforms, open source and the increasing adoption of private and public cloud services requires strong engineering rigor - all in the face of developer demands to use the tools of choice. As Agile has settled in as a mainstream practice, now DevOps has emerged as the next wave to improve software delivery speed and output. To make DevOps work, organization...
Mar. 25, 2017 12:45 PM EDT Reads: 1,640
The explosion of new web/cloud/IoT-based applications and the data they generate are transforming our world right before our eyes. In this rush to adopt these new technologies, organizations are often ignoring fundamental questions concerning who owns the data and failing to ask for permission to conduct invasive surveillance of their customers. Organizations that are not transparent about how their systems gather data telemetry without offering shared data ownership risk product rejection, regu...
Mar. 25, 2017 12:30 PM EDT Reads: 5,047
SYS-CON Events announced today that Loom Systems will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Founded in 2015, Loom Systems delivers an advanced AI solution to predict and prevent problems in the digital business. Loom stands alone in the industry as an AI analysis platform requiring no prior math knowledge from operators, leveraging the existing staff to succeed in the digital era. With offices in S...
Mar. 25, 2017 12:30 PM EDT Reads: 1,146
SYS-CON Events announced today that Interoute, owner-operator of one of Europe's largest networks and a global cloud services platform, has been named “Bronze Sponsor” of SYS-CON's 20th Cloud Expo, which will take place on June 6-8, 2017 at the Javits Center in New York, New York. Interoute is the owner-operator of one of Europe's largest networks and a global cloud services platform which encompasses 12 data centers, 14 virtual data centers and 31 colocation centers, with connections to 195 add...
Mar. 25, 2017 12:00 PM EDT Reads: 892
SYS-CON Events announced today that SoftLayer, an IBM Company, has been named “Gold Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York, New York. SoftLayer, an IBM Company, provides cloud infrastructure as a service from a growing number of data centers and network points of presence around the world. SoftLayer’s customers range from Web startups to global enterprises.
Mar. 25, 2017 11:15 AM EDT Reads: 1,511
SYS-CON Events announced today that CrowdReviews.com has been named “Media Sponsor” of SYS-CON's 20th International Cloud Expo, which will take place on June 6–8, 2017, at the Javits Center in New York City, NY. CrowdReviews.com is a transparent online platform for determining which products and services are the best based on the opinion of the crowd. The crowd consists of Internet users that have experienced products and services first-hand and have an interest in letting other potential buyers...
Mar. 25, 2017 11:00 AM EDT Reads: 3,531
SYS-CON Events announced today that T-Mobile will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. As America's Un-carrier, T-Mobile US, Inc., is redefining the way consumers and businesses buy wireless services through leading product and service innovation. The Company's advanced nationwide 4G LTE network delivers outstanding wireless experiences to 67.4 million customers who are unwilling to compromise on ...
Mar. 25, 2017 10:45 AM EDT Reads: 2,063
SYS-CON Events announced today that Infranics will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Since 2000, Infranics has developed SysMaster Suite, which is required for the stable and efficient management of ICT infrastructure. The ICT management solution developed and provided by Infranics continues to add intelligence to the ICT infrastructure through the IMC (Infra Management Cycle) based on mathemat...
Mar. 25, 2017 10:00 AM EDT Reads: 2,906
SYS-CON Events announced today that SD Times | BZ Media has been named “Media Sponsor” of SYS-CON's 20th International Cloud Expo, which will take place on June 6–8, 2017, at the Javits Center in New York City, NY. BZ Media LLC is a high-tech media company that produces technical conferences and expositions, and publishes a magazine, newsletters and websites in the software development, SharePoint, mobile development and commercial UAV markets.
Mar. 25, 2017 09:15 AM EDT Reads: 4,216
SYS-CON Events announced today that Telecom Reseller has been named “Media Sponsor” of SYS-CON's 20th International Cloud Expo, which will take place on June 6–8, 2017, at the Javits Center in New York City, NY. Telecom Reseller reports on Unified Communications, UCaaS, BPaaS for enterprise and SMBs. They report extensively on both customer premises based solutions such as IP-PBX as well as cloud based and hosted platforms.
Mar. 25, 2017 08:30 AM EDT Reads: 2,026
"I think that everyone recognizes that for IoT to really realize its full potential and value that it is about creating ecosystems and marketplaces and that no single vendor is able to support what is required," explained Esmeralda Swartz, VP, Marketing Enterprise and Cloud at Ericsson, in this SYS-CON.tv interview at @ThingsExpo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Mar. 25, 2017 08:00 AM EDT Reads: 4,072
In his keynote at @ThingsExpo, Chris Matthieu, Director of IoT Engineering at Citrix and co-founder and CTO of Octoblu, focused on building an IoT platform and company. He provided a behind-the-scenes look at Octoblu’s platform, business, and pivots along the way (including the Citrix acquisition of Octoblu).
Mar. 25, 2017 08:00 AM EDT Reads: 14,010
SYS-CON Events announced today that HTBase will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. HTBase (Gartner 2016 Cool Vendor) delivers a Composable IT infrastructure solution architected for agility and increased efficiency. It turns compute, storage, and fabric into fluid pools of resources that are easily composed and re-composed to meet each application’s needs. With HTBase, companies can quickly prov...
Mar. 25, 2017 06:45 AM EDT Reads: 2,788
Web Real-Time Communication APIs have quickly revolutionized what browsers are capable of. In addition to video and audio streams, we can now bi-directionally send arbitrary data over WebRTC's PeerConnection Data Channels. With the advent of Progressive Web Apps and new hardware APIs such as WebBluetooh and WebUSB, we can finally enable users to stitch together the Internet of Things directly from their browsers while communicating privately and securely in a decentralized way.
Mar. 25, 2017 03:00 AM EDT Reads: 5,774