Open Network Insight: Changing InfoSec Data Science Forever

Categories: Cybersecurity Security

Working in information security (InfoSec) is a data scientist’s dream job. InfoSec is the only part of the organization where a data scientist’s well-articulated argument for new data ingestion is met with high-fives instead of groans. Over time, I have adopted terms I had never previously thought about using to quantify a data store subjectively. Security professionals see better data as the only way to lift the fog of war. They don’t need to throw around terms like “Big Data” and “Data Lake”, instead, people just ask about visibility. InfoSec is still a field where a literal team of people may be working to undo your best work; where planning failure may suddenly turn budgets exponential, and where “secure” is a brass ring often chased alone on the bleeding edge. In a field where secrecy is often the default, open source projects may present a compelling path forward to a next generation security strategy. For organizations running event-driven security teams, Open Network Insight (ONI) is an advanced detection solution that proposes a unique paradigm for threat analysis using big data analytics to provide actionable insights into operational and security threats. An extensible open source project layered on Cloudera’s enterprise data hub (EDH), ONI can analyze billions of events per day in order to detect unknown threats, insider threats, and gain a new level of visibility.


Events in a queue aren’t enough

Visibility describes both the quantity of data available as well as the quality and effectiveness with which that data is used. Visibility is an important topic of discussion in security because it is the fuel that feeds a team’s ability to apply their own skill to the problems at hand. The skill level of the individual actors on a security team and the ability to empower those actors is important. Security teams at companies I’ve been lucky enough to engage with will sometimes try to describe a sense of the day-to-day schedule of their antiparts (literal groups of people working to break or circumvent anything they build).

Visual interaction is a core piece of ONI’s operational analytics. In its current state, the front end of ONI shifts the focus from list reading to pattern searches. Visual keys are designed as call outs to analysts using the system. Large scale data exfiltration is surfaced through symbols that are visually large in size. These types of intuitive queues are mainstays of the BI reporting world that now need to make their home in security.

The Dangers of Complexity

If the antiparts of your security organization weren’t bad enough, the current vendor landscape isn’t exactly a model for team play. The organizational need to purchase, deploy, and ultimately support multiple solutions creates an ever increasing burden. The impact of this burden is usually presented as cost, but the true burden is the real human cost of teams having to manage workflows using “n” number of tools. Failures in interoperability between disparate products prevents organizations from unlocking the true value of their human security capital.

Scale is often a very relative term for a security organization. More than just the raw count of, but the unique combination of endpoints weighed against the way the business layers value estimations over assets. The unique thumbprint for an organization requires scaling a security posture that is equally unique. Any company that scales long enough will increasingly find threat exposures in areas where the market is simply not innovating. In these areas the security organizations ability to create institutional memory around its unique use cases is vitally important. More than captured processes this institutional memory must be quantified in code.

Establishing ONI as an open project has been a core part of its DNA. The current generation underlying the machine learning is purpose driven, but that does not restrict the expandability of the remaining features. Contribution from the community by extending and incorporating additional workflows, information sources, and data models position ONI as a core project InfoSec teams can build on for many years to come.

How can we innovate

Analytics teams wanting to support organizations serious about solving these challenges have a number of obstacles to overcome. ONI opens an important new path for organizations to get past challenges at scale as well as provide a framework for analytics teams to overcome obstacles and create effective value-add analytics. The base state of ONI contains algorithms that address difficult security use cases; which once deployed can be implemented without the direct intervention of data scientists. Advanced machine learning takes analyst feedback continually training and improving the outputs of the system. Finally, the Integration of Jupyter notebooks gives a security analyst the ability to capture process, incorporate external data, leverage existing python libraries, or even pull rule based events from an existing SIM into their workflows. The constant feedback loop created by this approach combined with the license free open nature of the platform present a unique opportunity for organizations to consolidate workflows into a common working area that can be supported and improved by various actors inside InfoSec.


Learn more about ONI here.

Learn more about how Cloudera helps cybersecurity professionals here.


Austin Leahy is a contributor on Open Network Insights, currently living in the Bay Area. When he isn’t contributing to open source Austin consults as the principal data scientist for global threat management at eBay. Austin has built a career advising large organization such as PepsiCo, Blue Cross Blue Shield, General Mills and ACS on building advanced analytics.


Leave a Reply