Category Archives: Data Science

Continuous Ingest in the Face of Data Drift (Part 2)

Categories: Analytic Database Data Science Enterprise Data Hub General Partners Product

In my previous post I discussed the causes and impacts of data drift, a natural consequence of Big Data which creates serious data quality and data pipeline operational issues. Now I will describe the features of StreamSets Data Collector, how they address ingesting data in a “drifty” environment and describe some common use cases. StreamSets…

Read More

Continuous Ingest in the Face of Data Drift (Part 1)

Categories: Analytic Database Data Science Enterprise Data Hub General Partners Product

Big data has come a long way, with adoption accelerating as CIOs recognize the business value of extracting insights from the troves of data collected by their companies and business partners. But, as is often the case with innovations, mainstream adoption of big data has exposed a new challenge: how to ingest data continuously from…

Read More

Trifacta & Cloudera Navigator – Bringing User-Generated Context to Apache Hadoop Metadata

Categories: Analytic Database Data Science Enterprise Data Hub General Partners

Previously, we announced that the leaders in the data governance space have joined Cloudera to provide a unified foundation for open metadata and end-to-end visibility for governance. Today, we are happy to host this guest blog from Sean Ma, Director of Product Management at Trifacta. —– In the last couple years, organizations have dramatically changed the way they…

Read More