Control data distribution while allowing the flexibility to deliver data anywhere.

Cloudera DataFlow for the Public Cloud (CDF-PC) is a cloud-native universal data distribution service powered by Apache NiFi ​​that lets developers connect to any data source anywhere with any structure, process it, and deliver to any destination.

CDF-PC offers a flow-based low-code development paradigm that aligns best with how developers design, develop, and test data distribution pipelines. With over 400+ connectors and processors across the ecosystem of hybrid cloud services—including data lakes, lakehouses, cloud warehouses, and on-premises sources—CDF-PC provides indiscriminate data distribution. These data distribution flows can then be version-controlled into a catalog where operators can self-serve deployments to different runtimes.



Universal Data Distribution Service Powered by Apache NiFi

CDF for Public Cloud diagram

Connect to any data source anywhere, process, and deliver to any destination

Use cases

  • Data Lakehouse Ingest
  • Cybersecurity & log optimization
  • IoT & Streaming Data Collection

Data lakehouse ingest

Modernize data pipelines with a single tool that works with any data lakehouse or warehouse.

With support for more than 400 processors, CDF-PC makes it easy to collect and transform data into the format that your lakehouse of choice requires.

CDF-PC provides the flexibility to treat unstructured data as such and achieve high throughput by not having to enforce a schema or give unstructured data a structure by applying a schema and use the NiFi expression language or SQL queries to easily transform your data.

Data Lakehouse ingest diagram

Cybersecurity & log optimization

Enable data analysts to detect and analyze events faster and more accurately by curating  SIEM data.

Lower the cost of your cybersecurity solution by modernizing the data collection pipelines to collect and filter real-time data from thousands of sources worldwide.

Ingesting all device and application logs into your SIEM solution is not a scalable approach from a cost and performance perspective. CDF-PC allows you to collect log data from anywhere and filter out the noise, keeping the data stored in your SIEM system manageable.

DataFlow Monitoring screenshot

IoT & streaming data collection

Send data from IoT devices at the edge to a central data flow in the cloud that scales up and down as needed.

CDF-PC is built for handling streaming data at scale, allowing organizations to start their IoT projects small, but with the confidence that their data flows can manage data bursts caused by adding more source devices as well as handle intermittent connectivity issues.

IoT and Streaming Data Collection diagram

Key features

NiFi Flow Deployments automatically scale up and down based on CPU utilization. Infrastructure costs can be controlled by setting minimum and maximum boundaries for auto-scaling.

Connect to any data source or target using NiFi's rich processor library, including on-premises data sources, cloud data storage, cloud data warehouses, log data sources, cloud data analytics services, or cloud business process services. Developers can also quickly deploy a predefined set of data flows with minimal configuration called ReadyFlows to implement the most common data flow use cases.

Monitor all your NiFI flow deployments in a single dashboard, no matter on which cloud they're running. Track important flow performance metrics by defining KPI alerts for your flow deployments.

Easily provision secure, stable, and scalable endpoints, making it easy for any application to send data to flow deployments.

CDF-PC is built with automation in mind. Any action that is performed in the UI can be turned into a CLI statement for automation. Deploying a new NiFi flow is as easy as executing a single CLI command.

Define who can enable CDF-PC, create new flow deployments or monitor existing flow deployments by assigning predefined roles like Flow Administrator or Flow User to individual CDP users or groups.

DataFlow for Public Cloud flow diagram


as auto-scaling Kubernetes clusters

Experience DataFlow for the Public Cloud for yourself

Collect data from the edge and stream to the universal distribution service

Manage, control, and monitor the edge for streaming and IoT initiatives and deliver real-time streaming data with no-code ingestion and management with Cloudera Edge Management.

Get started


Find technical specs, architecture, and tutorials about Cloudera DataFlow for the Public Cloud.

Learn more


Evaluate Cloudera DataFlow for the Public Cloud pricing across public cloud instances.

Get the details


Get a hands on tour of Cloudera DataFlow for the Public Cloud.

Access now


Connect with your peers, ask questions, troubleshoot, and learn more about Apache NiFi.

Explore now


Book a three day hands-on training course on Apache NiFi fundamentals and more.

Go learn


Watch the introduction and demonstration of Cloudera DataFlow for the Public Cloud.

Go watch


End to End Universal Data Distribution Demo


Moving enterprise data from anywhere to any system made easy


Blog: Streaming Edge Data Collection and Global Data Distribution


Ask the experts: Cloudera DataFlow for the Public Cloud

World-class training, support, & services

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.