TAAFT
Free mode
100% free
Freemium
Free Trial
Deals
March 20, 2026
Use tool
Inputs:
TabularAPIText
Outputs:
TabularAPIText
Easy data preparation with AI-powered operators.
DataFlow website
Featured alternatives MyReport MyReport
203,534
 Komos - AI That Runs Your Repetitive Work Komos - AI That Runs Your Repetitive Work
3,365
GoAI GoAI
23,335
Pagesmith.ai Pagesmith.ai
24,862
Macky Macky
26,345
Termzy AI Termzy AI
6,627
Anna Anna
199

Overview

OpenDCAI/DataFlow is a tool developed for data preparation and training. It's intended to generate, refine, evaluate and filter high-quality data for AI from noisy sources such as PDFs, plain text, and low-quality QA.

This tool aims to improve the performance of large language models (LLMs) through targeted training in specific domains like healthcare, finance, legal, and academic research.

The system incorporates operator-based design to transform the entire data cleaning workflow into a reproducible, reusable, and shareable pipeline. This serves as the core infrastructure for the Data-Centric AI community.

Additionally, OpenDCAI/DataFlow has an intelligent agent capability that can dynamically assemble new pipelines by either recombining existing operators or creating new ones based on demand.

This tool assists in generating high-quality LLM training datasets from raw data using visual, low-code pipelines with flexible orchestration across domains and use cases.

The tool also includes text, math, and code data generation, as well as tools like AgenticRAG and Text2SQL for data creation. Other features include large-scale PDF to QA conversion and structured data extraction.

Show more

Releases

Get notified when a new version of DataFlow is released
DataFlow icon
Initial release
March 20, 2026
Initial release of DataFlow.
Author

Pricing

Pricing model
Free
Paid options from
Free
Save
TAAFT 0
0 AIs selected
Clear selection
#
Name
Task