CSV data cleaning, validation, and column mapping.

YoBulk is a powerful open source CSV importer that utilizes OpenAI GPT3 to provide advanced column matching, data cleaning and JSON schema generation features.

It is designed to scale, with the ability to process files in the gigabyte range without any glitches or errors. Transformations are done on stream buffers while handling backpressure and pacing gracefully.

The user-friendly spreadsheet interface highlights errors in a clear, concise manner, simplifying the task of cleaning data. Developers can also create a custom CSV importer that includes personalized validation rules based on JSON schema.

YoBulk also offers a Docker image to install the tool on a server, allowing users to do all the data cleaning and onboarding in-house without worrying about data privacy.

YoBulk features GPT3 integration, intelligent column mapping, framework to write own validation rules, no code template generation, delightful error review experience, bring your own database, and YoBulk backend API for headless CSV importing.

Upcoming features include Postgres and MySQL support, 1 click data error fixing, cloud and multi-tenant hosting, NLP models for self data correction, WebHook for custom data processing, and more.

The company has an open source community with Slack and GitHub channels, as well as demo videos and a newsletter.



