Analyzed data using natural language interface.
RTutor is an AI-based tool for data analysis. It provides a natural language interface for users to interact with their data. It can be used to generate R and Python code for various statistical analyses and generate reports in HTML format.

It is built on OpenAI’s powerful text-davinci-003 language model and can translate natural language into R and Python code.It supports data files in CSV, TSV/tab-delimited text files, and Excel formats.

It can also detect data types automatically, convert numeric columns to factors, and generate descriptive summaries and plots. It can also generate code for correlation, GGpairs, and other analyses.RTutor also supports natural language processing in dozens of human languages, including Chinese, Ukrainian, Arabic, Hindi, Spanish, German, French, Luxembourgish, Vietnamese, Portuguese, Japanese, Italian, Persian and more.It can also be used to answer generic questions without mentioning column names as it can detect and understand context.

RTutor is released as a prototype for testing and improvement.RTutor is a personal project of Steven Ge and is freely available for academic and non-profit organizations only.

Commercial use beyond testing is not allowed.


Pros and Cons


Generates R and Python code
Supports CSV, TSV, Excel formats
Auto-detects data types
Generates descriptive summaries
Generates correlation, GGpairs analyses
Supports numerous human languages
Detects and understands context
Detects data types automatically
Auto-convert numeric to factors
Available for academic, non-profit
Generates HTML reports
Prototype for testing, improvement
Translates natural language to code
Generic question answering
Generates code chunks
Data frame auto-loading as 'df'
Voice input optional
Supports multiple languages
Code refinement for analytics
Data visualization and exploration
Generates and runs Python code
Auto convert column as row names
Generates HTML report
Creates interactive plots
Filtering options for data
Provides code chunk records
Sends error messages and results


Only supports CSV, TSV, Excel formats
Commercial use not allowed
Still in prototype/testing phase
Requires data preparation in Excel
Does not execute large datasets (>10MB)
Accuracy of generated code not guaranteed
Generates code only in R and Python
Does not include installed R packages
Can fail due to server load


