GitHub search tool 2023-01-12
Explored GitHub events w/ NL queries & SQL.
Data Explorer is an AI-powered tool that allows users to easily and quickly explore GitHub event data. It is built with Chat2Query, an AI-powered SQL generator in TiDB Cloud, and uses GH Archive to collect and archive data since 2011.

Data Explorer enables users to ask questions in natural language and automatically generate SQL queries. The query results are then displayed visually, helping users to discover insights in the data quickly.Data Explorer can be used to explore any dataset, not just GitHub data.

It is capable of handling complex analytical queries, and is optimized to store and serve massive amounts of data. Users can input their question in natural language and Data Explorer will generate the relevant SQL query and output the results visually.

It can also suggest popular questions near the search box to help users explore faster.Data Explorer has limitations, including a lack of context and domain knowledge, and an inability to produce the most efficient SQL statement for large and complex queries.

To help the AI understand your query intention, it is recommended to use clear, specific phrases in your question. Users can also use query templates near the search box to start their exploration.


Pros and Cons


Explore GitHub event data
Built with Chat2Query
Utilizes GH Archive
Natural language SQL generation
Visual query results
Handles complex analytical queries
Optimized for massive data
Question suggestions for faster exploration
Query templates for initial exploration
Capable of exploring any dataset
Optimized for large-volume data
Real-time data updates
Offers pay-as-you-go pricing model
Can analyze up to 5 billion events
Interactive query in 2D and 3D
Live data updates
Query results showcased visually
Supports trend analysis
Comparison between different datasets
Frequent query suggestions
Quick discovery of insights
No SQL knowledge required
Custom dataset import
Historical data from 2011
Database stats tracking
Possible to report issues
Supports complex queries
Shows data in ranking format
Shows language related trends
Handles multiple workload types


Lack of context understanding
No domain knowledge
Inefficient SQL generation
Sometimes service instability
Limited by dataset
No results in misinterpreted questions
Incorrect SQL queries
Chart generation failure
Unable to handle excessive requests
Limited to 15 questions per hour


