Skip to content

Gradio for Data Science

Closed May 3, 2023 89% complete

Gradio for Data Science Notes

Goal: Make gradio a go-to tool in the day-to-day work of a data scientist.

Key takeaways:
1. Data scientists spend 90% cleaning and understanding data.
2. 10% of the time is demoing machine learning models.
3. We think Gradio can add value to users by making the intermediate outputs of points 1 and 2 easier to share.
Merve sh…

Gradio for Data Science Notes

Goal: Make gradio a go-to tool in the day-to-day work of a data scientist.

Key takeaways:
1. Data scientists spend 90% cleaning and understanding data.
2. 10% of the time is demoing machine learning models.
3. We think Gradio can add value to users by making the intermediate outputs of points 1 and 2 easier to share.
Merve shared this story about having to share results of an analysis via Flask app in her previous job 😂
4. Observations from conferences - need to onboard new users as easily as possible
5. Demos built by data scientists

Suggested Improvements to UX/UI:
* Dataframes could not display a dynamic number of columns. Had to use a file uploader as opposed to a dataframe component for this dataset profiler. This may have been fixed. Action Item: Verify and update the demo to use a dataframe?
* The dataset profiler demo could not display html output so had to upload the output to a new space. Action Item: if gr.HTML can fix the problem.
* Data scientists will likely want to display multiple plots in their demo. Not intuitive how to control the layout of multiple plots (especially when the # of plots is dynamic.)
Have to either put all the plots in one mega plot in python code or get creative with for loops in the layout creation code. Example this anomaly detection demo. Related discord thread Action Item: Can the Gallery be used for this example?
* Some confusion about what to return from matplotlib (fig vs plt) when populating a gr.Plot component. Related discord thread. Action Item: Can we make this easier for users?

New Features/Integrations:
* Make it possible to automatically turn a skops model on the hub to a gradio app, just like we do for transformers with gr.Interface.load("...")
* Short term -> Dataframe input to either a Label or Number output. skops models have concept of examples so add examples to gradio app as well.
* Long term -> Richer input components depending on the expected model dtypes, e.g. Dropdown for categoricals, check box for bools. Automatically add prediction explanations.
* Gradio for dashboards
* Make it easy to connect gradio to a live data feed. Programmatically query for new data on a fixed schedule and update the demo.
* Gradio for EDA
* Make it easy to generate exploratory reports of data health with gradio. Example streamlit demo.