J.J. Allaire: Founder & CEO of Posit, PBC
RStudio ⥵ Posit (July 2022)
Goal: Become a multi-language data science tools company
Early in the journey, but have made lots of investments already
Today I’ll share some of our work and talk about what’s next
Quarto Basics & Workflow
Introducing Quarto Dashboards
Posit and Python: History and Future
An open-source scientific and technical publishing system that builds on standard markdown with features essential for scientific communication.
Pandoc Markdown
Jupyter Kernels
Dozens of Output Formats
Specialized Project Types
Render to output formats:
# ipynb notebook
quarto render notebook.ipynb
quarto render notebook.ipynb --to docx
# plain text qmd
quarto render notebook.qmd
quarto render notebook.qmd --to pdf
Live preview server (re-render on save):
.qmd
Filespenguins.qmd
---
title: "Palmer Penguins"
author: Norah Jones
date: March 12, 2023
format: html
jupyter: python3
---
```{python}
#| echo: false
import pandas as pd
df = pd.read_csv("palmer-penguins.csv")
df = df[["species", "island", "year", \
"bill_length_mm", "bill_depth_mm"]]
```
## Exploring the Data
See @fig-bill-sizes for an exploration of bill sizes.
```{python}
#| label: fig-bill-sizes
#| fig-cap: Bill Sizes by Species
import matplotlib.pyplot as plt
import seaborn as sns
g = sns.FacetGrid(df, hue="species", height=3)
g.map(plt.scatter, "bill_length_mm", "bill_depth_mm") \
.add_legend()
```
Editable with any text editor (extensions for VS Code, Neovim, and Emacs)
Cells always run in the same order
Integrates well with version control
Cache output with Jupyter Cache or Quarto freezer
Lots of pros and cons visa-vi traditional .ipynb
format/editors, use the right tool for each job
Notebook workflow (no execution occurs by default):
Plain text workflow (.qmd
=> .ipynb
then execute cells):
A new output format for easily creating
dashboards from Jupyter Notebooks
:::
:::
Navigation Bar and Pages — Icon, title, and author along with links to sub-pages (if more than one page is defined).
Sidebars, Rows & Columns, and Tabsets — Rows and columns using markdown heading (with optional attributes to control height, width, etc.). Sidebars for interactive inputs. Tabsets to further divide content.
Cards (Plots, Tables, Value Boxes, Content) — Cards are containers for cell outputs and free form markdown text. The content of cards typically maps to cells in your notebook or source document.
All of these components can be authored and customized within notebook UI or plain text qmd.
```{python}
#| title: GDP and Life Expectancy
import plotly.express as px
df = px.data.gapminder()
px.scatter(
df, x="gdpPercap", y="lifeExp",
animation_frame="year", animation_group="country",
size="pop", color="continent", hover_name="country",
facet_col="continent", log_x=True, size_max=45,
range_x=[100,100000], range_y=[25,90]
)
```
## Row
```{python}
#| component: valuebox
#| title: "Current Price"
dict(icon = "currency-dollar",
color = "secondary",
value = get_price(data))
```
```{python}
#| component: valuebox
#| title: "Change"
change = get_change(data)
dict(value = change['amount'],
icon = change['icon'],
color = change['color'])
```
## Column
```{python}
#| title: Population
px.area(df, x="year", y="pop",
color="continent",
line_group="country")
```
```{python}
#| title: Life Expectancy
px.line(df, x="year", y="lifeExp",
color="continent",
line_group="country")
```
::: {.card}
Gapminder combines data from multiple sources
into unique coherent time-series that can’t be
found elsewhere. Learn more about the Gampminder
dataset at <https://www.gapminder.org/data/>.
:::
Cards provide an Expand button which appears at bottom right on hover:
Dashboards are typically just static HTML pages so can be deployed to any web server or web host.
Static | Rendered a single time (e.g. when underlying data won’t ever change) |
Scheduled | Rendered on a schedule (e.g. via cron job) to accommodate changing data. |
Parameterized | Variations of static or scheduled dashboards based on parameters. |
Interactive | Fully interactive dashboard using Shiny (requires a server for deployment). |
Add a parameters tag to the first cell (based on papermill) :
Use the -P
command line option to vary the parameter:
https://quarto.org/docs/dashboards/interactivity/shiny-python/
For interactive exploration, some dashboards can benefit from a live Python backend
To do this with Quarto Dashboards, add interactive Shiny components
Note that this requires a server for deployment
---
title: "Penguin Bills"
format: dashboard
server: shiny
---
```{python}
import seaborn as sns
penguins = sns.load_dataset("penguins")
```
## {.sidebar}
```{python}
from shiny import render, ui
ui.input_select("x", "Variable:",
choices=["bill_length_mm", "bill_depth_mm"])
ui.input_select("dist", "Distribution:", choices=["hist", "kde"])
ui.input_checkbox("rug", "Show rug marks", value = False)
```
## Column
```{python}
@render.plot
def displot():
sns.displot(
data=penguins, hue="species", multiple="stack",
x=input.x(), rug=input.rug(),kind=input.dist())
```
Shiny for Python applications are built on Starlette and ASGI, and can deployed in server environments that support WebSockets and sticky sessions.
Alternatively, deploy serverless using Pyodide. See the Retirement Simulation example for details.
Founded 14 years ago to create open source software for data science.
Posit is a Public Benefit Corporation with a mission to create open source software for data science, scientific research, and technical communication.
This is built into our charter, and our directors and officers have a fiduciary duty to pursue these public benefits along with balancing the needs of all our stakeholders.
Posit is an independent company and is committed to always being one
Corporate control lies within the company (not with outside investors)
Our imperative is not growth at all costs but rather to build something that is organically sustainable and still here fulfilling its mission in 100 years
Our core aspiration is to be a durable, trustworthy provider of open source software for science.
quarto | Scientific and technical publishing |
quartodoc | Python package documentation |
shiny | Interactive PyData web applications |
vetiver | Deploy and monitor ML models |
suiba | Data manipulation for pandas, duckdb, etc. |
plotnine | Grammer of graphics for Python |
Principle Corporate Sponsor of NumFOCUS
Working on standards for scientific notebook publishing in NotebooksNow
About Quarto | https://quarto.org/ |
Quarto Dashboards | https://quarto.org/docs/dashboards/ |
Shiny for Python | https://shiny.posit.co/py/ |
https://jjallaire.github.io/pydata-quarto-dashboards/