Skip to content

Improve documentation on how to handle binary and non-binary files (local/remote, up-/download) #595

Closed
@do-me

Description

@do-me

Checklist

  • I added a descriptive title
  • I searched for other issues and couldn't find a duplication
  • I already searched in Google and didn't find any good information or help

What is the issue/comment/problem?

There are a few issues around here concerned with file handling (#588, #558, #463, #151 amongst others).
It would be nice to have a dedicated section in the docs with the recommended way of doing things for binary and non-binary files.
Summed up:

Local

  • Load local file to browser (covered here or here)
  • Download file from browser to local (two examples here with file picker, but non-binary data only)

Remote

Due to the different nature of (non-) binary files (e.g. excel or genereally zip files), it would be very useful to have the differentiation included as else one stumples across missing await's or similar.

I think most of the above points are already described somewhere but I'm missing an example of how to conveniently access the virtual file system in order to download something locally.

Let's consider this:

from pyodide.http import pyfetch
import asyncio
import pandas as pd 
import openpyxl
from io import BytesIO

response = await pyfetch(url="/downloads/test.xlsx", method="GET")
bytes_response = await response.bytes()
df = pd.read_excel(BytesIO(bytes_response))
df

That's the (currently) easiest way of loading binary files. If I call df.to_excel("test_output.xlsx") and df.to_csv("test_output.csv") pandas will save the output to the virtual file system.

What's the best way of automatically starting the download from the browser to local when pandas is done saving to the virtual file system or could this even be skipped in some way? Do we need to use some js proxy, js buffer for the hooks or would you simply use some pyodide function for this?

Metadata

Metadata

Assignees

Labels

backlogissue has been triaged but has not been earmarked for any upcoming releasetag: docsRelated to the documentation

Type

No type

Projects

Status

Next

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions