Skip to content

Databased-backed component args/kwargs #119

Closed
@Archmonger

Description

@Archmonger

Current Situation

Currently, args/kwargs are serialized and stored in client-sided HTML. However, this is susceptible to spoofing, and makes it impossible to pass in non-serializable values such as Python objects.

Proposed Actions

During a HTTP request...

  1. Check if the component's signature contains any args/kwargs
    • Exit here if no args/kwargs exist
  2. Store the args/kwargs into a container (dataclass)
  3. Serialize this container into a byte string using dill.dumps.
  4. Store this byte string in a uniquely identifiable way.
    • This will be locked down to a specific component uuid.

During a WS component load...

  1. Check if the component's signature contains any args/kwargs
    • Render the component without args/kwargs if none exist
  2. Retrieve the bytes string from the database using uuid and/or session identifier.
  3. Deserialize the data back into an object using dill.loads
  4. Render the component using the deserialized args/kwargs

Now this brings up a question about data retention. Ensuring automatic deletion of data is going to be fairly important, and this is a common issue with things such as Django's database-backed sessions model.

Given that, we will need to store a expiration_date value within the ComponentParams database model. In that scenario, we have three options we can pick-and-match from.

Automatic deletion of expired sessions on...

  1. Django Start-up
    • I recommend we implement this regardless
  2. Periodic Schedule (background task)
    • If technologically feasible without external dependencies, we should do this.
    • Django Channels supports background worker processes, but I'll need to dig into whether we can programmatically start it.
    • My guess is this won't be technologically feasible without mandating users to run a separate python manage.py runworker command alongside their webserver, which doesn't feel clean/simplistic.
    • Maybe we can implement this as an "alternative option" for users willing to go through the hassle of manually running a background worker?
  3. WS connection
    • Doing a DB operation on every WS connection would be cumbersome and overkill
    • However, it's possible for us to rate-limit this by storing last-run timestamps within cache, and only re-running if we hit some configured time threshold.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions