Skip to content
This repository was archived by the owner on Feb 12, 2024. It is now read-only.
This repository was archived by the owner on Feb 12, 2024. It is now read-only.

RFC: js-ipfs Garbage Collection #2012

Closed
@dirkmc

Description

@dirkmc

This issue is to discuss how best to implement Garbage Collection in js-ipfs

The go-ipfs Garbage Collector

We would like to learn from the experience of go-ipfs when building a Garbage Collector for js-ipfs.

Triggers

  • Manual: ipfs repo gc
  • Daemon flag: --enable-gc causes GC to run
    • only if storage exceeds StorageGCWatermark% of StorageMax (90% of 10G by default)
    • periodically every GCPeriod (1 hour by default)
    • when a file is added to unixfs (currently disabled)

Algorithm

Source code

// GC performs a mark and sweep garbage collection of the blocks in the blockstore
// first, it creates a 'marked' set and adds to it the following:
// - all recursively pinned blocks, plus all of their descendants (recursively)
// - bestEffortRoots, plus all of its descendants (recursively)
// - all directly pinned blocks
// - all blocks utilized internally by the pinner
//
// The routine then iterates over every block in the blockstore and
// deletes any block that is not found in the marked set.
func GC(ctx context.Context, bs bstore.GCBlockstore, dstor dstore.Datastore, pn pin.Pinner, bestEffortRoots []cid.Cid) <-chan Result {

Note that bestEffortRoutes currently only contains the MFS root

Proposal for a js-ipfs Garbage Collector

Requirements

  • Manually Garbage Collect all blocks that are not pinned, not in MFS and not used internally
    ipfs repo gc: mark and sweep - remove all unreachable blocks
  • Passively Garbage Collect old blocks when new blocks are added to the blockstore
    ipfs daemon --enable-gc causes GC to run
    • whenever a block is added to blockstore
    • only if storage exceeds StorageGCWatermark% of StorageMax (90% of 10G by default)
    • remove only excess blocks, least recently accessed blocks first

Algorithm for passive GC

  • Track reference counts using persistent reference counting algorithm
    • reference count increment:
      • each time a block is pinned directly or indirectly
        • pinner must ensure that multiple pins of same hash are only registered once
      • on mfs add
    • reference count decrement:
      • each time a block is unpinned directly or indirectly
      • on mfs rm
    • maintain last block access time

Metadata

Metadata

Labels

P0Critical: Tackled by core team ASAPexp/wizardExtensive knowledge (implications, ramifications) requiredstatus/in-progressIn progress

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions