roadmap: what should ipfsspec do?

This issue is meant to discuss the purpose of the ipfsspec fsspec backend and to sharpen the overall design.

## background
Due to the availability of IPFS -> HTTP gateways, a specialized IPFS backend for `fsspec` based read access is not required, as it is possible to open any CID using the http backend by accessing
```
http(s)://<gateway>/ipfs/<CID>
```

the downside of this approach is, that this requires to transform from content-based addressing to location-based addressing in user code. Using gateway-aware urls in user code makes it harder
* to use local gateways
* to do automatic fallback between multiple gateways
* to define a preferred gateway based on the local computing environment

To overcome these downsides, it seems to be beneficial to refer to IPFS resources via a gateway-unaware url like
```
ipfs://<CID>
```
and do the translation to HTTP or IPFS when accessing the resource and based on the local computing environment and settings. **This was the initial idea of ipfsspec**.

## design questions

### Is such a library useful at all?
Or should this translation be implemented on a different layer?

### Should this library do automatic load balancing / fallback between multiple gateways?
* Doing load balancing or fallback properly is not trivial to implement (especially with `async`).
* If the library should just work without user configuration, a solution with fallback is likely required, as otherwise it is not possible to use public gateways and still prefer the local gateway if is available.

### Should the library provide write support?
... and if yes, how?

IPFS is a content addressable storage, thus one can not choose the filename when adding content. In stead, the "filename" is computed based on the stored content. As a result, the signature of a `put` function would rather look like
```
cid = put(content)
```
in stead of
```
put(content, filename)
```
and thus wouldn't directly fit into `fsspec`.

A way out might be to use the [IPFS mutable filesystem](https://docs.ipfs.io/concepts/file-systems/#mutable-file-system-mfs), which adds a local mutable overlay on top of the immutable filesystem. Using MFS it would be possible to incrementally construct a local filesystem hierarchy and ask for a root CID after construction has finished. The downside of this approach is, that this only works locally (or at least local to one gateway) and thus is probably not suited for larger datasets. So there's probably not too much benefit as compared to writing data into a local temporary folder and than `ipfs add -r -H` the entire folder.

A related option might be to pin data blocks one by one and keep the virtual directory in memory. After writing out a larger dataset this way, a root CID for remotely stored datasets could be created. An advantage of this approach might be, that writing could be distributed to multiple remote gateways.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

roadmap: what should ipfsspec do? #7

background

design questions

Is such a library useful at all?

Should this library do automatic load balancing / fallback between multiple gateways?

Should the library provide write support?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

roadmap: what should ipfsspec do? #7

Description

background

design questions

Is such a library useful at all?

Should this library do automatic load balancing / fallback between multiple gateways?

Should the library provide write support?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions