Skip to content

Commit

Permalink
feat: require content-type parser to set content-type (#423)
Browse files Browse the repository at this point in the history
* adds `contentTypeParser` function to createVerifiedFetch options & implements it.
* renamed `getStreamAndContentType` to `getStreamFromAsyncIterable` that now returns a stream with the firstChunk seen, so we can pass it to the `contentTypeParser` function.
* updates tests in packages/verified-fetch & packages/interop
* updates packageDocumentation with example

Related ipfs/helia#416
Fixes ipfs/helia#422
---------

Co-authored-by: achingbrain <alex@achingbrain.net>
  • Loading branch information
SgtPooki and achingbrain authored Feb 8, 2024
1 parent 7cbeed0 commit b39d07c
Show file tree
Hide file tree
Showing 11 changed files with 286 additions and 130 deletions.
8 changes: 8 additions & 0 deletions .aegir.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
/** @type {import('aegir').PartialOptions} */
const options = {
build: {
bundlesizeMax: '132KB'
}
}

export default options
24 changes: 24 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,30 @@ const resp = await fetch('ipfs://bafy...')
const json = await resp.json()
```

### Custom content-type parsing

By default, `@helia/verified-fetch` sets the `Content-Type` header as `application/octet-stream` - this is because the `.json()`, `.text()`, `.blob()`, and `.arrayBuffer()` methods will usually work as expected without a detailed content type.

If you require an accurate content-type you can provide a `contentTypeParser` function as an option to `createVerifiedFetch` to handle parsing the content type.

The function you provide will be called with the first chunk of bytes from the file and should return a string or a promise of a string.

## Example - Customizing content-type parsing

```typescript
import { createVerifiedFetch } from '@helia/verified-fetch'
import { fileTypeFromBuffer } from '@sgtpooki/file-type'

const fetch = await createVerifiedFetch({
gateways: ['https://trustless-gateway.link'],
routers: ['http://delegated-ipfs.dev'],
contentTypeParser: async (bytes) => {
// call to some magic-byte recognition library like magic-bytes, file-type, or your own custom byte recognition
return fileTypeFromBuffer(bytes)?.mime
}
})
```

## Comparison to fetch

This module attempts to act as similarly to the `fetch()` API as possible.
Expand Down
7 changes: 4 additions & 3 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -157,19 +157,20 @@
"@libp2p/peer-id": "^4.0.5",
"hashlru": "^2.3.0",
"ipfs-unixfs-exporter": "^13.5.0",
"mime-types": "^2.1.35",
"multiformats": "^13.0.1",
"progress-events": "^1.0.0"
},
"devDependencies": {
"@libp2p/logger": "^4.0.5",
"@libp2p/peer-id-factory": "^4.0.5",
"@types/mime-types": "^2.1.4",
"@sgtpooki/file-type": "^1.0.1",
"@types/sinon": "^17.0.3",
"aegir": "^42.2.2",
"helia": "^4.0.1",
"magic-bytes.js": "^1.8.0",
"sinon": "^17.0.1",
"sinon-ts": "^2.0.0"
"sinon-ts": "^2.0.0",
"uint8arrays": "^5.0.1"
},
"sideEffects": false
}
66 changes: 59 additions & 7 deletions src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@
* const fetch = await createVerifiedFetch({
* gateways: ['https://trustless-gateway.link'],
* routers: ['http://delegated-ipfs.dev']
*})
* })
*
* const resp = await fetch('ipfs://bafy...')
*
Expand Down Expand Up @@ -112,6 +112,31 @@
* const json = await resp.json()
* ```
*
* ### Custom content-type parsing
*
* By default, `@helia/verified-fetch` sets the `Content-Type` header as `application/octet-stream` - this is because the `.json()`, `.text()`, `.blob()`, and `.arrayBuffer()` methods will usually work as expected without a detailed content type.
*
* If you require an accurate content-type you can provide a `contentTypeParser` function as an option to `createVerifiedFetch` to handle parsing the content type.
*
* The function you provide will be called with the first chunk of bytes from the file and should return a string or a promise of a string.
*
* @example Customizing content-type parsing
*
* ```typescript
* import { createVerifiedFetch } from '@helia/verified-fetch'
* import { fileTypeFromBuffer } from '@sgtpooki/file-type'
*
* const fetch = await createVerifiedFetch({
* gateways: ['https://trustless-gateway.link'],
* routers: ['http://delegated-ipfs.dev'],
* contentTypeParser: async (bytes) => {
* // call to some magic-byte recognition library like magic-bytes, file-type, or your own custom byte recognition
* const result = await fileTypeFromBuffer(bytes)
* return result?.mime
* }
* })
* ```
*
* ## Comparison to fetch
*
* This module attempts to act as similarly to the `fetch()` API as possible.
Expand Down Expand Up @@ -257,11 +282,34 @@ export interface VerifiedFetch {
}

/**
* Instead of passing a Helia instance, you can pass a list of gateways and routers, and a HeliaHTTP instance will be created for you.
* Instead of passing a Helia instance, you can pass a list of gateways and
* routers, and a HeliaHTTP instance will be created for you.
*/
export interface CreateVerifiedFetchWithOptions {
export interface CreateVerifiedFetchOptions {
gateways: string[]
routers?: string[]

/**
* A function to handle parsing content type from bytes. The function you
* provide will be passed the first set of bytes we receive from the network,
* and should return a string that will be used as the value for the
* `Content-Type` header in the response.
*/
contentTypeParser?: ContentTypeParser
}

/**
* A ContentTypeParser attempts to return the mime type of a given file. It
* receives the first chunk of the file data and the file name, if it is
* available. The function can be sync or async and if it returns/resolves to
* `undefined`, `application/octet-stream` will be used.
*/
export interface ContentTypeParser {
/**
* Attempt to determine a mime type, either via of the passed bytes or the
* filename if it is available.
*/
(bytes: Uint8Array, fileName?: string): Promise<string | undefined> | string | undefined
}

export type BubbledProgressEvents =
Expand All @@ -280,17 +328,21 @@ export type VerifiedFetchProgressEvents =
/**
* Options for the `fetch` function returned by `createVerifiedFetch`.
*
* This method accepts all the same options as the `fetch` function in the browser, plus an `onProgress` option to
* listen for progress events.
* This interface contains all the same fields as the [options object](https://developer.mozilla.org/en-US/docs/Web/API/fetch#options)
* passed to `fetch` in browsers, plus an `onProgress` option to listen for
* progress events.
*/
export interface VerifiedFetchInit extends RequestInit, ProgressOptions<BubbledProgressEvents | VerifiedFetchProgressEvents> {
}

/**
* Create and return a Helia node
*/
export async function createVerifiedFetch (init?: Helia | CreateVerifiedFetchWithOptions): Promise<VerifiedFetch> {
export async function createVerifiedFetch (init?: Helia | CreateVerifiedFetchOptions): Promise<VerifiedFetch> {
let contentTypeParser: ContentTypeParser | undefined

if (!isHelia(init)) {
contentTypeParser = init?.contentTypeParser
init = await createHeliaHTTP({
blockBrokers: [
trustlessGateway({
Expand All @@ -301,7 +353,7 @@ export async function createVerifiedFetch (init?: Helia | CreateVerifiedFetchWit
})
}

const verifiedFetchInstance = new VerifiedFetchClass({ helia: init })
const verifiedFetchInstance = new VerifiedFetchClass({ helia: init }, { contentTypeParser })
async function verifiedFetch (resource: Resource, options?: VerifiedFetchInit): Promise<Response> {
return verifiedFetchInstance.fetch(resource, options)
}
Expand Down
55 changes: 0 additions & 55 deletions src/utils/get-content-type.ts

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,27 +1,25 @@
import { CustomProgressEvent } from 'progress-events'
import { getContentType } from './get-content-type.js'
import type { VerifiedFetchInit } from '../index.js'
import type { ComponentLogger } from '@libp2p/interface'

/**
* Converts an async iterator of Uint8Array bytes to a stream and attempts to determine the content type of those bytes.
* Converts an async iterator of Uint8Array bytes to a stream and returns the first chunk of bytes
*/
export async function getStreamAndContentType (iterator: AsyncIterable<Uint8Array>, path: string, logger: ComponentLogger, options?: Pick<VerifiedFetchInit, 'onProgress'>): Promise<{ contentType: string, stream: ReadableStream<Uint8Array> }> {
const log = logger.forComponent('helia:verified-fetch:get-stream-and-content-type')
export async function getStreamFromAsyncIterable (iterator: AsyncIterable<Uint8Array>, path: string, logger: ComponentLogger, options?: Pick<VerifiedFetchInit, 'onProgress'>): Promise<{ stream: ReadableStream<Uint8Array>, firstChunk: Uint8Array }> {
const log = logger.forComponent('helia:verified-fetch:get-stream-from-async-iterable')
const reader = iterator[Symbol.asyncIterator]()
const { value, done } = await reader.next()
const { value: firstChunk, done } = await reader.next()

if (done === true) {
log.error('No content found for path', path)
throw new Error('No content found')
}

const contentType = await getContentType({ bytes: value, path })
const stream = new ReadableStream({
async start (controller) {
// the initial value is already available
options?.onProgress?.(new CustomProgressEvent<void>('verified-fetch:request:progress:chunk'))
controller.enqueue(value)
controller.enqueue(firstChunk)
},
async pull (controller) {
const { value, done } = await reader.next()
Expand All @@ -40,5 +38,8 @@ export async function getStreamAndContentType (iterator: AsyncIterable<Uint8Arra
}
})

return { contentType, stream }
return {
stream,
firstChunk
}
}
51 changes: 41 additions & 10 deletions src/verified-fetch.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,10 @@ import { code as dagPbCode } from '@ipld/dag-pb'
import { code as jsonCode } from 'multiformats/codecs/json'
import { decode, code as rawCode } from 'multiformats/codecs/raw'
import { CustomProgressEvent } from 'progress-events'
import { getStreamAndContentType } from './utils/get-stream-and-content-type.js'
import { getStreamFromAsyncIterable } from './utils/get-stream-from-async-iterable.js'
import { parseResource } from './utils/parse-resource.js'
import { walkPath, type PathWalkerFn } from './utils/walk-path.js'
import type { CIDDetail, Resource, VerifiedFetchInit as VerifiedFetchOptions } from './index.js'
import type { CIDDetail, ContentTypeParser, Resource, VerifiedFetchInit as VerifiedFetchOptions } from './index.js'
import type { Helia } from '@helia/interface'
import type { AbortOptions, Logger } from '@libp2p/interface'
import type { UnixFSEntry } from 'ipfs-unixfs-exporter'
Expand All @@ -32,9 +32,8 @@ interface VerifiedFetchComponents {
/**
* Potential future options for the VerifiedFetch constructor.
*/
// eslint-disable-next-line @typescript-eslint/no-empty-interface
interface VerifiedFetchInit {

contentTypeParser?: ContentTypeParser
}

interface FetchHandlerFunctionArg {
Expand Down Expand Up @@ -72,6 +71,7 @@ export class VerifiedFetch {
private readonly json: JSON
private readonly pathWalker: PathWalkerFn
private readonly log: Logger
private readonly contentTypeParser: ContentTypeParser | undefined

constructor ({ helia, ipns, unixfs, dagJson, json, dagCbor, pathWalker }: VerifiedFetchComponents, init?: VerifiedFetchInit) {
this.helia = helia
Expand All @@ -87,6 +87,7 @@ export class VerifiedFetch {
this.json = json ?? heliaJson(helia)
this.dagCbor = dagCbor ?? heliaDagCbor(helia)
this.pathWalker = pathWalker ?? walkPath
this.contentTypeParser = init?.contentTypeParser
this.log.trace('created VerifiedFetch instance')
}

Expand Down Expand Up @@ -133,13 +134,13 @@ export class VerifiedFetch {
private async handleDagCbor ({ cid, path, options }: FetchHandlerFunctionArg): Promise<Response> {
this.log.trace('fetching %c/%s', cid, path)
options?.onProgress?.(new CustomProgressEvent<CIDDetail>('verified-fetch:request:start', { cid: cid.toString(), path }))
const result = await this.dagCbor.get(cid, {
const result = await this.dagCbor.get<Uint8Array>(cid, {
signal: options?.signal,
onProgress: options?.onProgress
})
options?.onProgress?.(new CustomProgressEvent<CIDDetail>('verified-fetch:request:end', { cid: cid.toString(), path }))
const response = new Response(JSON.stringify(result), { status: 200 })
response.headers.set('content-type', 'application/json')
const response = new Response(result, { status: 200 })
await this.setContentType(result, path, response)
return response
}

Expand Down Expand Up @@ -179,11 +180,11 @@ export class VerifiedFetch {
options?.onProgress?.(new CustomProgressEvent<CIDDetail>('verified-fetch:request:end', { cid: resolvedCID.toString(), path: '' }))
this.log('got async iterator for %c/%s', cid, path)

const { contentType, stream } = await getStreamAndContentType(asyncIter, path ?? '', this.helia.logger, {
const { stream, firstChunk } = await getStreamFromAsyncIterable(asyncIter, path ?? '', this.helia.logger, {
onProgress: options?.onProgress
})
const response = new Response(stream, { status: 200 })
response.headers.set('content-type', contentType)
await this.setContentType(firstChunk, path, response)

return response
}
Expand All @@ -194,10 +195,36 @@ export class VerifiedFetch {
const result = await this.helia.blockstore.get(cid)
options?.onProgress?.(new CustomProgressEvent<CIDDetail>('verified-fetch:request:end', { cid: cid.toString(), path }))
const response = new Response(decode(result), { status: 200 })
response.headers.set('content-type', 'application/octet-stream')
await this.setContentType(result, path, response)
return response
}

private async setContentType (bytes: Uint8Array, path: string, response: Response): Promise<void> {
let contentType = 'application/octet-stream'

if (this.contentTypeParser != null) {
try {
let fileName = path.split('/').pop()?.trim()
fileName = fileName === '' ? undefined : fileName
const parsed = this.contentTypeParser(bytes, fileName)

if (isPromise(parsed)) {
const result = await parsed

if (result != null) {
contentType = result
}
} else if (parsed != null) {
contentType = parsed
}
} catch (err) {
this.log.error('Error parsing content type', err)
}
}

response.headers.set('content-type', contentType)
}

/**
* Determines the format requested by the client, defaults to `null` if no format is requested.
*
Expand Down Expand Up @@ -321,3 +348,7 @@ export class VerifiedFetch {
await this.helia.stop()
}
}

function isPromise <T> (p?: any): p is Promise<T> {
return p?.then != null
}
Loading

0 comments on commit b39d07c

Please sign in to comment.