`@pexip/media-processor`

An package for media data processing using Web API.

Install

npm install @pexip/media-processor

Usages

Use Analyzer to get the MediaStream data

const stream = await navigator.mediaDevices.getUserMedia({audio:true});

const fftSize = 64;
// Setup an Audio Graph with `source` -> `analyzer`
const source = createStreamSourceGraphNode(stream);
const analyzer = createAnalyzerGraphNode({fftSize});
const audioGraph = createAudioGraph([[source, analyzer]]);

// Grab the current time domain data in floating point representation
const buffer = new Float32Array(fftSize);
analyzer.node?.getFloatTimeDomainData(buffer);
// Do some work with the buffer
buffer.forEach(...);

// Get the current volume, [0, 1]
const volume = analyzer.getAverageVolume(buffer);

// Release the resources when you have done with the analyzer
await audioGraph.release();

Use AudioGraph to control the audio gain to mute/unmute

const stream = await navigator.mediaDevices.getUserMedia({
  audio: true,
  video: true,
});

const mute = !stream.getAudioTracks()[0]?.enabled;

// Setup an Audio Graph with `source` -> `gain`
const source = createStreamSourceGraphNode(stream);
const gain = createGainGraphNode(mute);
const destination = createStreamDestinationGraphNode();

const audioGraph = createAudioGraph([[source, gain, destination]]);

// Use the output MediaStream to for the altered AudioTrack
const alteredStream = new MediaStream([
  ...stream.getVideoTracks(),
  ...destination.stream.getAudioTracks(),
]);

// Mute the audio
if (gain.node) {
  gain.node.mute = true;
}

// Check if the audio is muted
gain.node.mute; // returns `true`, since we have just set the gain to 0

// Release the resources when you have done with the analyzer
await audioGraph.release();

Use noise suppression

const stream = await navigator.mediaDevices.getUserMedia({
  audio: true,
  video: true,
});

// Fetch denoise WebAssembly
const response = await fetch(
  new URL('@pexip/denoise/denoise_bg.wasm', import.meta.url).href,
);
const wasmBuffer = await response.arrayBuffer();

// Setup an Audio Graph with `source` -> `gain`
const source = createStreamSourceGraphNode(stream);
const destination = createStreamDestinationGraphNode();

const audioGraph = createAudioGraph([[source, destination]]);
// Add worklet module
await audioGraph.addWorklet(
new URL(
        '@pexip/media-processor/dist/worklets/denoise.worklet',
        import.meta.url,
      ).href
);
const denoise = createDenoiseWorkletGraphNode(wasmBuffer);
// Route the source through the denoise node
const audioGraph.connect([source, denoise, destination]);
const audioGraph.disconnect([source, destination]);

// Release the resources when you have done with the analyzer
await audioGraph.release();

Use background blur

import {
  createSegmenter,
  createCanvasTransform,
  createVideoTrackProcessor,
  createVideoTrackProcessorWithFallback,
} from '@pexip/media-processor';

// Grab the user's camera stream
const stream = await navigator.mediaDevices.getUserMedia({video: true});

// Setting the path to that `@mediapipe/tasks-vision` assets
// It will be passed direct to
// [FilesetResolver.forVisionTasks()](https://ai.google.dev/edge/api/mediapipe/js/tasks-vision.filesetresolver#filesetresolverforvisiontasks)
const tasksVisionBasePath =
  'A base path to specify the directory the Wasm files should be loaded from';

const modelAsset = {
  /**
   * Path to mediapipe selfie segmentation model asset
   */
  path: 'A path to selfie segmentation model',
  modelName: 'selfie' as const,
};

const segmenter = createSegmenter(tasksVisionBasePath, {modelAssets});
// Create a processing transformer and set the effects to `blur`
const transformer = createCanvasTransform(segmenter, {effects: 'blur'});
const processor = createVideoProcessor([transformer]);

// Start the processor
await videoProcessor.open();

// Passing the raw MediaStream to apply the effects
// Then, use the output stream for whatever purpose
const processedStream = await videoProcessor.process(stream);

Profiling Web Audio

You can do it with chrome, See here.

How AudioWorkletNode and the AudioWorkletProcessor work together

┌─────────────────────────┐             ┌──────────────────────────┐
│                         │             │                          │
│    Main Global Scope    │             │  AudioWorkletGlobalScope │
│                         │             │                          │
│  ┌───────────────────┐  │             │  ┌────────────────────┐  │
│  │                   │  │ MessagePort │  │                    │  │
│  │   AudioWorklet    │◄─┼─────────────┼─►│    AudioWorklet    │  │
│  │       Node        │  │             │  │      Processor     │  │
│  │                   │  │             │  │                    │  │
│  └───────────────────┘  │             │  └────────────────────┘  │
│                         │             │                          │
└─────────────────────────┘             └──────────────────────────┘
       Main Thread                          WebAudio Render Thread

Constraints when using the AudioWorklet

Each BaseAudioContext possesses exactly one AudioWorklet
128 samples-frames
No fetch API in the AudioWorkletGlobalScope
No TextEncoder/Decoder APIs in the AudioWorkletGlobalScope

References

A library for media analysis using Web APIs.

Enumerations

Enumeration	Description
AbortReason	-

Interfaces

Interface	Description
Analyzer	-
AnalyzerSubscribableOptions	-
AsyncAssets	-
AudioGraph	-
AudioGraphOptions	-
AudioNodeInit	-
AudioNodeProps	-
AudioStats	Data structure for the audio statistics while processing
BackgroundImage	-
Benchmark	-
Clock	Clock interface to get the current time with now method,
Denoise	A wrapper for the Denoise wasm module
Detector	-
Frame	-
Gain	-
ImageRecord	-
ImageSegmenterOptions	-
MPMask	Wrapper for a mask produced by a Segmentation Task.
Point	Interface for Point consist of coordinates x and y
Process	-
ProcessEventClose	Message event to stop the processing
ProcessEventDebug	Message event for debugging
ProcessEventDestroy	Message event to release all the resources
ProcessEventOpen	Message event to initializing the Processor
ProcessEventProcess	Message event to process the video frame
ProcessEventUpdate	Message event to update processor options
ProcessorEvent	-
ProcessorOptions	-
ProcessorProcessOptions	-
ProcessWorkerDebugMessage	-
ProcessWorkerEvent	-
ProcessWorkerEventDebug	-
ProcessWorkerEventOpened	-
ProcessWorkerEventProcessed	-
ProcessWorkerEventUpdated	-
Rect	-
Renderer	-
RendererOptions	-
RenderEventHandlers	-
Runner	-
Segmentation	-
SegmentationModelAsset	Segmentation model asset that can be used to load a segmentation model.
SegmentationParams	-
SegmentationSmoothingConfig	-
Segmenter	-
SegmenterOptions	-
Size	-
StatsOptions	Stats options for calculating the audio stats
SubscribableOptions	-
ThrottleOptions	Limit the rate of flow in terms of millisecond, and provided Clock
Transform	-
VideoFrameLike	-
VideoProcessor	-
WorkletMessagePortOptions	-
WorkletModule	-

Type Aliases

Type Alias	Description
AnalyzerNodeInit	-
AsyncCallback	-
AudioBufferBytes	Same as the return from AnalyserNode.getByteFrequencyData()
AudioBufferFloats	Same as AudioBuffer Or the return from AnalyserNode.getFloatFrequencyData()
AudioDestinationNodeInit	-
AudioNodeConnectParam	-
AudioNodeInitConnection	-
AudioNodeInitConnections	-
AudioNodeInitConnectParam	-
AudioNodeParam	-
AudioSamples	Audio samples from each channel, either in float or bytes form
BaseAudioNode	-
Callback	-
Canvas	-
CanvasContext	-
ChannelSplitterNodeInit	-
Color	-
ConnectInitParamBaseType	-
ConnectInitParamType	-
ConnectParamBase	-
ConnectParamBaseType	-
ConnectParamType	-
DelayNodeInit	-
Delegate	-
DenoiseWorkletNodeInit	-
ExtractMessageEventType	-
GainNodeInit	-
ImageMapIterable	-
ImageType	-
IncludeMessageEventDataType	-
InputFrame	-
IsVoice	-
MediaElementAudioSourceNodeInit	-
MediaStreamAudioDestinationNodeInit	-
MediaStreamAudioSourceNodeInit	-
Node	-
NodeConnectionAction	-
Nodes	-
ProcessEvents	Message events to be sent to the Processor
ProcessInputType	-
ProcessorEvents	-
ProcessorWorkerEvents	-
ProcessStatus	-
ProcessVideoTrack	-
ProcessWorkerEvents	Message events returned from the processor
RenderBackend	-
RenderEffects	-
RenderingEvents	-
RunnerCreator	-
SegmentationModel	-
SegmentationTransform	-
TupleOf	From https://github.com/Microsoft/TypeScript/issues/26223#issuecomment-674500430
UniversalAudioContextState	We need to add the missing type def to work with AudioContextState in Safari See https://developer.mozilla.org/en-US/docs/Web/API/BaseAudioContext/state#resuming_interrupted_play_states_in_ios_safari
Unsubscribe	Unsubscribe the subscription
WasmPaths	-

Variables

Variable	Description
BACKGROUND_BLUR_AMOUNT	-
calculateMaxBlurPass	-
clamping	-
CLIP_COUNT_THRESHOLD	Default clipping count threshold Number of consecutive clipThreshold level samples that indicate clipping.
CLIP_THRESHOLD	Default clipping detection threshold
createLazyProps	-
createObjectUpdater	-
createRemoteImageBitmap	-
EDGE_BLUR_AMOUNT	-
FOREGROUND_THRESHOLD	-
FRAME_RATE	-
getErrorMessage	-
handleWebGLContextLoss	-
isRenderingEvents	-
isSegmentationModel	-
LOW_VOLUME_THRESHOLD	Default low volume detection threshold
MASK_COMBINE_RATIO	-
MONO_THRESHOLD	Default mono detection threshold Data must be identical within one LSB 16-bit to be identified as mono.
PLAY_VIDEO_TIMEOUT	-
PROCESS_STATUS	Process status
PROCESSING_HEIGHT	-
PROCESSING_WIDTH	-
RENDER_BACKEND	-
RENDER_EFFECT	-
RENDERING_EVENTS	-
resize	Calculate the coordinates and size of the source and destination to fit the target aspect ratio.
SEGMENTATION_MODELS	-
SILENT_THRESHOLD	Default silent threshold At least one LSB 16-bit data (compare is on absolute value).
toRenderingEvents	-
urls	-
VOICE_PROBABILITY_THRESHOLD	Default Voice probability threshold

Functions

Function	Description
avg	Average an array of numbers
calculateDistance	Calculate the distance between two Points
closedCurve	Create a cubic Bezier curve path then turning back to the starting point with provided point of reference
copyByteBufferToFloatBuffer	Copy data from Uint8Array buffer to Float32Array buffer with byte to float conversion
createAnalyzerGraphNode	Create an AnalyzerNodeInit
createAnalyzerSubscribableGraphNode	Create an analyzer node with push-based subscription
createAsyncCallbackLoop	Create an async callback loop to be called recursively with delay based on the `frameRate`
createAudioContext	A function to create `AudioContext` using constructor or factory function depends on the browser supports
createAudioDestinationGraphNode	Create an `AudioDestinationNode`
createAudioGraph	Accepts AudioNodeInitConnections to build the audio graph within a signal audio context
createAudioGraphProxy	-
createAudioSignalDetector	Create a function to process the AudioStats and check if silent `onSignalDetected` callback is called under 2 situations:
createAudioStats	AudioStats builder
createBenchmark	-
createCanvas	Create a Canvas element with provided width and height
createCanvasTransform	-
createChannelMergerGraphNode	Create a ChannelMergerNode
createChannelSplitterGraphNode	Create a ChannelSplitterNode
createDelayGraphNode	Create a `DelayNode`
createDenoiseWorkletGraphNode	Create a noise suppression node
createFrameCallbackRequest	Create a callback loop for video frame processing using `requestVideoFrameCallback` under-the-hood when available otherwise our fallback implementation based on `setTimeout`.
createGainGraphNode	Create a GainNodeInit
createMediaElementSourceGraphNode	Create a MediaStreamAudioSourceNodeInit
createSegmenter	-
createStreamDestinationGraphNode	Create a MediaStreamAudioDestinationNodeInit
createStreamSourceGraphNode	Create a MediaStreamAudioSourceNodeInit
createVADetector	Create a voice detector based on provided params
createVideoProcessor	-
createVideoTrackProcessor	Video track processor using MediaStreamTrackProcessor API
createVideoTrackProcessorWithFallback	Video track processor using Canvas[captureStream] API
createVoiceDetectorFromProbability	A function to check the provided probability is considered as voice activity
createVoiceDetectorFromTimeData	A function to check provided time series data is considered as voice activity
curve	Create a cubic Bezier curve path command
fromByteToFloat	Convert a byte to float, according to web audio spec
fromFloatToByte	Convert a float to byte, according to web audio spec
getAudioStats	Calculate the audio stats, expected the samples are in float form
getBezierCurveControlPoints	Spline Interpolation for Bezier Curve
isAnalyzerNodeInit	-
isAudioNode	-
isAudioNodeInit	-
isAudioParam	-
isClipping	Check if there is clipping
isEqualSize	Compare the provided width and height to see if they are the same
isLowVolume	Check if the provided gain above the low volume threshold, which is considered as low volume.
isMono	Check if provided channels are mono or stereo
isRenderEffects	-
isSilent	Simple silent detection to only check the first and last bit from the sample
isVoiceActivity	A Naive Voice activity detection
line	Create a straight line path command
loadScript	-
loadWasms	-
pow	pow function from Math in functional form `number -> number -> number`
processAverageVolume	Calculate the averaged volume using Root Mean Square, assuming the data is in float form
resumeAudioOnInterruption	Resume the stream whenever interrupted
resumeAudioOnUnmute	Resume the AudioContext whenever the source track is unmuted
rms	Calculate the Root Mean Square from provided numbers
round	Round the floating point number away from zero, which is different from `Math.round`
subscribeTimeoutAnalyzerNode	Subscribe to a timeout loop to get the data from Analyzer
subscribeWorkletNode	Subscribe MessagePort message from an AudioWorkletNode
sum	Sum an array of numbers
toDecibel	Convert a floating point gain value into a dB representation without any reference, dBFS, https://en.wikipedia.org/wiki/DBFS

Install​

Usages​

Use Analyzer to get the MediaStream data​

Use AudioGraph to control the audio gain to mute/unmute​

Use noise suppression​

Use background blur​

Profiling Web Audio​

How AudioWorkletNode and the AudioWorkletProcessor work together​

Constraints when using the AudioWorklet​

References​

Enumerations​

Interfaces​

Type Aliases​

Variables​

Functions​

Install

Usages

Use Analyzer to get the MediaStream data

Use AudioGraph to control the audio gain to mute/unmute

Use noise suppression

Use background blur

Profiling Web Audio

How AudioWorkletNode and the AudioWorkletProcessor work together

Constraints when using the AudioWorklet

References

Enumerations

Interfaces

Type Aliases

Variables

Functions