@pexip/media-processor
An package for media data processing using Web API.
Install
npm install @pexip/media-processor
Usages
Use Analyzer to get the MediaStream data
const stream = await navigator.mediaDevices.getUserMedia({audio:true});
const fftSize = 64;
// Setup an Audio Graph with `source` -> `analyzer`
const source = createStreamSourceGraphNode(stream);
const analyzer = createAnalyzerGraphNode({fftSize});
const audioGraph = createAudioGraph([[source, analyzer]]);
// Grab the current time domain data in floating point representation
const buffer = new Float32Array(fftSize);
analyzer.node?.getFloatTimeDomainData(buffer);
// Do some work with the buffer
buffer.forEach(...);
// Get the current volume, [0, 1]
const volume = analyzer.getAverageVolume(buffer);
// Release the resources when you have done with the analyzer
await audioGraph.release();
Use AudioGraph to control the audio gain to mute/unmute
const stream = await navigator.mediaDevices.getUserMedia({
audio: true,
video: true,
});
const mute = !stream.getAudioTracks()[0]?.enabled;
// Setup an Audio Graph with `source` -> `gain`
const source = createStreamSourceGraphNode(stream);
const gain = createGainGraphNode(mute);
const destination = createStreamDestinationGraphNode();
const audioGraph = createAudioGraph([[source, gain, destination]]);
// Use the output MediaStream to for the altered AudioTrack
const alteredStream = new MediaStream([
...stream.getVideoTracks(),
...destination.stream.getAudioTracks(),
]);
// Mute the audio
if (gain.node) {
gain.node.mute = true;
}
// Check if the audio is muted
gain.node.mute; // returns `true`, since we have just set the gain to 0
// Release the resources when you have done with the analyzer
await audioGraph.release();
Use noise suppression
const stream = await navigator.mediaDevices.getUserMedia({
audio: true,
video: true,
});
// Fetch denoise WebAssembly
const response = await fetch(
new URL('@pexip/denoise/denoise_bg.wasm', import.meta.url).href,
);
const wasmBuffer = await response.arrayBuffer();
// Setup an Audio Graph with `source` -> `gain`
const source = createStreamSourceGraphNode(stream);
const destination = createStreamDestinationGraphNode();
const audioGraph = createAudioGraph([[source, destination]]);
// Add worklet module
await audioGraph.addWorklet(
new URL(
'@pexip/media-processor/dist/worklets/denoise.worklet',
import.meta.url,
).href
);
const denoise = createDenoiseWorkletGraphNode(wasmBuffer);
// Route the source through the denoise node
const audioGraph.connect([source, denoise, destination]);
const audioGraph.disconnect([source, destination]);
// Release the resources when you have done with the analyzer
await audioGraph.release();
Use background blur
import {
createSegmenter,
createCanvasTransform,
createVideoTrackProcessor,
createVideoTrackProcessorWithFallback,
} from '@pexip/media-processor';
// Grab the user's camera stream
const stream = await navigator.mediaDevices.getUserMedia({video: true});
// Setting the path to that `@mediapipe/tasks-vision` assets
// It will be passed direct to
// [FilesetResolver.forVisionTasks()](https://ai.google.dev/edge/api/mediapipe/js/tasks-vision.filesetresolver#filesetresolverforvisiontasks)
const tasksVisionBasePath =
'A base path to specify the directory the Wasm files should be loaded from';
const modelAsset = {
/**
* Path to mediapipe selfie segmentation model asset
*/
path: 'A path to selfie segmentation model',
modelName: 'selfie' as const,
};
const segmenter = createSegmenter(tasksVisionBasePath, {modelAssets});
// Create a processing transformer and set the effects to `blur`
const transformer = createCanvasTransform(segmenter, {effects: 'blur'});
const processor = createVideoProcessor([transformer]);
// Start the processor
await videoProcessor.open();
// Passing the raw MediaStream to apply the effects
// Then, use the output stream for whatever purpose
const processedStream = await videoProcessor.process(stream);
Profiling Web Audio
You can do it with chrome, See here.
How AudioWorkletNode and the AudioWorkletProcessor work together
┌─────────────────────────┐ ┌──────────────────────────┐
│ │ │ │
│ Main Global Scope │ │ AudioWorkletGlobalScope │
│ │ │ │
│ ┌───────────────────┐ │ │ ┌────────────────────┐ │
│ │ │ │ MessagePort │ │ │ │
│ │ AudioWorklet │◄─┼─────────────┼─►│ AudioWorklet │ │
│ │ Node │ │ │ │ Processor │ │
│ │ │ │ │ │ │ │
│ └───────────────────┘ │ │ └────────────────────┘ │
│ │ │ │
└─────────────────────────┘ └──────────────────────────┘
Main Thread WebAudio Render Thread
Constraints when using the AudioWorklet
- Each
BaseAudioContextpossesses exactly oneAudioWorklet - 128 samples-frames
- No
fetchAPI in theAudioWorkletGlobalScope - No
TextEncoder/DecoderAPIs in theAudioWorkletGlobalScope
References
A library for media analysis using Web APIs.
Enumerations
| Enumeration | Description |
|---|---|
| AbortReason | - |
Interfaces
| Interface | Description |
|---|---|
| Segmentation | - |
| VideoFrameLike | - |
| ImageRecord | - |
| Rect | - |
| ProcessEventOpen | Message event to initializing the Processor |
| ProcessEventProcess | Message event to process the video frame |
| ProcessEventUpdate | Message event to update processor options |
| ProcessEventClose | Message event to stop the processing |
| ProcessEventDestroy | Message event to release all the resources |
| ProcessEventDebug | Message event for debugging |
| ProcessWorkerEventProcessed | - |
| ProcessWorkerEventOpened | - |
| ProcessWorkerEventUpdated | - |
| ProcessWorkerEvent | - |
| ProcessWorkerEventDebug | - |
| ProcessorOptions | - |
| ProcessorProcessOptions | - |
| ProcessWorkerDebugMessage | - |
| ProcessorEvent | - |
| RendererOptions | Options controlling the rendering and effects for person/background segmentation. All values are stable per-frame and can be safely updated in real-time. |
| Renderer | - |
| RenderEventHandlers | - |
| SegmentationModelAsset | Segmentation model asset that can be used to load a segmentation model. |
| ImageSegmenterOptions | - |
| SegmenterOptions | - |
| SegmentationSmoothingConfig | - |
| MPMask | Wrapper for a mask produced by a Segmentation Task. |
| Stats | - |
| Weights | - |
| SelectionOptions | - |
| Benchmark | - |
| Point | Interface for Point consist of coordinates x and y |
| Size | - |
| Frame | - |
| AudioStats | Data structure for the audio statistics while processing |
| StatsOptions | Stats options for calculating the audio stats |
| SubscribableOptions | - |
| WorkletMessagePortOptions | - |
| AnalyzerSubscribableOptions | - |
| Denoise | A wrapper for the Denoise wasm module |
| Gain | - |
| Analyzer | - |
| AudioNodeProps | - |
| AudioNodeInit | - |
| WorkletModule | - |
| AudioGraphOptions | - |
| AudioGraph | - |
| Clock | Clock interface to get the current time with now method, |
| ThrottleOptions | Limit the rate of flow in terms of millisecond, and provided Clock |
| Runner | - |
| Transform | - |
| AsyncAssets | - |
| Process | - |
| BackgroundImage | - |
| Segmenter | - |
| Detector | - |
| SegmentationParams | Options controlling the rendering and effects for person/background segmentation. All values are stable per-frame and can be safely updated in real-time. |
| VideoProcessor | - |
Type Aliases
Variables
Functions
| Function | Description |
|---|---|
| nearestPowerOfTwo | Coerce any positive number to the nearest power of two within [minPow, maxPow] Example: nearestPowerOfTwo(13) => 16; nearestPowerOfTwo(3.2) => 4 |
| iou | Intersection over Union (IoU), how much two masks overlap as a fraction of either's total. 1.0 = perfect match, 0 = no overlap. Used for stable, robust tracking of the main person between frames, especially with bystander removal and temporal instance switching. |
| labelComponents | - |
| scoreComponent | - |
| createAudioContext | A function to create AudioContext using constructor or factory function depends on the browser supports |
| resumeAudioOnInterruption | Resume the stream whenever interrupted |
| resumeAudioOnUnmute | Resume the AudioContext whenever the source track is unmuted |
| subscribeWorkletNode | Subscribe MessagePort message from an AudioWorkletNode |
| subscribeTimeoutAnalyzerNode | Subscribe to a timeout loop to get the data from Analyzer |
| createStreamSourceGraphNode | Create a MediaStreamAudioSourceNodeInit |
| createMediaElementSourceGraphNode | Create a MediaStreamAudioSourceNodeInit |
| createAnalyzerSubscribableGraphNode | Create an analyzer node with push-based subscription |
| createDenoiseWorkletGraphNode | Create a noise suppression node |
| createGainGraphNode | Create a GainNodeInit |
| createAnalyzerGraphNode | Create an AnalyzerNodeInit |
| createStreamDestinationGraphNode | Create a MediaStreamAudioDestinationNodeInit |
| createAudioDestinationGraphNode | Create an AudioDestinationNode |
| createDelayGraphNode | Create a DelayNode |
| createChannelSplitterGraphNode | Create a ChannelSplitterNode |
| createChannelMergerGraphNode | Create a ChannelMergerNode |
| createAudioGraph | Accepts AudioNodeInitConnections to build the audio graph within a signal audio context |
| createAudioGraphProxy | - |
| createBenchmark | Creates a benchmarking utility for measuring frame durations and calculating frames per second (FPS). |
| createWindowedStats | - |
| sum | Sum an array of numbers |
| avg | Average an array of numbers |
| pow | pow function from Math in functional form number -> number -> number |
| rms | Calculate the Root Mean Square from provided numbers |
| round | Round the floating point number away from zero, which is different from Math.round |
| createAudioStats | AudioStats builder |
| fromByteToFloat | Convert a byte to float, according to web audio spec |
| fromFloatToByte | Convert a float to byte, according to web audio spec |
| copyByteBufferToFloatBuffer | Copy data from Uint8Array buffer to Float32Array buffer with byte to float conversion |
| toDecibel | Convert a floating point gain value into a dB representation without any reference, dBFS, https://en.wikipedia.org/wiki/DBFS |
| processAverageVolume | Calculate the averaged volume using Root Mean Square, assuming the data is in float form |
| isSilent | Simple silent detection to only check the first and last bit from the sample |
| isLowVolume | Check if the provided gain above the low volume threshold, which is considered as low volume. |
| isClipping | Check if there is clipping |
| isMono | Check if provided channels are mono or stereo |
| getAudioStats | Calculate the audio stats, expected the samples are in float form |
| isVoiceActivity | A Naive Voice activity detection |
| isEqualSize | Compare the provided width and height to see if they are the same |
| createVoiceDetectorFromTimeData | A function to check provided time series data is considered as voice activity |
| createVoiceDetectorFromProbability | A function to check the provided probability is considered as voice activity |
| createVADetector | Create a voice detector based on provided params |
| createAudioSignalDetector | Create a function to process the AudioStats and check if silent onSignalDetected callback is called under 2 situations: |
| isAudioNode | - |
| isAudioParam | - |
| isAudioNodeInit | - |
| isAnalyzerNodeInit | - |
| createAsyncCallbackLoop | Create an async callback loop to be called recursively with delay based on the frameRate |
| createCanvasTransform | - |
| loadScript | - |
| loadWasms | - |
| createSegmenter | - |
| isRenderEffects | - |
| createCanvas | Create a Canvas element with provided width and height |
| setVideoElementSrc | - |
| playVideo | - |
| createFrameCallbackRequest | Create a callback loop for video frame processing using requestVideoFrameCallback under-the-hood when available otherwise our fallback implementation based on setTimeout. |
| createVideoProcessor | - |
| createVideoTrackProcessor | Video track processor using MediaStreamTrackProcessor API |
| createVideoTrackProcessorWithFallback | Video track processor using Canvas[captureStream] API |
| calculateDistance | Calculate the distance between two Points |
| getBezierCurveControlPoints | Spline Interpolation for Bezier Curve |
| line | Create a straight line path command |
| curve | Create a cubic Bezier curve path command |
| closedCurve | Create a cubic Bezier curve path then turning back to the starting point with provided point of reference |