@pexip/media-processor
An package for media data processing using Web API.
Install
npm install @pexip/media-processor
Usages
Use Analyzer to get the MediaStream data
const stream = await navigator.mediaDevices.getUserMedia({audio:true});
const fftSize = 64;
// Setup an Audio Graph with `source` -> `analyzer`
const source = createStreamSourceGraphNode(stream);
const analyzer = createAnalyzerGraphNode({fftSize});
const audioGraph = createAudioGraph([[source, analyzer]]);
// Grab the current time domain data in floating point representation
const buffer = new Float32Array(fftSize);
analyzer.node?.getFloatTimeDomainData(buffer);
// Do some work with the buffer
buffer.forEach(...);
// Get the current volume, [0, 1]
const volume = analyzer.getAverageVolume(buffer);
// Release the resources when you have done with the analyzer
await audioGraph.release();
Use AudioGraph to control the audio gain to mute/unmute
const stream = await navigator.mediaDevices.getUserMedia({
audio: true,
video: true,
});
const mute = !stream.getAudioTracks()[0]?.enabled;
// Setup an Audio Graph with `source` -> `gain`
const source = createStreamSourceGraphNode(stream);
const gain = createGainGraphNode(mute);
const destination = createStreamDestinationGraphNode();
const audioGraph = createAudioGraph([[source, gain, destination]]);
// Use the output MediaStream to for the altered AudioTrack
const alteredStream = new MediaStream([
...stream.getVideoTracks(),
...destination.stream.getAudioTracks(),
]);
// Mute the audio
if (gain.node) {
gain.node.mute = true;
}
// Check if the audio is muted
gain.node.mute; // returns `true`, since we have just set the gain to 0
// Release the resources when you have done with the analyzer
await audioGraph.release();
Use noise suppression
const stream = await navigator.mediaDevices.getUserMedia({
audio: true,
video: true,
});
// Fetch denoise WebAssembly
const response = await fetch(
new URL('@pexip/denoise/denoise_bg.wasm', import.meta.url).href,
);
const wasmBuffer = await response.arrayBuffer();
// Setup an Audio Graph with `source` -> `gain`
const source = createStreamSourceGraphNode(stream);
const destination = createStreamDestinationGraphNode();
const audioGraph = createAudioGraph([[source, destination]]);
// Add worklet module
await audioGraph.addWorklet(
new URL(
'@pexip/media-processor/dist/worklets/denoise.worklet',
import.meta.url,
).href
);
const denoise = createDenoiseWorkletGraphNode(wasmBuffer);
// Route the source through the denoise node
const audioGraph.connect([source, denoise, destination]);
const audioGraph.disconnect([source, destination]);
// Release the resources when you have done with the analyzer
await audioGraph.release();
Use background blur
import {
createSegmenter,
createCanvasTransform,
createVideoTrackProcessor,
createVideoTrackProcessorWithFallback,
} from '@pexip/media-processor';
// Grab the user's camera stream
const stream = await navigator.mediaDevices.getUserMedia({video: true});
// Setting the path to that `@mediapipe/tasks-vision` assets
// It will be passed direct to
// [FilesetResolver.forVisionTasks()](https://ai.google.dev/edge/api/mediapipe/js/tasks-vision.filesetresolver#filesetresolverforvisiontasks)
const tasksVisionBasePath =
'A base path to specify the directory the Wasm files should be loaded from';
const modelAsset = {
/**
* Path to mediapipe selfie segmentation model asset
*/
path: 'A path to selfie segmentation model',
modelName: 'selfie' as const,
};
const segmenter = createSegmenter(tasksVisionBasePath, {modelAssets});
// Create a processing transformer and set the effects to `blur`
const transformer = createCanvasTransform(segmenter, {effects: 'blur'});
const processor = createVideoProcessor([transformer]);
// Start the processor
await videoProcessor.open();
// Passing the raw MediaStream to apply the effects
// Then, use the output stream for whatever purpose
const processedStream = await videoProcessor.process(stream);
Profiling Web Audio
You can do it with chrome, See here.
How AudioWorkletNode and the AudioWorkletProcessor work together
┌─────────────────────────┐ ┌──────────────────────────┐
│ │ │ │
│ Main Global Scope │ │ AudioWorkletGlobalScope │
│ │ │ │
│ ┌───────────────────┐ │ │ ┌────────────────────┐ │
│ │ │ │ MessagePort │ │ │ │
│ │ AudioWorklet │◄─┼─────────────┼─►│ AudioWorklet │ │
│ │ Node │ │ │ │ Processor │ │
│ │ │ │ │ │ │ │
│ └───────────────────┘ │ │ └────────────────────┘ │
│ │ │ │
└─────────────────────────┘ └──────────────────────────┘
Main Thread WebAudio Render Thread
Constraints when using the AudioWorklet
- Each
BaseAudioContextpossesses exactly oneAudioWorklet - 128 samples-frames
- No
fetchAPI in theAudioWorkletGlobalScope - No
TextEncoder/DecoderAPIs in theAudioWorkletGlobalScope
References
A library for media analysis using Web APIs.
Enumerations
| Enumeration | Description |
|---|---|
| AbortReason | - |
Interfaces
Type Aliases
Variables
| Variable | Description |
|---|---|
| BACKGROUND_BLUR_AMOUNT | - |
| calculateMaxBlurPass | - |
| clamping | - |
| CLIP_COUNT_THRESHOLD | Default clipping count threshold Number of consecutive clipThreshold level samples that indicate clipping. |
| CLIP_THRESHOLD | Default clipping detection threshold |
| createLazyProps | - |
| createObjectUpdater | - |
| createRemoteImageBitmap | - |
| EDGE_BLUR_AMOUNT | - |
| FOREGROUND_THRESHOLD | - |
| FRAME_RATE | - |
| getErrorMessage | - |
| handleWebGLContextLoss | - |
| isRenderingEvents | - |
| isSegmentationModel | - |
| LOW_VOLUME_THRESHOLD | Default low volume detection threshold |
| MASK_COMBINE_RATIO | - |
| MONO_THRESHOLD | Default mono detection threshold Data must be identical within one LSB 16-bit to be identified as mono. |
| PLAY_VIDEO_TIMEOUT | - |
| PROCESS_STATUS | Process status |
| PROCESSING_HEIGHT | - |
| PROCESSING_WIDTH | - |
| RENDER_BACKEND | - |
| RENDER_EFFECT | - |
| RENDERING_EVENTS | - |
| resize | Calculate the coordinates and size of the source and destination to fit the target aspect ratio. |
| SEGMENTATION_MODELS | - |
| SILENT_THRESHOLD | Default silent threshold At least one LSB 16-bit data (compare is on absolute value). |
| toRenderingEvents | - |
| urls | - |
| VOICE_PROBABILITY_THRESHOLD | Default Voice probability threshold |
Functions
| Function | Description |
|---|---|
| avg | Average an array of numbers |
| calculateDistance | Calculate the distance between two Points |
| closedCurve | Create a cubic Bezier curve path then turning back to the starting point with provided point of reference |
| copyByteBufferToFloatBuffer | Copy data from Uint8Array buffer to Float32Array buffer with byte to float conversion |
| createAnalyzerGraphNode | Create an AnalyzerNodeInit |
| createAnalyzerSubscribableGraphNode | Create an analyzer node with push-based subscription |
| createAsyncCallbackLoop | Create an async callback loop to be called recursively with delay based on the frameRate |
| createAudioContext | A function to create AudioContext using constructor or factory function depends on the browser supports |
| createAudioDestinationGraphNode | Create an AudioDestinationNode |
| createAudioGraph | Accepts AudioNodeInitConnections to build the audio graph within a signal audio context |
| createAudioGraphProxy | - |
| createAudioSignalDetector | Create a function to process the AudioStats and check if silent onSignalDetected callback is called under 2 situations: |
| createAudioStats | AudioStats builder |
| createBenchmark | - |
| createCanvas | Create a Canvas element with provided width and height |
| createCanvasTransform | - |
| createChannelMergerGraphNode | Create a ChannelMergerNode |
| createChannelSplitterGraphNode | Create a ChannelSplitterNode |
| createDelayGraphNode | Create a DelayNode |
| createDenoiseWorkletGraphNode | Create a noise suppression node |
| createFrameCallbackRequest | Create a callback loop for video frame processing using requestVideoFrameCallback under-the-hood when available otherwise our fallback implementation based on setTimeout. |
| createGainGraphNode | Create a GainNodeInit |
| createMediaElementSourceGraphNode | Create a MediaStreamAudioSourceNodeInit |
| createSegmenter | - |
| createStreamDestinationGraphNode | Create a MediaStreamAudioDestinationNodeInit |
| createStreamSourceGraphNode | Create a MediaStreamAudioSourceNodeInit |
| createVADetector | Create a voice detector based on provided params |
| createVideoProcessor | - |
| createVideoTrackProcessor | Video track processor using MediaStreamTrackProcessor API |
| createVideoTrackProcessorWithFallback | Video track processor using Canvas[captureStream] API |
| createVoiceDetectorFromProbability | A function to check the provided probability is considered as voice activity |
| createVoiceDetectorFromTimeData | A function to check provided time series data is considered as voice activity |
| curve | Create a cubic Bezier curve path command |
| fromByteToFloat | Convert a byte to float, according to web audio spec |
| fromFloatToByte | Convert a float to byte, according to web audio spec |
| getAudioStats | Calculate the audio stats, expected the samples are in float form |
| getBezierCurveControlPoints | Spline Interpolation for Bezier Curve |
| isAnalyzerNodeInit | - |
| isAudioNode | - |
| isAudioNodeInit | - |
| isAudioParam | - |
| isClipping | Check if there is clipping |
| isEqualSize | Compare the provided width and height to see if they are the same |
| isLowVolume | Check if the provided gain above the low volume threshold, which is considered as low volume. |
| isMono | Check if provided channels are mono or stereo |
| isRenderEffects | - |
| isSilent | Simple silent detection to only check the first and last bit from the sample |
| isVoiceActivity | A Naive Voice activity detection |
| line | Create a straight line path command |
| loadScript | - |
| loadWasms | - |
| pow | pow function from Math in functional form number -> number -> number |
| processAverageVolume | Calculate the averaged volume using Root Mean Square, assuming the data is in float form |
| resumeAudioOnInterruption | Resume the stream whenever interrupted |
| resumeAudioOnUnmute | Resume the AudioContext whenever the source track is unmuted |
| rms | Calculate the Root Mean Square from provided numbers |
| round | Round the floating point number away from zero, which is different from Math.round |
| subscribeTimeoutAnalyzerNode | Subscribe to a timeout loop to get the data from Analyzer |
| subscribeWorkletNode | Subscribe MessagePort message from an AudioWorkletNode |
| sum | Sum an array of numbers |
| toDecibel | Convert a floating point gain value into a dB representation without any reference, dBFS, https://en.wikipedia.org/wiki/DBFS |