INTRODUCTION
A pure Node.js computer vision engine. Zero native dependencies. Custom codecs, SharedArrayBuffer architecture, and dynamic kernel fusion.
Traditional Node.js image libraries like Sharp rely on libvips
compiled as native C++ addons. Cervid Vision breaks from this by implementing
custom decoders (JPEG, PNG, PPM) natively in JavaScript, placing the raw pixel data
directly into contiguous SharedArrayBuffer memory.
Image operations (Kernels) are written in optimized JS using TypedArray views.
Instead of executing eager copies, the VisionPipeline can detect
common workflows (like Grayscale → Blur → Sobel edges) and fuse
them into a single multi-threaded pass across your CPU cores.
INSTALLATION
Cervid Vision requires Node.js 18+ for SharedArrayBuffer and Worker Threads support.
npm i @cervid/vision
import { Vision, VisionImage, VisionPipeline } from '@cervid/vision'
QUICKSTART
A complete example showing how to read an image, detect edges, draw a bounding box, and save it back to disk.
import { Vision } from '@cervid/vision' // 1. Read a JPEG or PNG const img = await Vision.read('input.jpg') // 2. Optimized Pipeline: Grayscale -> Blur -> Sobel Edges await img.pipeline() .grayscale() .blur(1) .edges() .runAsync() // Fuses these 3 ops and runs in parallel // 3. Find Connected Components const components = img.connectedComponents({ minArea: 50 }) // 4. Draw bounding boxes on the result img.drawBoxes(components, { r: 255, g: 0, b: 0 }, 2) // 5. Save back to disk await img.save('output_edges.png')
VISIONIMAGE
The primary wrapper class for image manipulation. Kernels attach themselves to this class dynamically.
Reads an image from disk into memory. Supported formats: .jpg, .png, .ppm.
Saves the image to disk. The format is inferred from the file extension (e.g. .jpg, .png).
Automatically converts 1-channel grayscale to 3-channel RGB if saving as JPEG.
Creates an optimizable VisionPipeline for this image. Useful for fusing multiple operations together.
Converts a 1-channel grayscale image back to 3-channel RGB.
.blur(), .resize(), .invert()) are automatically attached as methods to VisionImage. You can chain them directly!
VISIONPIPELINE
Lazy execution, kernel fusion, and parallel worker pools for complex workloads.
While you can execute methods eagerly (img.grayscale().blur().edges()), this creates intermediate buffer copies in RAM for every step. The VisionPipeline records operations and executes them smartly.
Known Fusions:
- gbe: Grayscale + Blur(1) + Edges (Sobel)
- be: Blur(1) + Edges (Sobel)
- gt: Grayscale + Threshold
Executes the recorded operations sequentially on the main thread, fusing compatible ops to skip intermediate arrays.
The ultimate performance path. If a fusion is detected and the image is large enough (>= 2 Megapixels), the workload is distributed across the PipelineWorkerPool.
Using SharedArrayBuffer, workers compute horizontal slices of the image in parallel with zero memory copying.
VISIONFRAME
The internal memory primitive backed by a SharedArrayBuffer and a Uint8Array view.
| Property | Type | Description |
|---|---|---|
| width | number | Image width in pixels |
| height | number | Image height in pixels |
| channels | number | 1 (Grayscale), 3 (RGB), or 4 (RGBA) |
| buffer | SharedArrayBuffer | The underlying raw memory block |
| data | Uint8Array | Flat view over the buffer, length = w*h*c |
Use frame.get(x, y, c) and frame.set(x, y, c, value) to safely access pixels directly.
ADJUSTMENTS
Color and value adjustments. These methods can be called directly on a VisionImage.
| Method | Description |
|---|---|
| grayscale() | Converts RGB/RGBA to a 1-channel grayscale image (BT.601 integer approx). |
| invert() | Inverts pixel values (255 - x). |
| threshold(value = 128) | Binarizes a grayscale image to 0 or 255 based on the threshold. |
| adaptiveThreshold({ blockSize, c, invert }) | Local adaptive thresholding using an Integral Image O(1) box sum approach. |
| brightnessContrast(b, c) | Fast LUT-based adjustment. Brightness (-255 to 255), Contrast (0.0 flat to >1 high). |
| gamma(g) | Gamma correction via LUT. g < 1 brightens, g > 1 darkens. |
| normalize() | Stretches values to use the full 0–255 range per-channel. |
| equalizeHistogram() | Flattens the contrast distribution globally (Grayscale only). |
| extractChannel(c) | Extracts a single channel as a grayscale image (0=R, 1=G, 2=B). |
FILTERS & EDGES
Spatial domain convolution filters.
| Method | Description |
|---|---|
| gaussianBlur(radius = 1, sigma?) | Separable Gaussian Blur. If radius=1, uses an ultra-fast hardcoded 3x3 pass. |
| boxBlur(radius = 1) | Mean filter convolution. |
| sobel() | Detects edges via X and Y gradients. Requires grayscale input. |
| sharpen(strength = 1) | Unsharp mask using a 3x3 convolution kernel. |
| convolve(kernel, kw, kh) | Apply any custom flat array kernel. Must provide odd dimensions (e.g. 3, 3). |
GEOMETRY
Structural transformations.
| Method | Description |
|---|---|
| crop(x, y, w, h) | Extracts a sub-region into a new frame. |
| resize(w, h, method?) | Resizes using 'nearest', 'bilinear', or 'area' (for downsizing). |
| scale(factor, method?) | Scales by a multiplier (e.g. 0.5 for half size). |
| flipH() / flipV() | Flips the image horizontally or vertically. |
| rotate90() | Rotates the image 90 degrees clockwise. |
MORPHOLOGY
Operations for binary masks and shapes. Require 1-channel input.
| Method | Description |
|---|---|
| erode(radius = 1) | Shrinks bright regions. Requires all pixels in neighborhood to be >0. |
| dilate(radius = 1) | Expands bright regions. Requires any pixel in neighborhood to be >0. |
| open(radius = 1) | Erode followed by Dilate. Removes small noise. |
| close(radius = 1) | Dilate followed by Erode. Closes small holes. |
| inRangeGray(min, max) | Creates a binary mask where pixels within range are 255, else 0. |
| inRangeRGB({r, g, b}) | Creates a mask based on RGB boundary arrays (e.g. { r: [100, 255] }). |
ANALYSIS
Functions that extract data or structural information from the image.
| Method | Description |
|---|---|
| histogram() | Returns an array of Uint32Array(256) representing pixel frequency per channel. |
| integral() | Returns an Integral Image representation for O(1) area sums. |
| connectedComponents(opts?) | Finds disjoint shapes in a binary image. Returns an array of objects containing bounding boxes, centroids, and area. Allows filtering by minArea, maxArea, and connectivity (4 or 8). |
| filterComponents(comps, opts) | Utility to filter a components array by area. |
DRAWING
Rasterizing shapes directly onto the image buffer.
Colors are provided as objects: { r: 255, g: 0, b: 0 }. If drawing on grayscale, you can provide { gray: 255 }. All methods accept a thickness parameter.
| Method | Signature |
|---|---|
| drawPoint | (x, y, color, thickness) |
| drawLine | (x0, y0, x1, y1, color, thickness) — Bresenham's algorithm |
| drawRect | (x, y, w, h, color, thickness) |
| drawFilledRect | (x, y, w, h, color) |
| drawCircle | (cx, cy, r, color, thickness) |
| drawFilledCircle | (cx, cy, r, color) |
| drawBoxes | (boxes[], color, thickness) — Accepts components array directly |