Pixel Type Design (Color / Gray / BW)
Design notes for delivering Color (24-bit RGB), Gray (8-bit grayscale), and BW (1-bit bitonal) images consistently across Native Transfer and File Transfer in BN Tech Virtual Scanner.
1. Requirements
The virtual scanner must produce images in the pixel type chosen by the user / application. TWAIN maps these to:
TWPT_RGB— 24-bit true color (R / G / B, 8 bits each)TWPT_GRAY— 8-bit grayscale (0..255)TWPT_BW— 1-bit bitonal (0 / 1, semantics defined byICAP_PIXELFLAVOR)
Functional requirements:
- When an application sets
ICAP_PIXELTYPE, the DS must actually convert pixels to the target format (not merely tag the image with the type while returning 24-bit). - The settings UI offers Color / Gray / BW with behavior identical to capability-driven changes.
- Native Transfer DIBs (
BITMAPINFOHEADER+ palette + rows) must use the correctbiBitCount: - RGB → 24
- GRAY → 8 with 256-entry grayscale palette
- BW → 1 with 2-entry black / white palette
- File Transfer must save PNG / JPG / BMP / TIFF with bit depth matching the pixel type. JPEG cannot encode 1-bit, so a fallback path is required.
- Pixel-type conversion must happen after DPI rescaling so that BW quantization is not smeared back into grayscale.
- Pixel type, DPI, and page size must be orthogonal — any combination must produce a coherent image and
TW_IMAGEINFO. - All output respects
ICAP_PIXELFLAVOR = TWPF_CHOCOLATE: 0 = black, 1 = white.
Non-functional requirements:
- Conversion is as lossless as possible: BW via thresholding, Gray via perceptual luma, RGB pass-through.
- No new third-party dependencies beyond FreeImage.
- Each scan converts pixel type at most once.
- Output files must open correctly in Windows Photos, XnView, Photoshop, NAPS2.
2. Domain knowledge
2.1 TWAIN pixel types
ICAP_PIXELTYPE (TW_UINT16) common values:
| Value | Name | Meaning |
|---|---|---|
| 0 | TWPT_BW |
1-bit bitonal |
| 1 | TWPT_GRAY |
8-bit grayscale |
| 2 | TWPT_RGB |
24-bit true color |
Registered as an ENUMERATION with default TWPT_RGB.
ICAP_BITDEPTH is derived from ICAP_PIXELTYPE:
- RGB → 24, GRAY → 8, BW → 1.
Applications cannot set ICAP_BITDEPTH directly in this project; it is read-only and follows ICAP_PIXELTYPE.
ICAP_PIXELFLAVOR controls 0 / 1 semantics:
TWPF_CHOCOLATE: 0 = black, 1 = white (project default).TWPF_VANILLA: 0 = white, 1 = black.
2.2 DIB bit-depth and palette rules
Windows DIB hard requirements:
| biBitCount | Palette | Row layout |
|---|---|---|
| 24 | none | BGR BGR BGR ..., row padded to 4 bytes |
| 8 | 256-entry RGBQUAD required | one palette index per byte |
| 1 | 2-entry RGBQUAD required | one bit per pixel, MSB first; row padded to 4 bytes |
- 8-bit grayscale DIB palette: 256 entries
(i, i, i, 0). - 1-bit DIB palette: index 0 = black
(0,0,0,0), index 1 = white(255,255,255,0). - Row stride:
((biWidth * biBitCount + 31) / 32) * 4.
A DIB header that declares 8-bit without a palette is malformed.
2.3 Pixel types in FreeImage
| TWAIN | FreeImage | API |
|---|---|---|
TWPT_RGB |
24 bpp BGR (FIT_BITMAP) |
FreeImage_ConvertTo24Bits |
TWPT_GRAY |
8 bpp + 256 gray palette | FreeImage_ConvertToGreyscale |
TWPT_BW |
1 bpp + 2-entry palette | FreeImage_Threshold / FreeImage_Dither |
Notes:
FreeImage_ConvertTo8Bitsuses a Windows halftone palette; for a true grayscale DIB useFreeImage_ConvertToGreyscale.FreeImage_Threshold(src, T)requires 8-bit input; T = 128 is a reasonable default.FreeImage_Dither(src, FID_FS)performs Floyd-Steinberg dithering — better for photos, worse for OCR.
2.4 Grayscale weights
- BT.601:
Y = 0.299 R + 0.587 G + 0.114 B - BT.709:
Y = 0.2126 R + 0.7152 G + 0.0722 B
FreeImage_ConvertToGreyscale uses BT.601. For scanner test content the difference is negligible.
2.5 File-format support per pixel type
| Format | 1-bit | 8-bit Gray | 24-bit RGB |
|---|---|---|---|
| PNG | yes | yes | yes |
| TIFF | yes (G4 / LZW / Raw) | yes | yes |
| BMP | yes | yes | yes |
| JPEG | no | yes | yes |
JPEG + BW requires either fallback to 8-bit gray, or rejection. The project picks fallback so applications never fail mid-scan, while still reporting TW_IMAGEINFO.PixelType = TWPT_BW to honor the negotiated contract.
2.6 Order vs. DPI rescaling
See implement_dpi_design.md §2.3: always rescale in 24-bit BGR first, then convert to the target pixel type. Quantizing before rescaling corrupts 1-bit / 8-bit semantics with intermediate grays.
3. Design
3.1 Data flow
[App / UI sets ICAP_PIXELTYPE]
│
▼
TwainDataSource::handleDatCapability (MSG_SET pixel_type)
│ stored in caps_
▼
TwainDataSource::updateScannerFromCaps()
│ settings_.pixel_type = caps_[ICAP_PIXELTYPE]
▼
VirtualScanner::preScanPrep(ScannerSettings)
│ acquireImage → ensure24BitDib
│ applyPageSizeScaling (still in 24-bit BGR)
│ applyPixelFormat(s.pixel_type) → 1 / 8 / 24 bpp
│ applyDpiMetadata
▼
current_fibitmap_
│
├──► Native: getDibImage → BITMAPINFOHEADER(biBitCount)
│ + palette
│ + rows (4-byte aligned)
│
└──► File: saveImageToFile → format-specific encoder
(JPEG + 1-bit fallback)
3.2 ScannerSettings.pixel_type
enum class PixelType : uint16_t {
BW = TWPT_BW,
Gray = TWPT_GRAY,
RGB = TWPT_RGB,
};
3.3 Capability layer
addCap(ICAP_PIXELTYPE, TWTY_UINT16, TWON_ENUMERATION,
TWPT_RGB,
{TWPT_BW, TWPT_GRAY, TWPT_RGB});
addCap(ICAP_PIXELFLAVOR, TWTY_UINT16, TWON_ONEVALUE,
TWPF_CHOCOLATE, {TWPF_CHOCOLATE});
ICAP_BITDEPTH returns 24 / 8 / 1 for GET/GETCURRENT/GETDEFAULT based on the current ICAP_PIXELTYPE; SET is rejected.
3.4 Settings UI
<select name="pixel_type">
<option value="2" selected>Color (24-bit)</option>
<option value="1">Gray (8-bit)</option>
<option value="0">BW (1-bit)</option>
</select>
Submit writes back to caps_[ICAP_PIXELTYPE].
3.5 Pixel conversion implementation
void VirtualScanner::applyPixelFormat(const ScannerSettings& s) {
if (!current_fibitmap_) return;
FIBITMAP* dst = nullptr;
switch (s.pixel_type) {
case TWPT_RGB:
if (FreeImage_GetBPP(current_fibitmap_) != 24) {
dst = FreeImage_ConvertTo24Bits(current_fibitmap_);
}
break;
case TWPT_GRAY:
dst = FreeImage_ConvertToGreyscale(current_fibitmap_);
break;
case TWPT_BW: {
FIBITMAP* gray = FreeImage_ConvertToGreyscale(current_fibitmap_);
if (gray) {
dst = FreeImage_Threshold(gray, 128);
FreeImage_Unload(gray);
}
break;
}
}
if (dst) {
FreeImage_Unload(current_fibitmap_);
current_fibitmap_ = dst;
}
}
Pipeline order is fixed at applyPageSizeScaling → applyPixelFormat → applyDpiMetadata.
3.6 Native Transfer DIB construction
WORD bpp = FreeImage_GetBPP(current_fibitmap_);
bih.biBitCount = bpp;
bih.biCompression = BI_RGB;
bih.biSizeImage = BYTES_PERLINE(width, bpp) * height;
DWORD palette_bytes = 0;
if (bpp == 1) palette_bytes = sizeof(RGBQUAD) * 2;
else if (bpp == 8) palette_bytes = sizeof(RGBQUAD) * 256;
Palette content:
- 1-bit:
{0,0,0,0}+{255,255,255,0}(chocolate). - 8-bit: 256 entries
{i,i,i,0}.
Rows are copied bottom-up with BYTES_PERLINE 4-byte alignment.
getImageInfo():
info.BitsPerPixel = bpp;
info.SamplesPerPixel = (bpp == 24) ? 3 : 1;
info.BitsPerSample[0]= (bpp == 24) ? 8 : bpp;
info.PixelType = settings_.pixel_type;
info.Planar = TWPC_CHUNKY;
3.7 File Transfer encoding
switch (image_file_format) {
case TWFF_PNG: FreeImage_Save(FIF_PNG, bmp, path, 0); break;
case TWFF_BMP: FreeImage_Save(FIF_BMP, bmp, path, 0); break;
case TWFF_TIFF: FreeImage_Save(FIF_TIFF, bmp, path,
(bpp == 1) ? TIFF_CCITTFAX4 : TIFF_LZW); break;
case TWFF_JFIF: {
FIBITMAP* to_save = bmp;
FIBITMAP* fallback = nullptr;
if (bpp == 1) {
fallback = FreeImage_ConvertToGreyscale(bmp);
to_save = fallback;
}
FreeImage_Save(FIF_JPEG, to_save, path, JPEG_QUALITYGOOD);
if (fallback) FreeImage_Unload(fallback);
break;
}
}
After saving, patchSavedDpiMetadata (see file_dpi_design.md) writes container-level DPI regardless of bit depth.
4. Key decisions and rationale
4.1 Expose only BW / Gray / RGB
- Decision: ENUMERATION values
{TWPT_BW, TWPT_GRAY, TWPT_RGB}. - Rationale: Covers the vast majority of real scanner workflows; CMY / CMYK / Palette have no value in a virtual scanner and would add significant code paths.
4.2 BW uses threshold by default
- Decision:
FreeImage_Threshold(gray, 128). - Rationale: Sharper text edges; better OCR fidelity. Dither stays available for a future setting.
4.3 Gray uses FreeImage_ConvertToGreyscale, not ConvertTo8Bits
- Decision: Use the grayscale-specific API.
- Rationale:
ConvertTo8Bitsmay produce a halftone palette, not contiguous gray entries, producing apparent "color noise" in the DIB.
4.4 BW pipeline is two-step
- Decision:
RGB → Gray (8-bit) → Threshold (1-bit). - Rationale:
FreeImage_Thresholdrequires 8-bit input; chaining via grayscale is predictable and matches FreeImage's intended use.
4.5 Pixel conversion happens after DPI rescaling
- Decision: Fixed
Rescale → applyPixelFormatorder. - Rationale: Quantizing before rescaling re-introduces grays into 1-bit data; rescaling already-quantized 8-bit gray destroys edges.
4.6 JPEG + BW falls back to 8-bit grayscale
- Decision: If
bpp == 1and target is JPEG, convert to grayscale before save. - Rationale: JPEG cannot encode 1-bit. Failing the scan is worse than fallback.
TW_IMAGEINFO.PixelTypestill reports BW per the contract.
4.7 1-bit DIB palette is fixed to chocolate semantics
- Decision: palette[0] = black, palette[1] = white.
- Rationale: Matches
ICAP_PIXELFLAVOR = TWPF_CHOCOLATE. A future vanilla mode would swap palette entries, not invert pixels.
4.8 TIFF: CCITT G4 for 1-bit, LZW otherwise
- Decision:
(bpp == 1) ? TIFF_CCITTFAX4 : TIFF_LZW. - Rationale: CCITT G4 is the canonical 1-bit document codec (fax / PDF/A). LZW is general-purpose lossless for 8/24-bit.
4.9 Pixel-type work owned by VirtualScanner; DS reads bpp only
- Decision:
TwainDataSourcederives DIB layout fromFreeImage_GetBPP(current_fibitmap_). - Rationale: Single source of truth. Both Native and File Transfer derive from the same final bitmap and cannot diverge.
5. Architectural component changes
5.1 src/capability.cpp
ICAP_PIXELTYPEENUMERATION (default RGB, values BW / Gray / RGB), all operations.ICAP_PIXELFLAVORONEVALUETWPF_CHOCOLATE.ICAP_BITDEPTHderived from PIXELTYPE on GET; SET returnsTWCC_BADCAP.CAP_SUPPORTEDCAPSincludes the above.
5.2 src/twain_data_source.cpp
updateScannerFromCaps()propagates pixel type intosettings_.pixel_type.handleDatImageInfo()derivesBitsPerPixel,SamplesPerPixel,BitsPerSample[]fromFreeImage_GetBPP; reportsPixelTypefrom settings.allocAndFillDibHeader()chooses palette size (0 / 2 / 256), writes RGBQUADs.copyDibPixelData()computes row stride viaBYTES_PERLINE(width, bpp), bottom-up.getScanStrip()uses the current bpp for strip sizing.
5.3 src/virtual_scanner.h/.cpp
- Add
applyPixelFormat(const ScannerSettings&). - Fix pipeline order in
preScanPrep. saveImageToFile()adds JPEG + BW fallback and TIFF compression selection.- Helper
bppFromPixelType(TW_UINT16).
5.4 src/settings_server.cpp
- pixel_type dropdown with Color / Gray / BW.
- i18n labels.
- Submit writes
caps_[ICAP_PIXELTYPE].
5.5 Test impact
- Matrix: 3 pixel types × 4 DPIs × 2 transfer modes × 4 file formats.
- Focus cases:
- BW + JPEG → fallback path, file opens,
TW_IMAGEINFO.PixelType == TWPT_BW. - Gray + Native → 256-entry grayscale palette in DIB.
- BW + Native → row stride
((w+31)/32)*4, 2-entry palette. - BW + TIFF → CCITT G4 compression.
6. Limitations
- Only BW / Gray / RGB supported; CMY / CMYK / YUV / palette unsupported.
- BW threshold fixed at 128; no adaptive threshold or dither switch yet.
- Gray uses BT.601 only.
- JPEG + BW writes 8-bit grayscale on disk while reporting BW; accepted trade-off.
- 16-bit gray and 48-bit RGB not supported.
- No multi-channel concurrent output.
ICAP_BITDEPTHis read-only.TWPF_VANILLAnot supported.
7. Next steps
- Add "BW mode: threshold / dither" UI option (
FreeImage_Dither(FID_FS)). - Configurable BW threshold (64..192) and adaptive thresholds (Otsu / Sauvola).
- Support
TWPF_VANILLAby swapping palette entries. - Add 16-bit gray and 48-bit RGB for high-end workflows.
- Provide a user-visible preference for BW + JPEG: fallback vs. reject.
- Automated tests: read back each saved file with Pillow and assert
mode(1/L/RGB), palette, and pixel dimensions. - Record pixel-type-related behavior changes in
CHANGELOG.md.