跳转至

Pixel Type Design (Color / Gray / BW)

Design notes for delivering Color (24-bit RGB), Gray (8-bit grayscale), and BW (1-bit bitonal) images consistently across Native Transfer and File Transfer in BN Tech Virtual Scanner.

1. Requirements

The virtual scanner must produce images in the pixel type chosen by the user / application. TWAIN maps these to:

  • TWPT_RGB — 24-bit true color (R / G / B, 8 bits each)
  • TWPT_GRAY — 8-bit grayscale (0..255)
  • TWPT_BW — 1-bit bitonal (0 / 1, semantics defined by ICAP_PIXELFLAVOR)

Functional requirements:

  • When an application sets ICAP_PIXELTYPE, the DS must actually convert pixels to the target format (not merely tag the image with the type while returning 24-bit).
  • The settings UI offers Color / Gray / BW with behavior identical to capability-driven changes.
  • Native Transfer DIBs (BITMAPINFOHEADER + palette + rows) must use the correct biBitCount:
  • RGB → 24
  • GRAY → 8 with 256-entry grayscale palette
  • BW → 1 with 2-entry black / white palette
  • File Transfer must save PNG / JPG / BMP / TIFF with bit depth matching the pixel type. JPEG cannot encode 1-bit, so a fallback path is required.
  • Pixel-type conversion must happen after DPI rescaling so that BW quantization is not smeared back into grayscale.
  • Pixel type, DPI, and page size must be orthogonal — any combination must produce a coherent image and TW_IMAGEINFO.
  • All output respects ICAP_PIXELFLAVOR = TWPF_CHOCOLATE: 0 = black, 1 = white.

Non-functional requirements:

  • Conversion is as lossless as possible: BW via thresholding, Gray via perceptual luma, RGB pass-through.
  • No new third-party dependencies beyond FreeImage.
  • Each scan converts pixel type at most once.
  • Output files must open correctly in Windows Photos, XnView, Photoshop, NAPS2.

2. Domain knowledge

2.1 TWAIN pixel types

ICAP_PIXELTYPE (TW_UINT16) common values:

Value Name Meaning
0 TWPT_BW 1-bit bitonal
1 TWPT_GRAY 8-bit grayscale
2 TWPT_RGB 24-bit true color

Registered as an ENUMERATION with default TWPT_RGB.

ICAP_BITDEPTH is derived from ICAP_PIXELTYPE:

  • RGB → 24, GRAY → 8, BW → 1.

Applications cannot set ICAP_BITDEPTH directly in this project; it is read-only and follows ICAP_PIXELTYPE.

ICAP_PIXELFLAVOR controls 0 / 1 semantics:

  • TWPF_CHOCOLATE: 0 = black, 1 = white (project default).
  • TWPF_VANILLA : 0 = white, 1 = black.

2.2 DIB bit-depth and palette rules

Windows DIB hard requirements:

biBitCount Palette Row layout
24 none BGR BGR BGR ..., row padded to 4 bytes
8 256-entry RGBQUAD required one palette index per byte
1 2-entry RGBQUAD required one bit per pixel, MSB first; row padded to 4 bytes
  • 8-bit grayscale DIB palette: 256 entries (i, i, i, 0).
  • 1-bit DIB palette: index 0 = black (0,0,0,0), index 1 = white (255,255,255,0).
  • Row stride: ((biWidth * biBitCount + 31) / 32) * 4.

A DIB header that declares 8-bit without a palette is malformed.

2.3 Pixel types in FreeImage

TWAIN FreeImage API
TWPT_RGB 24 bpp BGR (FIT_BITMAP) FreeImage_ConvertTo24Bits
TWPT_GRAY 8 bpp + 256 gray palette FreeImage_ConvertToGreyscale
TWPT_BW 1 bpp + 2-entry palette FreeImage_Threshold / FreeImage_Dither

Notes:

  • FreeImage_ConvertTo8Bits uses a Windows halftone palette; for a true grayscale DIB use FreeImage_ConvertToGreyscale.
  • FreeImage_Threshold(src, T) requires 8-bit input; T = 128 is a reasonable default.
  • FreeImage_Dither(src, FID_FS) performs Floyd-Steinberg dithering — better for photos, worse for OCR.

2.4 Grayscale weights

  • BT.601: Y = 0.299 R + 0.587 G + 0.114 B
  • BT.709: Y = 0.2126 R + 0.7152 G + 0.0722 B

FreeImage_ConvertToGreyscale uses BT.601. For scanner test content the difference is negligible.

2.5 File-format support per pixel type

Format 1-bit 8-bit Gray 24-bit RGB
PNG yes yes yes
TIFF yes (G4 / LZW / Raw) yes yes
BMP yes yes yes
JPEG no yes yes

JPEG + BW requires either fallback to 8-bit gray, or rejection. The project picks fallback so applications never fail mid-scan, while still reporting TW_IMAGEINFO.PixelType = TWPT_BW to honor the negotiated contract.

2.6 Order vs. DPI rescaling

See implement_dpi_design.md §2.3: always rescale in 24-bit BGR first, then convert to the target pixel type. Quantizing before rescaling corrupts 1-bit / 8-bit semantics with intermediate grays.

3. Design

3.1 Data flow

[App / UI sets ICAP_PIXELTYPE]
        │
        ▼
TwainDataSource::handleDatCapability (MSG_SET pixel_type)
        │  stored in caps_
        ▼
TwainDataSource::updateScannerFromCaps()
        │  settings_.pixel_type = caps_[ICAP_PIXELTYPE]
        ▼
VirtualScanner::preScanPrep(ScannerSettings)
        │  acquireImage → ensure24BitDib
        │  applyPageSizeScaling (still in 24-bit BGR)
        │  applyPixelFormat(s.pixel_type) → 1 / 8 / 24 bpp
        │  applyDpiMetadata
        ▼
current_fibitmap_
        │
        ├──► Native: getDibImage → BITMAPINFOHEADER(biBitCount)
        │                          + palette
        │                          + rows (4-byte aligned)
        │
        └──► File: saveImageToFile → format-specific encoder
                                     (JPEG + 1-bit fallback)

3.2 ScannerSettings.pixel_type

enum class PixelType : uint16_t {
  BW   = TWPT_BW,
  Gray = TWPT_GRAY,
  RGB  = TWPT_RGB,
};

3.3 Capability layer

addCap(ICAP_PIXELTYPE, TWTY_UINT16, TWON_ENUMERATION,
       TWPT_RGB,
       {TWPT_BW, TWPT_GRAY, TWPT_RGB});

addCap(ICAP_PIXELFLAVOR, TWTY_UINT16, TWON_ONEVALUE,
       TWPF_CHOCOLATE, {TWPF_CHOCOLATE});

ICAP_BITDEPTH returns 24 / 8 / 1 for GET/GETCURRENT/GETDEFAULT based on the current ICAP_PIXELTYPE; SET is rejected.

3.4 Settings UI

<select name="pixel_type">
  <option value="2" selected>Color (24-bit)</option>
  <option value="1">Gray (8-bit)</option>
  <option value="0">BW (1-bit)</option>
</select>

Submit writes back to caps_[ICAP_PIXELTYPE].

3.5 Pixel conversion implementation

void VirtualScanner::applyPixelFormat(const ScannerSettings& s) {
  if (!current_fibitmap_) return;
  FIBITMAP* dst = nullptr;
  switch (s.pixel_type) {
    case TWPT_RGB:
      if (FreeImage_GetBPP(current_fibitmap_) != 24) {
        dst = FreeImage_ConvertTo24Bits(current_fibitmap_);
      }
      break;
    case TWPT_GRAY:
      dst = FreeImage_ConvertToGreyscale(current_fibitmap_);
      break;
    case TWPT_BW: {
      FIBITMAP* gray = FreeImage_ConvertToGreyscale(current_fibitmap_);
      if (gray) {
        dst = FreeImage_Threshold(gray, 128);
        FreeImage_Unload(gray);
      }
      break;
    }
  }
  if (dst) {
    FreeImage_Unload(current_fibitmap_);
    current_fibitmap_ = dst;
  }
}

Pipeline order is fixed at applyPageSizeScaling → applyPixelFormat → applyDpiMetadata.

3.6 Native Transfer DIB construction

WORD bpp = FreeImage_GetBPP(current_fibitmap_);
bih.biBitCount    = bpp;
bih.biCompression = BI_RGB;
bih.biSizeImage   = BYTES_PERLINE(width, bpp) * height;

DWORD palette_bytes = 0;
if (bpp == 1)      palette_bytes = sizeof(RGBQUAD) * 2;
else if (bpp == 8) palette_bytes = sizeof(RGBQUAD) * 256;

Palette content:

  • 1-bit: {0,0,0,0} + {255,255,255,0} (chocolate).
  • 8-bit: 256 entries {i,i,i,0}.

Rows are copied bottom-up with BYTES_PERLINE 4-byte alignment.

getImageInfo():

info.BitsPerPixel    = bpp;
info.SamplesPerPixel = (bpp == 24) ? 3 : 1;
info.BitsPerSample[0]= (bpp == 24) ? 8 : bpp;
info.PixelType       = settings_.pixel_type;
info.Planar          = TWPC_CHUNKY;

3.7 File Transfer encoding

switch (image_file_format) {
  case TWFF_PNG:  FreeImage_Save(FIF_PNG,  bmp, path, 0);                   break;
  case TWFF_BMP:  FreeImage_Save(FIF_BMP,  bmp, path, 0);                   break;
  case TWFF_TIFF: FreeImage_Save(FIF_TIFF, bmp, path,
                                 (bpp == 1) ? TIFF_CCITTFAX4 : TIFF_LZW);   break;
  case TWFF_JFIF: {
    FIBITMAP* to_save = bmp;
    FIBITMAP* fallback = nullptr;
    if (bpp == 1) {
      fallback = FreeImage_ConvertToGreyscale(bmp);
      to_save = fallback;
    }
    FreeImage_Save(FIF_JPEG, to_save, path, JPEG_QUALITYGOOD);
    if (fallback) FreeImage_Unload(fallback);
    break;
  }
}

After saving, patchSavedDpiMetadata (see file_dpi_design.md) writes container-level DPI regardless of bit depth.

4. Key decisions and rationale

4.1 Expose only BW / Gray / RGB

  • Decision: ENUMERATION values {TWPT_BW, TWPT_GRAY, TWPT_RGB}.
  • Rationale: Covers the vast majority of real scanner workflows; CMY / CMYK / Palette have no value in a virtual scanner and would add significant code paths.

4.2 BW uses threshold by default

  • Decision: FreeImage_Threshold(gray, 128).
  • Rationale: Sharper text edges; better OCR fidelity. Dither stays available for a future setting.

4.3 Gray uses FreeImage_ConvertToGreyscale, not ConvertTo8Bits

  • Decision: Use the grayscale-specific API.
  • Rationale: ConvertTo8Bits may produce a halftone palette, not contiguous gray entries, producing apparent "color noise" in the DIB.

4.4 BW pipeline is two-step

  • Decision: RGB → Gray (8-bit) → Threshold (1-bit).
  • Rationale: FreeImage_Threshold requires 8-bit input; chaining via grayscale is predictable and matches FreeImage's intended use.

4.5 Pixel conversion happens after DPI rescaling

  • Decision: Fixed Rescale → applyPixelFormat order.
  • Rationale: Quantizing before rescaling re-introduces grays into 1-bit data; rescaling already-quantized 8-bit gray destroys edges.

4.6 JPEG + BW falls back to 8-bit grayscale

  • Decision: If bpp == 1 and target is JPEG, convert to grayscale before save.
  • Rationale: JPEG cannot encode 1-bit. Failing the scan is worse than fallback. TW_IMAGEINFO.PixelType still reports BW per the contract.

4.7 1-bit DIB palette is fixed to chocolate semantics

  • Decision: palette[0] = black, palette[1] = white.
  • Rationale: Matches ICAP_PIXELFLAVOR = TWPF_CHOCOLATE. A future vanilla mode would swap palette entries, not invert pixels.

4.8 TIFF: CCITT G4 for 1-bit, LZW otherwise

  • Decision: (bpp == 1) ? TIFF_CCITTFAX4 : TIFF_LZW.
  • Rationale: CCITT G4 is the canonical 1-bit document codec (fax / PDF/A). LZW is general-purpose lossless for 8/24-bit.

4.9 Pixel-type work owned by VirtualScanner; DS reads bpp only

  • Decision: TwainDataSource derives DIB layout from FreeImage_GetBPP(current_fibitmap_).
  • Rationale: Single source of truth. Both Native and File Transfer derive from the same final bitmap and cannot diverge.

5. Architectural component changes

5.1 src/capability.cpp

  • ICAP_PIXELTYPE ENUMERATION (default RGB, values BW / Gray / RGB), all operations.
  • ICAP_PIXELFLAVOR ONEVALUE TWPF_CHOCOLATE.
  • ICAP_BITDEPTH derived from PIXELTYPE on GET; SET returns TWCC_BADCAP.
  • CAP_SUPPORTEDCAPS includes the above.

5.2 src/twain_data_source.cpp

  • updateScannerFromCaps() propagates pixel type into settings_.pixel_type.
  • handleDatImageInfo() derives BitsPerPixel, SamplesPerPixel, BitsPerSample[] from FreeImage_GetBPP; reports PixelType from settings.
  • allocAndFillDibHeader() chooses palette size (0 / 2 / 256), writes RGBQUADs.
  • copyDibPixelData() computes row stride via BYTES_PERLINE(width, bpp), bottom-up.
  • getScanStrip() uses the current bpp for strip sizing.

5.3 src/virtual_scanner.h/.cpp

  • Add applyPixelFormat(const ScannerSettings&).
  • Fix pipeline order in preScanPrep.
  • saveImageToFile() adds JPEG + BW fallback and TIFF compression selection.
  • Helper bppFromPixelType(TW_UINT16).

5.4 src/settings_server.cpp

  • pixel_type dropdown with Color / Gray / BW.
  • i18n labels.
  • Submit writes caps_[ICAP_PIXELTYPE].

5.5 Test impact

  • Matrix: 3 pixel types × 4 DPIs × 2 transfer modes × 4 file formats.
  • Focus cases:
  • BW + JPEG → fallback path, file opens, TW_IMAGEINFO.PixelType == TWPT_BW.
  • Gray + Native → 256-entry grayscale palette in DIB.
  • BW + Native → row stride ((w+31)/32)*4, 2-entry palette.
  • BW + TIFF → CCITT G4 compression.

6. Limitations

  • Only BW / Gray / RGB supported; CMY / CMYK / YUV / palette unsupported.
  • BW threshold fixed at 128; no adaptive threshold or dither switch yet.
  • Gray uses BT.601 only.
  • JPEG + BW writes 8-bit grayscale on disk while reporting BW; accepted trade-off.
  • 16-bit gray and 48-bit RGB not supported.
  • No multi-channel concurrent output.
  • ICAP_BITDEPTH is read-only.
  • TWPF_VANILLA not supported.

7. Next steps

  • Add "BW mode: threshold / dither" UI option (FreeImage_Dither(FID_FS)).
  • Configurable BW threshold (64..192) and adaptive thresholds (Otsu / Sauvola).
  • Support TWPF_VANILLA by swapping palette entries.
  • Add 16-bit gray and 48-bit RGB for high-end workflows.
  • Provide a user-visible preference for BW + JPEG: fallback vs. reject.
  • Automated tests: read back each saved file with Pillow and assert mode (1 / L / RGB), palette, and pixel dimensions.
  • Record pixel-type-related behavior changes in CHANGELOG.md.