跳转至

Implement Different DPI Output Image Design

Design notes for actually producing image data at different DPI (150 / 200 / 300 / 600) for both Native Transfer and File Transfer, end-to-end from capability negotiation through FreeImage rescaling to DIB / file output.

1. Requirements

The virtual scanner must produce real image pixels at the user-selected DPI (150 / 200 / 300 / 600), not merely tag the output with DPI metadata. The pixel dimensions delivered to the application must equal page_inches × DPI for the chosen page size.

Functional requirements:

  • When a TWAIN application sets ICAP_XRESOLUTION / ICAP_YRESOLUTION, the DS must rescale the output so that the returned pixel width / height equals page_inches × DPI.
  • The settings UI must offer DPI choices 150 / 200 / 300 / 600 with identical behavior to capability-driven changes.
  • Both Native Transfer and File Transfer paths must return identical pixels for the same settings.
  • All three pixel types (TWPT_BW / TWPT_GRAY / TWPT_RGB) must compose cleanly with any DPI choice.
  • TW_IMAGEINFO.XResolution / YResolution, DIB biXPelsPerMeter / biYPelsPerMeter, and saved-file metadata must report DPI consistent with the actual pixel density.
  • Page size (A4 / Letter / Custom / ...) must combine with DPI to drive the final pixel dimensions.
  • Resampling quality must be high enough that downscaled text remains readable and upscaled photos do not show severe blocking.

Non-functional requirements:

  • Each scan rescales at most once; subsequent strip / row reads do not re-touch pixels.
  • Low-resolution source images must still be upscaled to the target DPI; refusing or returning the original size is not allowed.
  • DPI changes must not depend on disk caches; each scan computes from the latest source image and settings.

2. Domain knowledge

2.1 TWAIN DPI and image dimensions

TWAIN expresses scanner resolution through ICAP_XRESOLUTION / ICAP_YRESOLUTION (FIX32, unit driven by ICAP_UNITS). This project pins ICAP_UNITS = TWUN_INCHES, so the FIX32 value is literally DPI.

The pixel size seen by the application is determined by three factors:

  • Scan area (inches): driven by ICAP_SUPPORTEDSIZES, which the UI maps to a CustomPageSize (width and height in inches).
  • Horizontal and vertical DPI from the resolution capabilities.
  • Pixel type only affects bits-per-pixel, not width / height.

Therefore TW_IMAGEINFO.ImageWidth = round(page_width_inch × XResolution) and ImageLength = round(page_height_inch × YResolution). A correct DS must enforce this identity, otherwise applications doing imposition, OCR, or scaling will produce incorrect results.

2.2 FreeImage rescaling

FreeImage offers FIBITMAP* FreeImage_Rescale(FIBITMAP*, int dst_w, int dst_h, FREE_IMAGE_FILTER filter):

  • Filters include FILTER_BOX, FILTER_BILINEAR, FILTER_BICUBIC, FILTER_LANCZOS3, etc.
  • Input and output share the same pixel format; the caller owns both the old and the new bitmap.

For mixed document / photo scanning content, FILTER_LANCZOS3 and FILTER_BICUBIC are reasonable defaults. The project picks FILTER_LANCZOS3 for sharper text.

2.3 Pixel-format conversion order

Pixel-format conversion (24-bit RGB → 8-bit gray → 1-bit BW) should happen at the final target resolution, not before resampling:

  • Quantizing to 1-bit before Rescale then resampling introduces grayscale pixels that no longer match the 1-bit grammar and require re-thresholding.
  • Quantizing to 8-bit gray before Rescale is less destructive but still loses fidelity around edges.

Canonical pipeline:

Source RGB → 24-bit BGR → Rescale to target (w,h) → convert to pixel_type → write DPI metadata

2.4 Page-size fill modes

When source aspect ratio differs from page_w × DPI : page_h × DPI, three policies are available:

  • Stretch: non-uniform scale to (w,h). Simple but distorts.
  • Fit: uniform scale to fit, fill the rest with white (simulating paper).
  • Fill: uniform scale to cover, crop overflow.

Selected through ScannerSettings.page_fill_mode. Regardless of policy, the final image handed to TWAIN must be exactly page_w × DPI by page_h × DPI.

2.5 DPI metadata and pixel DPI must match

Resampling-to-DPI and writing-DPI-metadata are two independent steps but they must agree:

  • Pixel resampling determines TW_IMAGEINFO, DIB width / height, and file pixel dimensions.
  • Metadata determines what Explorer / Photoshop displays as the physical size.

If only metadata is changed, Photoshop will claim a paper-sized image but actually contain too few pixels, and consumers like OCR engines will silently underperform.

3. Design

3.1 Data flow

[App or UI sets DPI]
        │
        ▼
TwainDataSource::handleDatCapability (ICAP_XRESOLUTION / YRESOLUTION SET)
        │  stores into caps_
        ▼
TwainDataSource::updateScannerFromCaps()
        │  reads DPI / page size / pixel type from caps_
        │  populates ScannerSettings
        ▼
VirtualScanner::preScanPrep(ScannerSettings)
        │  1. acquireImage()
        │  2. ensure24BitDib()
        │  3. applyPageSizeScaling()   FreeImage_Rescale to (page_w*DPI, page_h*DPI)
        │  4. applyPixelFormat()       Convert to BW / GRAY / RGB
        │  5. applyDpiMetadata()       SetDotsPerMeterX/Y
        ▼
current_fibitmap_  (final DPI, final pixel format)
        │
        ├──► Native Transfer: getDibImage() → BITMAPINFOHEADER, pixels copied bottom-up
        │
        └──► File Transfer:   saveImageToFile() → FreeImage_Save → patchSavedDpiMetadata

3.2 ScannerSettings fields

struct ScannerSettings {
  TW_UINT16 pixel_type;        // TWPT_BW / TWPT_GRAY / TWPT_RGB
  double    x_resolution;      // DPI, default 300
  double    y_resolution;      // DPI, default 300
  PageSize  page_size;         // A4 / Letter / Custom (inches)
  PageFillMode page_fill_mode; // Stretch / Fit / Fill
  // ...
};

DPI is stored as double; FIX32 values from caps are converted on the way in.

3.3 Capability layer

capability.cpp registers ICAP_XRESOLUTION / ICAP_YRESOLUTION as ENUMERATION:

  • Type TW_FIX32, default 300, values {150, 200, 300, 600}.
  • All operations: GET / GETCURRENT / GETDEFAULT / SET / RESET.
  • MSG_SET rejects values outside the set with TWRC_FAILURE / TWCC_BADVALUE.

3.4 Settings UI

settings_server.cpp renders a DPI dropdown with the same four values, preselected from current caps. Submit writes the chosen DPI back to both X and Y resolution capabilities.

3.5 Rescaling

void VirtualScanner::applyPageSizeScaling(const ScannerSettings& s) {
  if (!current_fibitmap_) return;
  int dst_w = static_cast<int>(std::round(s.page_size.width_inch  * s.x_resolution));
  int dst_h = static_cast<int>(std::round(s.page_size.height_inch * s.y_resolution));
  if (dst_w <= 0 || dst_h <= 0) return;

  FIBITMAP* dst = nullptr;
  switch (s.page_fill_mode) {
    case PageFillMode::Stretch:
      dst = FreeImage_Rescale(current_fibitmap_, dst_w, dst_h, FILTER_LANCZOS3);
      break;
    case PageFillMode::Fit:
      dst = RescaleFit(current_fibitmap_, dst_w, dst_h);
      break;
    case PageFillMode::Fill:
      dst = RescaleFill(current_fibitmap_, dst_w, dst_h);
      break;
  }
  if (dst) {
    FreeImage_Unload(current_fibitmap_);
    current_fibitmap_ = dst;
  }
}

Fit allocates a white 24-bit canvas, rescales the source preserving aspect, and pastes it centered. Fill rescales to cover and crops the overflow.

3.6 Metadata write

After rescaling and pixel-format conversion:

FreeImage_SetDotsPerMeterX(current_fibitmap_,
                           static_cast<unsigned>(s.x_resolution * 39.37));
FreeImage_SetDotsPerMeterY(current_fibitmap_,
                           static_cast<unsigned>(s.y_resolution * 39.37));

Downstream:

  • DIB header copies biXPelsPerMeter from this value.
  • FreeImage_Save uses it to write PNG / BMP / TIFF metadata.
  • Byte-level patchers (see file_dpi_design.md) then enforce exact DPI on the saved file.

3.7 Native Transfer DPI reporting

info.XResolution = floatToFix32(settings_.x_resolution);
info.YResolution = floatToFix32(settings_.y_resolution);
info.ImageWidth  = FreeImage_GetWidth(current_fibitmap_);
info.ImageLength = FreeImage_GetHeight(current_fibitmap_);

bih.biXPelsPerMeter = static_cast<LONG>(settings_.x_resolution * 39.37);
bih.biYPelsPerMeter = static_cast<LONG>(settings_.y_resolution * 39.37);

3.8 File Transfer DPI reporting

saveImageToFile() writes the FreeImage-level metadata via applyDpiMetadata, calls FreeImage_Save, then patchSavedDpiMetadata(path, x_dpi, y_dpi) finalizes:

  • PNG pHYs
  • JFIF APP0 density
  • BMP biXPelsPerMeter / biYPelsPerMeter
  • TIFF XResolution / YResolution / ResolutionUnit

All sourced from ScannerSettings.x_resolution / y_resolution.

4. Key decisions and rationale

4.1 Real pixel resampling, not metadata-only DPI

  • Decision: applyPageSizeScaling calls FreeImage_Rescale to (page_w × DPI, page_h × DPI).
  • Rationale: Real scanning applications (XnView "Scan to PDF", NAPS2, OCR middleware) trust TW_IMAGEINFO.ImageWidth / Length and the actual pixel count. Metadata-only DPI would silently break layout, OCR, and downstream PDF page size.

4.2 Fixed enum

  • Decision: ENUMERATION instead of RANGE.
  • Rationale: Bounded test matrix, simple dropdown, and protects against pathological values like 4800 DPI A4 (≈ 39680 × 56123 pixels). Enum can grow if needed.

4.3 FILTER_LANCZOS3 as the default filter

  • Decision: All rescales use FILTER_LANCZOS3.
  • Rationale: Best built-in quality for mixed text / photo content. BILINEAR looks blurry, BICUBIC slightly softens text. Per-scan cost is negligible for one image at a time.

4.4 Resample before pixel-format conversion

  • Decision: Rescale in 24-bit BGR, then convert to BW / GRAY.
  • Rationale: Quantizing first destroys information that resampling would smear; doing conversion on the final resolution gives sharper thresholds and cleaner grayscale.

4.5 VirtualScanner owns all pixel work; DS only packages

  • Decision: TwainDataSource calls VirtualScanner::preScanPrep(settings_) and consumes current_fibitmap_.
  • Rationale: Single-responsibility. Both Native and File Transfer share one final FIBITMAP and cannot drift.

4.6 round instead of floor for inch × DPI

  • Decision: std::round.
  • Rationale: Keeps nominal page sizes consistent; floor would lose up to 1 pixel per dimension and back-compute to a slightly smaller page.

4.7 Three page_fill_mode policies

  • Decision: Expose Stretch / Fit / Fill.
  • Rationale: Different testing scenarios need different policies (OCR ≠ visual diff ≠ bleed-crop testing). All three are cheap to implement.

4.8 Resampling completed before strip transfer

  • Decision: preScanPrep does all pixel work; getScanStrip only copies rows.
  • Rationale: Native Transfer strip loop calls getScanStrip many times; per-call resampling would be slow and risk inconsistent intermediate states. Precomputing also lets getImageInfo report the true final dimensions immediately.

5. Architectural component changes

5.1 src/capability.cpp

  • Register ICAP_XRESOLUTION / ICAP_YRESOLUTION as ENUMERATION, default 300, values {150, 200, 300, 600}.
  • Validate MSG_SET against the enum, return TWCC_BADVALUE otherwise.
  • Pair with ICAP_UNITS = TWUN_INCHES so semantics never shift.

5.2 src/twain_data_source.cpp

  • updateScannerFromCaps() converts FIX32 → double for ScannerSettings.x_resolution / y_resolution.
  • handleDatImageInfo() returns XResolution / YResolution from settings and ImageWidth / ImageLength from the prepared bitmap.
  • allocAndFillDibHeader() fills biXPelsPerMeter / biYPelsPerMeter from settings (DPI × 39.37).
  • enableDs() invokes updateScannerFromCaps() and virtual_scanner_.preScanPrep(settings_) after the UI submits.

5.3 src/virtual_scanner.h/.cpp

  • Add applyPageSizeScaling(const ScannerSettings&).
  • Add applyDpiMetadata(const ScannerSettings&).
  • Refactor preScanPrep: acquireImage → ensure24BitDib → applyPageSizeScaling → applyPixelFormat → applyDpiMetadata.
  • saveImageToFile() no longer rescales; it only saves and patches metadata.

5.4 src/settings_server.cpp

  • DPI dropdown 150 / 200 / 300 / 600, default 300, selected from caps.
  • On submit, write both X and Y resolution caps.
  • i18n label for the DPI control.

5.5 Test impact

  • Coverage matrix: 4 DPIs × 3 pixel types × 2 transfer modes × ≥2 page sizes.
  • Output files validated to satisfy pixel_w == round(page_w × DPI) and pixel_h == round(page_h × DPI), plus DPI metadata equality.

6. Limitations

  • Only 150 / 200 / 300 / 600 are accepted; other values fail with TWCC_BADVALUE.
  • Upscaling from low-resolution sources visibly softens; supply high-resolution images in images/ for best results.
  • FILTER_LANCZOS3 at A3@600 DPI noticeably hits CPU and memory; no parallelization yet.
  • Page size is limited to a few standard sheets plus Custom; ICAP_FRAMES (arbitrary crop frames in inches) is not implemented.
  • The UI exposes one DPI dropdown only; asymmetric X / Y DPI is data-model-supported but not surfaced in the UI.
  • A DPI change requires re-entering the scan flow; the source image is not re-loaded between strips.
  • Extreme combinations (e.g. Custom 0.5×0.5" @ 150 DPI = 75×75 px) produce very small images; no minimum size guard.

7. Next steps

  • Extend the DPI enum to {75, 100, 150, 200, 300, 400, 600, 1200} or switch to RANGE for stress tests.
  • Cache current_fibitmap_ keyed by (source_path, page_size, x_dpi, y_dpi, fill_mode).
  • Add automated post-scan validation: a Python tool reads each saved file and asserts pixel dimensions and DPI tags.
  • Implement ICAP_FRAMES so applications can request custom inch-coordinate crops.
  • Expose asymmetric X / Y DPI behind an "advanced" toggle.
  • Investigate faster rescalers (multi-threaded Lanczos, or fallback to Bicubic for small scale ratios).
  • Add multi-page consistency tests for future ADF / batch scenarios.