Skip to content

Image Source Folder Design

Design notes for loading and round-robin iterating images from %APPDATA%\bntech\images as the virtual scanner's image source in BN Tech Virtual Scanner.

1. Requirements

The virtual scanner must have "paper" to feed. This project uses a local image folder as the source of virtual pages: on every scan request, the DS picks the next image from the folder and renders it with the current settings (DPI, pixel type, page size).

Functional requirements:

  • Read candidate images from %APPDATA%\bntech\images\.
  • Support common formats: PNG, JPG, JPEG, BMP, TIF, TIFF.
  • Files are sorted alphabetically (case-insensitive, locale-stable); each scan advances to the next file.
  • Wrap around to the first file after the last (round-robin) for long-running test loops.
  • Persist the scan index across DLL unload / reload.
  • When the folder is missing, empty, or contains only unsupported files, fall back to the bundled TWAIN_logo.png from the DS install directory — never fail the scan.
  • Resetting progress must be trivial: delete one file (info.json), no UI step required.
  • Users may add or remove images between scans; the next scan must reflect the new state.
  • Thread-safety: UI thread, TWAIN dispatcher thread, and any strip-copy work must coexist safely.
  • ADF expansion (future): support emitting N images per session, still in alphabetical order.

Non-functional requirements:

  • Index file format must be human-readable and easy to edit.
  • No third-party JSON dependency; a tiny hand-written parser suffices.
  • Image loading stays on FreeImage.
  • Ordering must not depend on filesystem timestamps.

2. Domain knowledge

2.1 TWAIN scan session and "next image"

A simplified TWAIN session looks like:

OpenDS → EnableDS → (XferReady) → DAT_IMAGEINFO → DAT_IMAGE{NATIVE|FILE}XFER → DAT_PENDINGXFERS → DisableDS → CloseDS

DAT_PENDINGXFERS.Count declares how many more images the scanner has ready. For a flatbed simulation this is 0 (one image per session). "Next image" means: each new EnableDS session takes the next item from the folder.

2.2 %APPDATA% and user-scoped data

%APPDATA% resolves to C:\Users\<user>\AppData\Roaming. Suitable for per-user data because:

  • No admin rights required.
  • Per-user isolation.
  • MSI uninstall does not remove user data.

Resolve via:

PWSTR p = nullptr;
SHGetKnownFolderPath(FOLDERID_RoamingAppData, 0, nullptr, &p);
CoTaskMemFree(p);

The project pins <%APPDATA%>\bntech\images\ for images, info.json for the index, and config.ini for language.

2.3 Directory enumeration and sort stability

FindFirstFileW / FindNextFileW does not guarantee order. The project sorts explicitly with _wcsicmp (case-insensitive) so behavior is identical across NTFS / FAT32 / SMB shares and uppercase / lowercase filenames.

2.4 Index persistence options

Choices considered:

  • INI (last_index=3).
  • JSON ({"next_index": 3, ...}).
  • Registry.
  • File next to the DLL.

Constraints: DLL is loaded/unloaded per application, so the index must persist on disk; the DLL directory (C:\Windows\twain_64\bntech\) requires admin write; the registry is opaque. JSON in %APPDATA%\bntech\images\info.json wins on simplicity and debuggability.

2.5 Format detection

FreeImage_GetFileTypeU detects format by signature so extension renames still work. But the pre-filter (which files qualify as candidates) uses extensions for human transparency: .png / .jpg / .jpeg / .bmp / .tif / .tiff (case-insensitive).

2.6 Fallback image (TWAIN_logo.png)

The MSI installs TWAIN_logo.png next to the .ds so a permanent fallback always exists. Locate the DLL's own directory through GetModuleHandleExW + GetModuleFileNameW using a function pointer inside the module (NOT nullptr, which resolves the host EXE path).

2.7 Cross-process concurrency

Multiple TWAIN applications can load the DS concurrently. The project relies on atomic file rename for info.json updates and does not add cross-process mutexes; the worst case is two apps scanning the same image, which is acceptable in a test scanner.

3. Design

3.1 Layout

%APPDATA%\bntech\
├── config.ini
└── images\
    ├── info.json
    ├── 001_a4_color.png
    ├── 002_a4_text.png
    └── ...

Fallback:

<install_dir>\TWAIN_logo.png

3.2 ImageSource component

class ImageSource {
 public:
  void refresh();
  std::wstring acquireNext();   // wrap-around; returns fallback if empty
  size_t size() const;
  void reset();
 private:
  void loadIndex();
  void saveIndex() const;

  std::vector<std::wstring> files_;
  size_t next_index_ = 0;
  std::wstring images_dir_;
  std::wstring fallback_path_;
  mutable std::mutex mutex_;
};

Used by VirtualScanner::preScanPrepacquireImage.

3.3 Enumeration

void ImageSource::refresh() {
  files_.clear();
  WIN32_FIND_DATAW fd{};
  HANDLE h = FindFirstFileW((images_dir_ + L"\\*").c_str(), &fd);
  if (h == INVALID_HANDLE_VALUE) return;
  do {
    if (fd.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) continue;
    if (!isSupportedExt(fd.cFileName)) continue;
    files_.push_back(images_dir_ + L"\\" + fd.cFileName);
  } while (FindNextFileW(h, &fd));
  FindClose(h);
  std::sort(files_.begin(), files_.end(),
            [](const std::wstring& a, const std::wstring& b) {
              return _wcsicmp(a.c_str(), b.c_str()) < 0;
            });
}

3.4 info.json format

{
  "next_index": 3,
  "last_file": "002_a4_text.png",
  "total": 12,
  "updated_at": "2026-05-26T10:21:33+08:00"
}
  • next_index: index to use on the next scan (0-based).
  • last_file: name used on the previous scan (for logs).
  • total: candidate count seen at last refresh().
  • updated_at: write timestamp.

Read: parse leniently; missing or malformed → treat as next_index = 0. Write: serialize → write info.json.tmpMoveFileExW(..., MOVEFILE_REPLACE_EXISTING) for atomic replacement.

3.5 Round-robin & set changes

if (files_.empty()) return fallback_path_;
size_t idx = next_index_ % files_.size();
auto path = files_[idx];
next_index_ = (next_index_ + 1) % files_.size();
saveIndex();
return path;

Modulo keeps the persisted index inside [0, total).

3.6 Caching policy

  • loadIndex() runs once at construction.
  • refresh() runs on every acquireNext() so additions / removals between scans take effect.
  • saveIndex() runs after every successful acquireNext().

3.7 Fallback path resolution

HMODULE h = nullptr;
GetModuleHandleExW(GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS |
                   GET_MODULE_HANDLE_EX_FLAG_UNCHANGED_REFCOUNT,
                   reinterpret_cast<LPCWSTR>(&resolveFallbackPath), &h);
wchar_t buf[MAX_PATH];
GetModuleFileNameW(h, buf, MAX_PATH);

Works both for installed DS (C:\Windows\twain_64\bntech) and dev builds (build\win64\Release).

3.8 VirtualScanner integration

preScanPrep(settings):
  path = image_source_.acquireNext();
  FIBITMAP* bmp = FreeImage_LoadU(detectFormat(path), path);
  ensure24BitDib() → applyPageSizeScaling() → applyPixelFormat() → applyDpiMetadata()
  current_fibitmap_ = bmp;

4. Key decisions and rationale

4.1 Hard-coded %APPDATA%\bntech\images

  • Decision: no UI configuration for the source folder.
  • Rationale: Stable for automation, docs, and CI; %APPDATA% is per-user and admin-free; co-locates with config.ini and info.json.

4.2 Alphabetical order, not mtime

  • Decision: _wcsicmp ascending.
  • Rationale: Predictable; users can prefix filenames (001_, 002_) to control order; mtime would be perturbed by git operations or copy/paste.

4.3 Round-robin instead of stopping at end

  • Decision: wrap.
  • Rationale: Long-running test loops never stall; reset is "delete info.json".

4.4 JSON file for index

  • Decision: info.json + atomic rename.
  • Rationale: Human-readable; separates state from preferences (config.ini); extensible to richer fields.

4.5 No third-party JSON library

  • Decision: hand-written tiny parser.
  • Rationale: One numeric field is enough; saves build complexity.

4.6 Fallback image when folder is empty

  • Decision: TWAIN_logo.png instead of error.
  • Rationale: First-run users have an empty folder; failing the scan would be a poor first impression.

4.7 Fallback bundled next to .ds

  • Decision: MSI installs TWAIN_logo.png next to the binary.
  • Rationale: Resolvable via GetModuleFileNameW; not dependent on %APPDATA%.

4.8 Refresh on every acquireNext

  • Decision: no directory cache between scans.
  • Rationale: Newly added files must appear immediately; FindFirstFile is cheap.

4.9 No cross-process locking

  • Decision: only intra-process std::mutex, plus atomic rename for info.json.
  • Rationale: Acceptable trade-off; named mutexes risk orphans without meaningful benefit for a test scanner.

4.10 Modulo index by files_.size() on save

  • Decision: persist (next + 1) % size.
  • Rationale: Keeps info.json easy to interpret at a glance.

5. Architectural component changes

5.1 src/virtual_scanner.h/.cpp

  • Introduce ImageSource member with images_dir_ and fallback_path_.
  • acquireImage() calls image_source_.acquireNext() + FreeImage_LoadU.
  • Optional resetImageIndex() for testing.

5.2 src/twain_data_source.cpp

  • No special branching for image selection; rely on VirtualScanner.
  • Log last_file for diagnostics.

5.3 src/ds_entry.cpp

  • No direct changes; ensure clean destruction on DLL unload.

5.4 Installer (installer/*.wxs)

  • Add TWAIN_logo.png to a component installed alongside the .ds.
  • Do NOT create %APPDATA%\bntech\images at install time; create on demand.

5.5 Documentation

  • README "Image source folder" describes folder, extensions, and reset.
  • New blog/post about replacing the scan source.

5.6 Test impact

  • Empty folder → TWAIN_logo.png returned.
  • 1 image, 3 scans → all return the same image.
  • 3 images, 5 scans → 1, 2, 3, 1, 2; next_index lands correctly.
  • Delete info.json → next scan starts from 0.
  • Add a file between scans → it appears next round.
  • Two concurrent applications → no crash; info.json stays well-formed.

6. Limitations

  • Folder path is hard-coded; no UI override.
  • Lexicographic only; image10 sorts before image2. No natural ordering.
  • No recursion into subdirectories.
  • One image per session; no ADF batch yet.
  • Index granularity is "next index" only; no per-file history.
  • Weak cross-process arbitration; two apps may pick the same file.
  • Fallback is fixed at TWAIN_logo.png; not user-replaceable through UI.
  • Very large folders (thousands of files) make refresh() + sort noticeable.
  • Exclusively-opened files cause FreeImage_LoadU to fail without retry.

7. Next steps

  • Settings UI "Reset image index" button calling VirtualScanner::resetImageIndex().
  • Settings UI "Choose image folder" persisted in config.ini.
  • Natural-order sort (image2 < image10).
  • ADF simulation: emit N images per session and report DAT_PENDINGXFERS.Count accordingly.
  • Shuffle mode (pseudo-random order) for stress testing.
  • Add history array to info.json for traceability.
  • LockFileEx on info.json.lock to better coordinate concurrent applications.
  • ReadDirectoryChangesW watcher for live refresh.
  • Tutorial in docs/ / blog: preparing ADF test sets and regression suites that pair with info.json.