Twain
1. What is TWAIN
TWAIN is an open industry standard API for image-acquisition devices (scanners, digital cameras). Introduced in 1992 by Hewlett-Packard, Kodak, Logitech, Aldus, and Caere, it is maintained by the TWAIN Working Group (https://www.twain.org).
The name is jokingly read as "Technology Without An Interesting Name"; officially it is said to derive from Kipling's "And never the twain shall meet" — symbolizing "application and device meeting".
TWAIN solves two problems:
- Image-acquisition applications (Photoshop, XnView, Office, PDF tools) can talk to scanners and cameras of any brand / model / bus without writing per-device code.
- Hardware vendors ship one Data Source (DS) DLL and it works in every TWAIN-compliant application.
TWAIN is an application-layer API, not a bus protocol; it sits on top of the OS and delegates the actual device I/O to vendor drivers.
2. Comparison with WIA and eSCL
2.1 Snapshot
| Protocol | Long name | Steward | Platform | Transport |
|---|---|---|---|---|
| TWAIN | TWAIN Protocol | TWAIN Working Group | Windows / macOS / Linux | In-process DLL (DS) + DSM dispatch |
| WIA | Windows Image Acquisition | Microsoft | Windows-only | COM + kernel stillimg.sys |
| eSCL | (Mopria / Apple) AirScan / eSCL | Mopria Alliance / Apple | OS-agnostic | HTTP + XML over IPP-style URLs |
2.2 Design comparison
| Aspect | TWAIN | WIA | eSCL |
|---|---|---|---|
| Communication | In-process DLL (DS_Entry) | Out-of-process COM + driver | Network HTTP / REST + XML |
| UI | DS may show its own dialog | System wizard | App-owned, no driver UI |
| Control granularity | Very fine (hundreds of capabilities) | Medium | Medium-coarse |
| Vendor burden | Heavy (write full DS DLL) | Medium (WIA driver) | Light (HTTP endpoints in firmware) |
| App burden | Medium (state machine + DSM calls) | Medium (COM calls) | Light (standard HTTP) |
| Cross-platform | Windows + macOS, partial Linux | Windows only | Any HTTP-capable platform |
| Network scan | Needs TWAIN Direct | Indirect via WSD | Native |
| 32/64-bit | Historical pain (need both DS) | Driver-level, no issue | Bit-width irrelevant |
| Vendor UI quality | Varies widely | Uniform Windows style | App decides |
| Advanced features | Strong (barcode, patch, ICC) | Weak | Medium (growing) |
2.3 When to pick which
- Need deep control of scan parameters (duplex, ADF, brightness, threshold, compression) and target professional / commercial scanners → TWAIN.
- Windows-only and lightweight (Outlook image insert, Office "Scan to document") → WIA is simpler.
- Modern / mobile / cross-platform, scanner is on the network (home MFP, office MFP) → eSCL is the de-facto modern standard (iOS, Android, macOS use it by default).
- Mixed (legacy USB + modern network) → wrap both in your own abstraction.
2.4 Complementary roles
- The TWAIN Working Group released TWAIN Direct (HTTPS + JSON) to compete with eSCL on network scanning.
- WIA on Windows 7+ gained indirect network scan via WSD, but the API feels dated.
- Commercial document scanners (Fujitsu, Kodak, Canon DR, Brother, Epson WorkForce) almost universally ship TWAIN DS, often plus WIA / ISIS / eSCL.
- Home / office MFPs (HP, Brother, Canon, Epson) now ship eSCL primarily, with TWAIN / WIA available over USB.
3. TWAIN architecture
3.1 Three layers
┌──────────────────────────────┐
│ Application (TWAIN-aware) │ Photoshop, XnView, NAPS2, Acrobat, ...
└──────────────┬───────────────┘
│ DSM_Entry(app, ds, DG, DAT, MSG, pData)
┌──────────────▼───────────────┐
│ Data Source Manager (DSM) │ TWAINDSM.dll (TWAIN 2.x)
│ │ TWAIN_32.dll (TWAIN 1.x, 32-bit)
└──────────────┬───────────────┘
│ DS_Entry(app, ds, DG, DAT, MSG, pData)
┌──────────────▼───────────────┐
│ Data Source (DS, .ds) │ Per-device DLL written by vendor.
│ Talks to actual hardware │ Encapsulates USB / network specifics.
└──────────────────────────────┘
3.2 The DSM / DS entry point
Both DSM and DS expose the same signature:
TW_UINT16 DSM_Entry(pTW_IDENTITY origin,
pTW_IDENTITY dest,
TW_UINT32 DG,
TW_UINT16 DAT,
TW_UINT16 MSG,
TW_MEMREF pData);
The DG / DAT / MSG triple expresses every operation:
- DG (Data Group): CONTROL / IMAGE / AUDIO.
- DAT (Data Argument Type): payload type, e.g.
DAT_IDENTITY,DAT_CAPABILITY,DAT_IMAGEINFO,DAT_IMAGENATIVEXFER,DAT_IMAGEFILEXFER. - MSG: action, e.g.
MSG_OPENDS,MSG_ENABLEDS,MSG_GET,MSG_SET,MSG_PROCESSEVENT.
Opening a device:
DSM_Entry(app, NULL, DG_CONTROL, DAT_IDENTITY, MSG_OPENDSM, &dsm_window);
DSM_Entry(app, NULL, DG_CONTROL, DAT_IDENTITY, MSG_GETDEFAULT, &ds_id);
DSM_Entry(app, &ds_id, DG_CONTROL, DAT_IDENTITY, MSG_OPENDS, &ds_id);
DSM_Entry(app, &ds_id, DG_CONTROL, DAT_USERINTERFACE, MSG_ENABLEDS, &ui);
3.3 State machine (7 states)
| State | Name | Meaning |
|---|---|---|
| 1 | Pre-Session | Nothing started |
| 2 | Source Manager Loaded | DSM DLL loaded (legacy LoadLibrary) |
| 3 | Source Manager Open | DSM MSG_OPENDSM succeeded |
| 4 | Source Open | DS MSG_OPENDS succeeded; capabilities readable / writable |
| 5 | Source Enabled | DS MSG_ENABLEDS; UI may show; waiting for Scan |
| 6 | Transfer Ready | DS posted MSG_XFERREADY |
| 7 | Transferring | App actively pulling an image |
Typical flow:
1 → 2 LoadDSM → 3 OPENDSM → 4 OPENDS → 5 ENABLEDS
│
▼ (DS posts MSG_XFERREADY)
6 Transfer Ready
│ DAT_IMAGEINFO
│ DAT_IMAGENATIVEXFER / DAT_IMAGEFILEXFER
▼
7 Transferring
│ DAT_PENDINGXFERS
▼
5 (next image) or
4 (DisableDS)
→ 3 (CloseDS) → 2 (CloseDSM) → 1
3.4 Capability negotiation
TWAIN models device features as Capabilities:
CAP_*: general (e.g.CAP_FEEDERENABLED,CAP_UICONTROLLABLE,CAP_XFERCOUNT).ICAP_*: image (e.g.ICAP_PIXELTYPE,ICAP_XRESOLUTION,ICAP_UNITS,ICAP_IMAGEFILEFORMAT,ICAP_XFERMECH).ACAP_*: audio.
Each value is expressed in one of four containers:
TWON_ONEVALUE: scalar.TWON_ENUMERATION: set + default + current.TWON_RANGE: min / max / step / default / current.TWON_ARRAY: list (e.g.CAP_SUPPORTEDCAPS).
Supported operations: MSG_GET / GETCURRENT / GETDEFAULT / SET / RESET / QUERYSUPPORT.
3.5 Three transfer mechanisms
ICAP_XFERMECH selects how pixels reach the application:
- Native Transfer (
TWSX_NATIVE): DS returns a DIB handle via DSM-managed shared memory. Most common. - File Transfer (
TWSX_FILE): DS writes the image to an application-specified path; application reads the file. Convenient for automation and PDF generation. - Memory Transfer (
TWSX_MEMORY): strip-by-strip in memory buffers; application controls each chunk. Good for very large images.
3.6 Event loop (Windows)
A TWAIN application must forward window messages to the DS:
MSG msg;
while (GetMessage(&msg, NULL, 0, 0)) {
TW_EVENT te = { &msg, MSG_NULL };
DSM_Entry(app, &ds, DG_CONTROL, DAT_EVENT, MSG_PROCESSEVENT, &te);
if (te.TWMessage == MSG_XFERREADY) {
// Begin transfer.
} else if (te.TWMessage == MSG_CLOSEDSREQ) {
// User closed the DS UI.
} else if (te.TWMessage == MSG_NULL) {
TranslateMessage(&msg);
DispatchMessage(&msg);
}
}
The DS uses PostMessage to push async notifications (MSG_XFERREADY, MSG_CLOSEDSREQ, MSG_DEVICEEVENT) into the application queue, which the application then forwards back through DSM.
3.7 32-bit vs 64-bit
Historically TWAIN DS was 32-bit only (TWAIN_32.dll + C:\Windows\twain_32), so 64-bit applications needed a 32-bit broker. TWAIN 2.x introduced a 64-bit DSM (TWAINDSM.dll) and C:\Windows\twain_64\. This project builds both 32-bit and 64-bit DS to cover both worlds (see README "Dual architecture").
4. Use cases
- Office document management: PDF/A archival, invoices, contracts (Adobe Acrobat, ABBYY, Foxit, Kofax, NAPS2).
- Professional imaging / prepress: high-resolution film / reflective scanning (Silverfast, VueScan, Photoshop).
- Medical imaging: film digitization, pathology pre-scans (PACS front-ends).
- Finance: cheque scanners (Fujitsu fi, Canon DR, Kodak i series).
- OCR / RPA: OmniPage, Tesseract front-ends, UiPath, Blue Prism document ingestion.
- Industry: legal evidence scanning, government archive digitization, hospital medical records.
- Test / development: this project, BN Tech Virtual Scanner, provides a complete TWAIN DS without real hardware for regression testing of scanning applications, PDF tools, and RPA flows.
5. Version history
5.1 TWAIN 1.x (1992–2005)
- 1992: TWAIN 1.0, first public release.
- 1997: TWAIN 1.7, more capabilities, Mac support.
- 32-bit only (Win32, Mac OS Classic, macOS PowerPC).
- DSM:
TWAIN_32.dll. - ASCII strings (
TW_STR32etc.). - State machine already defined; smaller capability and format coverage.
5.2 TWAIN 2.x (2008–today)
- 2008: TWAIN 2.0, new DSM
TWAINDSM.dll(separate open-source LGPL project). - Native 64-bit and unified cross-platform DSM.
- Optional Unicode strings (
TWTY_UNI512). - New capabilities:
ICAP_AUTODISCARDBLANKPAGES,ICAP_AUTOMATICDESKEW,ICAP_BARCODEDETECTIONENABLED, etc. - ICC profiles, JPEG 2000, PDF/A.
- 2015: TWAIN 2.3, metric units, stronger ADF support.
- 2017: TWAIN 2.4, bug fixes, macOS sandbox compatibility.
- 2022: TWAIN 2.5, target of this project; tightened capability negotiation and transfer descriptions.
5.3 TWAIN Direct (2017+)
- Not a replacement for TWAIN 2.x; brings TWAIN semantics to the network.
- HTTPS + JSON, REST-style.
- Devices advertise themselves via mDNS.
- Competes with eSCL for network scanning; eSCL currently has wider adoption.
- Works from browsers and mobile apps without installing a DS.
5.4 Version matrix
| Aspect | TWAIN 1.x | TWAIN 2.x | TWAIN Direct |
|---|---|---|---|
| Year | 1992+ | 2008+ | 2017+ |
| DSM | TWAIN_32.dll |
TWAINDSM.dll (32 + 64) |
embedded HTTPS server in device |
| Bit width | 32-bit only | 32 + 64-bit | bit-width irrelevant |
| Strings | ASCII | ASCII + optional Unicode | UTF-8 (JSON) |
| Transport | in-process DLL | in-process DLL | network HTTPS / JSON |
| Vendor UI | DS owns | DS or app | app |
| Cross-platform | Win + Mac Classic | Win + macOS + partial Linux | any HTTP platform |
| Network scan | not direct | not direct | native |
| Discovery | DSM scans directory | DSM scans directory | mDNS / Bonjour |
| Deployment | legacy | current mainstream | growing |
6. Summary
- TWAIN is the oldest and most thoroughly featured scanner standard with very fine-grained capability negotiation.
- WIA is a Windows-only lightweight alternative; good for simple scenarios but limited.
- eSCL is the de-facto modern standard for network and cross-platform scanning, increasingly the default for consumer MFPs.
- This project implements a TWAIN 2.5-compatible virtual Data Source for testing applications without real hardware; see the other design docs in this folder (
docs/native_transfer_design.md,docs/file_transfer_design.md,docs/pixel_type_design.md, ...) for module-level details.