I18N / Localization Design
Design notes for adding zh_CN and en_US localization support to BN Tech Virtual Scanner.
1. Requirement
The project must support zh_CN and en_US. Localized content includes settings UI titles, labels, dropdown options, buttons, confirmation pages, the folder picker title, and selected TWAIN identity strings. The language is selected by %APPDATA%\bntech\config.ini. Default is en_US.
2. Domain knowledge
A TWAIN DS exposes device information through TW_IDENTITY. Language-related fields include Version.Language, Version.Country, ProductFamily, and ProductName. The settings UI is generated HTML served by SettingsServer, so localization focuses on C++ strings used for HTML and HTTP responses. HTML uses UTF-8, while Windows folder picker APIs use UTF-16.
3. Design goals
- Support
en_USandzh_CN. - Default to
en_US. - Select language at runtime through config.ini.
- Keep the existing settings UI architecture.
- Centralize user-visible strings.
- Display Chinese correctly in both browser UI and folder picker.
Non-goals: no automatic system-language detection, no live switching for already opened pages, no resource DLL, no third-party parser, and no localization for logs or technical identifiers.
4. Overall design
Add src/localization.h and src/localization.cpp. The module provides language enums, string tables, config parsing, current-language selection, and UTF-8 to UTF-16 conversion. Callers use localization::strings().
5. Key decisions and rationale
- In-code string tables: simplest fit for C++ generated HTML and two languages.
- Fixed
en_USdefault: matches the requirement and keeps tests predictable. - Runtime config read: changes apply after reopening the UI or reloading the DS.
- UTF-8 HTML plus Unicode Win32 APIs: browser handles UTF-8 and Windows dialogs handle UTF-16 reliably.
- Technical identifiers are not translated: names such as PNG, DPI, A4, and form fields are standards or internal protocol details.
6. Flowcharts
6.1 Language selection
Need UI text
-> localization::strings()
-> Read config.ini
-> If no language/lang/locale, default to en_US
-> If zh_CN compatible, return Chinese strings
-> Otherwise return English strings
6.2 Settings UI
TWAIN app enables DS
-> If ShowUI=true, create SettingsServer
-> buildHtmlPage gets localized strings
-> Generate UTF-8 HTML
-> Browser shows page
-> User clicks Scan/Cancel
-> /submit returns localized confirmation
7. Component changes
CMakeLists.txt: addsrc/localization.cpp.- localization module: add language enum, string tables, config parser, aliases, BOM handling, normalization, and UTF conversion.
settings_server.cpp: localize generated HTML,/submit, and/browse; switch folder picker to Unicode APIs.twain_data_source.cpp: set localized TWAIN language/country and ProductFamily/ProductName.README.md: document language switching.
8. End-to-end data flow
TWAIN ShowUI request
-> TwainDataSource creates SettingsServer
-> SettingsServer builds HTML
-> localization reads config.ini
-> en_US or zh_CN string table is selected
-> browser displays localized UI
-> user submits
-> DS continues scanning
9. Test plan
- Delete config.ini and verify English default.
- Set
language=zh_CNand verify Chinese UI, buttons, Browse dialog, and confirmation page. - Set
language=en_USand verify English. - Set an invalid value and verify fallback to English.
- Test output directories containing Chinese characters.
- Close and reopen scanning applications after config changes to avoid DLL caching.
10. Risks
- TWAIN ProductName/ProductFamily are narrow character arrays; old TWAIN apps may interpret UTF-8 as ANSI and show garbled Chinese.
- HTML does not fully escape user-provided output directory and filename values yet.
- Config is read for each string lookup; caching may be needed later.
- Chinese depends on UTF-8 source files and
/utf-8. - Browser auto-translation may alter visible text.
11. Current limitations
Only en_US and zh_CN are supported. There is no system-language detection, live switching, UI language selector, log localization, or external translation file.
12. Future improvements
Add htmlEscape(), HTML lang attribute, an in-UI language selector, config caching, more languages, external translation resources, TWAIN identity encoding improvements, and automated i18n tests.