Program
Tutorials, Keynotes & Presentations
NOTE: THE CONFERENCE TIMES BELOW ARE SET AS PACIFIC DAYLIGHT TIME (PDT) - UTC/GMT-7 HOURS
WEDNESDAY, OCTOBER 13, 2021
09:00-10:30 - Session 1 (Tutorials)
TRACK 1
This tutorial helps you understand the unique characteristics of non-Latin writing systems that impinge on the implementation of Unicode-based applications. It doesn’t provide detailed coding advice, but focuses on essential concepts and requirements you must understand to deploy Unicode-based solutions, and does so across a representative range of all the world’s scripts (including Chinese, Japanese, Korean, Arabic, Hebrew, Thai, Hindi/Tamil, Russian and Greek). It also provides memorable examples to help you understand the buzzwords used in the rest of the conference and your future work with Unicode.
The tutorial starts with basic character encoding principles, but goes much further, covering things such as input of ideographs, combining characters, context-dependent shape variation, text direction, vowel signs, ligatures, punctuation, wrapping and editing, font issues, sorting and indexing, keyboards, and more. It has a proven track record as an orientation for newcomers to the conference, but also appeals to people at intermediate and advanced levels, due to the breadth of concepts discussed and the way they are related to real-world script usage. No prior knowledge is needed.
Richard Ishida, Internationalization Lead, W3C
TRACK 2
This tutorial will give you the knowledge for correct implementation for using Unicode to process text in any language. Unicode is the text encoding standard covering every major language on the planet.
Taught by software internationalization experts, this tutorial will introduce you to the key principles of Unicode, its design and architecture, and provide you with examples of real-world implementation. Attendees will come away with a basic knowledge of Unicode and how to be more effective at processing, handling, and debugging multilingual text content. The modules of the tutorial will cover:
- Why is the Unicode standard necessary? What problems does it solve?
- How computers work with text: Introduction to glyphs, character sets, and encodings
- Unicode Standard Specification and Related Data and Content
- Principles of Unicode’s Design
- Components of the Unicode Standard
- Encoding Forms, Behavior, Technical Reports, Database
- How to Use the Unicode Standard
- International Components for Unicode (ICU)
- Unicode Implementation Details and Recommendations
- The Unicode Consortium umbrella of Unicode, that is CLDR, ICU, and more
- Unicode Implementation Details and Recommendations
- Attributes, Compatibility, Non-spacing Characters, Directionality, Normalization, Graphemes, Complex Scripts, Surrogates, Collation, Regular Expressions and More
- Unicode and the Real World
- Support for Unicode in Software Platforms
- Unicode implementations on practically every modern device – in operating systems, browsers, applications, programming languages, and more
- How Unicode is Evolving
Craig Cummings, Sr. Product Manager, Amazon, Mike McKenna, Director World Ready Engineering, PayPal, Inc., Tex Texin, Chief Globalization Architect, Xencraft
track 3
This tutorial gives attendees everything they need to know to get started with working with Unicode text in computer systems using the International Components for Unicode library (ICU). ICU is a very popular internationalization solution, and is hosted by Unicode itself. While it vastly simplifies the internationalization of products, there can be a learning curve.
The goal of this tutorial is to help new users of ICU install and use the library. The tutorial will walk through code snippets and examples to illustrate common usage models, followed by demonstration applications and discussion of core features and conventions, advanced techniques and how to obtain further information. It is helpful if participants are familiar with Java, C, or C++ programming. Issues relating to ICU4C/C++ as well as ICU4J (Java) will be discussed. After the tutorial, participants should be able to install and use ICU for solving their internationalization problems.
Topics include: Installation (C++ libraries, Java .jar files, Java SPI for JDK integration), verification of installation, introduction and detailed usage analysis of ICU’s frameworks (normalization, formatting with the fluent API, calendars, collation, break iteration, Unicode properties, transliteration). We will also cover the packaging of ICU data, integrating ICU into an applications development process, and how to get involved in the ICU development community.
Steven Loomis, Senior Software Engineer, Craig Cornelius, Senior Software Engineer, Google, Inc.
10:30-11:00 Morning Refreshment Break
11:00-12:30 Session 2 (Tutorials)
TRACK 1
This tutorial helps you understand the unique characteristics of non-Latin writing systems that impinge on the implementation of Unicode-based applications. It doesn’t provide detailed coding advice, but focuses on essential concepts and requirements you must understand to deploy Unicode-based solutions, and does so across a representative range of all the world’s scripts (including Chinese, Japanese, Korean, Arabic, Hebrew, Thai, Hindi/Tamil, Russian and Greek). It also provides memorable examples to help you understand the buzzwords used in the rest of the conference and your future work with Unicode.
The tutorial starts with basic character encoding principles, but goes much further, covering things such as input of ideographs, combining characters, context-dependent shape variation, text direction, vowel signs, ligatures, punctuation, wrapping and editing, font issues, sorting and indexing, keyboards, and more. It has a proven track record as an orientation for newcomers to the conference, but also appeals to people at intermediate and advanced levels, due to the breadth of concepts discussed and the way they are related to real-world script usage. No prior knowledge is needed.
Richard Ishida, Internationalization Lead, W3C
TRACK 2
The Unicode in Action tutorial is a 90-minute session that demonstrates programming with Unicode and related best practices. This tutorial will build a simple web application and demonstrate the code and resulting behavior as internationalization functions are added. Attendees will be able to relate these prototype examples to the requirements of their own applications and reference them to code solutions.
The program will show sorting of different strengths, regular expressions, Unicode normalization, bidirectional languages, and other features of the Unicode standard. The tutorial will highlight why each of these functions are needed so you can determine when to use them in your applications. This tutorial is updated for IUC45.
Craig Cummings, Sr. Product Manager, Amazon, Mike McKenna, Director World Ready Engineering, PayPal, Inc., Tex Texin, Chief Globalization Architect, Xencraft
track 3
The CLDR Keyboard project is an initiative to collect layouts and transforms of user input tools in a standard format. This tutorial will first describe the basic structural components of a keyboard definition as specified in UTS #35 LDML (https://unicode.org/reports/tr35/tr35-keyboards.html). The tutorial will demonstrate a basic layout for a single layer keyboard using existing tools such as visual editors. This will be extended to include shift, control, and other layers. The example will then be enhanced with transform rules to illustrate code substitution and code point reordering. The tutorial will also review some methods to implement such CLDR keyboards on digital platforms.
Additional details of the LDML capabilities will be reviewed to implement additional features needed in keyboard implementations. Attendees will also learn about tooling for creating platform-specific keyboard applications or modules that can be installed on user devices. Tools such as KeyMan Developer can be used to prototype implementations before exporting to formats for the CLDR Keyboard database. Finally, the tutorial will outline the process of proposing and adding new items to the CLDR keyboard repository.
Craig Cornelius, Senior Software Engineer, Google, Inc., Mark Durdin, Keyman Team Lead, SIL International, Joshua Horton, Keyman Predictive Text Lead, SIL International, Steven Loomis, Senior Software Engineer