Skip to content Skip to sidebar Skip to footer

Simple Case Folding Vs Full Case Folding In Python Regex Module

This is the module I'm asking about: https://pypi.org/project/regex/, it's Matthew Barnett's regex. In the project description page, the difference in behavior between V0 and V1 ar

Solution 1:

It follows the Unicode case folding table. Excerpt:

# The entries in this file are in the following machine-readable format:
#
# <code>; <status>; <mapping>; # <name>
#
# The status field is:
# C: common case folding, common mappings shared by both simple and full mappings.
# F: full case folding, mappings that cause strings to grow in length. Multiple characters are separated by spaces.
# S: simple case folding, mappings to single characters where different from F.

[...]

# Usage:
#  A. To do a simple case folding, use the mappings with status C + S.
#  B. To do a full case folding, use the mappings with status C + F.

The folding is only different for a few special characters, examples are small and capital latin sharp s:

00DF; F; 0073 0073; # LATIN SMALL LETTER SHARP S

[...]

1E9E; F; 0073 0073; # LATIN CAPITAL LETTER SHARP S
1E9E; S; 00DF; # LATIN CAPITAL LETTER SHARP S

Post a Comment for "Simple Case Folding Vs Full Case Folding In Python Regex Module"