*****************************************************************************
* ZTreeRegexPuzzleSolutionStrict.html
* A more precise regex solution to Laurent Duchastel's file renaming puzzle
* proposed here: http://www.ztw3.com/forum/forum_entry.php?id=83661
*
* This enhanced solution verifies that the days (1-31) and the months (1-12)
* are within their valid ranges and allows for variable length session digit
* counts i.e. not just 1 of 9, but 10 of 100 or 9999 of 100000 etc. e.g.
* LLLL_20071216_400-1000.JPG It also converts dash delimiters to underscores
* and session delimiters from 'Of' to a single dash i.e. 1Of2 becomes 1-2.
* It makes sure all required fields exist and that no extra characters are
* present. These RE expressions were developed and tested using the Find and
* Replace function in UltraEdit32 which uses the Boost C++ Regex Library. This
* solution builds upon the RE solutions submitted by Ian Binnie.
*
* Author:  Jeff Roberson
* Date:    17-Dec-07
* Updated: 03-Dec-08
****************************************************************************
* regex #1:                 Step 1 of 2
****************************************************************************
Given:
    A filename containing a date having either of the following two structures:
    LLLL-DDMMYY-XOfX.JPG or
    LLLL_DDMMYY_X-X.JPG
    Where valid months are from 01 to 12, valid days are from 01 to 31 and
    valid years are from 0 to 99. Also each of the session count number
    fields ('X') may have more than one digit which are to be preserved.

Find:
    A strict regex search pattern which matches only filenames with the given
    structure, and a replacement string to swap the day and year fields and
    extend the year field from 2 to 4 digits to look like the following:
    LLLL_20YYMMDD_X-X.JPG
    Note that years from 10 through 99 will erringly become 2010 through 2099

Solution:

    First we chop up the task into its components and assign the captured groups
    to substitution variables $1 through $7:

STEP VAR  FIELD       SOURCE STRING       REGULAR EXPRESSION
------------------------------------------------------------
1    $1 = pre:       'LLLL-' or 'LLLL_'  '^([A-Z]{4})[-_]'
2    $2 = day:       'DD'                '(0[1-9]|[12]\d|3[01])'
3    $3 = month:     'MM'                '(0[1-9]|1[012])'
4    $4 = year:      'YY'                '(\d\d)'
5    $5 = firstdig:  '-X' or '_X'        '[-_](\d+)'
5    $6 = separator: '-' or 'Of'         '(-|Of)'
6    $7 = post:      'X.JPG'             '(\d+\.JPG)$'

STEP   VERBOSE DESCRIPTION OF REGULAR EXPRESSION INTERPRETATION
---------------------------------------------------------------
1      match the beginning of string position followed by (exactly four upper
       case letters)=$1 followed by either a dash or an underscore
2      match (a '0' followed by one of '1'-'9', OR a '1' or '2' followed by a
       digit '0'-'9', OR a '3' followed by a '0' or '1')=$2
3      match (a '0' followed by one of '1'-'9', OR a '1' followed by a '0',
       '1' or '2')=$3
4      match (a digit followed by another digit)=$4
5      match a dash or an underscore followed by (one or more digits)=$5
6      match (either a dash OR the letters 'O' then 'f')=$6
7      match (one or more digits then a dot '.' then the uppercase letters
       'J', 'P', then 'G')=$7 followed by the end of string position

Final regex #1 =
'^([A-Z]{4})[-_](0[1-9]|[12]\d|3[01])(0[1-9]|1[012])(\d\d)[-_](\d+)(-|Of)(\d+\.JPG)$'
Replacement string #1 = '$1_20$4$3$2_$5-$7'

See the RegexBuddy description of regex #1 pattern here.

**************************************************************************
* regex #2:                 Step 2 of 2
**************************************************************************
Given:
    File name output from previous regex having the following structure:
    LLLL_2010MMDD_X-X.JPG through LLLL_2099MMDD_X-X.JPG

Find:
    A regex search pattern that finds years from 2010 through 2099 and changes
    them to 1910 through 1999 thereby correcting the 20th century Y2K error
    resulting from the previous regex replacement.

Solution:

STEP VAR  FIELD     SOURCE STRING   REGULAR EXPRESSION
--------------------------------------------------------------
1    $1 = prefix:   'LLLL'         '^([A-Z]{4})'
2    $2 = century:  '_20'          '_(20)'
3    $3 = year:     '1'-'9'        '([1-9])'

STEP   DESCRIPTION OF REGULAR EXPRESSION INTERPRETATION
-------------------------------------------------------
1      match the beginning of string followed by (exactly four uppercase
       letters)=$1
2      match an underscore followed by (a '2' followed by a '0')=$2
3      match (one of '1'-'9')=$3

Final regex #2 = '^([A-Z]{4})(_20)([1-9])'
Replacement string #2 = '$1_19$3'

See the RegexBuddy description of regex #2 pattern here.

**************************************************************************
Notes:

day range        = 01 to 31
month range      = 01 to 12
year range 19xx  = 10 to 99
year range 20xx  = 01 to 09

weaknesses:
 * allows February '30' and '31'

Test Data:
ORIGINAL                   AFTER REGEX #1              AFTER REGEX #2
---------------------------------------------------------------------------
LLLL-311299-1Of2.JPG       LLLL_20991231_1-2.JPG       LLLL_19991231_1-2.JPG
LDUC_280906_1-3.JPG        LDUC_20060928_1-3.JPG
LDUC_280906_2-3.JPG        LDUC_20060928_2-3.JPG
LDUC_280906_3-3.JPG        LDUC_20060928_3-3.JPG
LDUC_290806_1-1.JPG        LDUC_20060829_1-1.JPG
LDUC_180498_1-9.JPG        LDUC_20980418_1-9.JPG       LDUC_19980418_1-9.JPG

Extended valid filespecs   Extended valid filespecs    Extended valid filespecs
LDUC_311207_10-999.JPG     LDUC_20071231_10-999.JPG
LLLL-020157-123Of4321.JPG  LLLL_20570102_123-4321.JPG  LLLL_19570102_123-4321.JPG

Invalid filespecs check
ABCD_320101_1-9.JPG   Day too big
ABCD_000101_1-9.JPG   Day too small
ABCD_011301_1-9.JPG   Month too big
ABCD_010001_1-9.JPG   Month too small
BCD_010101_1-9.JPG    Missing prefix char
ABCDE_010101_1-9.JPG  Extra prefix char
ABCD_010101_1-9.PG    Missing post char
ABCD_010101_1-9.JPEG  Extra post char
**************************************************************************