Elegant Way To Use Regex To Match Order-indifferent Groups Of Characters While Limiting How May Times A Given Character Can Appear?

I am looking for a way to use python regular expressions to match groups of characters with limits on how many times a character can appear in the match. The main problem is that t

Solution 1:

use this pattern



to match a substring use this pattern


for ABCD with A{0,4} B{0,4} C{0,2} D{0,1} use this pattern


Solution 2:


Try this.See demo.

import re
p = re.compile(ur'(?!(.*?C){2})[ABC]{3}', re.IGNORECASE)

re.findall(p, test_str)

Solution 3:

One thing you could do is programmatically generate an explicit alternation that you can then embed in other regexes:

from collections import Counter, namedtuple
from itertools import product

# You could just hardcode tuples in `limits` instead and access their indices in # `test`; I just happen to like `namedtuple`.
Limit = namedtuple('Limit', ['low', 'high'])

# conditions
length = 3
valid_characters = 'ABC'
limits = {
    'A': Limit(low=0, high=3),
    'B': Limit(low=0, high=3),
    'C': Limit(low=0, high=1)

# determines whether a single string is validdefis_valid(string):
    iflen(string) != length:
    counts = Counter(string)
    for character in limits:
        ifnot (limits[character].low <= counts[character] <= limits[character].high):
            returnFalsereturnTrue# constructs a (foo|bar|baz)-style alternation of all valid stringsdefgenerate_alternation():
    possible_strings = map(''.join,
                           product(valid_characters, repeat=length))
    valid_strings = filter(is_valid,
    alternation = '(' + '|'.join(valid_strings) + ')'return alternation

Given the conditions I included above, generate_alternation() would give:


Which would do what you wanted. You can embed the resulting alternation in further regexes freely.

