Skip to content Skip to sidebar Skip to footer

Regular Expressions But For Writing In The Match

When using regular expressions we generally, if not always use them to extract some kind of information. What I need is to replace the match value with some other value... Right n

Solution 1:

sub (replacement, string[, count = 0])

sub returns the string obtained by replacing the leftmost non-overlapping occurrences of the RE in string by the replacement replacement. If the pattern isn't found, string is returned unchanged.

    p = re.compile( '(blue|white|red)')
    >>> p.sub( 'colour', 'blue socks and red shoes')'colour socks and colour shoes'
    >>> p.sub( 'colour', 'blue socks and red shoes', count=1)'colour socks and red shoes'

Solution 2:

You want to use re.sub:

>>>import re>>>re.sub(r'aaa...bbb', 'aaaooobbb', "hola aaaiiibbb como estas?")
'hola aaaooobbb como estas?'

To re-use variable parts from the pattern, use \g<n> in the replacement string to access the n-th () group:

>>> re.sub( "(svcOrdNbr +)..", "\g<1>XX", "svcOrdNbr               IASZ0080")
'svcOrdNbr               XXSZ0080'

Solution 3:

Of course. See the 'sub' and 'subn' methods of compiled regular expressions, or the 're.sub' and 're.subn' functions. You can either make it replace the matches with a string argument you give, or you can pass a callable (such as a function) which will be called to supply the replacement. See https://docs.python.org/library/re.html

Solution 4:

If you want to continue using the syntax you mentioned (replace the match value instead of replacing the part that didn't match), and considering you will only have one group, you could use the code below.

def getExpandedText(pattern, text, replaceValue):
    m = re.search(pattern, text)
    expandedText = text[:m.start(1)] + replaceValue + text[m.end(1):]
    return expandedText

Solution 5:

def getExpandedText(pattern,text,*group):
    r""" Searches for pattern in the text and replaces
    all captures with the values in group.

    Tag renaming:
    >>> html = '<div> abc <span id="x"> def </span> ghi </div>'
    >>> getExpandedText(r'</?(span\b)[^>]*>', html, 'div')
    '<div> abc <div id="x"> def </div> ghi </div>'

    Nested groups, capture-references:
    >>> getExpandedText(r'A(.*?Z(.*?))B', "abAcdZefBgh", r'<\2>')
    'abA<ef>Bgh'
    """
    pattern= re.compile(pattern)
    ret = []
    last=0for m in pattern.finditer(text):
        for i in xrange(0,len(m.groups())):
            start,end= m.span(i+1)

            # nested or skipped group
            if start<lastorgroup[i] isNone:
                continue

            # text between the previous andcurrentmatch
            if last<start:
                ret.append(text[last:start])

            last=end
            ret.append(m.expand(group[i]))

    ret.append(text[last:])
    return''.join(ret)

Edit: Allow capture-references in the replacement strings.

Post a Comment for "Regular Expressions But For Writing In The Match"