Splitting Strings In Python Using Specific Characters
Solution 1:
You could try using re.split()
instead:
>>> import re
>>> re.split(r"[\[\]]", "I need to [go out] to lunch")
['I need to ', 'go out', ' to lunch']
The odd-looking regular expression [\[\]]
is a character class that means split on either[
or ]
. The internal \[
and \]
must be backslash-escaped because they use the same characters as the [
and ]
to surround the character class.
Solution 2:
str.split()
splits at the exact string you pass to it, not at any of its characters. Passing "[]"
would split at occurrences of []
, but not at individual brackets. Possible solutions are
splitting twice:
words = [z for y in x.split("[") for z in y.split("]")]
using
re.split()
.
Solution 3:
string.split(s), the one you are using, treats the entire content of 's' as a separator. In other words, you input should've looked like "[]'I need to []go out[] to lunch', 'and eat []some food[].'[]" for it to give you the results you want.
You need to use split(s) from the re module, which will treat s as a regex
import re
def main():
for x in docread:
words = re.split('[]', x)
for word in words:
doclist.append(word)
Post a Comment for "Splitting Strings In Python Using Specific Characters"