python - Replacing regex with new regex -
say have
"a acrobat jumped on bridge"
and want change to
"an acrobat jumped on bridge".
right now, i'm using
lyrics = re.sub(r" (a|e|i|o|u|y){1}([a-z]+|[a-z]+)", r" (a|e|i|o|u|y){1}([a-z]+|[a-z]+)", lyrics)
and resulting string doesn't replace in way i'd hope would, expected. how else can this?
to clarify, want able generalize every case, not example used above.
according english grammar, an
comes before word starts vowel. can use this:
>>> import re >>> re.sub(r'\ba\b(?=\s+[aeiouaeiou])', 'an', "a acrobat jumped on bridge") 'an acrobat jumped on bridge' >>> re.sub(r'\ba\b(?=\s+[aeiouaeiou])', 'an', "a elephant") 'an elephant' >>>
notice, a
before acorbat
has been changed an
, whereas a
before bridge has not been changed. a
before elephant
has been changed an
, hence above regex generalized , works words.
here using: '\ba\b(?=\s+[aeiouaeiou])'
\ba\b
tries match literal a
word boundary on either sides (?=\s+[aeioudaeiou])
ensures positive lookahead of vowel space , vowel char
to replace a
an
can use this:
>>> re.sub(r'\ba\b', 'an', "a bridge") 'an bridge'
Comments
Post a Comment