Skip to content
Snippets Groups Projects
Select Git revision
  • benchmark-tools
  • postgres-lambda
  • master default
  • REL9_4_25
  • REL9_5_20
  • REL9_6_16
  • REL_10_11
  • REL_11_6
  • REL_12_1
  • REL_12_0
  • REL_12_RC1
  • REL_12_BETA4
  • REL9_4_24
  • REL9_5_19
  • REL9_6_15
  • REL_10_10
  • REL_11_5
  • REL_12_BETA3
  • REL9_4_23
  • REL9_5_18
  • REL9_6_14
  • REL_10_9
  • REL_11_4
23 results

regcomp.c

Blame
    • Tom Lane's avatar
      2a3dbc15
      Fix misoptimization of "{1,1}" quantifiers in regular expressions. · 2a3dbc15
      Tom Lane authored
      A bounded quantifier with m = n = 1 might be thought a no-op.  But
      according to our documentation (which traces back to Henry Spencer's
      original man page) it still imposes greediness, or non-greediness in the
      case of the non-greedy variant "{1,1}?", on whatever it's attached to.
      
      This turns out not to work though, because parseqatom() optimizes away
      the m = n = 1 case without regard for whether it's supposed to change
      the greediness of the argument RE.
      
      We can fix this by just not applying the optimization when the greediness
      needs to change; the subsequent general cases handle it fine.
      
      The three cases in which we can still apply the optimization are
      (a) no quantifier, or quantifier does not impose a preference;
      (b) atom has no greediness property, implying it cannot match a
      variable amount of text anyway; or
      (c) quantifier's greediness is same as atom's.
      Note that in most cases where one of these applies, we'd have exited
      earlier in the "not a messy case" fast path.  I think it's now only
      possible to get to the optimization when the atom involves capturing
      parentheses or a non-top-level backref.
      
      Back-patch to all supported branches.  I'd ordinarily be hesitant to
      put a subtle behavioral change into back branches, but in this case
      it's very hard to see a reason why somebody would write "{1,1}?" unless
      they're trying to get the documented change-of-greediness behavior.
      
      Discussion: https://postgr.es/m/5bb27a41-350d-37bf-901e-9d26f5592dd0@charter.net
      2a3dbc15
      History
      Fix misoptimization of "{1,1}" quantifiers in regular expressions.
      Tom Lane authored
      A bounded quantifier with m = n = 1 might be thought a no-op.  But
      according to our documentation (which traces back to Henry Spencer's
      original man page) it still imposes greediness, or non-greediness in the
      case of the non-greedy variant "{1,1}?", on whatever it's attached to.
      
      This turns out not to work though, because parseqatom() optimizes away
      the m = n = 1 case without regard for whether it's supposed to change
      the greediness of the argument RE.
      
      We can fix this by just not applying the optimization when the greediness
      needs to change; the subsequent general cases handle it fine.
      
      The three cases in which we can still apply the optimization are
      (a) no quantifier, or quantifier does not impose a preference;
      (b) atom has no greediness property, implying it cannot match a
      variable amount of text anyway; or
      (c) quantifier's greediness is same as atom's.
      Note that in most cases where one of these applies, we'd have exited
      earlier in the "not a messy case" fast path.  I think it's now only
      possible to get to the optimization when the atom involves capturing
      parentheses or a non-top-level backref.
      
      Back-patch to all supported branches.  I'd ordinarily be hesitant to
      put a subtle behavioral change into back branches, but in this case
      it's very hard to see a reason why somebody would write "{1,1}?" unless
      they're trying to get the documented change-of-greediness behavior.
      
      Discussion: https://postgr.es/m/5bb27a41-350d-37bf-901e-9d26f5592dd0@charter.net