C# Regex – How to Mask Credit Card Number Before and After Spaces

c++regex

I am having an issue when there is a space before/after in CreditCardNo.

For example from INV 2420852290 to SAV 0165487. Here 2420852290 is 10-digits and still it is getting masked.

For a CreditCardNo, the range is 12-19 digits. The reason is the space (before and after the digits) which is taken that extra 11th and12 character i think.

The regex in use is

(?<=(?<![\d-*])(?=(?:-?[\d*\s]){12,19}(?![\d-*\s]))[\d-*\s]*)[\d*](?!(?:-?[\d*\s]){0,3}(?![\d-*\s]))

I tried below code with above regex. Expected is that all the scenarios along with the one which is asked in question should work as it is. The last 4 digits should always be masked with x.

They can be tested using the URL – https://dotnetfiddle.net/Gopzoz. Thanks

        public static string MaskNewCCNo(this string value)
        {
           var a = Regex.Replace(value, @"(?<=(?<![\d-*])(?=(?:-?[\d*\s]){12,19}(?![\d-*\s]))[\d-*\s]*)[\d*](?!(?:-?[\d*\s]){0,3}(?![\d-*\s]))", "x");

            return a;
        }

Best Answer

Assuming theses numbers can only be separated by one space or hyphen, following two ideas:

By use of \G to chain matches:
```
(?:\G(?!^)|(?<!\d[ -]?)(?=(?:[ -]?\d){12,19}(?![ -]?\d)))\d(?=(?:[ -]?\d){4})([ -]?)
```
See this demo regexstorm or your updated sample - replace with x$1 (capture of group 1)

This will first find a number between 12 and 19 characters and chain matches from there. The second lookahead will check at each matching digit if there are at least four digits ahead.
Similar to your current pattern:
```
\d(?<=(?=(?:[ -]?\d){12,19}(?![ -]?\d))(?<!\d)(?>[ -]?\d)*)(?=(?:[ -]?\d){4})
```
Demo at regexstorm or updated .NET demo - replace just with x (like your current code)

This will do the whole lookaround checks at each digit found and is probably more costly.
(the atomic group at (?>[ -]?\d)* will prevent matching such as 0 1234567890123456789)

The reason your current regex did not work for the sample lies in (?<![\d-*]) which purpose is meant to separate the whole number from text but it just checks for one of the listed characters. Together with [\d*\s]){12,19} that could match the specified amount of digits or whitespace.

Besides I would not use something like [\d-*\s]. In this case (.NET regex) there is no error but it still looks ugly. An unescaped hyphen inside a character class is used to denote a character range. To match a literal hpyhen put it at start/end of the character-class or escape it with a backslash.

Related Solutions

C# Regex – Add Spaces Before Capital Letters

The regexes will work fine (I even voted up Martin Browns answer), but they are expensive (and personally I find any pattern longer than a couple of characters prohibitively obtuse)

This function

string AddSpacesToSentence(string text, bool preserveAcronyms)
{
        if (string.IsNullOrWhiteSpace(text))
           return string.Empty;
        StringBuilder newText = new StringBuilder(text.Length * 2);
        newText.Append(text[0]);
        for (int i = 1; i < text.Length; i++)
        {
            if (char.IsUpper(text[i]))
                if ((text[i - 1] != ' ' && !char.IsUpper(text[i - 1])) ||
                    (preserveAcronyms && char.IsUpper(text[i - 1]) && 
                     i < text.Length - 1 && !char.IsUpper(text[i + 1])))
                    newText.Append(' ');
            newText.Append(text[i]);
        }
        return newText.ToString();
}

Will do it 100,000 times in 2,968,750 ticks, the regex will take 25,000,000 ticks (and thats with the regex compiled).

It's better, for a given value of better (i.e. faster) however it's more code to maintain. "Better" is often compromise of competing requirements.

Update
It's a good long while since I looked at this, and I just realised the timings haven't been updated since the code changed (it only changed a little).

On a string with 'Abbbbbbbbb' repeated 100 times (i.e. 1,000 bytes), a run of 100,000 conversions takes the hand coded function 4,517,177 ticks, and the Regex below takes 59,435,719 making the Hand coded function run in 7.6% of the time it takes the Regex.

Update 2 Will it take Acronyms into account? It will now! The logic of the if statment is fairly obscure, as you can see expanding it to this ...

if (char.IsUpper(text[i]))
    if (char.IsUpper(text[i - 1]))
        if (preserveAcronyms && i < text.Length - 1 && !char.IsUpper(text[i + 1]))
            newText.Append(' ');
        else ;
    else if (text[i - 1] != ' ')
        newText.Append(' ');

... doesn't help at all!

Here's the original simple method that doesn't worry about Acronyms

string AddSpacesToSentence(string text)
{
        if (string.IsNullOrWhiteSpace(text))
           return "";
        StringBuilder newText = new StringBuilder(text.Length * 2);
        newText.Append(text[0]);
        for (int i = 1; i < text.Length; i++)
        {
            if (char.IsUpper(text[i]) && text[i - 1] != ' ')
                newText.Append(' ');
            newText.Append(text[i]);
        }
        return newText.ToString();
}

Python – Remove All Special Characters, Punctuation, and Spaces from String

This can be done without regex:

>>> string = "Special $#! characters   spaces 888323"
>>> ''.join(e for e in string if e.isalnum())
'Specialcharactersspaces888323'

You can use str.isalnum:

S.isalnum() -> bool

Return True if all characters in S are alphanumeric
and there is at least one character in S, False otherwise.

If you insist on using regex, other solutions will do fine. However note that if it can be done without using a regular expression, that's the best way to go about it.

Best Answer

Related Solutions

C# Regex – Add Spaces Before Capital Letters

Python – Remove All Special Characters, Punctuation, and Spaces from String

Related Question