Excel Tricks | MacLochlainns Weblog

Archive for the ‘Excel Tricks’ tag

LIKE operator in Excel

While discussing the limitations of wild card comparisons in the Microsoft Excel’s IF function, I resolved to share my ISLIKE function. It acts like the LIKE operator in SQL. You can’t use the word like as the function name because it’s a reserved word in Visual Basic for Applications (VBA).

The function lets you compare any string to see if it is contained in another string. The default mode is case-insensitive, and the override mode case-sensitive. The optional third element lets you override the default comparison method.

You should remember or note that a number 0 is always evaluated as false, and any other number (negative or positive) is always true. You can put the ISLIKE function in a macro sheet or an Add-in library. I recently posted instructions on how to create your own Add-in library.

Here is the VBA code for the user defined ISLIKE function:

Function ISLIKE( inValue As Variant _
               , hasValue As Variant _
               , Optional caseSensitive As Integer = 0)
 
  ' Define a logical return variable and assign a default value.
  Dim logicReturn As Boolean
  logicReturn = False
 
  If caseSensitive Then
 
    ' Finds a case-insensitive value as a component in a larger value.
    If inValue Like "*" + hasValue + "*" Then
      logicReturn = True
    End If
 
  Else
 
    ' Finds a case-sensitive value as a component in a larger value.
    If UCase(inValue) Like "*" + UCase(hasValue) + "*" Then
      logicReturn = True
    End If
 
  End If
  ' Return the logical value.
  ISLIKE = logicReturn
 
End Function

Line numbers 12 and 19 use the "*", which is equivalent to the regular expression of "\.*". I’ve no idea why they made this choice, and it’s not well explained on Microsoft’s web site covering SmartTags.

You can shorten the program and make it slightly less efficient by performing the case sensitive match twice. The following doesn’t rely on nested block IF statements. You also must change the third parameter to a Boolean or the Not operator won’t work correctly.

Function ISLIKE( lookupValue As Variant _
               , containsValue As Variant _
               , Optional caseSensitive As Boolean = 0)
 
  ' Define a logical return variable and assign a default value.
  Dim logicReturn As Boolean
  logicReturn = False
 
  ' Finds a case-insensitive value as a component in a larger value.
  If caseSensitive And _
     lookupValue Like "*" + containsValue + "*" Then
      logicReturn = True
 
  ' Finds a case-sensitive value as a component in a larger value.
  ElseIf Not caseSensitive And _
    UCase(lookupValue) Like "*" + UCase(containsValue) + "*" Then
      logicReturn = True
 
  End If
 
  ' Return the logical value.
  ISLIKE = logicReturn
 
End Function

Somebody asked me, “How this can be useful?” That meant from their perspective, how is it useful beyond a comparison of cell values. A quick example of that additional utility can be demonstrated from an early post. In that post on replacing a VLOOKUP function with a UDF, there’s compound logic used to find the occurrence of a substring in a string. The ISLIKE function can simplify that logic.

Here’s that snippet of the code from that referenced post. The snippet evaluates whether the base string is greater than one character and checks the second character of the string to see if it is a - (minus sign).

58	ElseIf Len(LetterGrades(i)) > 1 And Mid(LetterGrades(i), 2, 1) = "-" Then

You can replace the compound logic with a call to the LIKE function. It would replace line 58 as follows:

58	ElseIf ISLIKE(LetterGrades(i),"-") Then

The logic of the function examines to see if the first parameter contains the second parameter. This is what is meant by like. In the following illustration, you examine if a string contains a substring (or is like a substring). The cell A2 is also assigned a Range name. Range names can increase the readability of your models and functions. All but the last one perform case insensitive comparisons that find whether the substring in cell A2 is found with cell range of the B5:B10. The B10 formula performs a case sensitive match.

Another logic comparison uses grades to see if a minus sign is found in them. The example compares the cell values in A1 and A2 against the B1 and B2 values. Returning true for one and false for the other cases in column C of the respective rows.

Here are the calls to the ISLIKE user defined function. The first one is case-insensitive and the second case-sensitive, but there’s no case sensitivity with a minus sign, -.

Spreadsheet Functions
Cell	Formula
C1	`=ISLIKE(A1,B1)`
C2	`=ISLIKE(A2,B2,TRUE)`

As always, I hope this helps. If you’ve comments or alternative approaches, please let me know.

Written by maclochlainn

June 4th, 2010 at 11:28 am

Posted in Microsoft Excel,Microsoft Office,VBA,VBA Tutorial

Tagged with Excel Development, Excel Tricks, Excel UDF, Excel User-Defined Function

Excel string parsing

with 14 comments

Parsing strings isn’t the easiest thing to do in Excel 2007 or Excel 2008 but it is an important thing to know how to do. You’ll learn how to parse a set names into first, middle and last names. You can find the data set for these examples at the bottom of the blog post.

Parsing the left substring ↓

This shows you how to parse a left substring from a text cell in Microsoft Excel.

There are two built in functions that let you parse dynamic substrings from the left of a string. They are the LEFT and FIND functions. The LEFT function actually does the parsing but the FIND function lets you dynamically find a delimiting character, like a space.

Assuming cell A1 holds the value of Joseph F. Smith, you can parse Joseph by using a static value of seven for the first white space in the string. This works because each character maps to a string, and strings start with the number one. The LEFT function supports three formal parameters, the first is the string you’re parsing, the second is where to stop, and the third is where to start. The position you start at is assumed to be one, which makes the last parameter an optional parameter. You only provide the start with parameter when you want to start someplace other than the left most position.

This is the simplest use of the LEFT function with its two required parameters.

=LEFT(A1,7)

It returns the substring Joseph from the string Joseph F. Smith, which is stored in cell A1.

The simplest solution merely illustrates a concept. You’re not going to do this unless you have a bit more data. Naturally, the list of first names have different lengths in the real world. You must include the FIND function inside the call to the LEFT function to make your function dynamic. The FIND function lets you capture the position of a white space delimiter.

The FIND function supports three formal parameters, the first is the substring you’re looking for in the string, the second is the string to search, and the third is where to start. Like the LEFT function, the position you start at is assumed to be one, which makes the last parameter an optional parameter. You only provide the start with parameter when you want to start someplace other than the left most position. While not required, you should consider always providing the start with parameter to the FIND function.

The following shows you how to use the FIND function to locate the first occurrence of a white space in a string stored in cell A1.

=FIND(" ",A1,1)

You can now make you’re LEFT function dynamic by putting the FIND function inside it. As qualified, the FIND function looks for the first occurrence of a white space, and returns it into the second parameter of the LEFT function. The sample formula now has a nested function, which makes it a mega formula in Microsoft Excel parlance. The following sample also includes the optional start with parameter for clarity.

=LEFT(A1,FIND(" ",A1,1),1)

The problem with the foregoing solution occurs when a name in the list doesn’t have a middle initial or name, or last name. The lack of a second name means that there won’t be any white space between to substrings in the base string. This situation causes the logic to fail because the FIND function returns a #VALUE! error when it can’t find a white space in the string.

You can prevent the error by wrapping the nested FIND function and LEFT function inside two IFERROR functions. This IFERROR function returns the positional value of the FIND function, or an alternative copy of the base string.

=IFERROR(FIND(" ",A1,1),A1)

This type of logic inside the LEFT function causes the LEFT function to fail when the base string is returned to it instead of a position number. You must wrap the LEFT function inside another IFERROR function to guarantee that you don’t throw an error. This also lets you return the base string as the valid substring when appropriate.

=IFERROR(LEFT(A1,IFERROR(FIND(" ",A1),A1)-1),A1)

Assuming you put this formula in cell C1, you could copy this relative reference formula down in a column. It would require that you had a list of full names that followed a first name, white space, middle initial or name, white space, and last name in the A column.

Parsing the right substring ↓

This shows you how to parse a right substring from a text cell in Microsoft Excel.

There are three built in functions that let you parse dynamic substrings from the right of strings. They are the RIGHT and FIND functions that you may have covered when reading how to parse from the left. The LEN function is the other function, and it lets you find the length of a string. Together these functions lets you find the length of a substring on the right.

The RIGHT function actually does the parsing but the FIND and LEN functions let you dynamically find where to cut a substring out of a base string. This example continues to use the string Joseph F. Smith.

The RIGHT function has only two required parameters. The first parameter is the string that you’re parsing. The second parameter is the length of the substring. If you inspect the string, Smith is only five characters long. A static call to the RIGHT function is shown below.

=RIGHT(A1,5)

It returns the substring Smith from the string Joseph F. Smith, which is stored in cell A1.

Like the LEFT function example, this static approach to parsing merely illustrates a concept. You’re not going to do this with real data because the list of first names have different lengths in the real world. You must include the FIND function twice inside your call to the RIGHT function because you’re parsing the string based on the second occurrence of a white space in the string. While the FIND function lets you dynamically capture the position of the white space delimiter, a nested FIND function lets you capture the correct start with parameter value. That value is one position after the first occurrence of a white space.

The following shows how to use a FIND function to locate the position one beyond where the first occurrence of a white space is found in a string.

=FIND(" ",A1,1)+1

If you pass the preceding FIND function call as the start with parameter to another FIND function, you can locate the second occurrence of a white space in a string stored in cell A1. Nesting function calls inside functions creates what are known as mega formulas in Excel. The following demonstrates a mega formula to find the second instance of a white space in a string.

=FIND(" ",A1,FIND(" ",A1,1)+1)

This returns the value of ten. You now know where to start but not the length of the substring on the right. You first need to find the length of the total string. You use the LEN function to find that, like the following.

=LEN(A1)

The LEN function returns fifteen. You can calculate the length of the substring as five by subtracting the position of the second white space from the length of the string. While you could inspect that value in this one cell, you can’t do that when there are one hundred or one hundred thousand names in a list. The way to dynamically capture the right hand side substring length is shown below.

=LEN(A1)-FIND(" ",A1,FIND(" ",A1,1)+1)

Now that you know how to capture the length of the substring, you can create a larger mega forumla to parse the substring on the right from the base string. The working example follows below.

=RIGHT(A1,LEN(A1)-FIND(" ",A1,FIND(" ",A1,1)+1))

The problem with the foregoing solution is that it fails when one of the names in the list doesn’t have a middle initial or name. The failure occurs because there would only be one white space in the base string, and the logic expects two. The FIND function looking for the second white space returns a #VALUE! error.

You can prevent this error by wrapping the nested FIND function calls and RIGHT function with calls with an IFERROR function. An IFERROR function returns the positional value of the FIND function, or an alternate value, like a fresh copy of the base string.

=IFERROR(RIGHT(A1,LEN(A1)-IFERROR(FIND(" ",A1,FIND(" ",A1)+1),FIND(" ",A1))),A1)

The IFERROR function inside the RIGHT function can cause the RIGHT function to fail when a base string is returned instead of a position number. Therefore, you must also wrap the RIGHT function inside another IFERROR function to avoid an error. This guarantees the return of the base string as a valid substring.

Parsing the middle substring ↓

This shows you how to parse a middle substring from a text cell in Microsoft Excel.

There are three built in functions that are required to let you parse dynamic substrings from the middle of base strings. Two of them are the MID and FIND functions. You’ve seen how to use the FIND function in the left and right parsing examples of this blog. The third function is the IFERROR function, which is required when a middle string doesn’t exist. There are two more functions that let you trap for the possibility of a single base string. They are the IF and ISNUMBER functions.

The MID function takes three required parameters. The first is the text value or cell reference, the second is the start with value, and the third is the length of the substring. Dynamic substrings require you to bracket them, which means you need to find their beginning and ending positions and measure their length.

This basic idea means you parse the middle string from a set of three strings by finding their delimiters. As in the other examples, you’ll work with the string Joseph F. Smith as a base string. First, you find the first character of the middle string. You do this by finding the position of the first delimiting white space with a FIND function, and then you add one to the returned result value. The example is below.

=FIND(" ",A1,1)+1

Next, you find the position of the second delimiter. This requires that you create what is known as a mega formula, which you create by nesting one or more formulas in another. You can use the following formula to do that.

=FIND(" ",A1,FIND(" ",A1)+1)

You can then calculate the length of the middle string by subracting the first result from the second one, as shown below:

=FIND(" ",A1,FIND(" ",A1)+1)-FIND(" ",A1)-1

After you’ve mastered those formulas, you need to create a mega formula with the MID function. This doesn’t have any error trapping yet, so it is very dependent on data that contains three substrings separated by white spaces.

=MID(A1,FIND(" ",A1)+1,FIND(" ",A1,FIND(" ",A1)+1)-FIND(" ",A1)-1)

The preceding function returns F.. If the middle string were a middle name, it would return the middle name. It fails when you have a base string that lacks three substrings. You need to wrap the nested FIND function and MID function inside two respective IFERROR functions. An IFERROR function returns the positional value of the FIND function, or an alternate value, like a fresh copy of the base string or a numeric equivalent. The following example uses an IFERROR function call to substitutes a zero value because the absence of a second white space means there isn’t a middle string.

=MID(A5,FIND(" ",A5)+1,IFERROR(FIND(" ",A5,FIND(" ",A5)+1)-FIND(" ",A5)-1,0))

A second IFERROR wrapping the MID function lets you return a null value for middle name when there is only one name in the base string, like Joseph.

=IFERROR(MID(A5,FIND(" ",A5)+1,IFERROR(FIND(" ",A5,FIND(" ",A5)+1)-FIND(" ",A5)-1,0)),"")

Assuming you put this formula in cell D1, you could copy this relative reference formula down in a column. It would require that you had a list of full names that followed a first name, white space, middle initial or name, white space, and last name in the A column.

Data set ↓

A list of native strings, parsed first, middle, and last names, and concatenated names.

Original Name         First Name  Middle Name  Last Name  Resorted Name
Joseph Smith          Joseph                   Smith      Smith, Joseph 
Brigham Young	      Brigham                  Young      Young, Brigham 
John Taylor	      John                     Taylor     Taylor, John 
Wilford Woodruff      Wilford                  Woodruff   Woodruff, Wilford 
Lorenzo Snow          Lorenzo	               Snow       Snow, Lorenzo 
Joseph F. Smith       Joseph      F            Smith      Smith, Joseph F
Heber J. Grant        Heber       J            Grant      Grant, Heber J
George Albert Smith   George      Albert       Smith      Smith, George Albert
David O. Mckay        David       O            Mckay      Mckay, David O
Joseph Fielding Smith Joseph      Fielding     Smith      Smith, Joseph Fielding
Harold B. Lee         Harold      B            Lee        Lee, Harold B
Spencer W. Kimball    Spencer     W	       Kimball    Kimball, Spencer W
Ezra Taft Benson      Ezra        Taft         Benson     Benson, Ezra Taft
Howard W. Hunter      Howard      W            Hunter     Hunter, Howard W
Gordon B. Hinckley    Gordon      B            Hinckley   Hinckley, Gordon B
Thomas S. Monson      Thomas      S            Monson     Monson, Thomas S

Written by maclochlainn

February 27th, 2009 at 10:15 pm

Posted in basic programming,Excel 2007,Excel 2008

Tagged with Excel Development, Excel Functions, Excel Tricks

MacLochlainns Weblog

Archive for the ‘Excel Tricks’ tag

LIKE operator in Excel

Excel string parsing

Recent Posts

Things Written About

Pages

Blogroll

Archives