Excel 2007 | MacLochlainns Weblog

A colleague asked how a User Defined Function (UDF) could simplify their grading. They were using a VLOOKUP() function and range component that they’d copy into spreadsheets generated from Blackboard.

You typically want a letter grade before you transcribe it into the student record system. What you usually start with is a percentage and a defined range of percentages that map to letter grades. While you can solve that probelm with a VLOOKUP() function, is it the best way. I’d suggest this type of activity is ideally suited to a UDF solution.

The VLOOKUP() function call in cell C2 is:

=VLOOKUP(B2,$E$2:$G$13,MATCH("Letter Grade",$E$1:$G$1,FALSE),TRUE)

A User Defined Function (UDF) can replace the VLOOKUP() function, the nested MATCH() function, and lookup range. You create a record type, like VirtualLookup, which must be defined outside of a subroutine or function.

The letterGrade() UDF call in cell C2 is much simpler:

=letterGrade(B2)

You can refer to the following blog post for clarification on how to create a UDF. UDFs can simplify our lives for routine operations. If you’re not interested in the programming, just copy it into your Visual Basic for Application library and it should work.

These letterGrade examples require a custom data type. It must be defined before any function or subroutine in a module. It’s generally a good idea to make these available in other modules by making them Public.

' Define a record type (must happen outside of a function or sub block.
Type VirtualLookupRecord
  lowerLimit As Double
  upperLimit As Double
  letterGrade As String * 2    ' A variable length string up to 2 characters.
End Type

For those interested in understanding how to program this UDF. You leverage that User Defined Type (UDT) when you define the function. In this example, you create a fifteen element array of the record structure. The example defines an array with 1-based numbering index. The default for Microsoft Excel is 0-based numbering. You would define a 0-based numbered array of fifteen elements like the one in the example.

Here’s the complete function that you can copy into your spreadsheet.

Dim VirtualLookup(15) As VirtualLookup

Since the formal parameter of the function is optional, you need an if-block to replace a null value with a zero. Then, you need a loop to read through the array and find where the call parameter value is within the lower and upper grade range values. When it finds a match, it assigns the letter grade to a variable and then breaks out (exits) the loop. If you implement this, don’t forget to comment out the debugging MsgBox() function call.

' Define a function to convert a percentage into a letter grade.
Function letterGrade(Optional percentage As Double = 0)
 
  ' Define a return variable
  Dim grade As String * 2
 
  ' Define a single dimension array of a UDT (record)
  Dim VirtualLookup(1 To 12) As VirtualLookupRecord
 
  ' Record initialization
  VirtualLookup(1).lowerLimit = 0.93
  VirtualLookup(1).upperLimit = 1#
  VirtualLookup(1).letterGrade = "A"
  VirtualLookup(2).lowerLimit = 0.9
  VirtualLookup(2).upperLimit = 0.93
  VirtualLookup(2).letterGrade = "A-"
  VirtualLookup(3).lowerLimit = 0.87
  VirtualLookup(3).upperLimit = 0.9
  VirtualLookup(3).letterGrade = "B+"
  VirtualLookup(4).lowerLimit = 0.83
  VirtualLookup(4).upperLimit = 0.87
  VirtualLookup(4).letterGrade = "B"
  VirtualLookup(5).lowerLimit = 0.8
  VirtualLookup(5).upperLimit = 0.83
  VirtualLookup(5).letterGrade = "B-"
  VirtualLookup(6).lowerLimit = 0.77
  VirtualLookup(6).upperLimit = 0.8
  VirtualLookup(6).letterGrade = "C+"
  VirtualLookup(7).lowerLimit = 0.73
  VirtualLookup(7).upperLimit = 0.77
  VirtualLookup(7).letterGrade = "C"
  VirtualLookup(8).lowerLimit = 0.7
  VirtualLookup(8).upperLimit = 0.73
  VirtualLookup(8).letterGrade = "C-"
  VirtualLookup(9).lowerLimit = 0.67
  VirtualLookup(9).upperLimit = 0.7
  VirtualLookup(9).letterGrade = "D+"
  VirtualLookup(10).lowerLimit = 0.63
  VirtualLookup(10).upperLimit = 0.67
  VirtualLookup(10).letterGrade = "D"
  VirtualLookup(11).lowerLimit = 0.6
  VirtualLookup(11).upperLimit = 0.63
  VirtualLookup(11).letterGrade = "D-"
  VirtualLookup(12).lowerLimit = 0#
  VirtualLookup(12).upperLimit = 0.6
  VirtualLookup(12).letterGrade = "F"
 
  ' Read through the possible lookup array values.
  For i = 1 To (UBound(VirtualLookup) + 1)
 
    ' Assign a grade if the percentage criterion or criteria match.
    If percentage > VirtualLookup(1).lowerLimit Then
      grade = VirtualLookup(1).letterGrade
      ' Exit the loop.
      Exit For
    ElseIf percentage > 0 And _
           percentage >= VirtualLookup(i).lowerLimit And _
           percentage < VirtualLookup(i).upperLimit Then
      grade = VirtualLookup(i).letterGrade
      Exit For
    Else
      grade = "UW"
    End If
 
  Next i
 
  ' A debug message (remark out for deployment).
  MsgBox ("Completed [" + grade + "]")
 
  ' Return the letter grade.
  letterGrade = grade
 
End Function

' Define a function to convert a percentage into a letter grade. Function letterGrade(Optional percentage As Double = 0) ' Define a return variable Dim grade As String * 2 ' Define a single dimension array of a UDT (record) Dim VirtualLookup(1 To 12) As VirtualLookupRecord ' Record initialization VirtualLookup(1).lowerLimit = 0.93 VirtualLookup(1).upperLimit = 1# VirtualLookup(1).letterGrade = "A" VirtualLookup(2).lowerLimit = 0.9 VirtualLookup(2).upperLimit = 0.93 VirtualLookup(2).letterGrade = "A-" VirtualLookup(3).lowerLimit = 0.87 VirtualLookup(3).upperLimit = 0.9 VirtualLookup(3).letterGrade = "B+" VirtualLookup(4).lowerLimit = 0.83 VirtualLookup(4).upperLimit = 0.87 VirtualLookup(4).letterGrade = "B" VirtualLookup(5).lowerLimit = 0.8 VirtualLookup(5).upperLimit = 0.83 VirtualLookup(5).letterGrade = "B-" VirtualLookup(6).lowerLimit = 0.77 VirtualLookup(6).upperLimit = 0.8 VirtualLookup(6).letterGrade = "C+" VirtualLookup(7).lowerLimit = 0.73 VirtualLookup(7).upperLimit = 0.77 VirtualLookup(7).letterGrade = "C" VirtualLookup(8).lowerLimit = 0.7 VirtualLookup(8).upperLimit = 0.73 VirtualLookup(8).letterGrade = "C-" VirtualLookup(9).lowerLimit = 0.67 VirtualLookup(9).upperLimit = 0.7 VirtualLookup(9).letterGrade = "D+" VirtualLookup(10).lowerLimit = 0.63 VirtualLookup(10).upperLimit = 0.67 VirtualLookup(10).letterGrade = "D" VirtualLookup(11).lowerLimit = 0.6 VirtualLookup(11).upperLimit = 0.63 VirtualLookup(11).letterGrade = "D-" VirtualLookup(12).lowerLimit = 0# VirtualLookup(12).upperLimit = 0.6 VirtualLookup(12).letterGrade = "F" ' Read through the possible lookup array values. For i = 1 To (UBound(VirtualLookup) + 1) ' Assign a grade if the percentage criterion or criteria match. If percentage > VirtualLookup(1).lowerLimit Then grade = VirtualLookup(1).letterGrade ' Exit the loop. Exit For ElseIf percentage > 0 And _ percentage >= VirtualLookup(i).lowerLimit And _ percentage < VirtualLookup(i).upperLimit Then grade = VirtualLookup(i).letterGrade Exit For Else grade = "UW" End If Next i ' A debug message (remark out for deployment). MsgBox ("Completed [" + grade + "]") ' Return the letter grade. letterGrade = grade End Function

Somebody asked for a variable array set by the input parameters. That’s a bit more programming and requires understanding the two types of array initialization and the difference between 0-based and 1-based numbering systems. Since this program assigns a constructed Array type to a Variant data type variable, 0-based arrays are consistently used in the example. That’s a change from the foregoing example.

Here is a function that lets you set the hundredth limits to something other than the traditional .x3 and .x7. It substitutes default values if you attempt to enter something greater than a tenth for either limit.

' Define a function to convert a percentage into a letter grade.
Function letterGrade(Optional percentage As Double = 0, _
                     Optional minusTenth As Double = 0.03, _
                     Optional plusTenth As Double = 0.07)
 
  ' Define a variable length string variable.
  Dim grade As String * 2
 
  ' Define local variables.
  Dim minusInverse As Double
  Dim plusInverse As Double
  Dim tenth As Double
 
  ' Assign value to counter.
  Dim counter As Double
 
  ' Define a single dimension array of a UDT (record).
  Dim LetterGrades As Variant
  Dim VirtualLookup(0 To 11) As VirtualLookupRecord
 
  ' Fix incorrect numeric data entry of lower bound.
  If minusTenth >= 0.1 Then
    minusTenth = 0.03
  End If
 
  ' Fix incorrect numeric data entry of upper bound.
  If plusTenth >= 0.1 Then
    plusTenth = 0.07
  End If
 
  ' Assign values to local variables.
  counter = 1#
  minusInverse = 0.1 - minusTenth
  plusInverse = 0.1 - plusTenth
  tenth = 0.1
 
  ' Initialize letter array.
  LetterGrades = Array("A", "A-", "B+", "B", "B-", "C+", "C", "C-", "D+", "D", "D-", "F")
 
  ' Use the array of letter grades to dynamically assign lower and upper bounds.
  For i = 0 To UBound(LetterGrades)
 
    ' There are three models for grades.
    ' -------------------------------------------------------------------
    '  [If]    =[A]     Has two possible values, a + or unadorned.
    '  [ElseIf]=[B,C,D] Have three possible values, a +, -, or unadorned.
    '  [ElseIf]=[F]     Has only an unadorned.
    ' -------------------------------------------------------------------
    If Left(LetterGrades(i), 1) = "A" Then
 
      ' The grade is one character for an unadorned grade.
      If Len(LetterGrades(i)) = 1 Then
        VirtualLookup(i).lowerLimit = counter - minusInverse
        VirtualLookup(i).upperLimit = counter
        VirtualLookup(i).letterGrade = LetterGrades(i)
 
      ' The grade is more than one character and second character a minus.
      ElseIf Len(LetterGrades(i)) > 1 And Mid(LetterGrades(i), 2, 1) = "-" Then
        VirtualLookup(i).lowerLimit = counter - tenth
        VirtualLookup(i).upperLimit = counter - minusInverse
        VirtualLookup(i).letterGrade = LetterGrades(i)
      End If
 
    ElseIf Left(LetterGrades(i), 1) = "B" Or _
           Left(LetterGrades(i), 1) = "C" Or _
           Left(LetterGrades(i), 1) = "D" Then
 
      ' The grade is one character for an unadorned grade.
      If Len(LetterGrades(i)) = 1 Then
        VirtualLookup(i).lowerLimit = counter - minusInverse
        VirtualLookup(i).upperLimit = counter - plusInverse
        VirtualLookup(i).letterGrade = LetterGrades(i)
 
      ' The grade is more than one character.
      ElseIf Len(LetterGrades(i)) > 1 Then
 
        ' The second character is a plus.
        If Mid(LetterGrades(i), 2, 1) = "+" Then
          VirtualLookup(i).lowerLimit = counter - plusInverse
          VirtualLookup(i).upperLimit = counter
          VirtualLookup(i).letterGrade = LetterGrades(i)
 
        ' The second character is a minus.
        ElseIf Mid(LetterGrades(i), 2, 1) = "-" Then
          VirtualLookup(i).lowerLimit = counter - tenth
          VirtualLookup(i).upperLimit = counter - minusInverse
          VirtualLookup(i).letterGrade = LetterGrades(i)
        End If
 
      End If
 
    ElseIf Left(LetterGrades(i), 1) = "F" Then
 
      VirtualLookup(i).lowerLimit = 0#
      VirtualLookup(i).upperLimit = counter
      VirtualLookup(i).letterGrade = LetterGrades(i)
 
    End If
 
    ' Debug message demonstrating the changed values of the dynamically
    ' constructed array (remark out for deployment).
    ' MsgBox (" Counter [" + CStr(counter) + _
    '         "] Index [" + CStr(i) + _
    '         "] Grade [" + CStr(LetterGrades(i)) + _
    '         "] LLimit [" + CStr(VirtualLookup(i).lowerLimit) + _
    '         "] ULimit [" + CStr(VirtualLookup(i).upperLimit) + "]")
 
    ' Increment the tenth value when letter grades change, like A to B, et cetera.
    If LetterGrades(i) = LetterGrades(UBound(LetterGrades)) Then
 
      ' Force a loop exit to avoid reading past the last index value.
      Exit For
 
    ElseIf Not Left(LetterGrades(i), 1) = Left(LetterGrades(i + 1), 1) Then
 
      ' Decrement the tenth placeholder.
      counter = counter - tenth
 
    End If
 
  Next i
 
  ' Read through the possible lookup array values.
  For i = 0 To UBound(VirtualLookup)
 
    ' Assign a grade if the percentage criterion or criteria match, and
    ' force loop exits until the last element of the array is read.
    ' -------------------------------------------------------------------
    '  [If]     Assumes an A is always the first letter.
    '  [ElseIf] Handles all numbers greater than zero.
    '  [Else]   Assumes a zero or null indicate an unofficial withdrawal.
    ' -------------------------------------------------------------------
    If percentage > VirtualLookup(0).lowerLimit Then
 
      ' Assign grade, alsways the first element of the array.
      grade = VirtualLookup(0).letterGrade
 
      ' Force a loop exit when match found.
      Exit For
 
    ElseIf percentage > 0 And _
           percentage >= VirtualLookup(i).lowerLimit And _
           percentage < VirtualLookup(i).upperLimit Then
 
      ' Assign grade.
      grade = VirtualLookup(i).letterGrade
 
      ' Force a loop exit.
      Exit For
 
    Else
 
      ' Assign grade.
      grade = "UW"
 
    End If
 
  Next i
 
  ' A debug message (remark out for deployment).
  ' MsgBox ("Completed [" + grade + "]")
 
  ' Return the letter grade.
  letterGrade = grade
 
End Function

' Define a function to convert a percentage into a letter grade. Function letterGrade(Optional percentage As Double = 0, _ Optional minusTenth As Double = 0.03, _ Optional plusTenth As Double = 0.07) ' Define a variable length string variable. Dim grade As String * 2 ' Define local variables. Dim minusInverse As Double Dim plusInverse As Double Dim tenth As Double ' Assign value to counter. Dim counter As Double ' Define a single dimension array of a UDT (record). Dim LetterGrades As Variant Dim VirtualLookup(0 To 11) As VirtualLookupRecord ' Fix incorrect numeric data entry of lower bound. If minusTenth >= 0.1 Then minusTenth = 0.03 End If ' Fix incorrect numeric data entry of upper bound. If plusTenth >= 0.1 Then plusTenth = 0.07 End If ' Assign values to local variables. counter = 1# minusInverse = 0.1 - minusTenth plusInverse = 0.1 - plusTenth tenth = 0.1 ' Initialize letter array. LetterGrades = Array("A", "A-", "B+", "B", "B-", "C+", "C", "C-", "D+", "D", "D-", "F") ' Use the array of letter grades to dynamically assign lower and upper bounds. For i = 0 To UBound(LetterGrades) ' There are three models for grades. ' ------------------------------------------------------------------- ' [If] =[A] Has two possible values, a + or unadorned. ' [ElseIf]=[B,C,D] Have three possible values, a +, -, or unadorned. ' [ElseIf]=[F] Has only an unadorned. ' ------------------------------------------------------------------- If Left(LetterGrades(i), 1) = "A" Then ' The grade is one character for an unadorned grade. If Len(LetterGrades(i)) = 1 Then VirtualLookup(i).lowerLimit = counter - minusInverse VirtualLookup(i).upperLimit = counter VirtualLookup(i).letterGrade = LetterGrades(i) ' The grade is more than one character and second character a minus. ElseIf Len(LetterGrades(i)) > 1 And Mid(LetterGrades(i), 2, 1) = "-" Then VirtualLookup(i).lowerLimit = counter - tenth VirtualLookup(i).upperLimit = counter - minusInverse VirtualLookup(i).letterGrade = LetterGrades(i) End If ElseIf Left(LetterGrades(i), 1) = "B" Or _ Left(LetterGrades(i), 1) = "C" Or _ Left(LetterGrades(i), 1) = "D" Then ' The grade is one character for an unadorned grade. If Len(LetterGrades(i)) = 1 Then VirtualLookup(i).lowerLimit = counter - minusInverse VirtualLookup(i).upperLimit = counter - plusInverse VirtualLookup(i).letterGrade = LetterGrades(i) ' The grade is more than one character. ElseIf Len(LetterGrades(i)) > 1 Then ' The second character is a plus. If Mid(LetterGrades(i), 2, 1) = "+" Then VirtualLookup(i).lowerLimit = counter - plusInverse VirtualLookup(i).upperLimit = counter VirtualLookup(i).letterGrade = LetterGrades(i) ' The second character is a minus. ElseIf Mid(LetterGrades(i), 2, 1) = "-" Then VirtualLookup(i).lowerLimit = counter - tenth VirtualLookup(i).upperLimit = counter - minusInverse VirtualLookup(i).letterGrade = LetterGrades(i) End If End If ElseIf Left(LetterGrades(i), 1) = "F" Then VirtualLookup(i).lowerLimit = 0# VirtualLookup(i).upperLimit = counter VirtualLookup(i).letterGrade = LetterGrades(i) End If ' Debug message demonstrating the changed values of the dynamically ' constructed array (remark out for deployment). ' MsgBox (" Counter [" + CStr(counter) + _ ' "] Index [" + CStr(i) + _ ' "] Grade [" + CStr(LetterGrades(i)) + _ ' "] LLimit [" + CStr(VirtualLookup(i).lowerLimit) + _ ' "] ULimit [" + CStr(VirtualLookup(i).upperLimit) + "]") ' Increment the tenth value when letter grades change, like A to B, et cetera. If LetterGrades(i) = LetterGrades(UBound(LetterGrades)) Then ' Force a loop exit to avoid reading past the last index value. Exit For ElseIf Not Left(LetterGrades(i), 1) = Left(LetterGrades(i + 1), 1) Then ' Decrement the tenth placeholder. counter = counter - tenth End If Next i ' Read through the possible lookup array values. For i = 0 To UBound(VirtualLookup) ' Assign a grade if the percentage criterion or criteria match, and ' force loop exits until the last element of the array is read. ' ------------------------------------------------------------------- ' [If] Assumes an A is always the first letter. ' [ElseIf] Handles all numbers greater than zero. ' [Else] Assumes a zero or null indicate an unofficial withdrawal. ' ------------------------------------------------------------------- If percentage > VirtualLookup(0).lowerLimit Then ' Assign grade, alsways the first element of the array. grade = VirtualLookup(0).letterGrade ' Force a loop exit when match found. Exit For ElseIf percentage > 0 And _ percentage >= VirtualLookup(i).lowerLimit And _ percentage < VirtualLookup(i).upperLimit Then ' Assign grade. grade = VirtualLookup(i).letterGrade ' Force a loop exit. Exit For Else ' Assign grade. grade = "UW" End If Next i ' A debug message (remark out for deployment). ' MsgBox ("Completed [" + grade + "]") ' Return the letter grade. letterGrade = grade End Function

I hope this helps folks that have to perform grading activities in Excel.

Microsoft Excel supports macros but it also supports library functions, known as User Defined Functions (UDF). Library functions are less risky than macros because they must return a value to a cell. This is a quick tutorial, mostly for my students, but as usual for anybody who’s interested.

Microsoft Excel User Defined Functions (UDFs) are different than standard VBA macros. They’re behavior is restricted. You can’t access other cells in a workbook, and may only return a value (also known as an expression) to the cell that uses the formula. That having been said, they can dramatically hide the complexity of mega-formulas and remove them from the editing control of users.

Here are the steps to configure your Excel 2007 environment to work with Visual Basic for Applications (VBA) and UDFs. They show you how to display the developer ribbon, open a module for editing, save VBA code into a library (*.xlam file), and add the library file as an “Add-in” library to your Excel 2007 installation. The name of the library will be the same as the Workbook where you created it.

Displaying Excel’s Developer Tab

There are four steps to make this visible. They are:

Choose the Office Button. You’ll see the following:

Click the Excel Options button.

Click the Popular tab if not highlighted (it’s the default). Inside the right side pane, click the Show Developer tab in the Ribbon check box to enable it, and click the OK button to set it.

Click the Developer ribbon tab. Click the left most icon to launch the Visual Basic Editor screen.

Open a Module in the Visual Basic Editor by clicking the Insert menu item and choosing the Module element.

Copy the following function definition into the open Module. The name of the UDF will be the case-sensistive name of the function in the VBA module. That means you’ll now have a newSerialDate function in your list of functions when you click the insert function button.

I struggled to come up with a simple function to illustrate this and how you debug UDFs. A word to the wise, you can’t use a numeric variable inside a MsgBox by itself. If you attempt it, Excel will return a #Value! error in the cell where you call the UDF. You must include a numeric variable as an argument (also known as a call parameter) to the CStr() function. An example is: Cstr(myVariable). You call this sample function by entering:

=monthEnd(cell_reference)

When you pass a date (actually an integer in the spreadsheet), the function handles it as a string. While no data type is assigned the dateIn variable in the example, variables without an explicit data type are always of the Variant data type. The Variant data type is a master data type and can hold any other type, which makes it like the Object data type in Java.

This program parses the string, then uses the DateSerial() function to return it as a number. It’s critical to note that the last line returns the value in the newSerialDate variable, and that variable must always be the function name. Place a single quote mark before all MsgBox() function calls after verifying that the function works.

Public Function newSerialDate(dateIn)
 
  ' Define local variables.
  Dim day As String
  Dim month As String
  Dim year As String
  Dim startPosition As Integer
  Dim endPosition As Integer
  Dim length As Integer
 
  ' Initialize local variables and parse the month from the left.
  startPosition = 1
  endPosition = InStr(startPosition, dateIn, "/")
  month = Left(dateIn, endPosition - 1)
 
  ' This is a debugging tool to track variables during execution.
  MsgBox ("Month [" + month + "]")
 
  ' Shift the start position and parse the day from the middle.
  startPosition = endPosition + 1
  endPosition = InStr(startPosition, dateIn, "/")
  day = Mid(dateIn, startPosition, endPosition - startPosition)
 
  ' This is a debugging tool to track variables during execution.
  MsgBox ("Day [" + day + "]")
 
  ' Get the remainder of the string.
  startPosition = endPosition
  length = Len(dateIn) - startPosition
  year = Right(dateIn, length)
 
  ' This is a debugging tool to track variables during execution.
  MsgBox ("Year [" + year + "]")
 
  ' Return a value from a User Defined Function (UDF) by using
  ' the function name as the return variable.
  newSerialDate = DateSerial(year, month, day)
 
End Function

Aside from the fact that all this parsing isn’t really necessary because the problem is much simpler and cleaner. At least, it becomes so when you understand the breadth of built-in functions in VBA. You can solve the problem by designating the formal parameter as a Double like the example below.

Function newSerialDate(dateIn As Double)
  newSerialDate = dateIn
End Function

Alternatively, you can accept a Variant, which maps to a String. Then, you convert it to a Date like this:

Function newSerialDate(dateIn)
  newSerialDate = DateValue(dateIn)
End Function

Create the following spreadsheet, the formula values are noted below the screen shot. You should be able to copy and past them into the spreadsheet. After you’ve created the spreadsheet, entering a new date in cell A1 causes the UDF to run. When the UDF runs “as-is”, you’ll see message dialog boxes that show values set during runtime.

Spreadsheet Functions
Cell	Formula
A2	`=newSerialDate(A1)`
C1	`=VLOOKUP(TYPE(A1),$E$3:$F$7,MATCH("Meaning",$E$2:$F$2,FALSE))`
C2	`=VLOOKUP(TYPE(A2),$E$3:$F$7,MATCH("Meaning",$E$2:$F$2,FALSE))`

You can now save this as an Add In library but comment out those debug MsgBox() function calls. Click the Office Button and click SaveAs in the menu, then accept Excel Workbook initially. When you get to the SaveAs dialog, choose Excel Add-In as the file type. Below is a screen capture of the drop down selection box.

Open a new Excel Workbook. Click the Excel Options button. Click the Add-Ins tab. Inside the right side pane, make sure the Manage drop down says Excel Add-ins before you click the Go button.

Check the Samplefunction check box as shown below. Samplefunction is the name of the Workbook that contains the module, and it is saved as an *.xlam file. Click the OK button to add the library. You’ve now created, and added an Add-In library to your new spreadsheet. It will create a SampleFunction.xlam file in the users directory.

The file is found by default in:

C:\Users\UserName\AppData\Roaming\Microsoft\AddIns

Ranjit asked how you could call a UDF from inside a module. In the answer noted below, I show how to do it with the Unary function:

Function Unary(number As Integer)
  Unary = number + 1
End Function

Then, enter the following ordinary Increment macro:

Sub Increment()
  ActiveCell.FormulaR1C1 = Unary(ActiveCell.Value())
End Sub

Enter a number in a cell, then navigate to Tools -> Macro -> Macros… and choose the increment macro, which increments the value previously in the cell by calling the Unary function.

As always, I hope this is helpful to a few folks.

I popped for QuickTime Pro for Windows 7 ($29.99). The reason for doing so, was to create native screen capture that would integrate with Mac Adobe Premiere Pro, CS4. That’s because my Camtasia output didn’t work due to a compression mismatch.

Unfortunately, QuickTime Pro on Windows 7 doesn’t support screen capture. Quite a difference between what the product does on a Mac OS X versus a Windows 7 OS. I thought only Microsoft shorted users on the other platform, like Microsoft Excel 2007 versus Microsoft Excel 2008 (a far inferior product). Oops, I was wrong! Apple does it too. 🙁

Over a year ago, I put out a blog page on how to query the contents of a table from an Oracle database through Excel. I meant to get back to provide much more about this. Specifically, I want to cover the XML mechanism used to accomplish the task. However, a blog reader augmented my page with a solution comment. I’m thrilled he did because it tells me to get a basic thing out rather than hang folks up halfway. My thanks go to Vernon.

Here’s a quick addendum to the original post with screen shots because folks tell me they’re very useful. It demonstrates how you write queries in Microsoft Excel against the Oracle database. While I plan a full tutorial on a different URL, this should help everybody in the meantime. This content is dependent on the generic mechanism, which I covered in this older post.

It starts with an alternative to Step #9 in the older blog page, and therefore, I’ve started the numbering as 9 there.

The Import Data dialog asks you how and where you want to import it. When you don’t want the contents of a simple table, click the Properties button.

Clicking the prior dialog’s Property button brings you to the Connection Properties dialog. The Usage tab is first for a reason but the Definition tab is where you need to go to enter a free form query. Click the Definition tab highlighted in yellow below.

In this dialog, the most important boxes are the Command type (as you may guess, Table is the default value) and Command text boxes. The first thing to do, is click on the Command type multiple select widget and choose SQL.

Now, you can write any query that you’d like against the selected Oracle schema. You may reference any tables or views for which your user (schema) has SELECT permissions. You don’t need to limit yourself to a single table because ANSI SQL:89 (comma delimited tables), and ANSI SQL:92 (INNER JOIN, et cetera) are both supported. Don’t forget that Oracle uses a different concatenation method than Microsoft Access or SQL Server, you can find that here. After you’ve entered your query, click the OK button to proceed.

When you clicked OK, you’ve instructed Microsoft Excel 2007 to change the connection, it raises this dialog box (click on it if you want to read it). You click Yes to continue and save your query.

You’re back to where you began, more or less, this it the Import Data dialog. This is Step #9 but this time it’ll run your query not return a simple table content. Click OK to run the query. Don’t be surprised if you get a message saying no password was provided. It just means you’ll be prompted to enter one. You should never store that password because it’s stored in plain text inside an XML file.

You can find those data source files in this directory:

C:\Document and Settings\<some_user>\My Documents\My Data Sources

It’s always interesting troubleshooting things that students run into when they work with Excel. For example, in Excel when is one not equal to one, or for that matter when is one less than one? If you use the CODE(cell_reference) function, you’d find both cells contain the same ASCII value of 49.

If you’re wondering why the one on the left is less than the one on the right, the one on the right is a text value and the one on the left is a number. If you use the TYPE(cell_reference) function, you’d find that a text field returns a 2 and a number (or date) cell returns a 1. All equality comparisons are identity comparisons in Microsoft Excel, which means they compare type and value. Inequality comparisons, compare type and return a value based on the type number and never compare value when the types differ. There are three other types: (1) A logical value is 4; (2) An error message is 16; and (3) An array is 64.

The answer is when one cell contains a number and the other cell contains a string (text value). That’s done here by putting an apostrophe before the number and right aligning the cell. You solve this by using the VALUE(cell_reference) function to ensure that you’re comparing values not unknown data types because there’s no implicit casting resolution in Microsoft Excel and all comparisons are identity operations.

I haven’t checked Open Office or Numbers, but I’ll do that and update this posting with findings.

I put together a post on how to upload to MySQL from a CSV file with dates. It was more or less for my students but one of them was curious how the mega formula worked. As commented, the easier solution is to create a custom format. Oddly, Open Office does support the MySQL default format natively.

Excel doesn’t support the native MySQL date format as a default format mask, which is YYYY-MM-DD, or 2009-06-02 for June 2, 2009. That means you have to convert it from a scalar date to a string or create a custom format mask (see Dmitri’s comment below). If you just shook your head at the term scalar date, maybe a custom format mask is best. However, if you want a programming solution let me explain that Excel supports only three data types. They’re a string literal, a numeric literal, and a formula. Dates in Excel are merely formatted numbers. When the numbers are integers, the date is a date, but when the number has a fractional component, the date is really a timestamp.

Here’s a brief description of the process required to convert a date in Excel into a MySQL date format string literal in a CSV file. You need the following Excel functions:

Date Functions

The DAY(date) function returns a 1 or 2 digit numeric value for the day of the month, with ranges of 1 to 28, 1 to 29, 1 to 30, or 1 to 31 dependent on the month and year.
The MONTH(date) function returns a 1 or 2 digit numeric value for the month of the year.
The YEAR(date) function returns a 4 digit numeric value for the year.

Logical Functions

The IF(logical_expression,truth_action,false_action) function returns the truth action when the expression is true, and the false action when the expression isn’t true.

MySQL Server

CONCATENATE(string_1, string_2, ...) glues strings together.
LEN(numeric_value) function returns the length of a string or number.

MySQL requires that you return an eight character string of numbers. The first four numbers must be a valid year, the fifth and sixth numbers a valid month, and the seventh and eigth numbers a valid day in the context of the year and month provided. Unfortunately, the DAY() and MONTH() functions may return a 1 or 2 digit value. That can’t happen in the CSV file’s string for a date, so you use the IF() and LEN() functions to guarantee a 2 digit return value.

Here are the examples that guarantee 2 digit day and month values, assuming that the base date is in the A1 cell. The concatenation of a "0" (zero between two double quotes) or the "" (two double quotes or a string null) ensures the number data types become strings.

=IF(LEN(DAY(A1))=1,CONCATENATE("0",DAY(A1)),DAY(A1))
=IF(LEN(MONTH(A1))=1,CONCATENATE("0",MONTH(A1)),MONTH(A1))

A zero is placed before the day or month when the logical condition is met, which means the day or month value is a single digit string. A null is place before the day or month when the logical condition isn’t met, which means the day or month value is a two digit string. There’s only one problem with these mega functions. They return a number.

The year calculation doesn’t require the explicit casting when you concatenate it with the other strings because it is implicitly cast as a string. However, it’s a better practice to include it for clarity (most folks don’t know about the implicit casting behaviors in Excel).

=CONCATENATE(YEAR(A1),"-",IF(LEN(MONTH(A1))=1,CONCATENATE("0",MONTH(A1)),MONTH(A1)),"-",IF(LEN(DAY(A1)) = 1,CONCATENATE("0",DAY(A1)),DAY(A1)))

As Goodwin reported in a comment, there’s an easier way that I missed. You can simply use the TEXT function when the source column is a valid serialized date value.

=TEXT(A1,"YYYYMMDD")

You can see the full MySQL import from CSV in the previous post. Naturally, you may want to copy and paste special the value before creating the CSV file. Also, don’t forget to delete any unused columns to the right or rows beneath because if you don’t your file won’t map to your table definition.

You have one approach to uploading CSV file data into a MySQL database. You use the LOAD statement. There two caveats about using the LOAD command, and they are:

You must convert dates to the default MySQL format – YYYYMMDD, or the four digit year, two digit month, and two digit day.
You can’t use AUTO_INCREMENT from the source to the loading table.

It’s untrue that you must input dates as strings and convert them in the database. However, you can do that if you like. I’d suggest you open the file in Microsoft Excel, then create a column with the correct data format. You can do that by creating a custom format, or with a formula. I figure the custom format is easiest but sometimes a formula is necessary for applications.

Excel Formula to Convert a Date to MySQL Format ↓

This is an example formula that’ll convert any supported Excel date format into a CSV-compliant MySQL date string.

The formula is large because MySQL can’t process a date when a one character month or day occurs in the string. You can find more on the logic of the Excel mega formula in this blog post.

=CONCATENATE(CONCATENATE("",YEAR(A1)),IF(LEN(MONTH(A1)),CONCATENATE("0",MONTH(A1))),IF(LEN(DAY(A1)) = 1,CONCATENATE("0",DAY(A1)),CONCATENATE("",DAY(A1))))

You should note that loading a date at the end of a line has problems in MySQL. The easiest fix that I’ve found is to place a comma at the end of each line. In a Microsoft world, that eliminates the need for the \r from the LINES TERMINATED BY clause.

While this shows a trivial amount of data, here’s a CSV file to use in the test case. Create a directory like Data off your base logical drive or mount point. Then create a file named transaction_upload.csv and put the following data set in it.

Create an importing.sql file in the C:\Data folder with the contents below. It’s only a recommendation but generally import tables aren’t useful in between uploads, and that’s why I create the table as an in-memory table. In-memory tables are dropped when you recycle (boot) the MySQL instance.

-- Conditionally drop the table.
DROP TABLE IF EXISTS transaction_upload;
 
-- Create the new upload target table.
CREATE TABLE transaction_upload
( transaction_date   DATE
, transaction_amount FLOAT ) ENGINE=MEMORY;
 
-- Load the data from a file, don't forget the \n after the \r on Windows or it won't work.
LOAD DATA INFILE 'c:/Data/mysql/transaction_upload.csv'
INTO TABLE transaction_upload
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
ESCAPED BY '\\'
LINES TERMINATED BY '\r\n';
 
-- Select the uploaded records.
SELECT * FROM transaction_upload;

Before you connect as anything other than the root account, you’ll need to grant the global FILE privilege to the targeted user. This is true because you’ve not qualified a LOCAL file in the import statement. Users that have all privileges on a given database also have read-write access to LOCAL tables, which is read write to client-side files. When the LOCAL key word is omitted, you’re instructing a read from the server by the client tool. You must therefore grant a global permission to enable a connected user can call a server-side file.

If you want to import without granting any additional global permissions, you can rewrite the statement like this:

LOAD DATA LOCAL INFILE 'c:/Data/mysql/transaction_upload.csv'
INTO TABLE transaction_upload
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
ESCAPED BY '\\'
LINES TERMINATED BY '\r\n';

If you don’t want to rewrite the statement, you’ll get the following error:

ERROR 1045 (28000): Access denied FOR USER 'student'@'localhost' (USING password: YES)
Empty SET (0.01 sec)

Grant Global File Permission ↓

This shows you how to grant a global file permissions to a restricted user. It is only necessary when you want the user to read or write server-side files.

Connect as the root user:

C:\Data> mysql -uroot -pcangetin -P3306

Then, grant the global privilege to the user:

mysql> GRANT FILE ON *.* TO 'student'@'localhost' IDENTIFIED BY 'student';

Now, you can connect to the data base as the restricted user and read an external server-side file:

C:\Data> mysql -ustudent -pstudent -P3306

Once you’ve granted these permissions, the user is no longer truly a restricted user. I’d strongly discourage doing this if the user is accessed via web applications.

Run the script:

mysql> \. importing.sql

Then, you should see the following:

+------------------+--------------------+
| transaction_date | transaction_amount |
+------------------+--------------------+
| 2009-01-02       |                  5 |
| 2009-12-31       |                  5 |
| 2009-11-28       |                  5 |
| 2009-06-16       |                  5 |
+------------------+--------------------+
4 rows in set (0.00 sec)

Another quick tidbit, dates can be tricky when they’re the last column in the file. The best solution is to put a comma at the end of each string. If you put the comma at the end of each line, you can also dispense with the \r on the Windows platform.

Somebody tweeted tonight and asked how to left pad a string in Microsoft Excel 2007. I thought it was an interesting question, and replied. Then, I thought oops … it should be on the blog.

You can use a combination of the REPLACE and REPT (believe it or not that’s the repeat function). The first step to left pad requires you to master the REPLACE function. It takes four actual parameters. The first is the old or base string, which is typically a cell reference. The second is the starting position, the third is the length of the base string to replace, and fourth the new string. The following example assumes you want to replace the whole string with a white space before the original string.

=REPLACE(A1,1,LEN(A1)," "&A1)

If you don’t like using the ampersand to concatenate, you can replace it with the CONCATENATE function, like this:

=REPLACE(A1,1,LEN(A1),CONCATENATE(" ",A1))

While more compact, the ampersand does the trick and ultimately saves space for complex mega formulas. The next step requires you learn how to use the REPT function.

The REPT function takes two parameters, which are the string you want to repeat and the number of times you want to repeat it. Assuming that you expect all strings to be smaller than twenty, you can use the REPT function this way to create a dynamic padding of physical white space.

=REPT(" ",20-LEN(A1))

When you put the concepts together, you’d get the following command to dynamically replace a string in cell A1 with a 20 character cell that is right aligned. The new value would be in the cell where you put this formula.

=REPLACE(A1,1,LEN(A1),REPT(" ",20-LEN(A1))&A1)

This lets you left pad a string, which I suppose is great when you want to output a flat position specific file. I’d prefer a CSV and a more robust programming environment but I hope it helps those who want to do it.

As Andy pointed out, you can simplify it with this:

=REPT(" ",20-LEN(A1))&A1

As an update to the original blog post, here’s a Visual Basic for Applications (VBA) User Defined Function (UDF) that would provide you with an add-in function to perform this task. It’s designed to LPAD numbers or text, and always return text. You can find how to develop and deploy Excel VBA UDFs in this more recent blog post of mine.

' The function takes any input as a cell and manages numbers different than strings.
Function LPAD(cell As Variant _
             , Optional offset As Variant = 0 _
             , Optional padWith As String = "0")
 
  ' Return variable.
  Dim returnString As String
  Dim whiteSpace As String * 1
 
  ' Reports a meaningful error when a non-numeric offset parameter is provided as the offset call parameter.
  If Not IsNumeric(offset) Then
 
    ' This means the function is called incorrectly but suppresses a #VALUE! error.
    MsgBox ("Offset should not be a non-numeric value of [" + offset + "]!")
 
  ' Dynamically assign the offset for the optional offset variable.
  ElseIf offset = 0 Then
    offset = Len(cell)
  Else
    ' This allows you to see the offset value when it is provided as a number.
    ' MsgBox ("offset is [" + CStr(offset) + "]")
  End If
 
  ' Assign default value.
  whiteSpace = " "
 
  ' Ensure size is at least the length of the cell and padding value.
   If IsNumeric(offset) And Len(cell) > offset Then
    offset = Len(cell) + Len(padWith)
   End If
 
  ' Assign default padding value when cell is a number.
  If IsNumeric(cell) And IsNumeric(padWith) Then
    padNumberWith = CInt(padWith)
  Else
    padNumberWith = 0
  End If
 
  ' Convert to string when numeric, use padWith text value when not null and whitespace if null.
  If IsNumeric(cell) Then
    returnString = Application.Rept("0", offset - Len(Application.Text(cell, padNumberWith))) + CStr(cell)
  ElseIf padWith <> "0" Then
    returnString = Application.Rept(padWith, offset - Len(cell)) + cell
  ElseIf padWith = "0" Then
    returnString = Application.Rept(whiteSpace, offset - Len(cell)) + cell
  Else
    returnString = Application.Rept(" ", offset - Len(cell)) + cell
  End If
 
  ' Return formatted string.
  LPAD = returnString
 
End Function

The formula for a number is different than it would be natively in Excel. That’s because some behaviors are implicitly provided. The equivalent function in Excel is:

=REPT("0",20-LEN(TEXT(A6,"0")))&A6

Microsoft Excel performs the LEN() function implicitly if you leave it out. For example, this works the same as the more complete formula above.

=REPT("0",20-(TEXT(A6,"0"))&A6

If you forget the LEN() function call in the VBA module, it raises an Excel Error 20150. This error is suppressed by a #VALUE! error. That error is returned because an error message can’t be concatenated with a string. Once you identify where the error occurs you can enclose it in a CStr() function call. The CStr() function explicitly casts the Err.number value to a string, which is then returned to the worksheet in the cell where you’ve called the function.

All that being said, it’s not nearly that complicated. You can provide the LPAD UDF in a simpler function that doesn’t use the built-in workbook functions. All you need to know is how to us the String() function, like this:

Function LPAD(padding AS Variant, character AS String, cell As String)
 
  ' Define and assign a default value, which must be a string data type.
  Dim character_default As String * 1
  character_default = " " 
  Dim number_default As String * 1
  number_default = "0" 
 
  ' Check if you've got adequate values to proceed.
  If IsNumeric(padding) And IsNumeric(cell) Then
    If IsNumeric(character) Then
      LPAD = String(padding, character) + cell
    Else
      LPAD = String(padding, number_default) + cell
    End If
  Else
    If IsNumeric(character) Then
      LPAD = String(padding, character) + cell
    Else
      LPAD = String(padding, character_default) + cell
    End If
  End If
 
End Function

You would call this UDF with any of the following function calls. You skip the last parameter because its an optional parameter.

User Defined Function Calls
Cell	Formula
A2	`=LPAD(A1,10)`
A3	`=LPAD(A1,30,"-")`
A4	`=LPAD(A1,20,0)`

Parsing strings isn’t the easiest thing to do in Excel 2007 or Excel 2008 but it is an important thing to know how to do. You’ll learn how to parse a set names into first, middle and last names. You can find the data set for these examples at the bottom of the blog post.

Parsing the left substring ↓

This shows you how to parse a left substring from a text cell in Microsoft Excel.

There are two built in functions that let you parse dynamic substrings from the left of a string. They are the LEFT and FIND functions. The LEFT function actually does the parsing but the FIND function lets you dynamically find a delimiting character, like a space.

Assuming cell A1 holds the value of Joseph F. Smith, you can parse Joseph by using a static value of seven for the first white space in the string. This works because each character maps to a string, and strings start with the number one. The LEFT function supports three formal parameters, the first is the string you’re parsing, the second is where to stop, and the third is where to start. The position you start at is assumed to be one, which makes the last parameter an optional parameter. You only provide the start with parameter when you want to start someplace other than the left most position.

This is the simplest use of the LEFT function with its two required parameters.

=LEFT(A1,7)

It returns the substring Joseph from the string Joseph F. Smith, which is stored in cell A1.

The simplest solution merely illustrates a concept. You’re not going to do this unless you have a bit more data. Naturally, the list of first names have different lengths in the real world. You must include the FIND function inside the call to the LEFT function to make your function dynamic. The FIND function lets you capture the position of a white space delimiter.

The FIND function supports three formal parameters, the first is the substring you’re looking for in the string, the second is the string to search, and the third is where to start. Like the LEFT function, the position you start at is assumed to be one, which makes the last parameter an optional parameter. You only provide the start with parameter when you want to start someplace other than the left most position. While not required, you should consider always providing the start with parameter to the FIND function.

The following shows you how to use the FIND function to locate the first occurrence of a white space in a string stored in cell A1.

=FIND(" ",A1,1)

You can now make you’re LEFT function dynamic by putting the FIND function inside it. As qualified, the FIND function looks for the first occurrence of a white space, and returns it into the second parameter of the LEFT function. The sample formula now has a nested function, which makes it a mega formula in Microsoft Excel parlance. The following sample also includes the optional start with parameter for clarity.

=LEFT(A1,FIND(" ",A1,1),1)

The problem with the foregoing solution occurs when a name in the list doesn’t have a middle initial or name, or last name. The lack of a second name means that there won’t be any white space between to substrings in the base string. This situation causes the logic to fail because the FIND function returns a #VALUE! error when it can’t find a white space in the string.

You can prevent the error by wrapping the nested FIND function and LEFT function inside two IFERROR functions. This IFERROR function returns the positional value of the FIND function, or an alternative copy of the base string.

=IFERROR(FIND(" ",A1,1),A1)

This type of logic inside the LEFT function causes the LEFT function to fail when the base string is returned to it instead of a position number. You must wrap the LEFT function inside another IFERROR function to guarantee that you don’t throw an error. This also lets you return the base string as the valid substring when appropriate.

=IFERROR(LEFT(A1,IFERROR(FIND(" ",A1),A1)-1),A1)

Assuming you put this formula in cell C1, you could copy this relative reference formula down in a column. It would require that you had a list of full names that followed a first name, white space, middle initial or name, white space, and last name in the A column.

Parsing the right substring ↓

This shows you how to parse a right substring from a text cell in Microsoft Excel.

There are three built in functions that let you parse dynamic substrings from the right of strings. They are the RIGHT and FIND functions that you may have covered when reading how to parse from the left. The LEN function is the other function, and it lets you find the length of a string. Together these functions lets you find the length of a substring on the right.

The RIGHT function actually does the parsing but the FIND and LEN functions let you dynamically find where to cut a substring out of a base string. This example continues to use the string Joseph F. Smith.

The RIGHT function has only two required parameters. The first parameter is the string that you’re parsing. The second parameter is the length of the substring. If you inspect the string, Smith is only five characters long. A static call to the RIGHT function is shown below.

=RIGHT(A1,5)

It returns the substring Smith from the string Joseph F. Smith, which is stored in cell A1.

Like the LEFT function example, this static approach to parsing merely illustrates a concept. You’re not going to do this with real data because the list of first names have different lengths in the real world. You must include the FIND function twice inside your call to the RIGHT function because you’re parsing the string based on the second occurrence of a white space in the string. While the FIND function lets you dynamically capture the position of the white space delimiter, a nested FIND function lets you capture the correct start with parameter value. That value is one position after the first occurrence of a white space.

The FIND function supports three formal parameters, the first is the substring you’re looking for in the string, the second is the string to search, and the third is where to start. Like the LEFT function, the position you start at is assumed to be one, which makes the last parameter an optional parameter. You only provide the start with parameter when you want to start someplace other than the left most position. While not required, you should consider always providing the start with parameter to the FIND function. It improves readability.

The following shows how to use a FIND function to locate the position one beyond where the first occurrence of a white space is found in a string.

=FIND(" ",A1,1)+1

If you pass the preceding FIND function call as the start with parameter to another FIND function, you can locate the second occurrence of a white space in a string stored in cell A1. Nesting function calls inside functions creates what are known as mega formulas in Excel. The following demonstrates a mega formula to find the second instance of a white space in a string.

=FIND(" ",A1,FIND(" ",A1,1)+1)

This returns the value of ten. You now know where to start but not the length of the substring on the right. You first need to find the length of the total string. You use the LEN function to find that, like the following.

=LEN(A1)

The LEN function returns fifteen. You can calculate the length of the substring as five by subtracting the position of the second white space from the length of the string. While you could inspect that value in this one cell, you can’t do that when there are one hundred or one hundred thousand names in a list. The way to dynamically capture the right hand side substring length is shown below.

=LEN(A1)-FIND(" ",A1,FIND(" ",A1,1)+1)

Now that you know how to capture the length of the substring, you can create a larger mega forumla to parse the substring on the right from the base string. The working example follows below.

=RIGHT(A1,LEN(A1)-FIND(" ",A1,FIND(" ",A1,1)+1))

The problem with the foregoing solution is that it fails when one of the names in the list doesn’t have a middle initial or name. The failure occurs because there would only be one white space in the base string, and the logic expects two. The FIND function looking for the second white space returns a #VALUE! error.

You can prevent this error by wrapping the nested FIND function calls and RIGHT function with calls with an IFERROR function. An IFERROR function returns the positional value of the FIND function, or an alternate value, like a fresh copy of the base string.

=IFERROR(RIGHT(A1,LEN(A1)-IFERROR(FIND(" ",A1,FIND(" ",A1)+1),FIND(" ",A1))),A1)

The IFERROR function inside the RIGHT function can cause the RIGHT function to fail when a base string is returned instead of a position number. Therefore, you must also wrap the RIGHT function inside another IFERROR function to avoid an error. This guarantees the return of the base string as a valid substring.

Assuming you put this formula in cell C1, you could copy this relative reference formula down in a column. It would require that you had a list of full names that followed a first name, white space, middle initial or name, white space, and last name in the A column.

Parsing the middle substring ↓

This shows you how to parse a middle substring from a text cell in Microsoft Excel.

There are three built in functions that are required to let you parse dynamic substrings from the middle of base strings. Two of them are the MID and FIND functions. You’ve seen how to use the FIND function in the left and right parsing examples of this blog. The third function is the IFERROR function, which is required when a middle string doesn’t exist. There are two more functions that let you trap for the possibility of a single base string. They are the IF and ISNUMBER functions.

The MID function takes three required parameters. The first is the text value or cell reference, the second is the start with value, and the third is the length of the substring. Dynamic substrings require you to bracket them, which means you need to find their beginning and ending positions and measure their length.

This basic idea means you parse the middle string from a set of three strings by finding their delimiters. As in the other examples, you’ll work with the string Joseph F. Smith as a base string. First, you find the first character of the middle string. You do this by finding the position of the first delimiting white space with a FIND function, and then you add one to the returned result value. The example is below.

=FIND(" ",A1,1)+1

Next, you find the position of the second delimiter. This requires that you create what is known as a mega formula, which you create by nesting one or more formulas in another. You can use the following formula to do that.

=FIND(" ",A1,FIND(" ",A1)+1)

You can then calculate the length of the middle string by subracting the first result from the second one, as shown below:

=FIND(" ",A1,FIND(" ",A1)+1)-FIND(" ",A1)-1

After you’ve mastered those formulas, you need to create a mega formula with the MID function. This doesn’t have any error trapping yet, so it is very dependent on data that contains three substrings separated by white spaces.

=MID(A1,FIND(" ",A1)+1,FIND(" ",A1,FIND(" ",A1)+1)-FIND(" ",A1)-1)

The preceding function returns F.. If the middle string were a middle name, it would return the middle name. It fails when you have a base string that lacks three substrings. You need to wrap the nested FIND function and MID function inside two respective IFERROR functions. An IFERROR function returns the positional value of the FIND function, or an alternate value, like a fresh copy of the base string or a numeric equivalent. The following example uses an IFERROR function call to substitutes a zero value because the absence of a second white space means there isn’t a middle string.

=MID(A5,FIND(" ",A5)+1,IFERROR(FIND(" ",A5,FIND(" ",A5)+1)-FIND(" ",A5)-1,0))

A second IFERROR wrapping the MID function lets you return a null value for middle name when there is only one name in the base string, like Joseph.

=IFERROR(MID(A5,FIND(" ",A5)+1,IFERROR(FIND(" ",A5,FIND(" ",A5)+1)-FIND(" ",A5)-1,0)),"")

Assuming you put this formula in cell D1, you could copy this relative reference formula down in a column. It would require that you had a list of full names that followed a first name, white space, middle initial or name, white space, and last name in the A column.

Data set ↓

A list of native strings, parsed first, middle, and last names, and concatenated names.

Original Name         First Name  Middle Name  Last Name  Resorted Name
Joseph Smith          Joseph                   Smith      Smith, Joseph 
Brigham Young	      Brigham                  Young      Young, Brigham 
John Taylor	      John                     Taylor     Taylor, John 
Wilford Woodruff      Wilford                  Woodruff   Woodruff, Wilford 
Lorenzo Snow          Lorenzo	               Snow       Snow, Lorenzo 
Joseph F. Smith       Joseph      F            Smith      Smith, Joseph F
Heber J. Grant        Heber       J            Grant      Grant, Heber J
George Albert Smith   George      Albert       Smith      Smith, George Albert
David O. Mckay        David       O            Mckay      Mckay, David O
Joseph Fielding Smith Joseph      Fielding     Smith      Smith, Joseph Fielding
Harold B. Lee         Harold      B            Lee        Lee, Harold B
Spencer W. Kimball    Spencer     W	       Kimball    Kimball, Spencer W
Ezra Taft Benson      Ezra        Taft         Benson     Benson, Ezra Taft
Howard W. Hunter      Howard      W            Hunter     Hunter, Howard W
Gordon B. Hinckley    Gordon      B            Hinckley   Hinckley, Gordon B
Thomas S. Monson      Thomas      S            Monson     Monson, Thomas S

Since Excel 2007 and 2008 don’t support a direct exclusive or – =XOR() – function, you’d think the logical operators could do it. Something like this:

=AND(NOT(AND(A1,B1)),OR(A1,B1))

It surprised me when it didn’t work. Curious, I figured out why it wouldn’t work. The AND() function doesn’t work as qualified in the help file.

excellogicdefinitions

The AND() function only works correctly when both cell values are not null values. It returns a Boolean true when one value is true and the other value is null, which was quite a surprise.

Fortunately, the IF() function works correctly. If you check a cell with a null value, the IF() function finds it false. That means you can wrap the cells in the IF() function, like:

=AND(NOT(AND(IF(A1,TRUE(),FALSE()),IF(B1,TRUE(),FALSE()))),OR(IF(A1,TRUE(),FALSE()),IF(B1,TRUE(),FALSE())))

Naturally, you can substitute the respective string literal for either the TRUE() or FALSE() function calls. You could also rewrite the formula to check for a null value by using two double quotes. By the way, it’s fun to note that the TYPE() function recognizes a null value cell as a number data type.

It’s ashame it doesn’t work the way you’d think it should.

MacLochlainns Weblog

Archive for the ‘Excel 2007’ Category

UDF replaces VLOOKUP

Excel UDF Tutorial

QuickTime Pro on Windows

SQL Query in Excel 2007

Excel doesn’t implicitly cast

Excel date conversion

MySQL Upload from CSV

How to LPAD in Excel

Excel string parsing

Excel 2007 AND() function

Recent Posts

Things Written About

Pages

Blogroll

Archives