MacLochlainns Weblog

Michael McLaughlin's Technical Blog

Site Admin

Archive for the ‘Excel 2008’ Category

Excel: How-to-use SUMIF

with 2 comments

Debbie asked a question about why a comparison wasn’t working in the Microsoft Excel SUMIF function when the initial array value was empty. This post answers that question.

The SUMIF function takes a lookup range, which anticipates a one column with a lookup value and another column with a value. The SUMIFS function works along the same line but accepts multiple lookups against a series of lookup list columns.

This shows a simple model of the SUMIF function:

SUMIF_Drawing1

The formula in cell G3 is:

=SUMIF(B$3:B$5,"<"&$E$3,C$3:C$5)

If you left off the $ symbol and made the comparison "<"&E3, you could add the sums for any number of values that are less than a value in the same row (or counting index with named ranges). You could also make the inequality symbol a parameter with this modified formula:

=SUMIF(B$3:B$5,$E$5&$E$3,C$3:C$5)

SUMIF_Drawing3

Now, your analysis model lets you choose whether you want to look for the sum greater than or less than a specific number in the lookup range. Named ranges compare index values between the lookup range and comparison range values, which allows them to be in different rows or spreadsheets. The only problem is that the label has an inverted inequality sign. You can fix that by constantly editing the description or by simply replace the text of the label with a dynamic text string, like this:

=CONCATENATE("Comparison """,$E$5,""" Value")

After applying the fix to the label, it looks like:

SUNIF_Drawing3Fixed

When the lookup column contains all nulls, the same formula returns a zero, as shown in the illustration.

SUMIF_Drawing2Fixed

You'll also get an error flag, unless you unchecked the Flag formulas that refer to empty cells in the error checking options. The IFERROR function can't be used to suppress this type of error.

Why doesn't the error return a null value? That's because an equality operator compares values in the array against a static or dynamic lookup value. While equality operators don't implicit cast values for comparison, they also don't compare null values. Only the ISBLANK function lets you find an empty cell and it can't be used inside a SUMIF function call.

Written by maclochlainn

January 13th, 2013 at 5:15 pm

Excel 2011 Needs ODBC Files

with one comment

I’d hoped for a better solution with Excel 2011 on Mac OS X but it appears we still need the third party drivers to connect Excel to MySQL. It’s the same as I reported earlier on Excel 2008. Here’s the message and you can click on it to go the web site and links to buy the drivers.

I tried getting away with installing the MySQL Connector/ODBC before downloading one of those recommended by Microsoft. I discovered that it didn’t work.

Therefore, I download and installed the OpenLink Software ODBC Software, instructions with screen shots are in this blog page. I discovered that this software support track isn’t purely native Mac OS X on Intel-based software because it requires Rosetta like Excel 2008. It also only connects with Microsoft Query, which appears to be a Power PC native application too, at least based on the interface and look and feel. I didn’t do much more research because I’ve concluded that connectivity isn’t ready for prime time on the Mac OS X platform. Microsoft still has more work to do.

Written by maclochlainn

February 10th, 2011 at 3:15 am

Excel Data Validation List

with 4 comments

A second Excel 2011 question in as many days, February is off and running. The question posed: “How do you create a data validation list without referring to a range of cells?”

You can create a data validation list without referring to a cell range. There’s a small trick to it. You enter the list of values as comma delimited strings, but you don’t enclose them in double quotes. Excel reads a set of literal text strings as if they’re a list of cell values, which is why you don’t provide the double quotes that you use in every function.

The desired list was either TRUE or FALSE, and the required behavior was that they function like Boolean variables. Fortunately, that’s straightforward when you remember that the case insensitive text strings TRUE and FALSE are Boolean values.

You create a data validation list by navigating to the Data Tab (Windows Excel 2010 it’s a Ribbon), and choose Validate. You’ll see the following list. Click on Data Validation… in the list, as shown below.

The selection launches the Data Validation dialog box. Choose List from the Allow poplist. Then, enter TRUE,FALSE in the Source entry field, like this:

Now click OK and you have a poplist. Hope this answer the question for others too.

Written by maclochlainn

February 2nd, 2011 at 9:30 pm

Excel 1st Day of Next Month

with 50 comments

I had an interesting reaction when I told my students they had to create a dynamic Excel model that required them to enter the months of the year without AutoFill feature. They were stumped. They couldn’t find a function to perform it for them.

It’s really quite easy, the EDATE function lets you find it directly. Enter any day in the current month as the first parameter and one as the number of months to add as the second parameter to the function. Voilà, you have a function to add a month to any date. The only time it returns the first of the next month is when the source date was the first day of the current month.

Here’s the solution when cell B1 contains January 14, 2011 and you want cell C1 calculated as the fourteenth day of February in the same year or February 14, 2011:

=EDATE(B1,1)

Here’s the solution when cell B1 contains January 14, 2011 and you want cell C1 calculated as the fourteenth day of December in the prior year or December 14, 2010:

=EDATE(B1,-1)

You can also use the EOMONTH function to find the first or last day of the month. It lets you find it the first day of the next month regardless of the start date. All you need is a simple trick.

Enter any day in the current month as the first parameter and zero as the number of months to add as the second parameter to the function, and then add one.

Here’s the solution when cell B1 contains any day in January and you want cell C1 calculated as the last day of January in the same year (rounding up).

=EOMONTH(B1,0)

Here’s the solution when cell B1 contains any day in January and you want cell C1 calculated as the first day of January in the same year (rounding down).

=EOMONTH(B1,-1)+1

A subsequent question asked how to calculate the 15th day of the current month, and here’s the formula:

=EOMONTH("10-Jul-2012",-1)+15

Hope this helps them and others looking for the trick.

Written by maclochlainn

February 1st, 2011 at 7:54 pm

Between Vlookup Key

with 4 comments

While the VLOOKUP and HLOOKUP functions are powerful tools in Excel, they limit their search to the first column or row of a range. This post shows you how to leverage the COUNTIFS function to perform an inclusive between search against the first two columns or rows of a range.

Moreover, the VLOOKUP function searches the first column of a range and looks for an approximate or exact match. The approximate match search throws an exception when the lookup key’s value precedes the first item in the range, and picks up the row immediately less than the lookup key. This type of search requires two condition. The data must be presorted into an ascending order for the sort column, the range must be contiguous, and two matching keys shouldn’t exist. That means that each search column or row cell points to the row of interest for any search key greater than it and less than the next. An exact match search finds the row that matches the lookup key and throws an error when there isn’t an exact match.

Neither of these allow for range searches between non-contiguous sets, like the one below. The date ranges where a value should be found are from the 16th of a month to the end of a month rather than the range between the 16th of one month to the 15th of the next. While this could be done by structuring a row with zeros the gap periods, a more effective solution is possible by using the COUNTIFS. At least, this is true from Excel 2007 forward.

The solution to this problem starts with recognizing how a COUNTIFS works. The COUNTIF provides the opportunity to compare a range of values against a single value, and the COUNTIFS allows multiple comparisons of values against ranges of values. The COUNTIFS function returns the number of matches that meet all conditions. Therefore, when a value is found in only one of the ranges the COUNTIFS function returns a 1, and when a value is found n times it returns n as a number.

The formula in cell F2 checks for the number of times the value in F1 exists:

=COUNTIFS($A$1:$A$13,"<="&F1,$B$1:$B$13,">="&F1)

If you evaluate when the foregoing function returns 1 before performing a VLOOKUP function, you can guarantee a match within a non-contiguous range of values. That formula is:

=IF(COUNTIFS($A$1:$A$13,"<="&F1,$B$1:$B$13,">="&F1)=1,VLOOKUP(F1,A2:C13,3),0)

Hope this helps some folks, as always …

Written by maclochlainn

January 26th, 2011 at 1:17 am

QuickTime Pro on Windows

with 5 comments

I popped for QuickTime Pro for Windows 7 ($29.99). The reason for doing so, was to create native screen capture that would integrate with Mac Adobe Premiere Pro, CS4. That’s because my Camtasia output didn’t work due to a compression mismatch.

Unfortunately, QuickTime Pro on Windows 7 doesn’t support screen capture. Quite a difference between what the product does on a Mac OS X versus a Windows 7 OS. I thought only Microsoft shorted users on the other platform, like Microsoft Excel 2007 versus Microsoft Excel 2008 (a far inferior product). Oops, I was wrong! Apple does it too. 🙁

Written by maclochlainn

May 22nd, 2010 at 10:47 am

Excel doesn’t implicitly cast

with 10 comments

It’s always interesting troubleshooting things that students run into when they work with Excel. For example, in Excel when is one not equal to one, or for that matter when is one less than one? If you use the CODE(cell_reference) function, you’d find both cells contain the same ASCII value of 49.

ExcelCellComp

If you’re wondering why the one on the left is less than the one on the right, the one on the right is a text value and the one on the left is a number. If you use the TYPE(cell_reference) function, you’d find that a text field returns a 2 and a number (or date) cell returns a 1. All equality comparisons are identity comparisons in Microsoft Excel, which means they compare type and value. Inequality comparisons, compare type and return a value based on the type number and never compare value when the types differ. There are three other types: (1) A logical value is 4; (2) An error message is 16; and (3) An array is 64.

The answer is when one cell contains a number and the other cell contains a string (text value). That’s done here by putting an apostrophe before the number and right aligning the cell. You solve this by using the VALUE(cell_reference) function to ensure that you’re comparing values not unknown data types because there’s no implicit casting resolution in Microsoft Excel and all comparisons are identity operations.

ExcelValueComp

I haven’t checked Open Office or Numbers, but I’ll do that and update this posting with findings.

Written by maclochlainn

November 10th, 2009 at 7:39 pm

Excel date conversion

with 24 comments

I put together a post on how to upload to MySQL from a CSV file with dates. It was more or less for my students but one of them was curious how the mega formula worked. As commented, the easier solution is to create a custom format. Oddly, Open Office does support the MySQL default format natively.

Excel doesn’t support the native MySQL date format as a default format mask, which is YYYY-MM-DD, or 2009-06-02 for June 2, 2009. That means you have to convert it from a scalar date to a string or create a custom format mask (see Dmitri’s comment below). If you just shook your head at the term scalar date, maybe a custom format mask is best. However, if you want a programming solution let me explain that Excel supports only three data types. They’re a string literal, a numeric literal, and a formula. Dates in Excel are merely formatted numbers. When the numbers are integers, the date is a date, but when the number has a fractional component, the date is really a timestamp.

Here’s a brief description of the process required to convert a date in Excel into a MySQL date format string literal in a CSV file. You need the following Excel functions:

Date Functions

  • The DAY(date) function returns a 1 or 2 digit numeric value for the day of the month, with ranges of 1 to 28, 1 to 29, 1 to 30, or 1 to 31 dependent on the month and year.
  • The MONTH(date) function returns a 1 or 2 digit numeric value for the month of the year.
  • The YEAR(date) function returns a 4 digit numeric value for the year.

Logical Functions

  • The IF(logical_expression,truth_action,false_action) function returns the truth action when the expression is true, and the false action when the expression isn’t true.

MySQL Server

  • CONCATENATE(string_1, string_2, ...) glues strings together.
  • LEN(numeric_value) function returns the length of a string or number.

MySQL requires that you return an eight character string of numbers. The first four numbers must be a valid year, the fifth and sixth numbers a valid month, and the seventh and eigth numbers a valid day in the context of the year and month provided. Unfortunately, the DAY() and MONTH() functions may return a 1 or 2 digit value. That can’t happen in the CSV file’s string for a date, so you use the IF() and LEN() functions to guarantee a 2 digit return value.

Here are the examples that guarantee 2 digit day and month values, assuming that the base date is in the A1 cell. The concatenation of a "0" (zero between two double quotes) or the "" (two double quotes or a string null) ensures the number data types become strings.

=IF(LEN(DAY(A1))=1,CONCATENATE("0",DAY(A1)),DAY(A1))
=IF(LEN(MONTH(A1))=1,CONCATENATE("0",MONTH(A1)),MONTH(A1))

A zero is placed before the day or month when the logical condition is met, which means the day or month value is a single digit string. A null is place before the day or month when the logical condition isn’t met, which means the day or month value is a two digit string. There’s only one problem with these mega functions. They return a number.

The year calculation doesn’t require the explicit casting when you concatenate it with the other strings because it is implicitly cast as a string. However, it’s a better practice to include it for clarity (most folks don’t know about the implicit casting behaviors in Excel).

=CONCATENATE(YEAR(A1),"-",IF(LEN(MONTH(A1))=1,CONCATENATE("0",MONTH(A1)),MONTH(A1)),"-",IF(LEN(DAY(A1)) = 1,CONCATENATE("0",DAY(A1)),DAY(A1)))

As Goodwin reported in a comment, there’s an easier way that I missed. You can simply use the TEXT function when the source column is a valid serialized date value.

=TEXT(A1,"YYYYMMDD")

You can see the full MySQL import from CSV in the previous post. Naturally, you may want to copy and paste special the value before creating the CSV file. Also, don’t forget to delete any unused columns to the right or rows beneath because if you don’t your file won’t map to your table definition.

Written by maclochlainn

June 16th, 2009 at 8:15 pm

MySQL Upload from CSV

with 6 comments

You have one approach to uploading CSV file data into a MySQL database. You use the LOAD statement. There two caveats about using the LOAD command, and they are:

  • You must convert dates to the default MySQL format – YYYYMMDD, or the four digit year, two digit month, and two digit day.
  • You can’t use AUTO_INCREMENT from the source to the loading table.

It’s untrue that you must input dates as strings and convert them in the database. However, you can do that if you like. I’d suggest you open the file in Microsoft Excel, then create a column with the correct data format. You can do that by creating a custom format, or with a formula. I figure the custom format is easiest but sometimes a formula is necessary for applications.

You should note that loading a date at the end of a line has problems in MySQL. The easiest fix that I’ve found is to place a comma at the end of each line. In a Microsoft world, that eliminates the need for the \r from the LINES TERMINATED BY clause.

While this shows a trivial amount of data, here’s a CSV file to use in the test case. Create a directory like Data off your base logical drive or mount point. Then create a file named transaction_upload.csv and put the following data set in it.

20090102,5
20091231,5
20091128,5
20090616,5

Create an importing.sql file in the C:\Data folder with the contents below. It’s only a recommendation but generally import tables aren’t useful in between uploads, and that’s why I create the table as an in-memory table. In-memory tables are dropped when you recycle (boot) the MySQL instance.

-- Conditionally drop the table.
DROP TABLE IF EXISTS transaction_upload;
 
-- Create the new upload target table.
CREATE TABLE transaction_upload
( transaction_date   DATE
, transaction_amount FLOAT ) ENGINE=MEMORY;
 
-- Load the data from a file, don't forget the \n after the \r on Windows or it won't work.
LOAD DATA INFILE 'c:/Data/mysql/transaction_upload.csv'
INTO TABLE transaction_upload
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
ESCAPED BY '\\'
LINES TERMINATED BY '\r\n';
 
-- Select the uploaded records.
SELECT * FROM transaction_upload;

Before you connect as anything other than the root account, you’ll need to grant the global FILE privilege to the targeted user. This is true because you’ve not qualified a LOCAL file in the import statement. Users that have all privileges on a given database also have read-write access to LOCAL tables, which is read write to client-side files. When the LOCAL key word is omitted, you’re instructing a read from the server by the client tool. You must therefore grant a global permission to enable a connected user can call a server-side file.

If you want to import without granting any additional global permissions, you can rewrite the statement like this:

LOAD DATA LOCAL INFILE 'c:/Data/mysql/transaction_upload.csv'
INTO TABLE transaction_upload
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
ESCAPED BY '\\'
LINES TERMINATED BY '\r\n';

If you don’t want to rewrite the statement, you’ll get the following error:

ERROR 1045 (28000): Access denied FOR USER 'student'@'localhost' (USING password: YES)
Empty SET (0.01 sec)

Run the script:

mysql> \. importing.sql

Then, you should see the following:

+------------------+--------------------+
| transaction_date | transaction_amount |
+------------------+--------------------+
| 2009-01-02       |                  5 |
| 2009-12-31       |                  5 |
| 2009-11-28       |                  5 |
| 2009-06-16       |                  5 |
+------------------+--------------------+
4 rows in set (0.00 sec)

Another quick tidbit, dates can be tricky when they’re the last column in the file. The best solution is to put a comma at the end of each string. If you put the comma at the end of each line, you can also dispense with the \r on the Windows platform.

Written by maclochlainn

June 16th, 2009 at 6:42 pm

How to LPAD in Excel

with 23 comments

Somebody tweeted tonight and asked how to left pad a string in Microsoft Excel 2007. I thought it was an interesting question, and replied. Then, I thought oops … it should be on the blog.

You can use a combination of the REPLACE and REPT (believe it or not that’s the repeat function). The first step to left pad requires you to master the REPLACE function. It takes four actual parameters. The first is the old or base string, which is typically a cell reference. The second is the starting position, the third is the length of the base string to replace, and fourth the new string. The following example assumes you want to replace the whole string with a white space before the original string.

=REPLACE(A1,1,LEN(A1)," "&A1)

If you don’t like using the ampersand to concatenate, you can replace it with the CONCATENATE function, like this:

=REPLACE(A1,1,LEN(A1),CONCATENATE(" ",A1))

While more compact, the ampersand does the trick and ultimately saves space for complex mega formulas. The next step requires you learn how to use the REPT function.

The REPT function takes two parameters, which are the string you want to repeat and the number of times you want to repeat it. Assuming that you expect all strings to be smaller than twenty, you can use the REPT function this way to create a dynamic padding of physical white space.

=REPT(" ",20-LEN(A1))

When you put the concepts together, you’d get the following command to dynamically replace a string in cell A1 with a 20 character cell that is right aligned. The new value would be in the cell where you put this formula.

=REPLACE(A1,1,LEN(A1),REPT(" ",20-LEN(A1))&A1)

This lets you left pad a string, which I suppose is great when you want to output a flat position specific file. I’d prefer a CSV and a more robust programming environment but I hope it helps those who want to do it.

As Andy pointed out, you can simplify it with this:

=REPT(" ",20-LEN(A1))&A1

As an update to the original blog post, here’s a Visual Basic for Applications (VBA) User Defined Function (UDF) that would provide you with an add-in function to perform this task. It’s designed to LPAD numbers or text, and always return text. You can find how to develop and deploy Excel VBA UDFs in this more recent blog post of mine.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
' The function takes any input as a cell and manages numbers different than strings.
Function LPAD(cell As Variant _
             , Optional offset As Variant = 0 _
             , Optional padWith As String = "0")
 
  ' Return variable.
  Dim returnString As String
  Dim whiteSpace As String * 1
 
  ' Reports a meaningful error when a non-numeric offset parameter is provided as the offset call parameter.
  If Not IsNumeric(offset) Then
 
    ' This means the function is called incorrectly but suppresses a #VALUE! error.
    MsgBox ("Offset should not be a non-numeric value of [" + offset + "]!")
 
  ' Dynamically assign the offset for the optional offset variable.
  ElseIf offset = 0 Then
    offset = Len(cell)
  Else
    ' This allows you to see the offset value when it is provided as a number.
    ' MsgBox ("offset is [" + CStr(offset) + "]")
  End If
 
  ' Assign default value.
  whiteSpace = " "
 
  ' Ensure size is at least the length of the cell and padding value.
   If IsNumeric(offset) And Len(cell) > offset Then
    offset = Len(cell) + Len(padWith)
   End If
 
  ' Assign default padding value when cell is a number.
  If IsNumeric(cell) And IsNumeric(padWith) Then
    padNumberWith = CInt(padWith)
  Else
    padNumberWith = 0
  End If
 
  ' Convert to string when numeric, use padWith text value when not null and whitespace if null.
  If IsNumeric(cell) Then
    returnString = Application.Rept("0", offset - Len(Application.Text(cell, padNumberWith))) + CStr(cell)
  ElseIf padWith <> "0" Then
    returnString = Application.Rept(padWith, offset - Len(cell)) + cell
  ElseIf padWith = "0" Then
    returnString = Application.Rept(whiteSpace, offset - Len(cell)) + cell
  Else
    returnString = Application.Rept(" ", offset - Len(cell)) + cell
  End If
 
  ' Return formatted string.
  LPAD = returnString
 
End Function

The formula for a number is different than it would be natively in Excel. That’s because some behaviors are implicitly provided. The equivalent function in Excel is:

=REPT("0",20-LEN(TEXT(A6,"0")))&A6

Microsoft Excel performs the LEN() function implicitly if you leave it out. For example, this works the same as the more complete formula above.

=REPT("0",20-(TEXT(A6,"0"))&A6

If you forget the LEN() function call in the VBA module, it raises an Excel Error 20150. This error is suppressed by a #VALUE! error. That error is returned because an error message can’t be concatenated with a string. Once you identify where the error occurs you can enclose it in a CStr() function call. The CStr() function explicitly casts the Err.number value to a string, which is then returned to the worksheet in the cell where you’ve called the function.

All that being said, it’s not nearly that complicated. You can provide the LPAD UDF in a simpler function that doesn’t use the built-in workbook functions. All you need to know is how to us the String() function, like this:

Function LPAD(padding AS Variant, character AS String, cell As String)
 
  ' Define and assign a default value, which must be a string data type.
  Dim character_default As String * 1
  character_default = " " 
  Dim number_default As String * 1
  number_default = "0" 
 
  ' Check if you've got adequate values to proceed.
  If IsNumeric(padding) And IsNumeric(cell) Then
    If IsNumeric(character) Then
      LPAD = String(padding, character) + cell
    Else
      LPAD = String(padding, number_default) + cell
    End If
  Else
    If IsNumeric(character) Then
      LPAD = String(padding, character) + cell
    Else
      LPAD = String(padding, character_default) + cell
    End If
  End If
 
End Function

You would call this UDF with any of the following function calls. You skip the last parameter because its an optional parameter.

User Defined Function Calls
Cell Formula
A2 =LPAD(A1,10)
A3 =LPAD(A1,30,"-")
A4 =LPAD(A1,20,0)

Written by maclochlainn

March 4th, 2009 at 12:17 am